# PLANT PHENOTYPING AND PHENOMICS FOR PLANT BREEDING

EDITED BY : Gustavo A. Lobos, Anyela V. Camargo, Alejandro del Pozo, Jose L. Araus, Rodomiro Ortiz and John H. Doonan PUBLISHED IN : Frontiers in Plant Science

#### Frontiers Copyright Statement

© Copyright 2007-2018 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88945-532-4 DOI 10.3389/978-2-88945-532-4

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# PLANT PHENOTYPING AND PHENOMICS FOR PLANT BREEDING

Topic Editors:

Gustavo A. Lobos, Universidad de Talca, Chile Anyela V. Camargo, National Institute of Agricultural Botany, United Kingdom Alejandro del Pozo, Universidad de Talca, Chile Jose L. Araus, University of Barcelona, Spain Rodomiro Ortiz, Swedish University of Agricultural Sciences, Sweden John H. Doonan, Aberystwyth University, United Kingdom

Plant phenotyping under controlled conditions. Image: National Plant Phenomics Centre (NPPC, Aberystwyth University, United Kingdom).

As a consequence of the global climate change, both the reduction on yield potential and the available surface area of cultivated species will compromise the production of food needed for a constant growing population. There is consensus about the significant gap between world food consumption projected for the coming decades and the expected crop yield-improvements, which are estimated to be insufficient to meet the demand.

The complexity of this scenario will challenge breeders to develop cultivars that are better adapted to adverse environmental conditions, therefore incorporating a new set of morpho-physiological and physico-chemical traits; a large number of these traits have been found to be linked to heat and drought tolerance.

Currently, the only reasonable way to satisfy all these demands is through acquisition of high-dimensional phenotypic data (high-throughput phenotyping), allowing researchers with a holistic comprehension of plant responses, or 'Phenomics'.

Phenomics is still under development. This Research Topic aims to be a contribution to the progress of methodologies and analysis to help understand the performance of a genotype in a given environment.

Citation: Lobos, G. A., Camargo, A. V., del Pozo, A., Araus, J. L., Ortiz, R., Doonan, J. H., eds. (2018). Plant Phenotyping and Phenomics for Plant Breeding. Lausanne: Frontiers Media. doi: 10.3389/978-2-88945-532-4

# First Latin-American Conference on Plant Phenotyping and Phenomics for Plant Breeding November 30, December 1 & 2, 2015

Hosted by

Universidad de Talca, Talca, Chile

Conveners

Dr. Gustavo A. Lobos Plant Breeding and Phenomic Center Facultad de Ciencia Agrarias Universidad de Talca Talca Chile

Dr. Anyela Camargo National Plant Phenomics Centre IBERS, Plas Gogerddan Aberystwyth University Aberystwyth United Kingdom

Supporters

Sponsors

# Table of Contents

#### EDITORIAL

*08 Editorial: Plant Phenotyping and Phenomics for Plant Breeding* Gustavo A. Lobos, Anyela V. Camargo, Alejandro del Pozo, Jose L. Araus, Rodomiro Ortiz and John H. Doonan

#### 1. INTRODUCTION


#### 2. CHARACTERIZATION OF THE PLANT: FROM THE GENE TO POPULATION RESPONSES BY REMOTE SENSING


Henry M. Barber, Martin Lukac, James Simmonds, Mikhail A. Semenov and Mike J. Gooding


Kai Cao, Lirong Cui, Xiaoting Zhou, Lin Ye, Zhirong Zou and Shulin Deng

*116 Physiological Traits Associated With Wheat Yield Potential and Performance Under Water-Stress in a Mediterranean Environment* Alejandro del Pozo, Alejandra Yáñez, Iván A. Matus, Gerardo Tapia, Dalma Castillo, Laura Sanchez-Jardón and José L. Araus


James F. Hancock, Suneth S. Sooriyapathirana, Nahla V. Bassil, Travis Stegmeir, Lichun Cai, Chad E. Finn, Eric Van de Weg and Cholani K. Weebadde


Susan Medina, Rubén Vicente, Amaya Amador and José Luis Araus


Meilian Tan, Jianfeng Xue, Lei Wang, Jiaxiang Huang, Chunling Fu and Xingchu Yan

*240 Physiological Mechanisms Underlying the High-Grain Yield and High-Nitrogen Use Efficiency of Elite Rice Varieties Under a Low Rate of Nitrogen Application in China*

Lilian Wu, Shen Yuan, Liying Huang, Fan Sun, Guanglong Zhu, Guohui Li, Shah Fahad, Shaobing Peng and Fei Wang

*252 Overexpression of* OsDof12 *Affects Plant Architecture in Rice (*Oryza Sativa *L.)*

Qi Wu, Dayong Li, Dejun Li, Xue Liu, Xianfeng Zhao, Xiaobing Li, Shigui Li and Lihuang Zhu


Dongdong Yu, Lihua Zhang, Kai Zhao, Ruxuan Niu, Huan Zhai and Jianxia Zhang

#### 3. NOVEL METHODOLOGICAL APPROACHES AND SOFTWARE DEVELOPMENT

*292 Methodology for High-Throughput Field Phenotyping of Canopy Temperature Using Airborne Thermography*

David M. Deery, Greg J. Rebetzke, Jose A. Jimenez-Berni, Richard A. James, Anthony G. Condon, William D. Bovill, Paul Hutchinson, Jamie Scarrow, Robert Davy and Robert T. Furbank

*305 Assessing Wheat Traits by Spectral Reflectance: Do We Really Need to Focus on Predicted Trait-Values or Directly Identify the Elite Genotypes Group?*

Miguel Garriga, Sebastián Romero-Bravo, Félix Estrada, Alejandro Escobar, Iván A. Matus, Alejandro del Pozo, Cesar A. Astudillo and Gustavo A. Lobos


Prashant Kaushik, Jaime Prohens, Santiago Vilanova, Pietro Gramazio and Mariola Plazas


Omar Vergara-Díaz, Mainassara A. Zaman-Allah, Benhildah Masuka, Alberto Hornero, Pablo Zarco-Tejada, Boddupalli M. Prasanna, Jill E. Cairns and José L. Araus

# Editorial: Plant Phenotyping and Phenomics for Plant Breeding

Gustavo A. Lobos <sup>1</sup> \*, Anyela V. Camargo<sup>2</sup> \*, Alejandro del Pozo<sup>1</sup> , Jose L. Araus <sup>3</sup> , Rodomiro Ortiz <sup>4</sup> and John H. Doonan<sup>5</sup>

<sup>1</sup> PIEI Adaptación de la Agricultura al Cambio Climático, Facultad de Ciencias Agrarias, Plant Breeding and Phenomic Center, Universidad de Talca, Talca, Chile, <sup>2</sup> The John Bingham Laboratory, Genetics and Breeding, National Institute of Agricultural Botany, Cambridge, United Kingdom, <sup>3</sup> Plant Physiology Section, University of Barcelona, Barcelona, Spain, <sup>4</sup> Department of Plant Breeding, Swedish University of Agricultural Sciences, Alnarp, Sweden, <sup>5</sup> National Plant Phenomics Centre, Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom

Keywords: Latin America, high-throughput phenotyping, forward phenomics, reverse phenomics, software development

**Editorial on the Research Topic**

#### **Plant Phenotyping and Phenomics for Plant Breeding**

#### INTRODUCTION

#### Edited by:

Chengdao Li, Murdoch University, Australia

#### Reviewed by:

Saleh Alseekh, Max Planck Institute of Molecular Plant Physiology (MPG), Germany

\*Correspondence:

Gustavo A. Lobos globosp@utalca.cl Anyela V. Camargo anyela.camargorodriguez@niab.com

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 27 September 2017 Accepted: 12 December 2017 Published: 22 December 2017

#### Citation:

Lobos GA, Camargo AV, del Pozo A, Araus JL, Ortiz R and Doonan JH (2017) Editorial: Plant Phenotyping and Phenomics for Plant Breeding. Front. Plant Sci. 8:2181. doi: 10.3389/fpls.2017.02181 A major challenge for food production in the coming decades is to meet the food demands of a growing population (Beddington, 2010). The difficulty of expanding agricultural land, along with the effect of climate change and the increase in world population are the current societal changes that make necessary to accelerate research to improve yield-potential and adaptation to stressful environments (Lobos et al., 2014; Camargo and Lobos). Increasing yields will require implementing novel approaches in gene discovery and plant breeding that will significantly increase both production per unit of land area and resource use efficiency (Parry and Hawkesford, 2010; Tanger et al., 2017). A critical component for accelerating the development of new and improved cultivars is the rapid and precise phenotypic assessment of thousands of breeding lines, clones or populations over time (Fu, 2015) and under diverse environments. The only reasonable way to satisfy all these demands is through acquisition of high-dimensional phenotypic data (high-throughput phenotyping) or "phenomics" (Houle et al., 2010). This approach may predict complex characters that are relevant for plant selection (forward phenomics), and will also provide explanations as to why given genotypes stands out in a specific environment (reverse phenomics) (Camargo and Lobos).

Phenotype can be defined as the characteristics of an organism resulting from the interaction between genotype, environment and crop management. Phenomics involves the gathering of appropriate phenotypic data at multiple levels of organization, to progress toward a more complete characterization of phenotypic space generated by a particular genome or set of genomes (Dhondt et al., 2013). Thus, plant phenotyping can operate at different levels of resolution and dimensionality, from the molecular to the whole plant, and in different environments, from controlled to field conditions. Although, each level focuses on particular traits, the ultimate goal is to integrate knowledge from the bottom up to produce cultivars with higher performance. In that regard, the use of plant phenotyping methods as part of breeding programs has become a powerful research tool to help breeders generate cultivars more adaptable to diverse challenging environmental scenarios (Camargo and Lobos).

This Research Topic (RT) issue is based on contributions from invited speakers to the First Latin-American Conference on Plant Phenotyping and Phenomics for Plant Breeding carried out during 2015 at Universidad de Talca (Talca, Chile), and from other scientists who are currently researching on phenomics and plant breeding. Thus, the categories and scope are diverse (review, perspective, and original research) and address different objectives through various levels of resolution and dimensionality. Interestingly, even though most of the phenotyping and phenomics for plant breeding research have been developed for model plants and cereals (Lobos and Hancock; Camargo and Lobos), this RT highlights the feasibility of implement these approaches on breeding programs targeted to other crop species.

#### CHARACTERIZATION OF THE PLANT: FROM THE GENE TO POPULATION RESPONSES BY REMOTE SENSING

Many of the articles comment on knowledge gained from model plants that was applied to crops. For example, Liu et al. proposed a pathway model for trichome development in cucumber and compared it with the model from Arabidopsis thaliana. Yu et al. identified VaERD15 as a transcription factor gene associated with cold-tolerance in Chinese wild Vitis amurensis, and the expression levels increased after lowtemperature treatment, enhancing cold tolerance. Cao et al. analyzed gene expression on transgenic material to unveil the functional roles of four expressed FT-like genes in tomato. These authors also demonstrated the functional differentiation between FT-like genes in controlling flowering through overexpression in A. thaliana and VIGS-mediated knocking down in tomato. Awlia et al. were able to show further that phenotyping multiple quantitative traits in one experimental setup can provide new insights into the dynamics of plant responses to stress, suggesting the use of forward genetics studies to identify genes underlying early responses to stress.

The small grain cereals are very well represented in this RT issue. For example, Wu et al. found that overexpressing OsDof12 in rice could lead to reduced plant height, erected leaf, shortened leaf blade, and smaller panicle resulted from decreased number of primary and secondary branches. Barber et al. performed 1-day transfers of pot-grown wheat to replicated controlled environments, providing strong evidence that the key phases susceptible to heat stress at booting and anthesis in wheat are discrete and that genotypes vary with regards to the most susceptible growth stage; at anthesis, the north European allele Rht-D1b was related with higher tolerance to heat stress. Adriani et al. analyzed the effect of different quantitative trait loci (QTL) and their interaction with growing conditions on panicle size and number in rice. Their results showed that grain production was enhanced by qTSN only under shading conditions, where panicle number was not affected while photosynthesis and starch storage in internodes were enhanced. Similarly, Camargo et al. established the value of systematically phenotyping genetically unstructured populations to reveal the genetic architecture underlying morphological variation in commercial wheat, QTL with phenological characters such as heading, and the onset of flag leaf senescence, as well as morphological traits such as stem height. In order to have consistency between stress conditions and seasons, del Pozo et al. highlighted the importance of defining and deeply characterizing the target environment before determining the set of phenotyping traits for selection.

Integration of molecular analyses with whole plant phenotyping deepens our understanding of responses to environmental variables as demonstrated by Sanchez-Bragado et al. who proposed the combination of δ <sup>15</sup>N and N content as an affordable tool to phenotype the relative contribution of different plant parts to the grain N in wheat under contrasting water and nitrogen conditions. Similarly, as selection criteria in breeding N-efficient rice cultivars, Wu et al. suggested that promoting pre-heading growth could increase total nitrogen uptake at maturity, while high biomass accumulation during the grain filling period and large panicles are important for higher nitrogen use efficiency for grain production. Nigro et al. also suggested the use of glutamine synthetase activity and expression as a candidate proxy to select genotypes having high grain protein content in wheat. Fisher et al. found that at early stages of drought stress, metabolic profiling could be used as an efficient tool to discriminate among tolerant or susceptible genotypes of the model plant Brachypodium distachyon. Medina et al. concluded for wheat that the combination of phenotyping and gene expression analysis is a useful approach to identify phenotype-genotype relationships and their behavior in response to different environments, which mostly follows from the combination of water regimes and CO<sup>2</sup> levels during vegetative stages.

Kumar et al. used molecular, phenotypic, and geographical diversity to develop a compact composite core collection in the oilseed crop safflower (Carthamus tinctorius L.) that will facilitate the identification of genetic determinants of trait variability. Mora et al. emphasized that large number of spurious QTL could be detected when the genetic covariance matrix is ignored in the mixed model analysis, increasing the rate of false-positives. In strawberry breeding programs, Hancock et al. proved that much of the cost associated with DNA marker discovery for markeraided breeding (MAB) can be eliminated if a diverse, segregating population is generated, genotyped, and made available to the global breeding community. Tan et al. proposed the use of Digital Gen Expression (DGE) as an efficient tool to find differences in transcriptional responses of different tissues/organs of castor plants (Ricinus communis L.) subjected to stress, but also to understand molecular mechanisms associated to sex variation. Among approaches to improve yield potential, delayed leaf senescence or stay-green attributes were also addressed in this RT (Balakrishnan et al., 2016; Camargo et al.; Fisher et al.; Wu et al.; Yang et al.; del Pozo et al.).

### NOVEL METHODOLOGICAL APPROACHES AND SOFTWARE DEVELOPMENT

This issue also considers methodological approaches to characterize and classify cells and to quantify fluorescence at sub-cellular level (Hall et al.). Likewise, using a wheat panel of elite cultivars and non-adapted genetic resources growing under different adverse environments, Tattaris et al. compared on-ground proximal assessments of canopy temperature (CT) and NDVI with data collected from unmanned aerial vehicles (UAV) and satellite; considering statistical analyses, cost and the feasibility of performing measurements of a high number of genotypes at any moment, the authors recommend the use of UAV for plant breeding purposes. Vergara-Díaz et al. highlighted the advantages of using spectral reflectance indices (SRI) derived from Red-Green-Blue (RGB) digital images as a low-cost tool for prediction of several traits (e.g., grain yield, leaf N concentration and the ratio of carbon to nitrogen) that are highly valuable for maize breeders. Likewise, Garriga et al. proposed the use of classification methods (PLS-DA) as an efficient tool to directly identify the elite genotypes group instead to focus on the estimation of predicted trait-values.

Software developments were also highlighted in this RT. For example, Deery et al. showed the efficiency of a customdeveloped airborne thermography image processing software (ChopIt), used for plot segmentation and extraction of canopy temperature from each individual plot by a non-technical user. Kaushik et al. performed a deep characterization of eggplant using the Tomato Analyzer (Rodríguez et al., 2010). Using Spectral Knowledge (SK-UTALCA; Lobos and Poblete-Echeverría, 2017), Garriga et al. performed an exploratory analysis of high-resolution spectral reflectance data, testing 255 SRI at the same time. RGB images were filtered by a retina filter (Benoit et al., 2010), which enhanced the contrast between plant and background, improving color consistency and providing spatial noise removal and luminance correction (Fisher et al.). Awlia et al. used the PlantScreenTMCompact System (PSI, Czech Republic) equipped with chlorophyll fluorescence and RGB imaging, as well as with automatically weighing and watering of plants to test the response of A. thaliana to salinity. The open source Breedpix 0.2 software (Casadesus et al., 2007) was used by Vergara-Díaz et al. for the calculation of several RGB vegetation indices based on the different properties of color inherent in RGB images.

#### REFERENCES


## CONCLUSIONS

This RT highlights the importance of a holistic characterization of the crop, from the cell to the plant level using for that current available tools for plant phenotyping. Gathering, integrating, and making inferences on these data will help breeding programs to speed up the release of cultivars tolerant to stress environments. Alongside more data, training programs should be established to guarantee the use and adoption of new technologies.

In addition to aspects associated to phenomics and breeding, the RT also discussed future challenges such as the need for multidisciplinary research and the better and deeper characterization of the environment. The progress of forward and reverse phenomics has great potential to accelerate the improvement of yield potential and we expect to see that rapid developments will continue in this subject area, especially in Latin America.

## AUTHOR CONTRIBUTIONS

GL, AC, AdP, JA, RO, and JD co-wrote this editorial based on the various contributions to this Research Topic.

#### ACKNOWLEDGMENTS

We would like to thank all the companies supporting the First Latin-American Conference on Plant Phenotyping and Phenomics for Plant Breeding (Phenospex, LemnaTec, Photon Systems Instruments-PSI, and Ivens Chile). In Chile, this editorial was supported by the National Commission for Scientific and Technological Research CONICYT (FONDEF IDEA 14I10106 & 14I20106) and the Universidad de Talca (research programs "Adaptation of Agriculture to Climate Change-A2C2" and "Núcleo Científico Multidisciplinario").


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Lobos, Camargo, del Pozo, Araus, Ortiz and Doonan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Latin America: A Development Pole for Phenomics

Anyela V. Camargo<sup>1</sup> \* and Gustavo A. Lobos <sup>2</sup> \*

<sup>1</sup> The National Plant Phenomics Centre, Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK, <sup>2</sup> Facultad de Ciencias Agrarias, Plant Breeding and Phenomic Center, PIEI Adaptación de la Agricultura al Cambio Climático (A2C2), Universidad de Talca, Talca, Chile

Latin America and the Caribbean (LAC) has long been associated with the production and export of a diverse range of agricultural commodities. Due to its strategic geographic location, which encompasses a wide range of climates, it is possible to produce almost any crop. The climate diversity in LAC is a major factor in its agricultural potential but this also means climate change represents a real threat to the region. Therefore, LAC farming must prepare and quickly adapt to an environment that is likely to feature long periods of drought, excessive rainfall and extreme temperatures. With the aim of moving toward a more resilient agriculture, LAC scientists have created the Latin American Plant Phenomics Network (LatPPN) which focuses on LAC's economically important crops. LatPPN's key strategies to achieve its main goal are: (1) training of LAC members on plant phenomics and phenotyping, (2) establish international and multidisciplinary collaborations, (3) develop standards for data exchange and research protocols, (4) share equipment and infrastructure, (5) disseminate data and research results, (6) identify funding opportunities and (7) develop strategies to guarantee LatPPN's relevance and sustainability across time. Despite the challenges ahead, LatPPN represents a big step forward toward the consolidation of a common mind-set in the field of plant phenotyping and phenomics in LAC.

#### Edited by:

Susana Araújo, Universidade Nova de Lisboa, Portugal

#### Reviewed by:

Sebastien Carpentier, KU Leuven, Belgium Biswapriya Biswavas Misra, Texas Biomedical Research Institute, USA

#### \*Correspondence:

Anyela V. Camargo avc1@aber.ac.uk Gustavo A. Lobos globosp@utalca.cl

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 12 May 2016 Accepted: 02 November 2016 Published: 06 December 2016

#### Citation:

Camargo AV and Lobos GA (2016) Latin America: A Development Pole for Phenomics. Front. Plant Sci. 7:1729. doi: 10.3389/fpls.2016.01729 Keywords: LAC, climate change, genomic, phenotyping, plant breeding, LatPPN

## WHY PHENOMICS IS KEY TO FACE CLIMATE CHANGE AND FOOD SECURITY?

In the past decades, climatic variations related to El Niño or La Niña phenomena have brought serious challenges to the agricultural sector in LAC. While drought is the main threat to food production associated to La Niña, El Niño can cause heavy rains, flooding or extremely hot or cold weather (Allen and Ingram, 2002). In the last 150 years, earth's temperature increased at a rate of 0.045◦C per decade, with almost four-fold (0.177◦C) in the last 25 years (IPCC, 2007), and will continue to raise by another 1.1–6.4◦C over the next century (Jin et al., 2011). This increase in temperature can lead to several agricultural associated problems such as yield reduction as a results of droughts, and the emergence and spreading of plant diseases and pests (FAO, 2016). Therefore, a better use of plant genetic resources and plant breeding (Borrás and Slafer, 2008), are key to tackling the imminent impact of climate change in food security. Further, a multidisciplinary approach that includes disciplines such as omics technologies (e.g., genomics, phenomics, proteomics, and metabolomics), plant physiology, eco-physiology, plant pathology and entomology, and soil science will be critical to increase crop resilience to climate change (Reynolds et al., 2016). Undoubtedly, public and private breeding programs have the challenge of producing stress tolerant cultivars whose yield potential and quality are also high. In order to increase the chances of producing desirable cultivars, breeders make a high number of crosses (e.g., Chilean wheat breeding programs generate ∼800 crosses per year) and screen them under a limited number of environmental conditions (Araus and Cairns, 2014). Line crossing is a common experimental design for mapping quantitative trait loci (QTLs) in plant breeding. Crosses are initiated from at least two inbred lines, such as backcrosses, F2, and more derived generations (Xie et al., 1998). To increase the statistical inference space of the estimated QTL variance and ensure that polymorphic alleles are present in the parental gene pool, a sufficient number of parents must be sampled (Muranty, 1996). The number of traits measured per plot is normally limited to the size of the population. Increasing the number of traits to be measured requires additional time, resources and the use of skilled labor (Kipp et al., 2014). This represents a limitation toward the understanding of the interaction genotype × environment (G × E) (Furbank and Tester, 2011; Yang et al., 2014; Großkinsky et al., 2015; Rahaman et al., 2015).

Although, genome sequencing has become relatively fast, cheap, and easy to produce, plant phenomics still lags behind. This unbalance has become a bottleneck in the understanding of G × E and it also limits the possibility of carrying out tests under field conditions (Lobos and Hancock, 2015). Therefore, there is a need to incorporate the evaluation of multiple morphophysiological and physico-chemical traits at the high-throughput level to be able to understand for example pleiotropy or genomic variants that gave rise to a particular phenotype (Houle et al., 2010; Fahlgren et al., 2015).

Due to the cost of high-throughput plant phenotyping, several international phenotyping networks have been established with the idea of joining efforts and produce research with impact. Some of the most prominent networks are: the European Plant Phenotyping Network (EPPN), Food and Agriculture COST Action FA1306, the International Plant Phenotyping Network (IPPN), the Australian Phenomics Network (APN), the German Plant Phenotyping Network (DPPN) and the U.K. Plant Phenomics Network (UKPPN). In Asia, the 1st Asia-Pacific Plant Phenotyping will be held in Beijing, China in October 2016 and the 3rd International Plant Phenotyping Symposium was held in Chennai, India in 2014. More recently in North America, the United States of America recently launched the North American Plant Phenotyping Network (NAPPN).

#### DOES LATIN AMERICA AND THE CARIBBEAN NEED TO WORRY ABOUT PHENOMIC DEVELOPMENT?

Latin America is a region that includes Mexico, the Spanish/Portuguese speaking countries in Central America and the whole of South America, as well as the Caribbean (Latin America and the Caribbean—LAC). The region is highly heterogeneous in terms of climate, ecosystems, human population distribution, politics, economy and incomes, and cultural traditions. Out of a total of 17 megadiverse countries identified by the World Conservation Monitoring Centre (http://www.unep-wcmc.org), six are in Latin American, namely Brazil, Colombia, Ecuador, Mexico, Peru, and Venezuela. Furthermore, from the eight primary centers of origin and diversity, numbers VII (South Mexican and Central American) and VIII (South America Andes region: Bolivia, Peru, Ecuador; VIIIa The Chilean Center, and VIIIb Brazilian-Paraguayan Center) are based in the region (Vavilov, 1992).

Due to LAC's diverse geography, climate change will impact the region severely. Compared to pre-industrial times, it is estimated that the mean temperature on the region will increase about 4.5◦C by the end of the century (Reyer et al., 2015). Temperatures are expected to increase dramatically in the tropics and moderate at the subtropical regions in the north (Mexico) and south (southern Chile, Argentina and Uruguay) (Reyer et al., 2015). Annual precipitations are also likely to increase in Argentina, Uruguay, Brazil, Peru, Ecuador, and Colombia and decrease in the rest of the countries (Reyer et al., 2015). These changes have a direct impact on agricultural crop yields. It's expected that crops such as wheat, soybean and maize will reduce its yield potential, while others such as rice and sugar cane will increase it (Fernandes et al., 2012; Marin et al., 2012).

The economic development of the regions where plant phenotyping and phenomics have been developed in the last 10 years (high-income countries) is completely different to that of LAC. According to the World Bank, around 37% of the LAC population lives under poverty or extreme poverty (World Bank, 2014), and near 60% of the people living in rural areas is under extreme poverty (RIMISP, 2011). Therefore, besides the climate change effects impacting LAC agriculture, there is also a significant knock-on the region economy, affecting particularly the lower socioeconomic strata (Ortiz, 2012).

Although, LAC countries are wealthier, government efforts are mainly focused on priority areas such as education, health, employability, and infrastructure. Research and innovation in areas such as agriculture has been given a low priority. As a result, most Latin American farmers do not have the resources or the support to effectively adapt to a changing climate that is already showing its negative impact in agriculture (Lobos and Hancock, 2015). Therefore, LAC scientists and private sector must work together to develop strategies aiming at moving toward a more resilient agriculture, and one of them is the use of plant phenomics and phenotyping for breeding.

Phenomics has become a powerful research tool to help breeders to generate cultivars adaptable to more challenging environmental scenarios. In the past decade, phenomics has been focused mainly on breeding of grain crops, but their application in other species of relevance for LAC (e.g., fruit, vegetables, forage and others) is almost absent (Lobos and Hancock, 2015).

The potential of recent advances in phenomics encouraged the Plant Breeding and Phenomic Center (Dr. Gustavo A. Lobos, Universidad de Talca, Talca, Chile) and the National Plant Phenomics Centre (Dr. Anyela Camargo, IBERS, Aberystwyth University, U.K.) to organize the First Latin American Conference on Plant Phenotyping and Phenomics for Plant Breeding (November 30st to December 2nd 2015, Talca, Chile). This event had three main goals: (1) bring to Latin American researchers and students, international keynote speakers and plant breeding companies from around the world, to present their ongoing work on plant phenomics and phenotyping for plant breeding; (2) perform a workshop to train Latin American scientists and postgraduate students in the use of key plant phenotyping tools, the analysis of data and the mapping of traits to the genome; and (3) set up the Latin American Plant Phenomics Network (LatPPN), conceived to facilitate the training on high-throughput phenotyping and pre-breeding methodologies, scientific exchange of young/senior researchers and students, and to improve access to resources and research facilities.

The conference covered a broad range of topics such as pre-breeding and breeding strategies, methods to measure and analyse trait data for plant breeding and the strategies to translate research from the bench to the field. International keynote speakers gave seminal talks and chaired the track of their expertise. Challenges and opportunities were also explored such as the handling of the high amount of data generated through high-throughput phenotyping. Multiple ideas were discussed to deal with every particular challenge. Participants also had the opportunity to attend five workshops that covered aspects such as the use of software and equipment for plant phenotyping (mainly by remote sensing), and data handling and manipulation.

The LatPPN, which is chaired for two years by Chile (Dr. Gustavo A. Lobos) and Colombia (Dr. Anyela Camargo), had it first reunion during the 3rd day of the conference. Representatives from LAC (Argentina, Brazil, Chile, Colombia, Ecuador, Mexico, and Uruguay) and from other countries (Australia, Germany, Saudi Arabia, Spain, U.K., and U.S.A.) got together to discuss what LAC's breeding programs needed to do to become more efficiency in terms of plant phenotyping and phenomics. They also discussed the differences between phenotyping and the more complex concept of phenomics. This discussion helped to define where LAC currently stands (more focused on the phenotyping of few traits and low number of genotypes) and where it needs to be in the future (mostly oriented to the multidimensional approach of phenomics, considering a high number of genotypes assessed). For example, the wheat breeding program of INIA Chile, used to consider a classical approximation where the numbers of traits evaluated increases insofar the number of generation progresses: ∼9 traits at F2–F5: susceptibility to Puccinia triticina, P. graminis, and P. striiformis, plant height, tillering capacity, type of spike, grain color, type of grain, and black point or other grain defects; ∼16 at F6–F8: previous ones plus heading date, grain yield, and some grain characteristics such as test weight, protein and gluten content, sedimentation, and seed hardiness; and ∼19 at F9–F10 where less than 5% of the original crosses are evaluated: previous ones plus some other required by millers such as W flour value, falling number, and some bakery aptitudes. Today, using spectrometry and thermography, this breeding program is aimed to predict some of these traits but also to consider other 30 morphophysiological and physico-chemical characters (some examples covered in next section), screening ∼800 genotypes per day.

## IS LAC ORIENTED TO PHENOMICS OR PLANT PHENOTYPING?

Due to resources' availability such as equipment, skills and infrastructure, LAC has mainly focused on plant phenotyping. Although phenomics in LAC has not yet had a proper expansion, there are some good examples of institutions focusing on it: (i) The International Maize and Wheat Improvement Center (CIMMYT—Mexico) routinely uses remote sensing and high spec sensor technologies to screen for wheat and maize's responses to biotic and abiotic stresses, among them yield and its components, biomass, senescence (stay-green), water stress, and water use efficiency, canopy cover, photosynthetic capacity and activity (Zaman-Allah et al., 2015). Special emphasis is also put on 3D reconstruction for plant height, spike number and biomass determination; (ii) The Plant Breeding and Phenomic Center (University of Talca—Chile) have focused its efforts on the prediction of physiological traits by spectrometry and thermography (e.g., gas exchange, modulated chlorophyll fluorescence, pigments concentration, stem water potential, hydric and osmotic cell potential, cell membrane stability, lipid peroxidation, proline content, C and O isotopic composition) on several breeding programs (wheat, blueberries, alfalfa, strawberries, and quinoa) oriented to abiotic stresses (salt, water deficit and high temperature) (Garriga et al., 2014; Lobos et al., 2014; Estrada et al., 2015; Hernandez et al., 2015), developing also a software for exploratory analysis of high-resolution spectral reflectance data on plant breeding (Lobos and Poblete-Echeverría, in press).

In terms of phenotyping, most research institutes across the region have done some form of low to medium throughput phenotyping, for example: (i) The International Centre for Tropical Agriculture (CIAT—Colombia) is screening root architecture to identify markers associated to drought stress tolerance in beans and grasses (Villordo-Pineda et al., 2015; Rao et al., 2016); (ii) Embrapa (Brazil) uses traditional phenotyping to screen for root morphology in wheat (Richard et al., 2015); (iii) Universidade Federal de Mato Grosso, Brazil, uses traditional phenotyping tools (e.g., gas exchange measurements) to look for photosynthetic responses of tree species to seasonal variations in hydrology in the Brazilian Cerrado and Pantanal (Dalmagro et al., 2016); (iv) Researchers from Argentina uses conventional phenotyping equipment to investigate the response of seed weight and composition to changes in assimilate supply from leaves, to the incident solar radiation reaching the pods and to the combination of both, changes in assimilate supply from the leaves and incident solar radiation on pods of soybean plants (Bianculli et al., 2016), they are also trying to develop low cost tools in order to make that technology accessible to researches from LAC; (v) The International Potato Center (Peru) have improved the screening of potato breeding lines by spectroscopy (Ayvaz et al., 2016); and (vi) INIA (Uruguay) in collaboration with INIA (Chile) and the Plant Breeding and Phenomic Center (University of Talca—Chile), applied genotyping-by-sequencing to identify single-nucleotide polymorphisms, in the genomes of 384 wheat genotypes that were field tested in Chile under three different water regimes (Lado et al., 2013).

#### HOW WILL LAC BENEFIT FROM LATPPN?

The conference served as a platform to showcase LAC capabilities, investigate strengths, and weaknesses, and thereby identify where the challenges lie and what the knowledge and the technological gaps between the region and the rest of the world are.

Given LAC's high heterogeneity in terms of climate, ecosystems and genetic diversity, as well as the differences of each country vulnerability to climate change, it was agreed how important it is for LAC's agri-food chain to take a more proactive role in the development of strategies leading to the selection of crops capable to withstand the impact of climate change.

With the aim of identifying what LatPPN needed to do to strength LAC's plant phenotyping and phenomics research, the panel of participants identified the following key challenges: (i) develop LatPPN's own tailored identity: there is not a common crop but rather a wide diversity of them, from grasses to forest species. As previously mentioned, plant phenotyping and phenomics has been developed almost exclusively on cereal improvement, however LatPPN needs to focus on other breeding programs that are important for particular countries. For example: blueberries for Chile (Chile is the biggest exporter of fresh blueberries in the world, ∼90,000 ton during 2015/16), potato for Peru (production was estimated to be 4.5 million tons for 2015), tangerines for Uruguay (production was ∼6000 tons in 2014), pineapple for Costa Rica (since 2000, pineapple production has increased by nearly 300%, however production is very inefficient, each plant only produces two fruit over a period of 18–24 months, and requires significant amount fertilizer to do so) and Coffee for Colombia (exports account for ∼810,000 ton in 2015) and Brazil (exports account for ∼2.6 million tons in 2014). The production of these cash crops will face serious challenges (e.g., post-harvest life, or the incidence of physiological disorders, pests and diseases) in the coming decades due to the sensitivity of them to water shortages and heat stress. In this meeting, it was also highlighted: (ii) training on plant phenomics and phenotyping using strategies that allow the participation of several countries at the same time. We are aiming at finding resources to implement distance-training courses using currently available technologies such as webinars and teleconferences; (iii) learn from experienced researchers and current plant phenotyping and phenomics initiatives. In order to facilitate the interaction between researchers and institution, senior researchers on plant phenotyping and phenomics were invited to participate in the first meeting; (iv) since highthroughput phenotyping requires a broad range of capabilities (e.g., programmers, bioinformaticians, statisticians, biologists, agronomists, geneticists, physiologists), is important to promote interdisciplinary work between researchers; (v) identify the state of art of plant phenotyping and phenomics in LAC. In order to identify strengths, opportunities and weaknesses and develop targeted strategies, key information such as breeding programs, researchers, equipment and infrastructure, regional and local financial sources, and capabilities should be surveyed. All this information should be included on the future LatPPN webpage; (vi) distribute efforts on common goals (e.g., researchers from different countries working on the same species or problem), it will be necessary to standardize measurements and protocols; (vii) sharing of equipment and infrastructure; and (viii) LatPPN visibility and presence. To avoid early disenchantment, LatPPN needs to carry out activities to promote the network (e.g., events, postgraduate grants or proposal calls).

In relation to weaknesses, the lack of a permanent budget to run network activities is one of LatPPN's main concerns. Currently, the Director and Co-Director, the executive committee (Dr. Paulo Hermann from EMBRAPA—Brazil and Dr. Gustavo Pereyra from INTA-CONICET—Argentina), and the representative members (three per country in charge of meet the local demands, thematic promotion, and economic resources leveraging) devote part of their time and resources to consolidate the network. However, they are looking into sources of support within LAC and worldwide. At the country level, there are a number of countries that have access to grants provided by their own governments. At regional level, there are a number of organizations such as PROCISUR and PROCITROPICOS, which provide regular grant support for agricultural research initiatives. At international level, there are several organizations such as FAO (the Food and Agricultural Organization), EU (the European Union), and IBS (the Inter-American Development Bank) who support agricultural research in LAC.

Another weakness is LAC's low publication rate and the lack of accessibility of LAC institutions to main bibliographic databases. According to the World Bank, the number of publications produced by the most important economies in LAC in 2012 was 48,622 from Brazil, 13,112 from Mexico, 8,053 from Argentina, 5,158 from Chile and 4,456 from Colombia. Brazil is the only country whose output is equivalent to high-income countries where phenomics have been developing in the last 10 years; U.S.A. (412,542), Germany (101,074), U.K. (97,332), France (72,555), Spain (53,342), and Australia (47,806) (World Bank, 2012). In term of access to bibliographic databases, most of the institutions in the region have limited or no access to main bibliographic databases such as Scopus and Web of Knowledge. This is serious limitation to the dissemination of the work developed in LAC, especially if we are aiming at improving plant breeding programs through the use of plant phenotyping and phenomics.

Despite the weaknesses, currently there are several international research institutes who are already formally collaborating with LAC on plant phenotyping and phenomics. Some of them are, Lemnatec (Germany), CSIRO (Australia), IBERS (U.K.), Universidad de Barcelona (Spain), the Julich Plant Phenomics Centre (German), and the James Hutton Institute (U.K.).

The establishment of LatPPN represented a big step forward toward the consolidation of a common mind-set in the field of plant phenotyping and phenomics across LAC. Clearly there are more opportunities than disadvantages, and each weakness needs to be addressed having in mind a regional approach.

#### CONCLUSIONS AND FUTURE WORK

Phenomics can complement the potential of new molecular/ genotyping technologies, and together with agronomy and plant breeding efforts would be a real contribution to develop new strategies to help mitigate the impact of climate change in agriculture. There are major opportunities for phenomics in LAC, not only because it has been adopted in isolated initiatives, but also as worldwide development has focused mainly on grain breeding programs. LAC researchers have identified the need to collaborate to exploit the opportunities and gathered together to organize the Latin American Plant Phenomics Network (LatPPN). Currently, LatPPN has prioritized the work on several fronts to consolidate the network (e.g., grant application to CYTED and Procisur, LatPPN's second meeting in April 2016 (Balcarce, Argentina) and planning a second regional conference organized by EMBRAPA during 2017, drafting of LatPPN's survey, drafting of LatPPN's white paper, and construction of LatPPN's webpage). What follows next is the development of strategies leading to the sustainability of the network. We are aware of the work ahead of us and know that the collaboration within LatPPN members and with other networks will be crucial to build on the foundations laid.

#### AUTHOR CONTRIBUTIONS

Both authors contributed equally.

#### ACKNOWLEDGMENTS

Our special gratitude to Dr. Carolina Saint Pierre (Wheat Phenotyping Coordinator at CIMMYT—Mexico) for valuable information and discussion about Latin American reality and challenges, and to Dr. Ivan Matus (INIA—Chile) for technical definitions and valuable discussion about national wheat breeding program. In Chile, this activity was supported by Universidad de Talca (research programs "Adaptation of Agriculture to Climate Change (A2C2)" and "Núcleo Científico Multidisciplinario"), and the National Commission for Scientific and Technological Research CONICYT-CHILE (FONDEF IDEA 14I10106). In the U.K., this work was supported by the "National Capability for Crop Phenotyping" grant. Award number BB/J004464/1.

#### REFERENCES


spectroradiometry in Fragaria chiloensis under salt stress. J. Integr. Plant Biol. 56, 505–515. doi: 10.1111/jipb.12193


drought-tolerant-associated SNPs in common bean (Phaseolus vulgaris). Front. Plant Sci. 6:546. doi: 10.3389/fpls.2015.00546


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Camargo and Lobos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# **Breeding blueberries for a changing global environment: a review**

#### *Gustavo A. Lobos 1, 2\* and James F. Hancock <sup>2</sup>*

*<sup>1</sup> Faculty of Agricultural Sciences, Plant Breeding and Phenomic Center, Universidad de Talca, Talca, Chile, <sup>2</sup> Department of Horticulture, Michigan State University, East Lansing, MI, USA*

Today, blueberries are recognized worldwide as one of the foremost health foods, becoming one of the crops with the highest productive and commercial projections. Over the last 100 years, the geographical area where highbush blueberries are grown has extended dramatically into hotter and drier environments. The expansion of highbush blueberry growing into warmer regions will be challenged in the future by increases in average global temperature and extreme fluctuations in temperature and rainfall patterns. Considerable genetic variability exists within the blueberry gene pool that breeders can use to meet these challenges, but traditional selection techniques can be slow and inefficient and the precise adaptations of genotypes often remain hidden. Marker assisted breeding (MAB) and phenomics could aid greatly in identifying those individuals carrying adventitious traits, increasing selection efficiency and shortening the rate of cultivar release. While phenomics have begun to be used in the breeding of grain crops in the last 10 years, their use in fruit breeding programs it is almost non-existent.

#### *Edited by:*

*Edmundo Acevedo, University of Chile, Chile*

## *Reviewed by:*

*Margherita Irene Beruto, Istituto Regionale per la Floricoltura, Italy Carlos E. Muñoz Schick, Universidad de Chile, Chile*

#### *\*Correspondence:*

*Gustavo A. Lobos, Facultad de Ciencias Agrarias, Universidad de Talca, 2 Norte 685, Talca 3460000, Chile globosp@utalca.cl; gustavol@msu.edu*

#### *Specialty section:*

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

*Received: 06 July 2015 Accepted: 10 September 2015 Published: 30 September 2015*

#### *Citation:*

*Lobos GA and Hancock JF (2015) Breeding blueberries for a changing global environment: a review. Front. Plant Sci. 6:782. doi: 10.3389/fpls.2015.00782* **Keywords: Vaccinium, drought, heat, UV, phenotype, highbush, MAB, phenomics**

#### **Introduction**

Over the last 100 years, the geographical area where highbush blueberries are grown has expanded dramatically (Retamales and Hancock, 2012). The northern highbush blueberry (NHB) is native to the eastern and mid-western portions of the USA, where winters are very cold, summers are moderate and chilling hours are high (**Table 1**). The Industry was first established in New Jersey (1910), but within a few decades had expanded to North Carolina (1920), Michigan (1930), and the Pacific Northwest (1940). From there it leapfrogged to Europe (1970s), New Zealand/Australia (1980s), central Chile (1980s), and most recently China (2000s).

The expansions into the Pacific Northwest, Mexico, and Chile were into climates with much less severe winters, while the introductions into China were into much colder regions. Cultivars developed in Michigan and New Jersey have generally thrived in the milder climates of the Pacific Northwest, but many of them suffer from the high irradiance in Chile and the cold of China.

Southern highbush blueberry (SHB) types were originally developed in the 1980s by incorporating genes from native species from the southern US to reduce the chilling requirement of NHB. SHB were first established in Florida and Georgia (1980s) and then moved to north central Chile (1980), Argentina and Spain (1990), California (2000) and most recently Mexico, Peru and Ecuador (2010s).

The introductions of SHB into California, north Central Chile, and Spain were into hotter and dryer climates than those in Florida and Georgia (**Table 1**), and in southern Chile with much higher UV levels (Huovinen et al., 2006). The expansions into Mexico and Ecuador were from low to



moderate chill conditions to regions with few to no hours under 7◦C. In general, the cultivars that have done well in Florida and Georgia have also performed well in the hotter, drier production regions of California, central Chile, and Spain. However, only a few low chill cultivars have performed well in Mexico and Peru, and many of them suffer under the high UV levels of Chile.

During the last couple of decades, a constant stream of successful cultivars has been released from a number of breeding programs. These programs have focused on releasing cultivars with reduced chilling hours in warmer regions, increased cold hardiness in colder regions, and higher performance under high pH, temperature and radiative stress, but there is still much room for improvement. To achieve these goals, blueberry breeders have incorporated genes from many species within the *Vaccinium* genus through inter-specific hybridization (**Table 2**), which should prove to be a rich genetic pool for further improvements.

In this paper, we review the environmental challenges facing blueberry cultivation due to global warming. We describe the state of the art of blueberry breeding and outline how future varietal development can be enhanced by marker assisted breeding (MAB) and phenomics.

#### **TABLE 2 | Genetic composition of some of the cultivated blueberries.**

## **Environmental Challenges to Blueberry Cultivation**

The expansion of highbush blueberry growing into colder and warmer regions will be challenged by the alterations in global temperature and rainfall patterns, both associated with increases in atmospheric CO2 concentrations. From the "Industrial Revolution" carbon dioxide has increased in a significant way and will continue to do so. It is estimated that under the most conservative scenario, atmospheric CO2 concentrations at the end of the century will be at least double to the preindustrial era, increasing by 35% from year 2005 (IPCC, 2007b). As atmospheric concentrations of greenhouse gases rise due to the human activity, worldwide climatic patterns are being greatly altered (United Nations, 2010). The Intergovernmental Panel on Climate Change (IPCC), reports that during the past 150 years, global mean temperatures raised 0.045◦C per decade, but in the last 25 years have increased almost four times (0.177◦C) (IPCC, 2007a). Two separate analyses done recently by NASA (National Aeronautics and Space Administration) and NOAA (National Oceanic and Atmospheric Administration) have concluded that 2014 was the warmest year since 1880 (NASA, 2015). It is


*VC, V. corymbosum; VA, V. angustifolium; VD, V. darrowii; Va, V. ashei; VT, V. tenellum; Vc, V. constablei; VE, V. elliottii (Hancock and Siefker, 1982; Ehlenfeldt, 1994; Clark et al., 1996; Hancock et al., 1997; Brevis et al., 2008; Lee et al., 2012; Rowland et al., 2013).*

expected that during the next century global temperatures will be increased by an additional 1.1–6.4◦C (Jin et al., 2011).

The increases in temperature are associated with extreme variations in weather patterns, resulting in severe droughts, unusually heavy rains and atypically hot temperatures (Allen and Ingram, 2002). Since the 1970s, the frequency of warm nights and days is increasing dramatically (IPCC, 2007a). For example, in the main blueberry production area in Chile (approximately between 35 and 38◦ Latitude S), precipitation diminished around 25% during the 20th Century, and it is estimated that there will be a further reduction of 5–15% over the next 30 years (Meza et al., 2003; Santibañez and Santibañez, 2007, 2008; United Nations, 2010). These dramatic changes led Friend (2010) to suggest that "Quantifying and explaining the current global distribution of plant production, and predicting its future responses to climate change and increasing atmospheric CO2, are therefore major scientific objectives." High temperatures and drought can significantly reduce the productivity and the quality of the harvested organ (Moretti et al., 2010), restricting the areas (latitudes and soils) where economically important species can be grown.

The activity and development of humanity has not only increased atmospheric CO2 levels but also levels of chlorofluorocarbons from aerosols, refrigerators, and other equipment that conditions the air. These compounds destroy the ozone layer, which selectively absorbs ultraviolet light. Ozone absorbs 100% of UV-C, prevents the passage of UV-B (near 90%) but does not affect the UV-A transmission (de Gruijl and van der Leun, 2000). In the southern (35–60◦) and northern (35–60◦) hemisphere, the annual mean ozone quantities during 2006–2009 were lower than between 1964 and 1980 (6 and 3.5%, respectively) (WMO, 2011).

The average UV erythemal irradiance, which indicates potential biological damage to human skin from solar ultraviolet radiation, has steadily risen as the amount of ozone has decreased (WMO, 2011). Compared with the 1970s, surface erythemal UV radiation has increased 7% in winter-spring and 4% in summer-fall in the northern hemisphere mid-latitudes, 6% yearround in the mid-southern hemisphere latitudes, and 22% in the Antarctic and Arctic in the spring (Madronich et al., 1998). In the summertime, erythemal UV irradiance in the southern hemisphere is up to 40% higher than values in the northern hemisphere (Madronich et al., 1998). If the Montreal Protocol is followed, it is possible that UV values will return to 1980 levels by the middle of this century, but this is dependent on multifaceted global cooperation (Kazantzidis et al., 2010; McKenzie et al., 2011).

#### **Implications of Climate Change on Blueberry Breeding**

The aspect of global warming that most needs attention from blueberry breeders is the dramatic seasonal fluctuations now occurring in rainfall and temperature patterns. Cultivars well adapted to "average conditions," often do not have sufficient plasticity to perform well under the range of conditions now being faced. For example, an unusually warm spring in Michigan in 2012 lead to very early floral development, and as a result, when temperatures returned to normal later in the spring, a high percentage of flowers were damaged by frost. An unusually hot summer in the Pacific Northwest in 2012, resulted in the fruit of most cultivars being too soft for extended storage. This was followed by an unusually cold winter in 2013–2014, where high percentages of the floral buds were heavily damaged. In Chile, falls and winters are becoming progressively milder in many areas, causing some cultivars to bloom out of season (O'Neal, Snowchaser and Misty, among others).

To maintain and extend the geographic range where blueberries are grown, breeders will need to be much more cognizant of the potential range of environments that the cultivars will face. They will need to take care not to release cultivars that are narrowly adapted to average conditions. Among the environmental challenges faced by blueberry breeders are:

#### **Winter Cold**

The range of the highbush blueberry has been limited by extreme winter cold. Cold hardiness is a complex interaction between rate of acclimation (development of freezing tolerance) and deacclimation (loss of developed freezing tolerance), as well as degree of mid-winter tolerance. This is extremely important since unseasonably warm midwinter spells can trigger a premature deacclimation, exposing the bush to freeze damage (Arora and Rowland, 2011).

In general, northern highbush cultivars survive much colder mid-winter temperatures than southern highbush ones, although considerable variability exists within groups and among *Vaccinium* species (Hancock et al., 1997; Ehlenfeldt et al., 2003, 2006, 2007; Dhanaraj et al., 2004; Rowland et al., 2004; Ehlenfeldt and Rowland, 2006; Hanson et al., 2007). In full dormancy, northern highbush genotypes have been found to range in tolerance from −20 to −30◦C. Few southern highbush have been evaluated, although "Legacy" tolerates temperatures to −17◦C and "Ozarkblue" to −26◦C. US 245, an inter-species hybrid of US 75 ("Bluecrop" × *V*. *darrowii* "Fla 4B") × "Bluecrop," is tolerant to at least −24◦C.

To date, the primary approach to developing more cold tolerant blueberries has been to hybridize lowbush with highbush to produce "half-high" types (Trehane, 2004; Hancock et al., 2008a). However, the shorter stature of the half-highs and the fact they become covered and protected with snow may be the primary basis of their increased tolerance (El-Shiekh et al., 1996). Due to the lack of formal comparisons of flower bud tolerance to winter cold in highbush, lowbush and half-highs, it is unknown how much more cold tolerant highbush can be made through introgression. It would be productive to determine if several other wild species carry useful genes for cold hardiness including *V. boreale*, *V. constablaei,* and *V. myrtilloides* (Galletta and Ballington, 1996; Ehlenfeldt and Rowland, 2006). Ehlenfeldt et al. (2007) showed that when *V. ashei* was hybridized with *V. constablaei*, cold hardiness was positively associated with the percentage of *V. constablaei* genes.

Little formal genetic analysis of cold tolerance of tetraploid blueberry has been performed. Arora et al. (2000) found in diploid populations that the cold hardiness data fit a simple additive-dominance model of gene action, with the additive effects being greater than the dominance ones. During cold acclimation, specific genes are expressed in floral buds that increase cold tolerance (Naik et al., 2007). Arora et al. (1997a), working with "Bluecrop," "Tifblue," and "Gulfcoast," found a close relationship between floral bud dehydrin concentration and the level of cold hardiness. Similar results were found by Rowland et al. (2004) and Dhanaraj et al. (2005). This suggests that dehydrin concentration might be a way to predict the cold hardiness of selections in a breeding program.

It is important to note that studies of cold hardiness under field or artificial conditions can lead to different conclusions. When "Bluecrop" (NHB) and "Tifblue" (Rabbiteye blueberry— RE) flower buds were assessed in the field, LT50 (maximum level of cold-hardiness) were close to −27 and -25◦C (respectively), whereas the same cultivars in cold room conditions (4◦C) reached maximums around −24 and −17◦C, respectively (Arora et al., 1997b; Arora and Rowland, 2011). There were almost twice as many "Bluecrop" genes expressed in the cold room than in the field, suggesting that many of the genes induced in the cold room were responding to low temperature (specifically 4◦C) and were not contributing to freezing tolerance *per se*. In contrast, more "Tifblue" genes were expressed in the field than under the controlled conditions. This suggests that there is a strong genotype × environment interaction associated with cold tolerance and any screen designed to select cold hardy genotypes, must be conducted under field conditions or under realistic controlled protocols (Arora and Rowland, 2011).

It may be possible to determine when a plant is approaching full dormancy by measuring the expression of the β-amylase gene. Lee et al. (2012) showed in the NHB "Jersey" and the SHB "Sharpblue" that there was an abrupt reduction in starch in shoots in the middle of cold acclimation, which was associated with an increase in the expression of the β-amylase gene. This change was positively correlated with the total amount of soluble solids in the wood, which likely served as osmoprotectants able to reduce the freezing point. Inter-species differences in the level of expression of β-amylase genes in northern and southern highbush were described by Rowland et al. (2008).

#### **Spring and Fall Frost**

Freezing damage to developing flowers in the spring is a major problem in most blueberry production regions, with both NHB and SHB. It is a rare year when at least a fraction of the flower buds is not damaged. Rate of deaclimation likely plays a role in early spring flower bud tolerance. Ehlenfeldt (2003) found the northern highbush "Duke" deacclimated the fastest in a mixed group of 12 cultivars, while the southern highbush "Magnolia," the northern highbush × rabbiteye pentaploid hybrid "Pearl River," the rabbiteye × *V*. *constablaei* "Little Giant" and the half-highs "Northcountry" and "Northsky" were the slowest. Northern highbush "Bluecrop" and "Weymouth," southern highbush "Legacy" and "Ozarkblue" were intermediate. While there is evidence of considerable variability, no formal genetic studies have been done on deaclimation rates.

Identifying late bloom or slower deacclimating genotypes will be useful for breeding spring-frost tolerant cultivars (Rowland et al., 2005). Because of the chances of frosts and the direct relation between the stage of floral development and the relative bud hardiness, those cultivars with late bloom dates tend to suffer less frost damage than those flowering earlier (Spiers, 1976; Hancock et al., 1987; Patten et al., 1991; Lin and Pliszka, 2003). When Hancock et al. (1987) assessed flower bud injury in 18 highbush blueberry cultivars after two spring frosts in Michigan, they found significant differences in proportion of brown ovaries among cultivars, ranging from 25 to 94%. Most of the variation was associated with stage of bud development.

Bloom date is strongly correlated with ripening date, but early ripening cultivars have been developed that have later than average flowering dates such as the NHB "Duke," "Huron" and "Spartan," and the SHB "Santa Fe" and "Star." Bloom date, ripening interval and harvest dates are highly heritable in blueberry populations (Lyrene, 1985; Hancock et al., 1991) with strong genotype by environmental interactions (Finn et al., 2003). Finn and Luby (1986) found additive genetic variation was more important than non-additive effects for date of 50% bloom, 50% ripe fruit and for length of fruit development interval in populations from hybrids between *V. angustifolium* and *V*. *corymbosum*. Where spring frosts are a problem, breeders can focus on developing cultivars with late bloom dates and where earliness is premium, selection will need to be made on ripening interval as well.

Flower buds also can be damaged by rapid freezes in the fall. The flower buds of SHB cultivars are generally considered to acclimate more slowly in the fall than those of NHB ones, and as a result are more subject to late fall freezes; however, few formal screens of germplasm have been conducted on this characteristic (Rowland et al., 2005, 2013; Hanson et al., 2007). Leaf retention in the fall does not appear to be a good predictor of rate of DA, as Hanson et al. (2007) found that "Ozarkblue" and US 245 retain their leaves until the very late fall, but they are just as hardy as the mid-season standard "Bluecrop." Bittenbender and Howell (1975) also found no correlation between flower bud hardiness and fall leaf retention.

There are not many studies that have evaluated the effect of the spring frost on open flowers, and most of them have been done on RE (Spiers, 1976; Gupton, 1983; NeSmith et al., 1999). Among RE cultivars, "Southland" proved to be more frost-tolerant than "Delite," "Woodard," "Climax," and "Tifblue" (Gupton, 1983). Nevertheless, interspecific crosses with RE would not be recommended to increase bud tolerance to frost since, at similar stages of floral bud development, RE tend to be more sensitive than NHB and SHB (Patten et al., 1991). Rowland et al. (2013), studying the sensitiveness of five northern highbush blueberries cultivars ("Bluecrop," "Elliott," "Hannah's Choice," "Murphy," and "Weymouth") to frost damage of open flowers, concluded that "Hannah's Choice" and "Murphy" were the most tolerant whereas "Bluecrop" was the most susceptible. Among the cultivars analyzed by Rowland et al. (2013), female parts from "Elliott" (styles), "Hannah's Choice" (styles and exterior ovaries) and "Murphy" (styles) were more frost-tolerant than those structures of "Bluecrop," and the male organs from "Murphy" (filaments and anthers) were more frost-tolerant than "Bluecrop." These differences need to be studied in other cultivars and exploited for breeding.

#### **Chilling Requirement**

Expanding the range of adaptation of the NHB by reducing its chilling requirement has been a major breeding goal over the last 50 years (Hancock et al., 2008a). This was largely accomplished by incorporating genes from the southern diploid species *V. darrowii* into *V. corymbosum* via unreduced gametes, although hybridizations with native southern *V. corymbosum* and *V. ashei* also played a role. Cultivars with an almost a continuous range of chilling requirements (hours below 7◦C) are now available from 0 to 1000 h.

Most SHB are grown in areas with 250–600 chilling hours each winter. SHB cultivars vary widely in their performance without any chilling hours. Surprisingly, one of the best adapted cultivars to this system is "Biloxi," which requires 500 chilling hours (*<*7◦C) in Mississippi, where it was developed. The response to chilling is clearly a complex interaction and many factors play a role including sensitivity to temperature shifts, floral development time, response to photoperiodic change and temperature thresholds.

The genetics of the chilling requirement has not been formally determined, although segregation patterns suggest that it is largely quantitatively inherited with the low chilling requirement showing some dominance. The precise temperature necessary to break dormancy has not been determined, but Mainland et al. (1977) and Spiers (1976) have proposed that the chilling requirement of highbush blueberries is at least partially satisfied by temperatures below 1.4 and above 12.4◦C. It is possible that blueberry genotypes vary in the threshold temperatures that are required to break dormancy, although this has not been documented. Southern highbush cultivars with complex ancestry may be particularly variable in their temperature thresholds.

#### **Heat and Drought Tolerance**

When heat stress is present in blueberries, the quick response needed to supply the atmospheric demand, puts the plant and its fruit at permanent threat (Chen et al., 2012). High summer temperatures, such as in the subtropical southeast China or the dry Mediterranean north of Chile, impact on the productivity of highbush blueberries across much of their range (Darnell, 2000; Chen et al., 2012). It is thought that SHB are more tolerant of high temperatures than NHB, but both types commonly experience summer temperatures in the field that have negative impacts on CO2 assimilation rates and fruit quality. In general, optimal temperatures have been shown to vary between 20 and 25◦C (Davies and Flore, 1986).

No formal studies have been conducted on the genetics of photosynthetic heat tolerance in blueberry, but genetic variation has been documented. Moon et al. (1987b) evaluated the optimum temperature for photosynthesis in different highbush cultivars, determining ranges of 18–6◦C for Jersey and 14–22◦C for Bluecrop. A temperature of 30◦C has been shown to reduce photosynthesis in NHB cultivars by 22–51% (Hancock et al., 1992); the authors reported that "Jersey," "Elliott," and "Rubel" showed a decrease in photosynthesis between 22 and 27% whereas for "Spartan," "Bluejay," and "Patriot" it was between 41 and 51%. Trehane (2004) describes "Ozarkblue" and "Jubilee" as varieties that perform well in hot summers. Chen et al. (2012) found that at high temperatures, up to 40–45◦C, a number of photosynthetic parameters were damaged in "Brigitta," but they stayed largely intact in "Sharpblue," and "Duke"; "Misty" performed in the middle. In this study, at high temperatures, there were increases in hydrogen peroxide, super oxide radical and F0 (minimum fluorescence in the dark-adapted state), while Fv/Fm (maximum photochemical quantum yield of photosystem II) and ÔPS II (quantum efficiency of PSII photochemistry) decreased.

Southern highbush cultivars may have obtained higher photosynthetic heat tolerance from *Vaccinium darrowii* (Lyrene, 2002). Moon et al. (1987a) found that CO2 assimilation (*A*) in Fla 4B of *V. darrowii* was similar at 20 and 30◦C, while *A* in the pure northern highbush "Bluecrop" dropped by almost 30% across this same range. Transpiration rates were also much lower in Fla 4B than "Bluecrop." This difference was found to be heritable, with a tetraploid F1 hybrid actually having higher *A* than the two parents (Moon et al., 1987b; Hancock et al., 1992). The selection, Fla 4B has been used to generate many of the important southern highbush cultivars including "Biloxi," "Emerald," "Legacy," and "Star" (Lyrene and Sherman, 2000; Draper and Hancock, 2003). There may also be additional sources of heat tolerance in the native southern species *V. tenellum*, *V. myrsinites*, *V. pallidum*, *V. ashei*, *V. elliottii*, *V. stamineum*, *Vaccinium arboreum*, southern diploids and tetraploids of *V. corymbosum* (Luby et al., 1991).

It would seem likely that the photosynthetic heat tolerance of both NHB and SHB types can be increased by crossing the most heat tolerant genotypes, since there is considerable genetic variability for this trait both within and among blueberry species.

High temperatures also negatively impact fruit quality and storage life of highbush blueberries. Temperatures higher than 32◦C during the maturation of the fruit can give rise to smaller, soft fruits and with waxes that have greater susceptibility of being lost by means of the rubbing (by leaves or during the harvest) (Mainland, 1989).

Blueberries have a relatively inefficient water conducting systems, characterized by the lack of root hairs (Gough, 1994). Root anatomy and architecture should be a key trait, but unfortunately it is almost unexplored, e.g., *V. arboreum* is drought tolerant specie because it has deep tap roots in contrast to the spreading, shallow root systems of highbush blueberry. Hence, drought tolerance of highbush blueberries might also be enhanced by using species material in breeding.

In his screens of wild species material, Erb et al. (1988a,b) found *V. elliottii*, *V. darrowii,* and *V. ashei* to be the most drought tolerant species and this characteristic was transmitted to hybrid progeny. Moon et al. (1987a) found transpiration rates and leaf conductance (*gs*) to water vapor to be much lower in *V. darrowii* than "Bluecrop" at high temperature. This suggests that *V. darrowii* may have higher drought tolerance through decreased stomatal opening and subsequent restriction of water loss.

Other sources of drought tolerance likely include the native species *Vaccinium stamineum* and *V. arboretum* (Hancock et al., 2008a). *V. stamineum* is the most drought tolerant species in the southeastern U.S.A., but hybrids derived with species in section Cyanococcus have not been vigorous (Ballington, 1980; Lyrene, 2006). The use of *V. arboreum* appears to be more promising, as this species can be crossed with *V. darrowii* to produce vigorous hybrids, and these hybrids can be used as a bridge to tetraploid SHB types (Lyrene, 1991; Brooks and Lyrene, 1998; Olmstead et al., 2013).

In response to water deficit, plants stop shoot growth affecting their final height and diameter (Mingeau et al., 2001). Bluecrop is one of the cultivars which has been studied most, proving to be highly sensitive to water deficit, showing a rapid stomatal closure, and reduced gas exchange (Cameron et al., 1989; Rho et al., 2012), berry size and yields (Améglio et al., 2000). When "Bluecrop" was subjected to severe hydric restriction, a reduction in the yield (31–49%) was observed (Perrier et al., 2000). Similar results were found for other highbush cultivars like Rancocas (Lee et al., 2006) and Jersey (Cameron et al., 1989). Rho et al. (2012) also found that along with the reduction in gas exchange found in "Bluecrop" under water deficit (−1.9 MPa), an increment in the electron transport rate (ETR) occurs, indicating photorespiration is also affected.

When Estrada et al. (2015) studied how SHB, RE, and NHB responded to drought conditions, with or without heat stress, they found that under each stress SHB and RE had a better photoprotection capacity, while the NHB showed increments in its photochemical capacity. When both stresses were present, just NHB "Liberty" and "Elliott" had increased ETRmax (maximum electron transport rate), the coefficient of photochemical fluorescence quenching (qP and qL) and the effective photochemical quantum yield of photosystem II [Y(II)]. This suggests these two cultivars should be considered as parents if reduction of photo-oxidative damage is required. This tool could be used to screen genotypes much faster than other classic measurements, such as gas exchange rate, chlorophyll content, stem water potential, etc., making the monitoring of stressed plants more efficient (Ralph and Gademann, 2005; Estrada et al., 2015).

The priority of this trait for the breeder varies depending on location and irrigation availability. It should be mentioned that currently most blueberries are cultivated under irrigation or with irrigation supplementation.

#### **High UV Light**

Ozone depletion itself, is not a major contributor to global warming, but increases in UV irradiance have large, direct impacts on plant productivity (Boesgaard et al., 2012). In some latitudes, plants will not only have to deal with extreme photosynthetic active radiation and heat, but also with high UV radiation. Is not unusual to observe damaged plants in commercial fields (Yáñez et al., 2009), as well as among breeding families, in central Chile with reddened and curled leaves and what appear to be localized burns on fruit. Shading nets have been shown to enhance productivity of blueberries in central Chile, but the direct influence of UV light has not be investigated (Retamales et al., 2008; Lobos et al., 2009, 2012, 2013, 2015).

Kakani et al. (2003), reviewing 129 studies of the effect of UV-B on 35 crops, reported that higher levels of UV-B (most affected by the ozone depletion) were associated with vegetative and reproductive morphology alterations, decreases in chlorophyll content and photosynthesis, and chlorotic or necrotic patches on leaves or fruit. Little formal work has been done on the effect of UV light on northern highbush blueberries, and most of what has been done has focused on postharvest improvement through short treatments of UV-B on harvested fruit (Perkins-Veazie et al., 2008; Eichholz et al., 2011).

The damaging effects of high UV light have been documented in other *Vaccinium* species. Albert et al. (2008) and Boesgaard et al. (2012) found a reduction of net photosynthesis in *V. uliginosum* throughout the season and damage to photosystem II (PS II) through the diminution of the fluorescence measured as Fv/Fm. Kossuth and Biggs (1978) tested the effects of 15, 24, and 44 units of UV-B on the rabbiteye blueberry "Woodard" and found that the higher doses reduced fruit growth and surface bloom, and under high doses of UV-B the fruit skin actually appeared burned.

Among the mechanisms that might be selected to improve UV tolerance in blueberries is the ability of the leaf surface to reflect part of the incident radiation (Semerdjieva et al., 2003a,b). The thickness of the epidermis of the leaf and the concentration of absorbent compounds could also be improved to counteract the damaging effects of UV radiation (Batschauer, 1998; Boesgaard et al., 2012). Increases in levels of phenols (flavonoids and hidroxamic acid), could also help counter the degradative effects of high UV-B on DNA (Rozema et al., 1997; Ruhland et al., 2005). Defense responses to UV-B have not been evaluated in highbush blueberry; however, Semerdjieva et al. (2003a,b) noted in three other species (*V. myrtillus* L., *V. vitis-idaea* L., and *V. uliginosum* L.) in the north of Sweden, that there were noticeable differences in quantity of phenolic compounds. In all species, high UV-B led to an increase in phenolic compounds, but some genotypes responded more than others, and the plants with the highest flavonoid content had the least UV-B damage.

#### **New Frontiers for the Breeding of Blueberries**

The breeding of highbush blueberries can be a long and tedious process. Traditional approaches take from 10 to 20 years from the original cross to cultivar release and often the precise adaptive range of a cultivar is not known until farmers have grown it for a number of years. Two relatively new techniques called "MAB" and "Phenomics" could greatly facilitate blueberry breeding. MAB would aid in the selection of those individuals most likely to carry adventitious traits and Phenomics would allow for much more easy, fast and precise characterization of the superior types. It is possible that individuals could be selected for their adaptability to variable environmental conditions with MAB even though they were not exposed to those conditions in the field.

#### **Marker Assisted Breeding (MAB)**

All breeding programs revolve around identifying the optimal traits for a cultivar. Most blueberry breeding programs utilize traditional approaches to identify desirable types, such as walking along rows of crosses in the field or doing simple laboratory assays on fruit quality and disease resistance. However, in many other crops, MAB is used to facilitate and speed up the release of new cultivars (Cabrera-Bosquet et al., 2012; Araus and Cairns, 2014).

MAB is based on DNA diagnostic tests that can identify potential parents and progeny carrying desirable traits. This process allows selection to be moved all the way back to conception in the breeder's minds, helping them to only make crosses that create desirable trait combinations in offspring, increasing the efficiency of the entire process. It also permits selection to be moved from the field to the greenhouse, so that only seedlings predicted to be superior are planted in the field for further evaluation. In addition, MAB allows for the assessment of traits that are difficult to predict in the field such as chilling hour requirement or heat tolerance.

To use MAB to broaden the environmental range where highbush blueberries can be grown, it will be necessary to find genetic variability associated with expanded adaptations. The rich germplasm diversity currently being used by blueberry breeders (**Table 2**) is likely to contain useful genes. The major stumbling block to using MAB will be the collection of precise data on the adaptations of potential breeding parents. Evaluations of genotypes in the field will require that the extreme conditions occur when the plants are in the field (Arora and Rowland, 2011). Care will need to be taken to evaluate genotypes in appropriate environments and in many cases controlled experiments will need to be undertaken. Most likely a genotype will need to be evaluated in multiple environments to obtain an accurate representation of its adaptations. This will often be tedious and time consuming, but once markers are found that are tightly linked to the genes regulating adaptations of interest, future screening will be greatly facilitated through MAB. As we discuss below, field screening could be greatly streamlined using phenomic techniques involving spectrometry and thermography.

The first genetic maps of blueberries are beginning to emerge that will set the groundwork for MAB. Rowland's group at the USDA-ARS (Genetics of Fruit and Vegetable Improvement Laboratory, Beltsville, MD, USA) developed the first blueberry map using a diploid population segregating for chilling requirement (Rowland and Levi, 1994). Their population was a cross between an inter-specific hybrid (*V. darrowii* × *V. corymbosum*) and another clone of *V. corymbosum*. They have continued to periodically add markers to this map and at last report had 265 markers mapped to 12 linkage groups. They have used this map to identify quantitative trait loci for cold tolerance and chilling requirement (Rowland et al., 2014). Allan Brown (North Carolina State University, NC, USA) and Eric Jackson (General Mills, Crop Bioscience, MN, USA) have led teams that sequenced the genome of a *V. darrowii* × *V. corymbosum* hybrid and used this information to generate a more dense chromosome map with 1200 markers.

A major research initiative has been undertaken by Lisa J. Rowland, Nahla Bassil (USDA-ARS, National Clonal Repository, Corvallis, OR, USA), Julie Graham and Susan McCallum (The James Hutton Institute, Dundee, UK) and Jim Olmstead (University of Florida, Gainesville, FL, USA) to develop a linkage map of the tetraploid cross "Jewel" (SHB) × "Draper" (NHB) (Rowland et al., 2012). Replicated progeny of that cross were planted at five locations across the USA and data was collected on a wide array of traits including fruit quality, developmental rates, chilling hour requirements and growth patterns. A QTL analysis is currently being conducted to search for markers for these traits.

The first few thousand expressed sequence tags (ESTs) have been generated and made publicly available for the Ericaceae family, about 5000 from blueberry and about 1200 from rhododendron (Rowland et al., 2010, 2014; Die and Rowland, 2013) (http://bioinformatics.towson.edu/BBGD/). These ESTs from blueberry and rhododendron were generated as parts of projects focused on cold acclimation research, and the ESTs are from non-acclimated and cold acclimated flower bud libraries, in the case of blueberry (Dhanaraj et al., 2004, 2007), and from non-acclimated and cold acclimated leaf libraries, in the case of rhododendron (Wei et al., 2005). Another ∼16,000 ESTs have been generated from blueberry fruit by the New Zealand Institute for Plant and Food Research Ltd. (formerly HortResearch), but they are not publicly available.

#### **Phenomics**

The success of breeding programs is reflected in the number of individuals released at the end of the selection process (Hancock et al., 2008b). In order to be successful, breeders must generate thousands of hybrids annually and evaluate them for a number of years. What ultimately is selected is dependent on local environmental conditions (Araus and Cairns, 2014).

Because of the large number of genotypes that need to be evaluated, deep phenotypic characterizations of the material often becomes impractical due to the time and costs involved (White et al., 2012; Kipp et al., 2014). For this reason, conventional breeding generally focuses on different visual characteristics (e.g., fruit color, cluster tightness, disease resistance, growth habit, flowering, and ripeness dates) and a few that require measurements of average complexity (e.g., yield, soluble solids, firmness).

To effectively develop cultivars well adapted to fluctuations in environmental stresses, blueberry breeders will have to evaluate a number of morpho-physiological and physico-chemical traits that they are not used to considering (**Figure 1**). The only reasonable way to fulfill all these needs is through acquisition of high-dimensional phenotypic data (high-throughput field phenotyping) or "Phenomics" (Houle et al., 2010). Nowadays, there are a number of remote sensing devices, techniques and analysis, mostly non-destructive, which have proven very helpful in the characterization of the phenotype (Furbank and Tester, 2011; White et al., 2012; Araus and Cairns, 2014).

Among the remote sensing technologies with the greatest potential for use in phenomics are spectrometry and thermography. Spectroradiometers are widely used to measure plant reflectance (R), whose spectral signature (graphic

characterization of reflected segment of wavelengths) is closely associated with the absorption at certain wavelengths that are linked to specific characters or plant conditions (Araus and Cairns, 2014). Thermography uses plant temperature as an efficient tool for the study of the spatial and temporal heterogeneity of plant water status and how it responds to the environment.

While these techniques are not new, their use was expanded tremendously during mid 1980s and is now being widely used in plant ecophysiology and postharvest studies (estimation of yield, nutritional content in leaves, gas exchange rate, fruit quality, biotic and abiotic stress, etc.) (Garriga et al., 2014; Lobos et al., 2014). Measurements that before usually took months, weeks, or days can now be accomplished in hours or even minutes for a large number of genotypes (**Figure 2**).

In spectrometry, reflectance data is used to generate "Spectral Reflectance indices" (SRI). Initially SRIs were simple relationships between wavelengths or spectral bands. The first SRI was the "Simple Ratio," calculated as the ratio of the near infrared (NIR) to the visible (VIS) (SR = RNIR/RVIS), and the "Normalized Difference Vegetation Index" [NDVI = (RNIR-RVIS)/(RNIR+RVIS)]. Since then, and incorporating specific wavelengths, SRIs have been used in different species to estimate green biomass and leaf area index (Tucker and Sellers, 1986), plant water status (Peñuelas et al., 1993), radiation use efficiency (Peñuelas et al., 1995), water content in leaves (Sims and Gamon, 2003), photosynthetic capacity and efficiency (Inoue et al., 2008), micro and macro nutrients in leaves (Basayigit and Senol, 2009), yield and carbon isotope discrimination (Lobos et al., 2014) among many others.

Since the 1960s, plant temperature has been widely used as an indicator of water status (Tanner, 1963). Initially, temperature measurements were performed by thermocouples in contact with the leaves. Later, the development of infrared sensors allowed faster measurements of leaves or canopies. With thermal imaging it is possible to detect biotic or abiotic pre-symptomatic responses, providing a powerful tool to evaluate a high number of samples in only a few minutes (Costa et al., 2013; Araus and Cairns, 2014). The development of cheaper devices has made this approach available to farmers and breeders. Thermometry analysis has been fine-tuned over time and the different parts of the image (soil, air, leaves, stems, branches, etc.) can now be isolated, allowing for the evaluation of specific tissues, organs or individuals (Costa et al., 2013; Araus and Cairns, 2014).

To date, spectrometry and thermography have been used to only a limited extent on blueberries. There are a few studies where the antioxidant content in NHB was evaluated using spectrometry (total phenols, total flavonoids, total anthocyanins, and ascorbic acid) (Sinelli et al., 2008; Bai et al., 2014). The ideal "Brigitta" harvest date was determined in the field using reflectance data (Beghi et al., 2009), and a blueberry ripeness index (BRI) has been developed (Beghi et al., 2013). Guidetti et al. (2009) used a portable spectroradiometer Vis-NIR to accurately estimate soluble solids, firmness and functional compounds

(anthocyanins, flavonoids, polyphenols, and ascorbic acid) in fresh and homogenized fruit samples of "Brigitta" and "Duke." Other work on blueberries focused on monitoring osmo and air dehydration processes (Sinelli et al., 2011), SHB cultivar identification (Yang and Lee, 2011; Yang et al., 2012), and the recognition of foreign materials (leaves and stems) among frozen blueberries (Tsuta et al., 2006; Sugiyama et al., 2010). In the wild lowbush blueberry (*V. angustifolium*) reflectance data has also been used for detection of internal larvae fruit infestation (Peshlov et al., 2009), *in situ* levels of foliar nitrogen (Bourguignon, 2006; Maqbool et al., 2012) and to evaluate vegetative (leaf area index) and reproductive (flower number, fruit set, and berry yield) parameters (Percival et al., 2012). Most recently, hyperspectral imaging has been used to predict soluble solids content and firmness in NHB fruit (Leiva-Valenzuela et al., 2013, 2014), to identify damaged fruit (Leiva-Valenzuela et al., 2012; Leiva-Valenzuela and Aguilera, 2013), to classify blueberry fruit growth stages (Yang et al., 2012, 2014), and as a tool for early detection of leaf rust in blueberries (Ahlawat et al., 2011). Escobar-Opazo (2015) found that in blueberry some of the physiological parameters were significantly correlated with reflectance data (e.g., ETRmax and chlorophyll a/b *>* 0.90; *A* and *gs >* 0.65).

During the last decade, spectrometry and thermography have begun to be used in the breeding of grain crops, but their use in fruit breeding programs is almost non-existent. Even in the grain crops, their use has been limited to the evaluation of small numbers of genotypes, in general *<*20. SRI will need to be replaced by more complex bio-mathematical models, to fully provide breeders with the solid and reliable information they need for plant selection. When this happens, the efficiency of selection should dramatically improve, and well adapted genotypes will be released at a faster rate. The future challenge will be to develop techniques that can screen a large number of genotypes simultaneously (hundreds or thousands) (**Figure 2**).

## **Concluding Remarks and Future Perspectives**

Considerable genetic variability exists in the highbush blueberry germplasm base that can be used by breeders to meet the environmental challenges associated with climate change. There has already been extensive range expansion of blueberries into hotter and drier environments. There is no reason to believe that there is not additional genetic variability that can be deployed to further enhance cold acclimation, heat, high UV, and drought tolerance of blueberries. Perhaps the greatest challenge associated with climate change and blueberry range expansion will be the development of blueberry cultivars that can resist extremes in environmental variability. Ongoing research to develop DNA diagnostic markers for key physiological tolerances will aid greatly in the breeding of these stress resistant types. To generate robust markers for MAB, it will be necessary to have precise phenotypic characterizations, making phenomics a powerful tool that could aid greatly in identifying the superior types.

## **Acknowledgments**

This work was supported by the National Commission for Scientific and Technological Research CONICYT Chile (FONDEF IDEA 14I10106 and FONDEQUIP EQM130073), and the research programs "Adaptation of Agriculture to Climate

**References**


Change (A2C2)" and "Nucelo Científico Multidisciplinario" from Universidad de Talca, Chile.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2015 Lobos and Hancock. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# The qTSN Positive Effect on Panicle and Flag Leaf Size of Rice is Associated with an Early Down-Regulation of Tillering

Dewi E. Adriani 1, 2, Tanguy Lafarge<sup>1</sup> , Audrey Dardou<sup>1</sup> , Aubrey Fabro<sup>3</sup> , Anne Clément-Vidal <sup>1</sup> , Sudirman Yahya<sup>4</sup> , Michael Dingkuhn1, 3 and Delphine Luquet <sup>1</sup> \*

<sup>1</sup> CIRAD, UMR AGAP, F-34398 Montpellier, France, <sup>2</sup> Faculty of Agriculture, University of Lambung Mangkurat, Banjarbaru, Indonesia, <sup>3</sup> Crop and Environment Science Division, International Rice Research Institute, Los Baños, Philippines, <sup>4</sup> Department of Agronomy and Horticulture, Bogor Agricultural University, Bogor, Indonesia

#### Edited by:

Alejandro Del Pozo, Universidad de Talca, Chile

#### Reviewed by:

Maoteng Li, Huazhong University of Science and Technology, China Hao Peng, Washington State University, USA

\*Correspondence:

Delphine Luquet delphine.luquet@cirad.fr

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 10 October 2015 Accepted: 14 December 2015 Published: 07 January 2016

#### Citation:

Adriani DE, Lafarge T, Dardou A, Fabro A, Clément-Vidal A, Yahya S, Dingkuhn M and Luquet D (2016) The qTSN Positive Effect on Panicle and Flag Leaf Size of Rice is Associated with an Early Down-Regulation of Tillering. Front. Plant Sci. 6:1197. doi: 10.3389/fpls.2015.01197 The qTSN4 was identified as rice QTL (Quantitative Traits Locus) increasing total spikelet number per panicle and flag leaf area but potentially reducing panicle number depending on the environment. So far, this trade-off was mainly observed at grain maturity and not specifically studied in details, limiting the apprehension of the agronomic interest of qTSN4. This study aimed to understand the effect of qTSN4 and of the environment on panicle sizing, its trade-off with panicle number, and finally plant grain production. It compared two high yielding genotypes to their Near Isogenic Lines (NIL) carrying either QTL qTSN4 or qTSN12, two distinct QTLs contributing to the enlarged panicle size, thereafter designated as qTSN. Traits describing C sink (organ appearance rate, size, biomass) and source (leaf area, photosynthesis, sugar availability) were dynamically characterized along plant and/or panicle development within two trials (greenhouse, field), each comparing two treatments contrasting for plant access to light (with or without shading, high or low planting densities). The positive effect of qTSN on panicle size and flag leaf area of the main tiller was confirmed. More precisely, it could be shown that qTSN increased leaf area and internode cross-section, and in some cases of the photosynthetic rate and starch reserves, of the top 3–4 phytomers of the main tiller. This was accompanied by an earlier tillering cessation, that coincided with the initiation of these phytomers, and an enhanced panicle size on the main tiller. Plant leaf area at flowering was not affected by qTSN but fertile tiller number was reduced to an extent that depended on the environment. Accordingly, plant grain production was enhanced by qTSN only under shading in the greenhouse experiment, where panicle number was not affected and photosynthesis and starch storage in internodes was enhanced. The effect of qTSN on rice phenotype was thus expressed before panicle initiation (PI). Whether early tillering reduction or organ oversizing at meristem level is affected first cannot be entirely unraveled. Further studies are needed to better understand any signal involved in this early regulation and the qTSN × Environment interactions underlying its agronomic interest.

Keywords: rice, qTSN4, down-regulation of tillering, panicle size, top leaf and internode size, main stem growth rate

## INTRODUCTION

Grain yield elaboration in cereals depends on the establishment of panicle number per square meter and panicle size, grain filling rate, and individual grain size (Chen et al., 2008; Gaju et al., 2014). All these traits are known to compensate per unit area i.e., among plants (Zhang and Yamagishi, 2010) and within the plant where they compete for the same pool of C and N resources (Okawa et al., 2003; Hashida et al., 2013). Amongst these traits in rice, panicle number, and panicle size are those characterized with the highest plasticity under favorable growing conditions thus with the highest impact on yield elaboration. These two traits are determined along plant cycle, particularly based on tillering and green leaf area dynamics, internode reserve remobilization, and reproductive sink size and number. All these processes are depending on C assimilated and N availability, as reported by Lafarge and Bueno (2009) and Dingkuhn et al. (2015) in rice, and by Dreccer et al. (2008) in wheat. Regarding panicle size, many studies reported that panicle development, and the determination of spikelet number per panicle, is closely correlated to early plant vigor and underlying traits such as leaf appearance rate (Dong et al., 2004; Streck et al., 2009; Itoh and Shimizu, 2012; Rebolledo et al., 2012), tillering (Lafarge et al., 2002, 2010; Borràs-Gelonch et al., 2012), culm (Fujita and Yoshida, 1984; Wu et al., 2011), and peduncle (Liu et al., 2008) size, plant height (Wang et al., 2014; Chen et al., 2015), or even neck internode diameter (Zhang and Yamagishi, 2010).

Several genes were reported to control morphogenetic processes positively during both the vegetative and the reproductive phases. Amongst them, the genes MOC1 (Monoculm 1, Li et al., 2003) and LAX (Komatsu et al., 2003) were shown to promote axillary meristem initiation and accordingly to regulate both shoot and panicle branching. The Gn1 gene is involved in the control of plant height and grain number per panicle (Ashikari et al., 2005). The DEP1 enhances shoot apical meristem activity and grain number per panicle (Huang et al., 2009). The OsSPL14 promotes early tillering and grain number per panicle (Miura et al., 2010). The APO1 increases grain number per plant and harvest index (Ikeda et al., 2007; Terao et al., 2010). The OsEBS enhances plant biomass and spikelet number per panicle (Dong et al., 2013). By contrast, other genes were shown to have trade-off effects on some morphogenetic traits expressed during the vegetative and the reproductive phases. This is the case of the rice mutant this1 characterized by higher tiller number but lower plant height and fertile spikelet number compared to the wild type (Liu et al., 2013). The Ghd7 increases spikelet number per panicle through panicle branching but decreases tillering in a density-dependent manner (Weng et al., 2014); the OsSPS1 gene inhibits plant height but enhances spikelet number per panicle (Hashida et al., 2013). The Ltn (low tiller number) and similarly SPIKE (qTSN4) induces higher total spikelet number per panicle but decreases final panicle number (Fujita et al., 2010, 2013).

These studies point out that a gene has unlikely a positive effect on both organ size and number and accordingly that molecular breeding efforts are frequently confronted to the issue of physiological trade-offs among traits, compromising the direct use of a candidate gene or QTL in crop improvement. The tradeoff between tiller number and organ size is of major concern for rice, not only with respect to vegetative growth (Rebolledo et al., 2012) but also reproductive growth as it comes as a parameter regulating panicle size vs. number (Lafarge et al., 2010). The qTSN4 was recently identified as a QTL enhancing flag leaf width and total spikelet number per panicle in IR64 background (Fujita et al., 2009, 2012, 2013). This QTL is known to co-locate with Nal1 gene involved in the determination of leaf structure, veining pattern, and carboxylation (Qi et al., 2008), as well as other early traits like stem length and vascular bundle number (Fujita et al., 2012, 2013). It was also shown that its positive effect on panicle size was potentially depressive on panicle number (Fujita et al., 2012, 2013). Recently, Okami et al. (2015) confirmed that an isogenic line carrying qTSN4 produced fewer but larger tillers than its parent (IR64).

No studies, however, were conducted to further understand at plant level the physiological and environmental determinisms regulating (i) the trade-off between panicle size and number in genotypes introgressed with qTSN4 and (ii) accordingly, the positive effect of qTSN4 on plant grain production. This is the aim of the present study that, analyzes the effect of qTSN on plant growth and development and thus on yield formation processes along plant cycle in contrasting situations of plant access to light. Two high yielding genotypes (IR64 and IRRI146) were compared to their Near Isogenic Lines (NIL) carrying QTL qTSN4 (Fujita et al., 2013) and QTL qTSN12, a different QTL reported for having a similar function as qTSN4, i.e., enhanced panicle size (Ishimaru, personal communication). This was conducted within three trials under contrasted situations of plant access to light and thus to C assimilates in order to modulate the level of competition among sinks within the plant. Plant growth was then characterized in terms of morphogenesis and biomass accumulation per organ type along plant development.

## MATERIALS AND METHODS

## Plant Materials

Two pairs of parents/near-isogenic lines (P/NIL): IR64 vs. IR64 NIL and IRRI146 vs. IRRI146 NIL were studied in two experiments. In one of these experiments, an additional pair of P/NIL was considered asIR64 and NIL1 carrying qTSN12. This QTL has been observed to increase panicle size and reduce panicle number—as observed with qTSN4—and was detected in chromosome 12 from donor parent YP4. The NILs were developed by self-pollination of a plant selected from BC4F2 population (in IR64 background) and BC3F1 population (in IRRI146 background; Fujita et al., 2013). The IR64 NIL was identified by composite interval mapping carrying the high total spikelet number (TSN) QTL between Simple Sequence Repeat (SSR) markers RM3423 and RM17492 on the long arm of chromosome 4 (Fujita et al., 2012), whereas IRRI146 NIL was developed from recurrent backcrossing to IRRI146 and markerassisted selection (MAS; Fujita et al., 2013). The details (cross combination, donor and category) of plant materials are available in **Table 1**. The recipient lines (parents) were chosen because

Adriani et al. Early Tillering Regulation of qTSN4 in Rice



of the wide adaptability as of IR64 as a mega variety grown in many parts of the world (Khush, 1987) and IRRI146, a 2ndgeneration New Plant Type (NPT) variety developed at IRRI and released in the Philippines in 2007 under the name NSIC Rc158, a high-yielding indica cultivar as well (Brennan and Malabayabas, 2011).

#### Experiments GH-CNRS

Experiment I was conducted in the greenhouse from May to August 2013 at the National Center for Scientific Research (CNRS), Montpellier, France. This experiment tested qTSN4 in IRRI146 and IR64 genetic backgrounds comparing one control treatment (C) and a shading (S) treatment with a 58% light attenuation from panicle initiation (PI) up to heading (H) by using a gray net all around the table bringing the plants.

The seeds were grown in a germination chamber at 29◦C, then transplanted 4 days after germination in 3 l pots (three seeds per pot) when seedlings were about 3 cm tall. The thinning of plant population to one plant per pot (downsizing to 45.65 plants m−<sup>2</sup> ) was conducted at four-leaf stage. Pots contained about 3 4 of their volume with EGOT 140 media (17N-10P-14K, pH of 5) and were placed side by side (corresponding to 14.8- cm spacing) on tables filled with 5 cm water depth. Basal fertilizer was applied using Basacot 6M+ at 2g l−<sup>1</sup> , 11N-9P-19K +2Mg incorporated before transplanting. Pots were arranged in four aluminum tables and each table was containing 104 pots, including border plants, at the beginning of the experiment. The tables were moved every week from 2 weeks after transplanting until maturity to avoid any bias due to the green house structure.

Weather data were collected from the AWS (Automatic Weather Station) that was installed in the center of the tables measuring Photosynthetic Active Radiation (PAR), global radiation (Rg), air temperature (T), and relative humidity (RH). The average daily air temperature throughout the crop cycle was 27.3 ± 0.6◦C. The average daily PAR for the whole crop cycle was 24.7 ± 7.1 mol m−<sup>2</sup> d <sup>−</sup><sup>1</sup> under full light and 10.3 ± 3.4 mol m−<sup>2</sup> d <sup>−</sup><sup>1</sup> under shading, and the average RH was 66.8 ± 7.7%.

#### Field-IRRI

Experiment II was performed in the field at International Rice Research Institute (IRRI) experiment station in Los Baños, Philippines (14◦ 11′N, 121◦ 15′E, 21 m altitude), from December 2013 to April 2014. This experiment tested qTSN4 and qTSN12 effect in IRRI146 and IR64 genetic backgrounds and at two planting densities, low (LD, 25 plants m−<sup>2</sup> ), and high density (HD, 100 plants m−<sup>2</sup> ).

The seeds were soaked for 24 h, drained and incubated for another 24 h, then sown in the seeding trays in the greenhouse on December 5, 2013. The 2-week old seedlings were transplanted in the field at one plant per hill in a 2 × 2.4 m<sup>2</sup> plots. The field was initially flooded to hold two puddlings and two harrowings, standing water level, 3–5 cm, was maintained as the IRRI guide field standard. Phosphorus (30 kg P ha−<sup>1</sup> ), potassium (40 kg K ha−<sup>1</sup> ), and zinc (5 kg Zn ha−<sup>1</sup> ) were applied and incorporated into all the plots 2 days before transplanting. 60 kg N ha−<sup>1</sup> was applied 1 day before transplanting, then 40 kg N ha−<sup>1</sup> and 60 kg N ha−<sup>1</sup> were applied at mid-tillering and PI stage, respectively.

Weather data were collected from the IRRI meteorological station measuring radiation (MJ m−<sup>2</sup> ), daylight (h), rainfall (mm), evaporation (mm), average temperature (◦C), vapor pressure (kPa), RH (%), and wind speed (m s−<sup>1</sup> ). Average daily air temperature throughout crop cycle was 25.6 ± 1.5◦C. The average daily PAR for the whole crop cycle was 31.0 ± 11.3 mol m−<sup>2</sup> d −1 , and the average RH was 84.2 ± 4.8%.

#### GH-IRRI

Experiment III was performed in an open-top green house during the wet season (August to November 2014) supporting the two experiments described above. This experiment adopted split plot design with three replications and was conducted at IRRI, Los Baños, Philippines (14◦ 11′N, 121◦ 15′E, 21 m altitude). The main factor was plant spacing: crowded density, 20 × 20 cm (Cr, 25 plants m−<sup>2</sup> ) and isolated density, 60×60 cm (Is, 2.78 plants m−<sup>2</sup> ) from PI up to flowering (FLO). The subsidiary factor was a pair of rice genotype: IRRI146 (NSIC Rc158) recipient line and its NIL (qTSN4.1–YP4).

The seeds were soaked for 24 h, drained and incubated for another 24 h, then sown in the 6 l pots (four seeds per pot). Pots contain about 3 4 of its volume of Andaqueptic Haplaquoll with a topsoil of 39% clay, 46% silt, 14% sand, pH of 6.38. The thinning of plant population to one plant per pot was conducted 2 weeks after sowing. Water level in the pots was always maintained about 8 cm height. 4 g of Ammonium Sulfate (NH4)2(SO4), 2 g of Single Super Phosphate (SSP) and 2 g of muriate of Potash (KCl) were applied and incorporated into all the pots 2 days before sowing, and 2 g of Ammonium Sulfate was applied at PI stage.

Weather data were collected from IRRI meteorological station measuring radiation (MJ m−<sup>2</sup> ), daylight (h), rainfall (mm), evaporation (mm), average temperature (◦C), vapor pressure (kPa), RH (%) and wind speed (m s−<sup>1</sup> ). Average daily air temperature throughout crop cycle was 27.7 ± 0.9◦C. The average daily PAR for whole crop cycle was 29.8 ± 11.5 molm−2d −1 , and RH was 85.7 ± 5.3%.

#### Plant Measurements

#### Plant Development and Biomass Accumulation

In both GH-CNRS and field-IRRI, leaf appearance on the main tiller and green tiller number were measured every week from 2 weeks after sowing (in GH-CNRS) and 2 weeks after transplanting (in the field) up to flag leaf stage on 3 and 4 sampled plants in GH-CNRS and field, respectively, in every replication. Thermal time was calculated by daily integration of air temperature minus a base temperature of 12◦C (Rebolledo et al., 2012).

Dry weight (DW) of plant shoot organs (leaves, stems of the main tiller and the rest of the whole plant, and panicles) was measured after drying for 72 h in an oven at 70◦C, during panicle development (PI + 3 weeks in GH-CNRS and GH IRRI; PI + 2 weeks in the field) and at heading (H; in GH-CNRS) or flowering (FLO; in the field). For each of this sampling main stem dry weight at H or FLO and MAT (MS DW H/FLO, MS DW MAT), shoot dry weight at FLO and MAT (Shoot DW H/FLO, Shoot DW MAT), main stem panicle dry weight at MAT (MS PDW MAT) and plant filled grain dry weight at MAT (Plant FGDW MAT) were measured.

At physiological maturity, the DW of all vegetative organs excluding root biomass was measured as well as MS PDW (after drying under the sun). The five plants harvested at maturity in GH were separated into panicles (after taken pictures for P-TRAP analysis), green leaf blades, senescent leaves, and productive stems (culms + sheaths). In the field, all the plants within a soil base area of 0.12 m<sup>2</sup> per plot were harvested, that is three plants under LD and 12 plants under HD. They were then separated into panicles, green leaf blades, dead tissues, and productive stems. The panicles were hand-threshed and then the filled spikelets were separated from the unfilled by a densitometric column (in GH) or submerging the spikelets in the water (in the field).

PI was determined by dissecting and observing the main tiller of randomized collected plants (border plants for field experiment) from each unit treatment every second day when PI was close. The occurrence of PI was considered when the first row of floral primordial was visible on the shoot apex. Flowering (FLO) was determined within each unit treatment when an average of 75% spikelets per panicle of the main tiller exerted their anthers. Plants were considered at physiological maturity when 75% of the grains of the panicles had turned yellow and the texture was in dough stage.

#### Leaf Area

In GH-CNRS, individual leaf area on the main stem was measured by using LI–3100C Area Meter (Lincoln, NE, USA). In the field and GH-IRRI, the length and maximum width of individual green leaf blades on the main stem was measured manually with a ruler, then leaf area was estimated as length × maximum width × 0.725 (Tivet et al., 2001). The measurement was done at PI + 3 weeks (four sampled plants in GH IRRI), and at FLO (four sampled plants in GH-CNRS and two sampled plants in the field) on the plants used for biomass measurement. The total area of all green leaves per plant was then measured by LI–3100C Area Meter in GH-CNRS and in the field.

#### Internode Profile and Anatomy

The length of each internode of the main tiller was measured at maturity in GH-CNRS and GH IRRI of sampled plants used for biomass measurement. In GH-CNRS, for anatomical observation purpose a middle part of 2 cm-long of peduncle and of internode-3 was sampled at FLO stage from the same plants used for biomass measurements. The samples were fixed in paraformaldehyde fixative solution and kept in desiccator overnight followed by dehydration with ethanol 70% for at least 24 h then conserved in the freezer prior to observation. The internodes were sliced into pieces of 60–80µm with a Thermo Scientific Microm HM 650 V Vibration microtome. The pedunles were sliced into 75–90µm after embedded in agarose 7% for 2 h with a Thermo Scientific Microm HM 650 V Vibration microtome.

In the field, internode profile was measured at FLO stage by sampling plants different from the plants used for biomass measurement. A 2 cm-long section of the peduncle and of the top internode was fixed in Formalin-Acetic-Alcohol (FAA) fixative solution until the date of sectioning. Prior to sectioning, the samples were dehydrate with ethanol 70% then sliced manually by razor blade.

The samples were then observed based on a high resolution imagery system (microscope Leica S8 APO equiped with camera QImaging MicroPublisher 3.3 in GH-CNRS and stereo microscope Olympus SZX7 equiped with camera Olympus DP71 in the field) to analyze the total and inner diameter as well as peripheric bundles number. In this study, peduncle is the uppermost internode between panicle neck node and node I, and top internode is the internode just below the peduncle, which is the same phytomer as flag leaf.

#### Carbon Assimilation Measurements

Actual assimilation rate (at homogenous level of light in the measurement chamber, i.e., 1500 or 1800 micromole m−<sup>2</sup> s −1 ), was measured by using Portable Photosynthesis System (GFS– 3000 WALZ) during panicle development (2–3 weeks after PI), the same time as biomass measurement for all experiments. The last ligulated leaf of the main tiller from three tagged plants (in GH-CNRS) and two tagged plants (in the field and GH-IRRI) in every replication was chosen for this measurement.

#### Non Structural Carbohydrates (NSC) Analyses

NSC was analyzed during panicle development (2–3 weeks after PI) at the same time as C assimilation and biomass measurement in all experiments. In GH-CNRS, three plants per treatment (as three replications) were chosen homogenously for NSC analyses. In the field and GH-IRRI, two of sampled plants dedicated for biomass measurement were chosen for NSC analyses. In GH CNRS and GH IRRI trials, plants were dissected to sample the leaf blade of last ligulated leaf from the main stem. One base (only in GH CNRS) and one top internode (internode just below the peduncle) were sampled and immediately frozen in liquid nitrogen and store at −80◦C until fine ground with a ball grinder (Mixer mill MM301, Retsch, Germany). In the field, all green leaves and internodes of the main tiller were sampled and dried for 72 h in an oven at 70◦C before fine ground with a grinder (Thomas-Wiley Laboratory Mill Model 4, Thomas Scientific USA).

The method used for sugar content analysis in GH CNRS and GH IRRI was based on high performance liquid chromatography (HPLC). The sugars were extracted three times from 20 mg ground samples with 1 ml of 80% ethanol for 30 min at 75◦C, and then centrifuged for 10 min at 10,000 rpm. Soluble sugars (sucrose, glucose and fructose) were contained in the supernatant and starch in the sediment. The supernatant was filtered in the presence of polyvinyl polypyrrolidone and activated carbon to eliminate pigments and polyphenols. After evaporation of solute with Speedvac (RC 1022 and RCT 90, Jouan SA, Saint Herblain, France), soluble sugars were quantified by high performance ionic chromatography (HPIC, standard Dionex) with pulsated amperometric detection (HPAE-PAD). The sediment was solubilized with 0.02 N sodas at 90◦C for 1 h 30 min and then hydrolyzed with α-amyloglucosidase at 50◦C, pH 4.2 for 1 h 30 min. Starch was quantified as described by Boehringer (1984) with 5µL of hexokinase (glucose-6-phosphate dehydrogenase), followed by spectro-photometry of NADPH at 340 nm (spectrophotometer UV/VIS V-530, Jasco Corporation, Tokyo, Japan).

In the field IRRI, sugars were extracted two times from 200 mg ground samples with 7 ml of 80% ethanol for 10 min at 80◦C, and then centrifuged for 5 min at 3000 rpm. The residue was washed with 5 ml of hot 80% ethanol three times and combined all washings with the supernatant. The residue was dried at 70◦C for 24 h prior to starch assay. Total soluble sugars was determined through colorimetric by adding 5 ml anthrone to 0.5 ml aliquot (sugar extraction was diluted with 80% ethanol) then boiled for 10 min at 100◦C. After vortex mix and cooling on ice bath for about 5 min, followed by spectrophotometry at 620 nm (DU 800 UV/Vis spectrophotometer, Beckman Coulter). Dried residue was dropped with absolute ethanol and added with 2 ml of acetate buffer then boiled for 3 h at 100◦C. The tubes were cooled to 55◦C and proceed to hydrolysis step by adding 1 ml acetate buffer and 1 ml amyloglucosidase, then vortex mix. After incubated for 24 h at 37◦C, the hydrolysate (supernatant layer) was decanted and save combined with the residue that had been washed with 3 ml of distilled water. Starch was determined through colorimetric assay by adding enzyme Peroxidase Glucose Oxidase (PGO) to 0.6 ml aliquot (starch hydrolysis was diluted with distilled water), then followed by spectro-photometry at 450 nm (DU 800 UV/Vis spectrophotometer, Beckman Coulter) after incubated in the dark room for 30 min.

#### Physiological Maturity and Yield Component

In GH IRRI four plants were harvested per replication and separated into panicles, green leaf blades, dead tissues and productive tillers then processed as two other experiments described in Adriani et al. (unpublished data). We determined yield components as panicle number per plant and plant FGDW.

#### Response Rate

The rate of trait response to either qTSN introgression or to a reduced plant access to light was quantified as: (ref\_value– mod\_value)/ref\_value; where ref\_value is the trait value for the NIL (in the case of response to qTSN introgression) or for low plant access to light treatment (shading in GH-CNRS and HD in field-IRRI, in the case of response to low light quantification) and mod\_value is the trait value for the parent or for full light treatment, i.e., control in GH-CNRS and LD in field-IRRI). Response rates are synthesized in **Table 3**.

#### Data Analyses

The graphs describing plant morphogenesis, individual leaf area, grain production, relative NIL-P, and tillering rate relation to growth were represented with mean values and standard error (standard deviation divided by square root of the number of samples). Data of **Tables 2**, **3** and **Figures 3**, **4** were analyzed by an ANOVA procedure and mean comparisons between parent vs. NIL and between treatments for each pair of genotype were analyzed by Duncan's multiple range test using Microsoft <sup>R</sup> Excel 2010/XLSTAT-PRO statistical software (version 2014, Addinsoft, Inc., Brooklyn, NY, USA). SigmaPlot <sup>R</sup> Version 11.2 software (for Windows XP and below, copyright 2009–2010), Systat Software Inc. (Chicago, IL, USA) was used for plotting data and nonlinear regressions.

### RESULTS

#### QTL Effects on Plant Morphogenesis Under Full Light Conditions

Tiller dynamic is delayed in the presence of qTSN. This effect was observed at early tillering stage, i.e., starting at 400◦C days (accumulation of thermal time from sowing) when eight leaves had appeared on the main stem. This was true for both genetic backgrounds and experiments (**Figures 1A,D**, **2A,D**; **Figures S1B**, **S2B**) and resulted in a reduced tiller number at PI that was however only significant in the field (P < 0.001, see **Table 2** for ANOVA). In IR64 background, this reduction rate ranged from 8.5 to 16.7% in GH-CNRS and the highest reduction rate was observed for NIL1 in the field (between 26.3 and 32.7%; **Table 3A**). In IRRI146 background, the reduction rate ranged from 16 to 25.9% (GH CNRS) and up to 29.5% in the field (**Table 3A**). This reduction was associated with a smaller rate of tiller abortion until MAT for the NILs compared to the parents, resulting in a progressive convergence of tiller number of parents and NILs at MAT (**Figures 1A,D**, **2A,D**; **Figure S1B**; **Tables 2**, **3**). Nevertheless, fertile tiller number in IRRI146 background in the field kept smaller at MAT in the NIL compared to the parent (22.3 and 19% of reduction under LD and HD, respectively) as well as for the IR64 parent in HD treatment (21.3 and 24.4% less tillers in NIL and NIL1, respectively, compared to the parent).

The effect of qTSN on leaf appearance rate and final leaf number on the main stem was in general weak and not homogenous across treatments. The qTSN4 had a depressive effect on final leaf number on the main tiller in the field, although less pronounced in IRRI146 background (**Figures 2B,E**; **Figure S2A**). This could be related to a slightly lower rate of TABLE 2 | ANOVA of flag leaf area, tiller number at PI (Panicle Initiation), FLO (flowering; heading in GH-CNRS), MAT (grain physiological maturity), biomass related traits (DW, Dry Weight) at plant and main stem level at FLO and MAT for vegetative DW and at MAT only for panicle DW and FGDW (FG, Filled Grain).


leaf appearance considering the similar duration of the vegetative phase. In contrast, in GH-CNRS, higher leaf number (one more in average) in the presence of qTSN4 was observed in IR64 background only, which was appreciable at the end of panicle development (about 1300◦C days, at time of appearance of leaf 14; **Figure 1B**). This can be related to the fact that the vegetative phase was slightly longer in the NIL (later PI), of approximately one phyllochron, i.e., duration between the appearance of two consecutive leaves (**Figures 1B,E**; **Figure S1A**).

Similarly to that observed with leaf number, qTSN effect on stem length (considered here as the successive internodes excluding the peduncle) also differed from GH-CNRS to field-IRRI trials as well as between genetic backgrounds. An appreciable decrease in final stem length was observed in IR64 background in the presence of qTSN in GH-CNRS (**Figure 1C**), whereas in the field no qTSN effect on stem length was observed (**Figure 2C**; **Figure S2C**). No clear difference was observed in IRRI146 background (**Figures 1F**, **2F**; **Figure S1C**). The effect of qTSN on peduncle length differed with respect to the genetic background and the experiment. In GH-CNRS, the peduncle (the internode bearing the panicle) was significantly longer in the presence of qTSN4 in IR64 background (not presented) but it was the opposite in IRRI146 background (**Figure S1E**), which was confirmed in GH-IRRI (data not presented). In the field, the peduncle was shorter in the presence of qTSN12 in IR64 background (**Figure S2E**), whereas no effect was observed in IRRI146 background (not presented). In IRRI146 background in GH-CNRS, no significant difference in stem length was observed between parent and NIL until heading (**Figure S1C**), but thereafter qTSN4 positively affected the length of the top three internodes located just below the peduncle. The peduncle was, however, shorter in the NIL compared to the parent (**Figure S1E**).

More stable effect of qTSN could be observed on peduncle anatomy and thickness for both genetic backgrounds and trials. The increase of peduncle thickness was 44% in IRRI146 background in GH-CNRS (**Figure S1F**), and 14% in IR64 background (not presented). In the field, the increase of peduncle TABLE 3 | Response rate of traits to QTLs introgression in each pair of isoline in a given treatment (in the field-IRRI, LD is for Low Density, HD is for high density) and genetic background (IR64 and IRRI146) (A), to access to light in each trial (field-IRRI, GH-CNRS) (B), for each genotype (parent, NIL, NIL1).



Gray columns indicate no treatment effect as shading in GH-CNRS just imposed at PI.

thickness in the presence of qTSN was 20% in IR64 background (**Figure S2F**) and 17% in IRRI146 background (not presented). The characteristics of the peduncle were associated with thicker top internode (in the third internode below the peduncle in GH-CNRS, **Figure S1F**; in top internode in the field, **Figure S2F**) and higher number of vascular bundles in the peduncle (data not presented) in the NILs compared to the parents.

A positive effect of qTSN on leaf area was observed for the flag leaf (FL) (significant in IR64 background for both trials) and the two to three leaves below the flag leaf (FL-3 or FL-2), but it was more pronounced for FL (**Figure 3**). In GH-CNRS, the increase was 53% for IR64 background (**Figure 3A**; **Table 3A**) and 109% in IRRI146 background (**Figure 3B**; **Figure S1D**; **Table 3A**), which was mainly explained by an increase of leaf length, whereas the width was not affected (data not presented). In the field, qTSN effect on the leaf area of the top leaves was already expressed from FL-3 upward for both genetic backgrounds (and significant for all these leaves only in NIL1) (**Figures 3C,D**), with 34% of increase in IR64 background (**Figure S2D**; **Table 3A** for QTL effect in NIL1) and 29% of increase in IRRI146 (**Table 3A**). In GH IRRI it was expressed and significant from FL-2 upward (**Figure 3E**). In the field and GH IRRI, the positive effect of qTSN on individual leaf size was mainly supported by the width (data not presented) rather than the length.

#### Under Low Light Conditions

Traits related to plant growth and development were more affected by the treatment in the field. This can be easily explained by the fact that treatments were established by planting density from sowing onwards in the field, while shading treatment in GH-CNRS was imposed only from PI time to heading. Accordingly, early tillering (**Figures 1A,D**, **2A,D**; **Tables 2**, **3B** for the rate of trait plasticity in response to light treatment) and leaf appearance rates (**Figures 1B,E**, **2B,E**) were poorly affected by shading in GH-CNRS. By contrast stem length was more affected by shading in GH-CNRS, i.e., decreased, as observed in field-IRRI but only in HD treatment in IRRI146 isolines.

Peduncle and internode thickness were also reduced under low access to light in both experiments but only in the NILs, whereas they were not modified in the parents (not presented). Individual leaf size was almost not affected by the reduction of incoming light. In GH-CNRS, main tiller FL size of that parents of both backgrounds increased by about 30% under shading, whereas under the same conditions it was maintained or reduced

with the NILs. In the field, slight increase in FL area due to high density was observed only for the parent of IR64 background (2%) (**Figures 3A,D**; **Table 3B**).

As observed under full light conditions, tiller number was reduced with qTSN under low plant access to light (shading in GH-CNRS, HD in the field), however, the difference between NIL and parents was not as strong as observed under higher plant access to light (**Figures 1A,D**, **2A,D**; **Table 3B**), and final tiller number was similar between NILs and parents at maturity. The qTSN effect on leaf appearance (positive for IR64 background in GH-CNRS; negative in the field; unchanged for IRRI146 background in both trials) and stem elongation (negative or unchanged) was similar than that observed in full light conditions (**Figures 1**, **2**). The qTSN positive effect on individual leaf size under low plant access to light was appreciable but less pronounced than that observed under high access to light, as the low access to light increased top leaf size of the parents but not of the NILs (**Figure 3**; **Table 3B**). No QTL effect was observed on the traits related to peduncle and internode anatomy under low light condition (not presented).

#### Biomass, Leaf Area and Grain Productions

At plant level, in GH-CNRS, qTSN4 increased plant final grain production (FGDW) which was significant in all cases (**Table 2**) except for IRRI146 background under shading (**Figure 4B**; **Table 3A**). Meanwhile, plant shoot biomass was not affected by qTSN4 neither at PI and FLO (heading in GH-CNRS) (not presented) nor at physiological maturity (**Figure 4C**; **Tables 2**, **3A**). In the field, qTSN poorly affected grain and straw biomass and not systematically in a positive way. An increase could be observed only in IR64 background in plant grain production but this was not significant. The qTSN effect was even significantly negative on plant grain production in IRRI146 under LD (**Figure 4E**; **Table 3A**). These contradictory results between GH-CNRS and field conditions regarding qTSN4 effect on grain production at plant level were also observed in GH-IRRI where

no significant QTL effect on plant grain production was observed (results not presented). The leaf area per plant at flowering time was unaffected by qTSN but the distribution of leaf area was modified in a way that individual leaves were larger but fewer in the presence of qTSN. This was true for both genetic backgrounds in both trials (**Figures 4D,H**). However, plant leaf area was affected by light conditions in the field, where it was higher in full light (LD) compared to low light (HD) condition.

At main stem level, panicle dry weight at MAT was increased by qTSN in both genetic backgrounds and trials (**Figures 4B,F**; **Table 2**; P < 0.01). This increase was more pronounced and systematic in GH-CNRS. It was generally higher under low light conditions for IR64 background (115% under shading and 51% under HD in NIL1, but no effect observed in the NIL of IR64 in field-IRRI, **Table 3A**), whereas it was more homogenous among treatments for IRRI146 (**Table 3A**). The main stem panicle DW was systematically reduced by low access to light and no qTSN × treatment interactions were observed (**Tables 2**, **3B**). The main stem DW at FLO was systematically increased by qTSN (P < 0.01, **Table 2**), and this was generally stronger in IR64 background compared to IRRI146 and more particularly in field-IRRI. This qTSN effect was not maintained until MAT as no significant difference for main stem DW between parents and NIL could be observed at that stage (**Tables 2**, **3**). Low plant access to light systematically reduced main stem DW at FLO, more particularly in GH-CNRS (**Table 3**). This was maintained until maturity only in the field-IRRI as no more significant treatment effect was observed in the GH-CNRS at this time (**Tables 2**, **3**). No qTSN × treatment interaction for main stem DW was observed, neither at MAT or FLO, except in field-IRRI at FLO (**Table 2**).

#### Relationship Between Tillering Dynamics and Main Stem Growth Rate

Overall, above-mentioned results pointed out two key nodes of regulation of plant phenotypes due to qTSN introgression,

namely: (i) tillering and tiller number, generally reduced by qTSN and (ii) main stem biomass (either before or after FLO), generally increased by qTSN (**Table 3A**). In association with the opposite effect of qTSN4 on these two traits, no additional difference was observed between parents and NILs regarding the resulting plant shoot DW at FLO and MAT. Results of plant FGDW were, however, dependent on cropping conditions (**Figures 4A,D**; **Table 3A**). In order to further explore the relationship between tillering and main stem DW, the change of main stem growth rate during panicle development (PI– FLO period) was plotted against tillering rate before PI. A negative correlation could be observed between these variables, stronger in IRRI146 (R <sup>2</sup> = 0.5) than in IR64 (R <sup>2</sup> = 0.28) backgrounds (**Figures 5A,B**). Interestingly, the average value for the NIL was systematically at a higher position on the y-axis (with reference to main stem growth rate from PI to FLO) compared to that of the parents. However, the correlation disappeared when analyzing the same relationship in each experiment separately (**Figures 5C,D**).

In order to evaluate whether this early trade-off between tillering and main stem growth rate (from PI to FLO) could impact grain production, main stem growth rate from PI to FLO was plotted against main stem panicle DW at maturity. This is presented in **Figures 6A,B**, showing a slightly positive correlation between these variables (R <sup>2</sup> = 0.12) when analyzing data from parents and NILs together, whereas there was no correlation in IRRI146 (R <sup>2</sup> = 0.02) background. This positive correlation was getting even stronger when considering trials separately

(**Figures 6C,D**), in particular at GH-CNRS. In all situations, the average of NIL values showed a higher main stem growth rate from PI to FLO related to a bigger panicle DW on the main stem at maturity.

The relationship between main stem and plant shoot growth rates from PI to FLO was thereafter explored, and no significant correlation was observed (not presented). Meanwhile, plant grain production was positively correlated to the whole plant shoot growth rate from PI to FLO (**Figure 7**). This correlation was higher in IR64 (R <sup>2</sup> = 0.49) than in IRRI146 (R <sup>2</sup> = 0.37) background and in field-IRRI (R <sup>2</sup> = 0.50) than in GH-CNRS (R <sup>2</sup> = 0.28) trial. Interestingly, with respect to this correlation, the NILs performed better than parents only in GH-CNRS whatever the genetic background (**Figure 7C**), i.e., in the cropping situation where the tradeoff between tillering and main stem growth rates was the lowest for the NIL and the nearest from that of parents (**Figure 5C**). It can be mentioned that plant grain production was not correlated to main stem growth rate from PI to FLO (**Figure S3**).

#### Carbon Assimilation and Sugar Related Traits

In order to identify whether the difference in main stem growth rate was associated with a particular metabolic pattern, starch and net assimilation rate at ambient CO<sup>2</sup> concentration of 400 ppm were quantified during panicle development. In GH-CNRS, qTSN4 enhanced assimilation only in shading treatment by 33% for IR64 and 24% for IRRI146 background (**Figure 8**). Shading significantly reduced assimilation in parent lines, 29 and 16% for IR64 and IRRI146 background, respectively, whereas in the NILs, assimilation was maintained under shading (**Figure 8A**). In the field, no significant qTSN and treatment effect was observed on assimilation, even if it was slightly decreased with qTSN in both backgrounds (**Figure 8B**). In GH-IRRI, qTSN4 increased assimilation under low light (crowded plants) conditions but it was the opposite under full light (isolated plants) conditions (**Figure S4A**).

Internode starch content increased in the presence of qTSN for both backgrounds across the trials (**Figures 8C,D**; **Figure S4B**). The qTSN4 effect was stronger under full light conditions (control in GH-CNRS, LD in the field, isolated in GH-IRRI), and significant in IRRI146 background in GH-CNRS (18%) (**Figure 8C**) and IR64 background in the field (11%) (**Figure 8D**). Similar trend was observed in leaf blade, with no significant effect of qTSN across the trials (not presented). However, in most cases, leaf starch was strongly reduced under low plant access to light.

## DISCUSSION

The isolines (NIL) used in this study carried qTSN4 or qTSN12, known to enhance leaf and panicle sizes but to reduce panicle number in some environmental situations (Fujita et al., 2013; Okami et al., 2015). The present study aimed at better characterizing this trade-off by comparing the NILs to their recurrent parents IRRI146 and IR64 regarding morphogenesis and C source-sink balance along the whole plant cycle and their behavior under low access to light.

#### The QTSN Affects Rice Morphogenesis and Physiology at Earlier Stage than Expected

A reduction of the rate of tiller emergence before PI (as early as 400◦C days) was observed in this study in the presence of qTSN for both genetic backgrounds under both treatments. This was, however, not addressed in previous studies (Fujita et al., 2012, 2013) where the breeders mainly focused their attention on latter traits measured between flowering and maturity. Nevertheless, Okami et al. (2015) confirmed that tiller number was reduced with qTSN4 under drought stress at vegetative stage as tillering rate per unit of above-ground biomass of the NIL was lower than that of the parent, which is in line with the present study. But in contrast to the present study, no difference between P and NIL was observed regarding tillering dynamics under well water conditions (Okami et al., 2015) and the ratio of main stem

leaf area to tiller number (Okami et al., 2012). Interestingly, in the present study, main stem growth rate between PI and FLO was inversely proportional to tillering rate before PI (**Figure 5**). Considering that reduction in tillering rate is expected to provide more assimilate available to the growing stems, the difference in main stem growth rate appears as a consequence of the change in early tillering rate. Several hypotheses subtending this correlation can be raised. On one hand, it can be hypothesized that qTSN implies a higher apical dominance due to hormonal signals, e.g., in relation to strigolactone (Jamil et al., 2012) or other hormones (ABA, IAA, GA3; Liu et al., 2011). This may be associated with a higher sensitivity to the red/far red ratio within the canopy maybe brought by qTSN. Indeed cessation of tillering in crops has been widely reported to be correlated with the increase, with crop age, in red/far-red ratio sensed by the plant within the canopy, even before any C limitation occurred within the plant (Ballaré and Casal, 2000; Ugarte et al., 2010). On the other hand, sink strength of growing stems should be higher at early stage in the presence of QTL, due to the initiation and pre-dimensioning of larger organs at meristem level, which may decrease tiller bud outgrowth. In this latter case, tillering would be reduced by a competition (or at least a signaling of competition) for resources active meristem and organs and tiller bud outgrowth, as pointed out by Rebolledo et al. (2012) across japonica rice genetic diversity. Whatever the hypothesis, the reduction in early tillering rate coincides with the appearance of leaf 8 in GH (rather leaf 7 in the field) so with the initiation of leaf 10 or 11, based on the developmental pattern established by Nemoto et al. (1995). Interestingly in GH, the leaves with larger area in the presence of QTL were the 3–4 top leaves of the main stem, so those initiated right after leaf 11.

The size of organs was enhanced by qTSN4 on the main stem, from 3 to 4 leaves below the flag leaf up to the panicle. In addition to individual leaf area from the upper phytomers, internode length, and internode and peduncle thickness, were also enhanced with qTSN, while peduncle length and plant height were reduced. This finding was also reported in a recent study using the same genetic materials but comparing genotype behavior under drought and well-watered conditions (Okami et al., 2015). This is in line with Fujita et al. (2013) using the same genetic materials and Wu et al. (2011) using other genetic materials revealing that rice plants with larger culm diameter exhibited longer and wider flag leaf, more grains per panicle, as well as lower tiller number. Furthermore, Liu et al. (2008) confirmed that a thicker peduncle plays an important role in the determination of panicle size and grain yield potential.

This behavior in the presence of qTSN may be thus related to a stronger expression of apical dominance as the changes in organ size could not be detected at plant level: plant leaf area and shoot biomass at flowering were not different between parents and NILs, mainly due to the compensation between organ size and number. Nevertheless, the increase in C assimilation (in GH

under shading) and in sugar (in particular starch) storage in stems (only in GH) suggested that qTSN enhances C availability within the main tiller at least during panicle development. Whether this is related to leaf anatomy and the elaboration of plant leaf area based on larger, thicker but fewer leaves needs to be confirmed. The fact that qTSN4 co-localizes with NAL1, a gene involved in leaf anatomy, veining pattern and carboxylation, provides further insight to this QTL (Qi et al., 2008).

#### The QTSN Early Trade-Off Between Tiller and Main Stem Growth Partially Explains its Environment Dependent Effect on Plant Grain Production

The reduction of early tillering and its benefits for main stem growth rate in the presence of qTSN was dependent on the genetic background and the environment: it was more pronounced in GH-CNRS and in IR64. In addition, the positive effect of qTSN on plant grain production was mostly revealed in GH-CNRS while it was weak or inexistent in the field. This can be explained by the fact that in GH-CNRS fertile tiller (i.e., panicle) number was less reduced by qTSN4 compared to that observed in the field, while its effect on panicle size was strong. The low effect in the field cannot be totally interpreted, but it is in line with a previous study (Okami et al., 2014) that revealed the absence of difference in grain yield between parents and NILs of IR64 background during three summer seasons under flooded and aerobic conditions in Japan. The same authors, however, reported some differences under aerobic conditions in one season with low nitrogen supply (90 kg N ha−<sup>1</sup> ). These results support the present study suggesting that a positive effect of the qTSN may be expressed depending on the cropping environment, where stressing environments (low light, low N, low soil exploration as in pots in greenhouse) should favor the benefits of qTSN on panicle size and, if ever, on grain production.

### Earlier Tiller Cessation and its Implication for Yield Potential

Increasing yield potential by inducing an earlier cessation of tiller production was already proposed in previous studies comparing performances of hybrids and inbreds (Bueno and Lafarge, 2009; Lafarge et al., 2010). An earlier cessation of tillering was then related to an earlier biomass accumulation within reproductive stems and to a higher biomass remobilization from internodes to panicles, as also promoted by a higher sensitivity to the red/far red ratio within the canopy (Ballaré and Casal, 2000). Interestingly, the present study also highlighted the correlation between an earlier tillering cessation and higher main stem biomass growth, in association with a larger sink size. In the present study, however, these correlations were weak at plant level in the field particularly, highlighting the complexity

of the GxE interactions and trade-offs underlying the qTSN effect on plant grain production. It will be interesting to pay more attention on other fertile tillers to better understand qTSN impact on the whole plant growth and grain production. Nevertheless, this study reinforces the interest of developing genotypes optimizing tillering dynamics as long as yield potential is concerned.

## CONCLUSION

The qTSN was confirmed in this study as a QTL potentially increasing panicle size due to an increase in stem growth rate, and in the size of the top leaves and internodes, at least at the main stem level. However, the trade-off between panicle size and panicle number was identified as the key node modulating the environment-dependent qTSN positive effect on plant grain production. This study revealed indeed that this trade-off was already visible at early stage through an earlier cessation of tiller production due to qTSN introgression, which coincided with the initiation at meristem level of phytomers with potentially larger leaves and internodes. Although it cannot be concluded if this early effect impacts directly tillering or organ dimensioning at meristem level, it seems worth going deeper in the understanding of the physiological regulation of this allele on plant functioning including contrasted cropping condition like limited radiation (i.e., during wet season in the tropics or N availability) in further studies.

## AUTHOR CONTRIBUTIONS

TL, SY, MD, and DL participated in the design of the study; DA, AD participated in performing the research; AF and AC participated in biochemistry analysis; DA, TL, and DL participated in data analysis and wrote the manuscript. All authors read and approved the final manuscript.

## ACKNOWLEDGMENTS

This work was a part of GRiSP (Global Rice Science Partnership) consortium and Yield Potential Project, in the scheme of double degree program between Bogor Agricultural University, Indonesia and Montpellier SupAgro, France. The authors would like to gratitude to Faculty of Agriculture, University of Lambung Mangkurat, Indonesia for supporting the PhD program; and to the Directorate General of Higher Education of Indonesia, Agris Mundus (Agreenium) and CIRAD for providing a PhD scholarship. Lastly we thank Nicole Sonderegger, Armelle Soutiras, and John Julius Manuben for sugar content analyses; and Denis Fabre and Bermenito Punzalan for their expertise on ecophysiological measurements.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 01197

Figure S1 | The relative (NIL-P) of IRRI146 background in GH-CNRS under control. Leaf number on the main tiller (A) Tiller number per plant (B) Stem length of the main tiller (C) Individual leaf area at flowering and spikelet number per panicle at maturity (D) Peduncle and internode length at maturity (E) Peduncle and internode thickness at maturity (F).

Figure S2 | The relative (NIL1-P) of IR64 background in field-IRRI under low density. Leaf number on the main tiller (A) Tiller number per plant (B) Stem length of the main tiller (C) Individual leaf area at flowering and spikelet number per panicle at maturity (D) Peduncle and internode length at maturity (E) Peduncle and internode thickness at maturity (F).

Figure S3 | Relationship between plant shoot growth rate from panicle initiation (PI) to flowering (FLO) and plant grain dry weight at maturity of

during panicle development (B) of parent (black), NIL (gray) under isolated and crowded population in GH-IRRI. The values are mean ± SE. Results of Duncan test for multiple comparisons of each genotype per treatment at 5% level are shown in the letters above the bars. n = 3.

regression curve.

the parent (black symbol) and the NIL (gray symbol), in IR64 background in GH-CNRS and field trials (A) in IRRI146 background in GH-CNRS and field trials (B) in GH-CNRS in IR64 and IRRI146 backgrounds (C) in field-IRRI experiment in IR64 and IRRI146 backgrounds (D). The values are mean ± SE. Regression curves are associated with confidence interval at P = 0.05. n = 42 for IR64 background and field-IRRI trial, n = 34 for IRRI146 background and GH-CNRS trial for

Figure S4 | Carbon assimilation (A) and internode starch concentration

#### REFERENCES


Indica-type rice variety, IR64, for unique agronomic traits and detection of the responsible chromosomal regions. Field Crops Res. 114, 244–254. doi: 10.1016/j.fcr.2009.08.004


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Adriani, Lafarge, Dardou, Fabro, Clément-Vidal, Yahya, Dingkuhn and Luquet. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# High-Throughput Non-destructive Phenotyping of Traits that Contribute to Salinity Tolerance in Arabidopsis thaliana

Mariam Awlia<sup>1</sup> , Arianna Nigro<sup>2</sup> , Jirí Fajkus ˇ 3 , Sandra M. Schmoeckel<sup>1</sup> , Sónia Negrão<sup>1</sup> , Diana Santelia<sup>2</sup> , Martin Trtílek<sup>3</sup> , Mark Tester<sup>1</sup> , Magdalena M. Julkowska<sup>1</sup> \* and Klára Panzarová<sup>3</sup> \*

<sup>1</sup> Division of Biological and Environmental Sciences and Engineering, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia, <sup>2</sup> Institute of Plant and Microbial Biology, University of Zurich, Zurich, Switzerland, <sup>3</sup> PSI (Photon Systems Instruments), Drásov, Czech Republic

#### Edited by:

John Doonan, Aberystwyth University, UK

#### Reviewed by:

Hao Peng, Washington State University, USA Konstantinos Vlachonasios, Aristotle University of Thessaloniki, Greece

#### \*Correspondence:

Klára Panzarová panzarova@psi.cz Magdalena M. Julkowska magdalena.julkowska@kaust.edu.sa

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 30 April 2016 Accepted: 05 September 2016 Published: 28 September 2016

#### Citation:

Awlia M, Nigro A, Fajkus J, Schmoeckel SM, Negrão S, Santelia D, Trtílek M, Tester M, Julkowska MM and Panzarová K (2016) High-Throughput Non-destructive Phenotyping of Traits that Contribute to Salinity Tolerance in Arabidopsis thaliana. Front. Plant Sci. 7:1414. doi: 10.3389/fpls.2016.01414 Reproducible and efficient high-throughput phenotyping approaches, combined with advances in genome sequencing, are facilitating the discovery of genes affecting plant performance. Salinity tolerance is a desirable trait that can be achieved through breeding, where most have aimed at selecting for plants that perform effective ion exclusion from the shoots. To determine overall plant performance under salt stress, it is helpful to investigate several plant traits collectively in one experimental setup. Hence, we developed a quantitative phenotyping protocol using a high-throughput phenotyping system, with RGB and chlorophyll fluorescence (ChlF) imaging, which captures the growth, morphology, color and photosynthetic performance of Arabidopsis thaliana plants in response to salt stress. We optimized our salt treatment by controlling the soil-water content prior to introducing salt stress. We investigated these traits over time in two accessions in soil at 150, 100, or 50 mM NaCl to find that the plants subjected to 100 mM NaCl showed the most prominent responses in the absence of symptoms of severe stress. In these plants, salt stress induced significant changes in rosette area and morphology, but less prominent changes in rosette coloring and photosystem II efficiency. Clustering of ChlF traits with plant growth of nine accessions maintained at 100 mM NaCl revealed that in the early stage of salt stress, salinity tolerance correlated with non-photochemical quenching processes and during the later stage, plant performance correlated with quantum yield. This integrative approach allows the simultaneous analysis of several phenotypic traits. In combination with various genetic resources, the phenotyping protocol described here is expected to increase our understanding of plant performance and stress responses, ultimately identifying genes that improve plant performance in salt stress conditions.

Keywords: high-throughput phenotyping, Arabidopsis thaliana, salt stress, salinity tolerance, shoot-ion independent tolerance, kinetic chlorophyll fluorescence imaging, color segmentation

## INTRODUCTION

fpls-07-01414 September 26, 2016 Time: 16:38 # 2

Climate change and population growth place a twofold pressure on agricultural crop production. Crop yields need to be sustained and increased while grown in unfavorable environments (Godfray et al., 2010; Tester and Langridge, 2010; Tilman et al., 2011). To meet future food demands, breeding efforts are targeting more resource-efficient and stresstolerant crops by combining large-scale plant phenotyping with genome sequencing in forward genetics studies. Phenotypic traits, including growth rate, size, shape, color, temperature and photosynthetic activity, are traditionally studied to evaluate plant performance under stress (Sirault et al., 2009; Zhang et al., 2012; Hairmansis et al., 2014; Chen et al., 2015; Ghanem et al., 2015). Plant breeding programs aimed at enhancing plant performance should investigate growth and photosynthetic activity in tandem because these processes are interdependent (Longenberger et al., 2009).

Advances in non-destructive image-based phenotyping technologies are enabling parallel studies of plant growth and photosynthetic performance over time (Munns et al., 2010; Dhondt et al., 2013; Hairmansis et al., 2014) using RGB and chlorophyll fluorescence (ChlF) imaging (Dhondt et al., 2013; Brown et al., 2014; Humplik et al., 2015). Traits related to plant growth, architecture and development have been quantified from digital color imaging, while leaf color is a simple, under-utilized trait that indicates plant health and leaf senescence (Berger et al., 2012). Kinetic ChlF imaging is a powerful tool for measuring plant photosynthetic capacity and provides valuable insights into the performance of photosynthetic apparatus (Oxborough, 2004; Baker, 2008). Light energy, captured by chlorophyll molecules, can undergo one of three fates: (1) be used to drive photosynthesis by photochemistry, (2) be dissipated as heat, or (3) be re-emitted as fluorescence. Because these three processes co-exist in close competition, the ChlF yield provides information on both the quantum efficiency of the plant's photochemistry and on the amount of heat dissipated. Under conditions of stress, the photochemical yield decreases, which in turn, causes heat dissipation and ChlF emissions to increase (Maxwell and Johnson, 2000; Murchie and Lawson, 2013). Although, high-throughput phenotyping of photosynthetic performance has previously been employed to study plant response to cold (Jansen et al., 2009; Humplik et al., 2015) and drought (Bresson et al., 2015), few studies have established an integrative approach that simultaneously analyzes plant growth and photosystem II (PSII) efficiency (Humplik et al., 2015). Systems for kinetic ChlF imaging have not been widely integrated into high-throughput phenotyping platforms, in contrast to single-level steady-state ChlF imaging. The latter only reflects chlorophyll content and not PSII activity. This means that using single-level steady-state ChlF can only discriminate between healthy, senescing and stressed leaves by the amount of chlorophyll they degrade (Campbell et al., 2015).

Soil salinity is a key stress factor that affects agriculture on a global scale (Munns and Tester, 2008; Cabot et al., 2014; Roy et al., 2014). In saline soil, plants accumulate ions in their shoots, compromising plant growth with ion toxicity, for example by reducing the rate of photosynthesis (Munns and Tester, 2008). During the early phase of salt stress, before ions accumulate significantly in the shoots, the osmotic phase of salinity tolerance takes place, which is referred to as shoot ion-independent tolerance (SIIT; Roy et al., 2014). During this phase, growth reduction is caused by decreased leaf emergence and expansion (Fricke et al., 2006; Munns and Tester, 2008; Berger et al., 2012). Mechanisms underlying ion sensing, cell cycle, cell expansion, and stomatal conductance are likely underlying this response (Ma et al., 2006; Stephan and Schroeder, 2014). Thus, performing regular growth measurements from when salt stress is introduced to the plant across an extended period of time provides the opportunity to discriminate between early (osmotic) and late (ionic) growth-related responses to salt stress. During the early phase of salt stress, the capacity of photosynthetic machinery is reduced (James et al., 2006; Chaves et al., 2009). In later phases, excessive photonic energy causes photochemical inactivation (Muranaka et al., 2002) that reduces PSII stability (Stepien and Johnson, 2009), limits stomatal gas diffusion and causes changes in carbon assimilation rates (Chaves et al., 2009), ultimately resulting in decreased photosynthetic activity. Additional limits to photosynthesis may be caused by the accumulation of unused organic compounds from carbon assimilation (Munns and Tester, 2008). Early responses of plants to salinity have previously been quantified by measuring rosette area, color, temperature, and photosynthetic activity using steady-state ChlF (Rajendran et al., 2009; Sirault et al., 2009; Berger et al., 2012; Chen et al., 2015). Here, we quantify how traits related to plant morphology, color and photosynthetic activity respond to salinity in one experimental setup to establish significant correlations among individual phenotypes.

We developed a phenotyping method that monitors plant responses to salt stress by evaluating plant growth, color and photosynthetic traits using an automated, high-throughput system. We grew Arabidopsis plants in soil and maintained them at 40, 60, or 80% of the soil-water holding capacity to achieve approximately 150, 100, and 50 mM NaCl, respectively. We established 100 mM NaCl as the optimal condition for salt treatment. To characterize the early and late plant responses to salt stress, we investigated RGB, greenness and photosynthesisrelated traits. Traits of ChlF were clustered with relative plant performance values into groups corresponding to early and late responses to salt stress. This work provides the means for screening natural diversity panels and mapping populations to identify candidate genes underlying plant development and stress tolerance.

#### MATERIALS AND METHODS

#### Plant Material and Growing Conditions

Accessions of Arabidopsis Columbia-0 (Col-0) and C24 were used to establish the cultivation, salt treatment and phenotyping protocol. Thereafter, nine accessions of Arabidopsis [Col-0, C24, Canary Islands (Can), Coimbra (Co), Cape Verde Islands (Cvi), Landsberg erecta (Ler), Niederzenz (Nd), Rschew (Rsch) and Tenela (Te)] were used to optimize this protocol

and investigate the natural variation of plants in response to salt stress (Hannah et al., 2006). Seeds were sown into pots (70 mm × 70 mm × 65 mm, Poppelman TEKU DE) containing 60 g of freshly sieved soil (Substrate 2, Klasmann-Deilmann GmbH, Germany) and watered to full soil-water holding capacity. Seeds were stratified for 3 days at 4◦C in the dark. All plants were grown in a climate controlled growth chamber (FS\_WI, PSI, Czech Republic) with cool-white LED and far-red LED lighting. The protocol was setup with Col-0 and C24 grown in a 10 h/14 h 21◦C/15◦C light/dark cycle at a relative humidity of 60% and a photon irradiance of 250 µmol m−<sup>2</sup> s −1 . The protocol was optimized for the nine accessions using a 12 h/12 h 22◦C/20◦C light/dark cycle with a relative humidity of 55% and an irradiance of 150 µmol m−<sup>2</sup> s −1 . Seven days after stratification (DAS), seedlings of similar size were transplanted into soil that had been watered 1 day in advance to full soil-water holding capacity. Plants were cultivated in the growth chamber until most plants were at the 10-leaf stage (24 DAS for plants in the 10 h/14 h light/dark cycle and 21 DAS in the 12 h/12 h light/dark cycle). The growth timeline for Col-0 and C24 plants illustrates the implementation of the three watering regimes, the salt treatment and the phenotyping protocol (**Figure 1**).

#### Watering and Salt Treatment

Similar to Junker et al. (2015), we determined the soil waterholding capacity by filling 10 pots with 60 g of sieved soil and drying them for 3 days at 80◦C to completely desiccate the soil. Soil was then saturated with water and left to drain for 1 day before weighing. Based on the soil-water content at 100% (130 g), 40, 60, and 80% of the soil-water holding capacity were found to weigh 52, 77, and 103 g, respectively. At 14 DAS, Col-0 and C24 seedlings were placed randomly in trays (5 × 4 pots per tray) and into the PlantScreenTM Compact System (PSI, Czech Republic) and automatically weighed and watered every other day to reach and maintain the reference weight corresponding to the desired soil-water contents (**Figure 1A**). Once at the 10-leaf stage at 24 DAS, nine replicates per accession were placed in 250 mM NaCl or dH2O for 1 h to ensure full saturation of the soil. Pots were left to drain for 10 min before being placed in the phenotyping system. Effective NaCl concentration in the soil was estimated as 150, 100, and 50 mM NaCl in plants watered to 40, 60, and 80% soil-water holding capacity, respectively, representing conditions of severe, moderate, and mild salt stress (**Figure 1B**). For 11 days, with the exception of days 5 and 6 (**Figure 1C**), plants were transferred manually from the growth chamber to PlantScreenTM for image acquisition and then returned to the same positions inside the growth chamber. On the final day of imaging, the water content of the soil was found to be approximately 60– 70%, indicating that plants had adequate soil moisture during the phenotyping period.

After analyzing the effects of the three watering regimes, 60% soil-water content and 100 mM NaCl were established as effective conditions for investigating early responses to salt stress. We then optimized the protocol, in terms of growth conditions and ChlF imaging, using nine accessions of Arabidopsis. At 18 and 20 DAS pots were watered to the target soil-water content. Once at the 10-leaf stage, at 21 DAS, eight replicates per accession were placed in a 250 mM NaCl solution or dH2O for 1 h to

FIGURE 1 | Watering regime, salt stress treatment and phenotyping protocol. (A) Col-0 and C24 were sown, watered to full soil-water saturation, stratified, then germinated under short day conditions. Similar-sized seedlings were transplanted into freshly sieved soil 7 days after stratification (DAS). At 14 DAS, watering was controlled to reach 80, 60, or 40% of the soil-water holding capacity. (B) Seedlings at the 10-leaf stage (24 DAS) were saturated for 1 h in a 250 mM NaCl solution to reach concentrations of 50, 100, and 150 mM NaCl, while control pots were saturated in dH2O. (C) The PlantScreenTM Compact System performed chlorophyll fluorescence (ChlF) and RGB imaging, as well as automatically weighing and watering the plants. The lower panel depicts the timeline of the experiment.

ensure saturation of the soil. Plants were imaged for 7 days with no additional watering. The final soil-water content was approximately 70–80%.

#### High-Throughput Phenotyping

fpls-07-01414 September 26, 2016 Time: 16:38 # 4

Control and salt-treated plants were automatically phenotyped for RGB and kinetic ChlF traits using PlantScreenTM (Supplementary Figures S1 and S2) from 1 h after introducing salt stress. The phenotyping was conducted for 11 days to develop the protocol and 7 days to investigate natural variation among the nine accessions. Trays were transported within PlantScreenTM on conveyor belts between the light-isolated imaging cabinets, weighing and watering station and the dark/light acclimation chamber. A single round of measuring consisted of an initial 15 min dark-adaptation period inside the acclimation chamber, followed by ChlF and RGB imaging, weighing and watering. Pixel count, color and fluorescence intensity were evaluated from the images. A total time of 1 h and 40 min was required to measure 10 trays (200 plants). The PlantScreenTM Analyzer software (PSI, Czech Republic) was used to automatically process the raw data.

#### RGB Imaging and Processing

Trays were loaded into the imaging cabinet of the PlantScreenTM platform with three RGB cameras (one top and two side views) mounted on robotic arms, each supplemented with an LED-based light source to ensure homogenous illumination of the imaged object. To assess plant growth and morphological traits, RGB images (resolution 2560 × 1920 pixels) of 5 × 4 plants per tray were captured using the GigE uEye UI-5580SE-C/M 5 Mpx Camera (IDS, Germany) from the top view only. Light conditions, plant position and camera settings were fixed throughout the experiments. The PlantScreenTM software required three steps to extract features from the RGB images. (1) Basic processing applied in real-time involving correction for barrel (fisheye) distortion, tray detection, cropping of individual pots, background subtraction to remove nonplant pixels from the images, filtration and artifact removal to produce binary (black and white) and RGB representations of each plant [binary images represent the plant's surface (white) and background (black)]; (2) morphologic analysis, requiring separation of the background from the plant shoot tissue allowing the pixel number per plant and rosette area to be counted; and (3) analysis of greenness using background-subtracted RGB images to evaluate the color. For this step, the images were color-segmented to represent and evaluate rosette coloring (Supplementary Figure S1). The morphometric parameters area, perimeter, roundness, compactness, rotational mass symmetry, eccentricity and slenderness of leaves were computed from the RGB image processing and have been listed and defined in Supplementary Table S1.

#### Plant Growth-Related Parameters

To evaluate the effect of salt stress on early and late plant growth rates (GR), we examined the increase in projected rosette area over time by fitting a linear function to two time intervals. The regression coefficient of the fitted function was determined and used as a trait in the statistical analysis. Relative effects of salt stress were calculated by dividing the estimated growth rates (GR) in salt conditions by the average in control conditions (GR (salt) /GR (control) ). The calculation was performed for each accession over two time intervals (0–4 days and then 7–11 days when developing the protocol, and 0–3 days and then 4–7 days when examining natural variation among accessions; **Figures 2D** and **5C**). This ratio has been termed the shoot ion-independent tolerance (SIIT) index, which is used as a measure of plant salinity tolerance (Roy et al., 2014). The effects of salt stress were determined by performing an analysis of variance (ANOVA) per accession and treatment with Tukey's post hoc test of significance for each RGB trait (p-value < 0.05).

#### Color Segmentation and Evaluation of Greenness

Using color segmentation, we analyzed the change in rosette coloring. We calibrated the analysis by using RGB images from both control and salt-stressed conditions and from the start, middle and end of the phenotyping period to obtain an unbiased color scale (Supplementary Figure S1). Values in the RGB channels, from each pixel corresponding to the plant's surface area, were extracted to serve as a dataset for k-means clustering. Nine clusters were sufficient for optimal color differentiation and all input pixels were partitioned according to their Euclidean distance in the RGB color space. The RGB coordinates of cluster centroids were used as base hues to evaluate greenness. Original pixel color was approximated from the nearest cluster centroid, yielding color-segmented images. To calculate the relative hue abundance independent of the rosette area, pixel counts of individual hue values were divided by the rosette area of the same plant on the same day (**Figure 3**). The effect of salt stress was determined by performing an ANOVA per accession and treatment with Tukey's post hoc test of significance for each greenness hue (p-value < 0.05).

#### Kinetic Chlorophyll Fluorescence Imaging and Processing

To assess the effect of salt stress on photosynthetic performance, ChlF measurements were acquired using an enhanced version of the FluorCam FC-800MF pulse amplitude modulated (PAM) chlorophyll fluorometer (PSI, Czech Republic). The ChlF imaging station was mounted on a robotic arm with an LED light panel and a high-speed charge-coupled device camera (pixel resolution of 720 × 560, frame rate 50 fps and 12-bit depth) positioned in the middle of the light panel (Supplementary Figure S2). The LED panel was equipped with 3 × 64 orange-red (618 nm) and 64 cool-white LEDs (6500 K), distributed equally over 75 × 75 cm. This resulted in a ±5% maximum deviation from the mean across the imaged area of 35 × 35 cm. Modulated light of known wavelength was applied to detect the ChlF signal. Three types of light sources were used: (1) PAM short-duration measuring flashes (33 µs) at 618 nm, (2) orange-red (618 nm) and cool-white (6500 K) actinic lights with maximum irradiance 440 µmol m−<sup>2</sup> s −1 and (3) saturating cool-white light with maximum irradiance 3000 µmol m−<sup>2</sup> s −1 .

Plant trays were automatically loaded into the light-isolated imaging cabinet of PlantScreenTM with a top-mounted LED

FIGURE 2 | Col-0 and C24 growth-related responses to salt stress. (A) RGB images of control (upper panel) and 11 days salt-stressed (lower panel) plants maintained at 80, 60, and 40% of the soil water-holding capacity, which correspond to approximately 50, 100, and 150 mM NaCl, respectively. (B) Leaf slenderness and (C) projected rosette area over time for Col-0 (purple lines) and C24 (green lines) in control (solid lines) and salt-stressed (dashed lines) conditions. Values represent the averages of nine replicates per accession and treatment. Error bars represent standard error. The significant differences between control and salt treatment per accession are indicated with <sup>∗</sup> and ∗∗ for p-values below 0.05 and 0.01, respectively. (D) The shoot ion-independent tolerance (SIIT) index was calculated as the ratio of growth rates in salt-stressed conditions to average growth rates observed in control conditions per accession and treatment over two time intervals. Values represent the average of nine biological replicates per accession and treatment. Error bars represent standard error. Different letters are used to indicate the significant differences between the accessions and conditions as tested with one-way ANOVA with post hoc Tukey's test (p < 0.05).

light panel. After the 15 min dark-adaptation period, when PSII reaction centers open, the trays were automatically transported to the ChlF imaging cabinet. A 5 s flash of light was applied to measure the minimum level of fluorescence in the darkadapted state (Fo), followed by a saturation pulse of 800 ms (with an irradiance of 1200 µmol m−<sup>2</sup> s −1 ) used to determine the maximum fluorescence in the dark-adapted state (Fm). Plants were relaxed in the dark for 17 s and then subjected to 70 s of cool-white actinic lights to drive photosynthesis and measure the peak rise in fluorescence (Fp). These conditions were used in both ChlF imaging techniques using quenching kinetics and light curve protocol.

For quenching kinetics protocol, additional saturation pulses were applied at 8, 18, 28, 48, 68 s during actinic illumination, corresponding to L1, L2, L3, L4, and Lss states at a constant photon irradiance of 210 µmol m−<sup>2</sup> s −1 (Supplementary Figure S3). These ChlF signals were used to acquire the maximum fluorescence in the light-adapted state (F<sup>m</sup> 0 ), and the

level of ChlF measured just before the saturation pulse was considered the steady-state fluorescence in the light-adapted state (Ft). Further responses to dark-relaxation were measured by switching the actinic light off for 100 s and applying saturating pulses at 30, 60, and 90 s, corresponding to D1, D2, D3 states (Supplementary Figure S2). The PlantScreenTM Analyzer software performed the automated ChlF feature extraction by mask application, background subtraction and parameter calculation based on the fluorescence levels of Fo, Fm, Fp, F<sup>t</sup> and F<sup>m</sup> 0 , which were estimated by integrating pixel-bypixel values across the entire rosette (Supplementary Figure S2; Supplementary Table S2). Minimum fluorescence in the lightadapted stated (F<sup>o</sup> 0 ) was calculated according to Oxborough and Baker (1997).

For the examination of natural variation in the nine Arabidopsis accessions we optimized ChlF imaging by quantifying the rate of photosynthesis at different photon irradiances using the light curve protocol (Henley, 1993; Rascher et al., 2000) which was proven to provide detailed information on ChlF under stress (Brestic and Zivcak, 2013). A 5 s flash of light was applied to measure the minimum fluorescence, followed by a saturation pulse of 800 ms (with an irradiance of 1200 µmol m−<sup>2</sup> s −1 ) to determine the maximum fluorescence in the dark-adapted state. Next, 60 s intervals of cool-white actinic light at 95, 210, 320, 440 µmol m−<sup>2</sup> s −1 corresponding to L1, L2, L3, and L4, respectively, were applied. A saturation pulse was applied at the end of the period of actinic light to acquire the maximal fluorescence in the lightadapted state (Supplementary Figure S6A). The ChlF signal measured just before the saturation pulse was taken as the steady-state fluorescence value in the light-adapted state. The ChlF parameters were extracted and processed as described above for data collected from day 0 to day 7 of salt treatment (**Figure 6**).

#### Statistical Analysis on Chlorophyll Fluorescence-Related Responses to Salt Stress

An ANOVA with Tukey's post hoc test of significance (pvalue < 0.05) was used to evaluate the differences in ChlF between control and salt-stressed plants. Trait values specific to accession, day and condition were divided by the overall average per trait to analyze the fluctuations in the ChlF traits, which were due to both plant development and salt treatment (**Figures 4** and **6**). Principal component analysis (PCA) was performed on 20 ChlF traits collected from Col-0 and C24 under the different adapted states and saturating pulses (L1 to L4, Lss and D1 to D3) to reduce data dimensionality (Supplementary Figure S5; Supplementary Table S3). Eight ChlF traits (Fv/Fm, F<sup>v</sup> 0 /F<sup>m</sup> 0 , 8P, qP, 8NO, 8NPQ, qN and NPQ; Lazar, 2015) measured on day 7 at 440 µmol m−<sup>2</sup> s <sup>−</sup><sup>1</sup> were clustered with SIIT<sup>1</sup> and SIIT<sup>2</sup> (GRsalt/GRcontrol for each time interval) using the Ward linkage method. This was performed to study the relationships between the ChlF traits and relative changes in growth rate under salt stress among the nine accessions (**Figure 7**). Normalization was done by dividing the relative trait values by the overall average per trait. Mann–Whitney U-test with a continuity correction were performed on the ChlF parameters captured by the light curve protocol, and p-values were calculated using treatment as a grouping variable per accession for each day of the phenotyping period (Supplementary Table S4).

## RESULTS

## Salt Stress Affected Growth and Morphology-Related Traits Over Time

To define the most suitable screening conditions to study early responses to salt stress, we first aimed to optimize the salt treatment (**Figure 1**). Three watering regimes were used to control the soil-water content (40, 60, and 80%; **Figure 1A**). The salt solution was diluted according to the watering regime to the final concentrations of 150, 100, or 50 mM NaCl in the soil corresponding to severe, moderate and mild salt stress, respectively (**Figure 1B**) and plants were phenotyped using RGB and ChlF imaging (**Figure 1C**). We then analyzed the phenotypes of Col-0 and C24 plants to discern the conditions most suitable for screening early salt-induced changes without greatly compromising plant health (**Figure 2**).

Investigation of rosette morphology revealed that Col-0 developed more slender leaves than C24 under control conditions (**Figure 2B**; Supplementary Table S1), but under severe salt stress, differences were less pronounced. Changes in roundness and compactness of Col-0 and C24 leaves were apparent after 2– 3 days of salt treatment, whereas changes in rotational mass symmetry, eccentricity and slenderness of Col-0 leaves were recorded after 1 day of salt treatment (Supplementary Table S1; **Figure 2C**). C24 plants showed significant decreases in rosette area with salt treatment at day 8 in mild and moderate stress conditions and at day 7 in severe stress conditions. Mild stress levels did not cause a significant decrease in rosette area in Col-0 plants but a marked decrease was observable at day 8 with moderate and at day 7 with severe salt stress (**Figure 2C**).

Growth rates were estimated in each accession and condition by splitting the growth period into two intervals, 0–4 days and from 7 to 11 days, and fitting two linear functions to the increase in rosette area over time (**Figure 2C**). Growth rates of salt-treated plants were smaller in both intervals than those of control plants. This reduction was more pronounced in the second interval (Supplementary Table S1), allowing the discrimination between early and late responses to salt stress. The ratio of GRsalt to GRcontrol was used to describe the SIIT index and was calculated to assess salinity tolerance in the early (SIIT1) and late (SIIT2) phases of salt stress (**Figure 2D**). There was a clear decrease in SIIT values in both Col-0 and C24 plants with increasing salt stress levels. Both SIIT<sup>1</sup> and SIIT<sup>2</sup> of C24 were higher than those of Col-0 under moderate and severe salt stress conditions in both intervals, indicating that C24 has higher salinity tolerance. Differences between control and salt-stressed plants and between Col-0 and C24 were most pronounced under moderate and severe conditions of salt stress. In addition, control plants grown in the 40% soil-water content were much smaller than control plants grown in other watering regimes (**Figure 2C**), suggesting that these plants were likely to be suffering from drought stress. Mild salt stress had no effect on the growth of Col-0 and only

a slight effect was observed later in C24 plants (**Figure 2B**). We established that plant growth and performance was best assessed using a 60% watering regime, which resulted in moderate salt stress of 100 mM NaCl.

## Color Segmentation of RGB Images Illustrates Changes in Rosette Greenness Over Time

We examined the features of pixel color in the RGB images to identify changes in rosette greenness under salt stress. Color information was extracted from pixels corresponding to the imaged rosettes. RGB images were color-segmented into nine green hues and analyzed for their relative abundance as a percentage of rosette area (**Figure 3A**). This strategy enabled the number of pixels representing each hue to be normalized for rosette area and compared between accessions and treatments (**Figure 3B**). Hues 1, 2, 5, and 8 changed over time without marked differences between treatments, while hues 3, 4, 6, 7, and 9 differed from control conditions after only 1 day of exposure to salt stress (**Figure 3B**). We calculated the ratio for each hue, between control and salt-treated plants, to observe

FIGURE 5 | Natural variation in growth-related responses of nine Arabidopsis accessions under salt stress. (A) RGB images of control (upper panel) and 7 days salt-stressed (lower panel) plants. (B) Projected rosette area over time for plants grown in control (solid lines) and salt-stressed (dashed lines) conditions. Values represent the average of eight biological replicates per accession and treatment. Error bars represent standard error. Significant differences between control and salt stress treatment are indicated with <sup>∗</sup> and ∗∗ for p-values below 0.05 and 0.01, respectively. (C) Shoot ion-independent tolerance (SIIT<sup>1</sup> and SIIT2) values were calculated from averages of eight replicates per accession and treatment. Error bars represent standard error. Different letters indicate significant differences between accessions as tested with one-way ANOVA with post hoc Tukey's test (p < 0.05).

the salt-induced changes among between the accessions and treatments. We presented the results for hue 4, showcasing the differences between Col-0 and C24 throughout the phenotyping period (**Figure 3C**).

## Chlorophyll Fluorescence Imaging Captures the Early and Late Changes in Photosynthetic Performance in Response to Salt Stress

To further explore the photosynthetic performance of control and salt-treated plants, we used ChlF parameters measured by the PAM method and quenching kinetics. From the measured fluorescence transient states, the basic ChlF parameters were derived (i.e., Fo, Fm, F<sup>m</sup> 0 , F<sup>t</sup> , Fv, and Fp), which were used to calculate the quenching coefficients (i.e., qP, NPQ, PQ, and qN) and other parameters characterizing plant photosynthetic performance (i.e., F<sup>o</sup> 0 , Fv/Fm, 8P, F<sup>v</sup> 0 /F<sup>m</sup> 0 , 8NO, 8NPQ and Rfd; these parameters are summarized in Supplementary Table S2). The quenching kinetics protocol allowed to detect the shifts in the ChlF curves of Col-0 and C24 salt-stressed plants as early as 1 h after moderate or severe stress (Supplementary Figure S3).

We then selected six ChlF parameters that reflect the photosynthetic function of PSII (Lazar, 2015): the maximum quantum yield of PSII photochemistry in the dark-adapted (Fv/Fm), and the light-adapted (F<sup>v</sup> 0 /F<sup>m</sup> 0 ) states, the coefficient

FIGURE 7 | Clustering of ChlF parameters with SIIT values revealed the early and late responses to salt stress. (A) Using the Ward Linkage method, the shoot-ion independent tolerance (SIIT) values, SIIT1, SIIT<sup>2</sup> and eight ChlF traits, measured at the highest photon irradiance (L4) at day 7, were clustered. Three cluster groups were identified for the nine accessions, while two clusters were observed to form across the phenotypic traits. The trait values for individual accessions presented in the heat map were normalized by the z-Fisher transformation per trait. (B) The effect of salt treatment on ChlF parameters and SIIT values across the three cluster groups identified in (A). SIIT1, SIIT<sup>2</sup> and the selected ChlF parameters were calculated relative to control conditions and divided by the overall average per trait. Values represent average of eight biological replicates per accession and treatment.

of photochemical quenching that estimates the fraction of open PSII reaction centers (qP), the actual quantum yield of PSII photochemistry in the light-adapted state, the proportion of light absorbed by the chlorophyll associated with PSII that is used in photochemistry (8P), the quantum yield of constitutive non-light-induced dissipation consisting of ChlF emission and heat dissipation (8NO) and the quantum yield of regulatory light-induced heat dissipation (8NPQ) for days 0, 1, 2, 3, 4, 7, and 11 of the phenotyping period (**Figure 4**). The traits related to maximum quantum yield (Fv/F<sup>m</sup> and F<sup>v</sup> 0 /F<sup>m</sup> 0 ) were not significantly different between control and salt-treated plants (**Figure 4**; Supplementary Figure S4; Supplementary Table S2), suggesting that PSII was not damaged during the course of the experiment. The other four ChlF parameters (qP, 8P, 8NO, and 8NPQ) varied with time in both control and salt-stressed conditions (**Figure 4**).

Change in nine ChlF parameters, with respect to control conditions, was used to indicate the effect of salt stress on quenching processes and PSII efficiency (Supplementary Table S2; Supplementary Figure S4): traits measured in the lightadapted state were most affected (L1, L2, L3, L3, and Lss), while traits measured in the dark-adapted state (D1, D2, and D3) did not vary between treatments or accessions, except in the case of NPQ, qN, RFD, and 8NPQ under severe stress (Supplementary Figure S4). F<sup>v</sup> 0 /F<sup>m</sup> <sup>0</sup> was unchanged in all salt treatments and in both accessions and 8NO displayed only slight changes indicating that those parameters were robust in response to salt stress. With this protocol we were able to detect rapid changes in 8P, qP, PQ, NPQ, qN, RFD, and 8NPQ in C24, but not in Col-0, after only 1 day of salt treatment (Supplementary Figure S4). To explore the ChlF parameters even further, we used PCA to classify the observed trends (Supplementary Figure S5). PCA performed on the 20 ChlF traits, under the eight different adapted states and saturating pulses (L1 to L4, Lss and D1 to D3), showed that the five PCs explained 85% of the variation (Supplementary Figure S5). PC1 described accession-specific trends and PC2 contained traits relevant to treatment with salt (Supplementary Table S3).

## Natural Variation of Growth-Related Traits was Quantified in Response to Salt Stress

Because moderate salt stress elicited significant changes between control and salt-treated plants without symptoms of severe stress, it was used to investigate natural variation among the nine accessions of Arabidopsis thaliana (**Figure 5A**). Plants were cultivated with a longer light period at higher temperatures and lower photon irradiance than the initial conditions for Col-0 and C24. Phenotyping of the plants was conducted in the same manner through 7 days after salt treatment using RGB (**Figure 5A**) and ChlF imaging with the light curve protocol (Supplementary Figure S6). The rosette area of Te was the most significantly reduced by salt stress starting from day 3, while Col-0, Can, Co and Ler showed significant reductions later. Rosette areas of C24, Nd, and Cvi were not significantly reduced (**Figure 5B**). Natural variation was evident across all SIIT values in both intervals, indicating differences in salinity tolerance among the nine accessions. Lower SIIT<sup>2</sup> than SIIT<sup>1</sup> values indicated that plants became less tolerant over time; however, this difference was only significant for Cvi. This demonstrates that we were able to assess natural variation, not only in the growth reduction magnitude, but also in the timing of the responses to salt stress.

### Light Curve Chlorophyll Fluorescence Imaging Captured Early Responses to Salt Stress

ChlF parameters were measured at four photon irradiances and were calculated as described previously (Supplementary Table S2). Differences in most ChlF parameters between control and salt-stressed plants were observed within 24 h of introducing salt at the highest photon irradiance (**Figure 6**). We found that the maximum quantum yield of photosynthesis in the darkadapted state (Fv/Fm) was not affected by salt stress in Col-0. Rapid responses to salt stress were observed in photochemical and non-photochemical quenching, as 8NPQ rapidly increased, followed by a decrease in non-regulatory heat dissipation 8NO. The increase in heat dissipation via xanthophyll-mediated nonphotochemical quenching (8NPQ) coincided with a significant decrease in the photochemical quenching coefficient (qP) and inhibition of the PSII operating efficiency (8P). Maximum quantum yield in the light-adapted state (Fv<sup>0</sup> /Fm<sup>0</sup> ) decreased in response to salt stress, but not as severely as did the other ChlF traits (Supplementary Figure S6B; Supplementary Table S4).

Because the highest actinic photon irradiance (L4) provided the most discriminative power for the quantification of early saltinduced changes in ChlF parameters, it was chosen to assess the natural variation in photosynthetic activity. Comparison among the nine accessions using the six key ChlF traits revealed that Fv/F<sup>m</sup> did not differ between control and salt-treated plants, with the exception of Cvi (**Figure 6**; Supplementary Table S4). Increased 8NPQ, upon exposure to salt stress, was observed to varying degrees (**Figure 6**; Supplementary Table S4). We also observed natural variation in the salt-induced decrease of 8P, qP, and 8NO over time. Overall, we observed that salt stress results in rapid and substantial increase in non-photochemical processes (i.e., the dissipation of heat in the PSII antennae), which correlates with reduced PSII quantum efficiency and photochemical quenching under stress (**Figure 6**; Supplementary Figure S6C).

Eight ChlF traits and two SIIT values were used to cluster the nine accessions using the Ward linkage method (**Figure 7A**). The accessions clustered into three groups (1–3) while the phenotypic traits were classified into two clusters (A and B). Clustering of traits revealed that non-photochemical quenching-related parameters (NPQ, 8NO, 8NPQ, and qN) are more prominent during the early stage (cluster A), while quantum yield-related parameters (Fv/Fm, F<sup>v</sup> 0 /F<sup>m</sup> 0 , 8P, and qP) corresponded to the later stage of exposure to salt stress (cluster B). Eight accessions were grouped into two clusters (clusters 1 and 2) and distinct responses of Cvi placed it separately (cluster

3; **Figure 7**). Accessions in cluster 2 (C24, Nd, and Col-0) showed the least pronounced responses to salt stress during the early phase of exposure to salt stress in terms of SIIT<sup>1</sup> (**Figure 5C**) and significant decline in photosynthetic activity (8P and qP; **Figures 6** and **7**; Supplementary Table S4). Finally, the accessions belonging to cluster 1 (Rsch, Te, Ler, Can, and Co) were characterized by rapid reduction in growth rate in the early phase of salt stress (**Figure 5C**) and less prominent changes in ChlF parameters (**Figures 6** and **7**; Supplementary Table S4). Hence, using ChlF parameters and SIIT values, we were able to distinguish between the processes affected in the early and late responses of plants to salt stress.

#### DISCUSSION

Recent advances in high-throughput phenotyping have allowed the parallel screening of multiple quantitative traits measuring plant growth and performance under stress conditions. In this study, we used RGB and ChlF measures, with rosette coloring, to dissect the complex responses of plants to salt stress. We developed a phenotyping protocol to monitor early physiological changes in response to salt stress involving growth, rosette morphology and photosynthetic performance. To make these evaluations, we determined the RGB, greenness and ChlF traits most responsive to salinity. The 60% soil-water content (100 mM NaCl) was the most suitable condition for studying early plant responses to salt stress without causing growth arrest (**Figure 2**) or premature leaf senescence (**Figure 3**).

Investigation of the RGB images revealed that salt stress caused little change in rosette morphology (**Figure 2B**; Supplementary Table S1); however, these phenotypes should not be overlooked as leaf slenderness and rosette compactness are known to play important roles in heat dissipation and transpiration rate (Bridge et al., 2013). Similarly, differences in rosette greenness were predominantly related to development and accession rather than to treatment with salt (**Figure 3**). Nevertheless, we did observe an increase in darker hues in plants grown in severe and moderate salt stress conditions than in those grown in mild stress, which could be due to accumulation of anthocyanin (Van Oosten et al., 2013). Pronounced changes in both lighter and darker hues were previously reported to occur in the later phases of salt stress response than the phenotyping period used in this study (Ben Abdallah et al., 2016). Therefore, color segmentation and quantification of green hues could provide valuable information regarding plant development and stress-related responses, especially when combined with quantitative pigment-content analysis.

We used automated image-processing pipelines to examine the salt-induced changes in rosette area over time by fitting a linear function to describe the growth over time (**Figures 2C** and **5B**; Supplementary Table S1). We used SIIT<sup>1</sup> and SIIT<sup>2</sup> as indicators of plant salinity tolerance in two time intervals (**Figures 2C** and **5B**) finding lower SIIT<sup>2</sup> than SIIT<sup>1</sup> values all accessions studied. The SIIT<sup>2</sup> values of Cvi decreased dramatically, potentially due to its early increase in non-photochemical processes, which are represented by the traits NPQ, 8NPQ and qN, with concurrent decreases in photochemical efficiency, represented by the traits F<sup>v</sup> 0 /F<sup>m</sup> 0 , 8P, and qP. These steps are usually followed by increase in constitutive non-light-induced dissipation (8NO) and a drop in the maximal quantum efficiency of PSII in dark-adapted state (Fv/Fm; Mishra et al., 2012). The observed trend of lower SIIT<sup>2</sup> values could be due to accumulation of ions in photosynthetic tissues, putting additional constraint on photosynthesis and subsequently plant growth (Munns and Tester, 2008).

Although, maximum quantum yield was commonly used for assessing plant performance under stress (Jansen et al., 2009; Bresson et al., 2015), we found that Fv/F<sup>m</sup> seems to be a robust parameter (**Figures 4** and **6**), being affected only under severe stress and not reflecting early salt stress responses, which is in agreement with previous reports (Baker and Rosenqvist, 2004). Other parameters quantifying photochemical and non-photochemical processes displayed more dynamic responses to salt stress (**Figures 4** and **6**). To identify the ChlF traits most responsive to salt stress, we performed PCA resulting in extracting PC1 and PC2, corresponding to the differences observed between treatment and accession, respectively (Supplementary Figure S5). According to PC1, Fo, Fm, Fv, Ft, and F<sup>p</sup> were the traits that changed the most in response to salt treatment. Based on traits contained in PC2, we found that the differences observed in Fv/Fm, NPQ, qN, and qP were more accession-specific (Supplementary Table S3). Based on our results, early 8NPQ, qP, and 8P responses to salt stress could be used to distinguish salt-tolerant from salt-sensitive plants (**Figures 4** and **6**). Similar results have been found for studies of drought and salt stress (Baker and Rosenqvist, 2004; Stepien and Johnson, 2009; Mishra et al., 2012). In conclusion, we observed largely different ChlF responses to salt stress among the accessions, evidencing that different accessions use different strategies to tolerate salt stress.

In this study, we demonstrated that phenotyping multiple quantitative traits in one experimental setup can provide new insights into the dynamics of plant responses to stress. These traits can be used to assess plant natural variation and to cluster accessions based on the magnitude and timing of their response to stress (**Figure 7**). Our work identified a set of phenotypes that can serve as markers for early responses to salt stress. These phenotypic markers can be used to study mutant populations, natural diversity panels and responses to other stress conditions, such as drought, cold or nutrient-deficiency, which would reveal the scope of their influence on tolerance to stress. Integrating thermal imaging into the phenotyping pipeline, along with quantifying water-use and transpiration-use efficiency, would provide a more comprehensive understanding of plant responses and development under stress. The protocol presented here can also be used to study non-model plants and crop species with more complicated 3D morphology, ultimately capturing a broad range of phenotypic traits. These traits could then be used in combination with forward genetics studies to identify genes underlying early responses to salt stress with the goal of providing new target genes for crop improvement.

## AUTHOR CONTRIBUTIONS

fpls-07-01414 September 26, 2016 Time: 16:38 # 14

MA performed the optimization protocol at PSI (Czech Republic) and most of the data analyses. MA, MJ, KP, and AN wrote the manuscript. With support from DS, AN performed the natural variation experiment at PSI. MA, MJ, and JF performed the phenotypic and statistical analyses. KP selected the nine accessions of Arabidopsis thaliana and designed the natural variation experiment. MTe, SN, and SS contributed to the original concept of the project and supervised the study. MTe, MTr, and KP conceived of the project and its components. All authors discussed the results and contributed to the manuscript.

#### FUNDING

The research reported in this publication was supported by funding from King Abdullah University of Science and Technology (KAUST) and from the European Union's Seventh

#### REFERENCES


Framework Program for research, technological development and demonstration under grant agreement no GA-2013-608422 – IDP BRIDGES.

## ACKNOWLEDGMENTS

We would like to thank Radka Mezulaniková for assisting with the optimization protocol, Kumud Mishra (CzechGlobe, Czech Republic) for providing the Arabidopsis thaliana seeds and Dušan Lazár for comments on the ChlF analysis. We also thank Carolyn Unck for reviewing and editing the manuscript and Dr. Xavier Sirault for suggesting the use of the saturation method for salt treatment.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016.01414

and light limitations to photosynthesis. Plant Cell Environ. 38, 1528–1542. doi: 10.1111/pce.12504



**Conflict of Interest Statement:** MTr is the owner and CEO of PSI (Photon Systems Instruments), Drasov, Czech Republic, and Dr. KP and JF are employees of his company. The other authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Awlia, Nigro, Fajkus, Schmoeckel, Negrão, Santelia, Trtílek, Tester, Julkowska and Panzarová. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Genotype × Environment Interactions of Yield Traits in Backcross Introgression Lines Derived from *Oryza sativa* cv. Swarna/*Oryza nivara*

Divya Balakrishnan\*, Desiraju Subrahmanyam, Jyothi Badri, Addanki Krishnam Raju, Yadavalli Venkateswara Rao, Kavitha Beerelli, Sukumar Mesapogu, Malathi Surapaneni, Revathi Ponnuswamy, G. Padmavathi, V. Ravindra Babu and Sarla Neelamraju

*Crop Improvement Section, ICAR- National Professor Project, ICAR- Indian Institute of Rice Research, Hyderabad, India*

#### *Edited by:*

*John Doonan, Aberystwyth University, UK*

#### *Reviewed by:*

*Natalie Brezinova Belcredi, Mendel University, Czechia Yaunhuai Han, Shanxi Agricultural University, China*

#### *\*Correspondence:*

*Divya Balakrishnan divyab0005@gmail.com; divyabalakrishnan05@gmail.com*

#### *Specialty section:*

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

*Received: 11 March 2016 Accepted: 29 September 2016 Published: 19 October 2016*

#### *Citation:*

*Balakrishnan D, Subrahmanyam D, Badri J, Raju AK, Rao YV, Beerelli K, Mesapogu S, Surapaneni M, Ponnuswamy R, Padmavathi G, Babu VR and Neelamraju S (2016) Genotype* × *Environment Interactions of Yield Traits in Backcross Introgression Lines Derived from Oryza sativa cv. Swarna/Oryza nivara. Front. Plant Sci. 7:1530. doi: 10.3389/fpls.2016.01530* Advanced backcross introgression lines (BILs) developed from crosses of *Oryza sativa* var. Swarna/*O. nivara* accessions were grown and evaluated for yield and related traits. Trials were conducted for consecutive three seasons in field conditions in a randomized complete block design with three replications. Data on yield traits under irrigated conditions were analyzed using the Additive Main Effect and Multiplicative Interaction (AMMI), Genotype and Genotype × Environment Interaction (GGE) and modified rank-sum statistic (*YSi*) for yield stability. BILs *viz.,* G3 (14S) and G6 (166S) showed yield stability across the seasons along with high mean yield performance. G3 is early in flowering with high yield and has good grain quality and medium height, hence could be recommended for most of the irrigated locations. G6 is a late duration genotype, with strong culm strength, high grain number and panicle weight. G6 has higher yield and stability than Swarna but has Swarna grain type. Among the varieties tested DRRDhan 40 and recurrent parent Swarna showed stability for yield traits across the seasons. The component traits thousand grain weight, panicle weight, panicle length, grain number and plant height explained highest genotypic percentage over environment and interaction factors and can be prioritized to dissect stable QTLs/ genes. These lines were genotyped using microsatellite markers covering the entire rice genome and also using a set of markers linked to previously reported yield QTLs. It was observed that wild derived lines with more than 70% of recurrent parent genome were stable and showed enhanced yield levels compared to genotypes with higher donor genome introgressions.

Keywords: BILs, stability, AMMI, GGE, *Oryza nivara,* yield traits

**Abbreviations:** AMMI, additive main effect and multiplicative interaction; G × E, genotype by environment; GGE, Genotype and Genotype × Environment Interaction; BILs, backcross introgression lines; GEI, Genotype Environment Interaction; AICRIP, All India Coordinated Rice Improvement Programme; PCA, principal component analysis; IIRR, Indian Institute of Rice Research, DFF, days to fifty percent flowering; DTM, days to maturity; PH, plant height, TN, tiller number, PTN, number of productive tillers; PL, panicle length; PW, panicle weight; FG number of filled grains; TG, total number of grains; TGW, 1000 grain weight; GY, grain yield; BM, biomass; SF, spikelet fertility; TDM, total dry matter; HI, harvest index; TDMPD, total dry matter per day; YPD, per day productivity; BY, bulk yield.

## INTRODUCTION

Improving rice production per unit area and per unit time will be a major challenge in future due to the expanding population of rice consumers in the world. The average yield of existing cultivars reached a plateau and now research is directed toward wild relatives of Oryza to explore novel genes that can improve yield traits. Wild relatives were widely explored as donors for stress resistance and less exploited for yield improvement because of non-preferable agronomic traits linked with them. Wild rice genotypes provide a diverse range of allelic variation due to their adaption to a wide range of environmental conditions. Wild and related genotypes are valuable resources to explore novel variations to widen the genetic background of cultivated rice (Brar and Khush, 1997; Tanksley and McCouch, 1997; Swamy and Sarla, 2008; Wickneswari et al., 2012). Introgression of chromosomal segments from wild species into cultivated species can also generate de-novo variations in the new genetic background (Wang et al., 2005).

Back cross introgression lines developed from wild and adapted genotypes are useful in diversifying existing germplasm in more usable form and also in discovering novel genes/QTLs. As BILs have maximum genome of recurrent parent with few donor segments, it is advantageous to use them for precise estimation of quantitative traits. Fixed BILs can be replicated and can be used to study their environment interactions. The evaluation of the BILs for stability is very important especially when it is derived from an interspecific cross, as it takes more time to attain stability in the new back ground. Utilization of stable BILs will accelerate varietal development due to the presence of novel genes in an adapted parental background (Jeuken and Lindhout, 2004).

As grain yield is a complex quantitative trait, with high environmental interaction; selection of genotypes based on performance in single environment is not effective for varietal identification (Shrestha et al., 2012). It is essential to carry out selection based on yield stability evaluation than average performance in multiple environment conditions (Kang, 1993; Tariku et al., 2013; Islam et al., 2015). Selection of genotypes for stability and adaptability is required prior to recommendation in case of a crop such as rice which is grown in diverse ecologies. Stability is the suitability of a variety over a wide range of environments while adaptability is the better survival of a genotype over any specific environment. This can be attained through either genetic or physiological homeostasis of genotypes for environmental fluctuations (Singh and Narayanan, 2006). For cultivation in large area it is stability for yield traits which is desirable but for achieving maximum productivity, it is adaptability to best target environments that is preferred.

Effects of genotype, environment and genotype × environment interaction determine the phenotypic performance and its general and specific adaptation to different environments (Falconer and Mackey, 1996). This information is required for planning better selection strategies and to identify the best environment to select genotypes for grain yield (Gauch and Zobel, 1996; Kang, 1998). Several studies have been conducted on stability performance for grain yield of rice for different ecosystems (Cooper et al., 1999; Wade et al., 1999; Ouk et al., 2007; Anandan et al., 2009; Kumar et al., 2012; Tariku et al., 2013; Liang et al., 2015; Katsura et al., 2016). Many such studies showed that genotype × environment interaction was more significant than genotypic main effects (Henderson et al., 1996; Cooper and Somrith, 1997; Wade et al., 1997, 1999; Cooper et al., 1999; Inthapanya et al., 2000).

There are several methods to study stability and genotype × environment interactions of traits through conventional analysis. Different models were proposed on stability variance, ecovalence, regression coefficient analysis or principal component analysis (PCA) (Finlay and Wilkinson, 1963; Eberhart and Russell, 1966; Perkins and Jinks, 1968; Freeman and Perkins, 1971; Shukla, 1972; Kang, 1993). Kang (1993) proposed yield stability static (Ysi) by combining yield and stability as a single selection criterion by modifying rank Sum method. However, additive main effects and multiplicative interaction (AMMI) model and the genotype main effects and genotype × environment interaction effects (GGE) model are more popular methods. This method is followed to quantify the genotype environment interaction through PCA and graphical representation and has been widely applied in the multi-environment cultivar trials (Kempton, 1984; Crossa et al., 1990; Gauch and Zobel, 1997).

A panel of 14 BILs derived from Swarna/ Oryza nivara was studied along with 9 high yielding rice varieties of different duration and these 23 lines were screened in three seasons. Genotypic characterization of these BILs was conducted with genome wide polymorphic markers and markers linked to yield QTLs. The objectives of this study were (1) to identify the yield potential of backcross introgression lines in comparison with existing popular varieties (2) to identify stable high yielding BILs and their parental genome percentage (3) to prioritize the component traits important for further genetic dissection and improvement.

#### MATERIALS AND METHODS

#### Location

Field experiments were conducted at Indian Institute of Rice Research, Hyderabad (17◦ 19′ N and 78◦ 29′ E) at an altitude of 549 m above mean sea level during two wet seasons Kharif-2013(E1), (Kharif-2014) (E2) and one dry season Rabi-2014(E3). Crop was grown in alkaline vertisol with a pH of 7.94 at irrigated field conditions. Details of meteorological conditions during the crop growth period are presented in **Table 1**.

#### Plant Material

Studies were conducted at IIRR to develop wild introgression lines between O. sativa cv. Swarna and accessions of wild relative, Oryza nivara (Kaladhar et al., 2008; Swamy et al., 2011). The developed BILs were advanced to BC2F<sup>6</sup> generation and further purified by single panicle selection method upto BC2F8. From two sets of BILs consisting of 94 lines from Swarna / O. nivara (81848) (S lines) and 104 lines from Swarna / O. nivara (81832) (K lines), a panel of 14 BILs at BC2F<sup>8</sup> generation were selected based on their preferable phenotypic traits (**Table 2**).


TABLE 1 | Weather parameters during crop season.

As BILs have a range of flowering duration from 77 to 120 days popular varieties IR64, Jaya, MTU1010, MTU1081, NLR34449, Sahbhagi Dhan, Swarna, Tellahamsa, and Tulasi with different flowering duration were grown as checks under irrigated conditions. These BILs were evaluated for yield and related traits in irrigated conditions over a period of three seasons (2013–2014) along with checks. As there is considerable variation in the duration among the BILs, per day productivity was computed to compare genotypes with different duration.

#### Field Experimental Details

Seeds were sown in nursery beds, and 25 days old seedlings were transplanted, with single seedling per hill in all the field trials. The planting density was 33.3 hills m−<sup>2</sup> , with 20 cm row spacing and 15 cm intra-row spacing with five rows of 21 plants each constituting a replication. Normal package of practice and fertilizer application was followed; weeds, insects, and diseases were controlled by using standard herbicides and pesticides as required to avoid yield loss. The experimental plots were arranged in a randomized complete block design with three replications each containing 105 plants. These same parameters were followed uniformly across the seasons.

#### Phenotyping

These genotypes were screened for various yield contributing traits in all the seasons following Standard Evaluation System (IRRI, 2013). The observations on yield and morpho-agronomic traits were recorded from the field experiments.

#### Statistical Analysis

Analysis of variance was computed for individual environment, then a combined analysis of variance was performed, considering both environments and genotypes as fixed using PB tools (Version 1.4, http://bbi.irri.org/products) and R (R Core Team, 2012). Significance of all effects was tested against mean square of error. The performance of BILs was tested over three seasons and was assessed using stability models viz, (1) yield-stability statistic (YSi) (Kang, 1993), (2) Additive Main effects and Multiplicative Interaction (AMMI) (Gauch and Zobel, 1997), and (3) GGE Biplot or Site Regression model (Yan and Kang, 2003). These models were used to interpret and visualize the stability and GEI patterns. In the AMMI model, only the GEI term is absorbed in the multiplicative component, whereas in the GGE model, the main effects of genotypes (G) plus the GEI are absorbed into the multiplicative component. Yield-stability (YSi) statistic was developed by Kang (1993) to be used as a selection criterion when G × E interaction is significant. The stability-variance was determined following modified Shukla's (1972) method and genotypes with significant stability variance were considered unstable. The stability variance was integrated with yield to obtain the YSi statistic as outlined by Kang and Magari (1995). Simultaneous selection of high yielding and stable genotypes is possible through this method.



The AMMI model (Gauch, 1988) was used in analyzing the stability and interaction for yield traits. The AMMI model is a combination of analysis of variance (ANOVA) and principal component analysis (PCA). The G × E interaction was evaluated with the AMMI model by considering the first two principal components. ANOVA model was used to analyze the trait data with main effects of genotype and environment without the interaction, then, a principal component analysis was integrated using the standardized residuals. These residuals include the experimental error and the effect of the GEI. The analytical model can be written as

$$Y\_{ij} = \mu + \delta\_i + \beta\_j + \sum\_{k=1}^{K} \lambda\_k \delta\_{ik} \beta\_{jk} + \varepsilon\_{ij}$$

Where Yij. is the mean yield of i th genotype in j th environment, µ is the overall mean, δ<sup>i</sup> is the genotypic effect, β<sup>j</sup> is the environment effect, λ<sup>k</sup> is the singular value for PC axis k, δik is the genotype eigenvector value for PC axis n, βjk is the environment eigenvector value for PC axis k and εij is the residual error assumed to be normally and independently distributed (0, σ2/r), σ2 is the pooled error variance and r is the number of replicates.

GGE biplots display both G (genotype) and GE (genotype environment) variation (Kang, 1993) for genotype evaluation. The GGE biplot is based on the sites regression (SREG) linear bilinear model (Cornelius et al., 1996; Crossa and Cornelius, 1997; Crossa et al., 2002). The sites regression model as a multiplicative model in the bilinear terms shows the main effects of cultivars plus the cultivar × environment interaction (GGE) and the model is

$$Y\_{ij} - \mu\_j = \sum\_{k=1}^{t} \lambda\_k \delta\_{ik} \beta\_{jk} + \varepsilon\_{ij}$$

The GGE biplot graphically represents G and GEI effect present in the multi-location trial data using environment centered data. GGE biplots were used to evaluate (1) mega environment analysis (which-won-where pattern), where genotypes can be recommended to specific mega environments. (2) Genotype evaluation, where stable specific genotypes can be recommended across all locations and (3) location evaluation, explains discriminative power of target locations for genotypes under study.

Sum of square percentage was computed as percentage of sum of squares of components of stability analysis of variance per total sum of squares to know the contribution of each component viz., genotype, environment and GEI. Correlation analysis was performed with Statistical Tool for Agricultural Research (STAR) using Pearson's correlation coefficient method. Significance levels are indicated as: <sup>∗</sup>P < 0.05, ∗∗P < 0.01, ∗∗∗P < 0.001.

#### Genotyping

Molecular screening was conducted to identify the presence of reported QTLs in the BILs and also to identify recurrent parent genome percentage. Leaves of 20 days-old seedlings were collected from the field and CTAB (Cetyl Trimethyl Ammonium Bromide) method was followed for DNA extraction (Doyle and Doyle, 1987). Polymorphic SSR markers with genome wide distribution (**Figure 1**) from universal core genetic map (Orjuela et al., 2010) were used for genotyping (Supplementary Table 1). PCR reactions were carried out in Thermal cycler (Veriti PCR, Applied Biosystems, USA) with the total reaction volume of 10µl containing 15 ng of genomic DNA, 1X assay buffer, 200µM of dNTPs, 1.5 mM MgCl2, 10 pmol of forward and reverse primer and 1 unit of Taq DNA polymerase (Thermo Scientific, U.S.A). PCR cycles were programmed as follows: initial denaturation at 94◦C for 5 min followed by 35 cycles of 94◦C for 45 s, 55◦C for 30 s, 72◦C for 45 s and a final extension of 10 min at 72◦C. Amplified products were resolved in 4% metaphor agarose gels prepared in 0.5 X TAE buffer and electrophoresis was conducted at 120V for 2 h. Gels were stained with ethidium bromide and documented using gel documentation system (Alfa imager, U.S.A). Amplified fragments were scored for the presence (1) or absence (0) for each primer genotype combination. The SSR genotypic data generated in the population were analyzed using the software, GGT ver.2.0. The graphical representations and comparisons were made among the 23 lines on linkage group basis and also the entire genome level on individual basis.

#### RESULTS

#### Yield and Yield Related Traits

Wide range of variation was observed for yield traits among the genotypes and across the environments. Combined analysis of variance of three environmental data showed significant genotypic and genotype × environment interactions for all the traits except for 1000 grain weight where the G × E interactions were not significant. In the three environments Kharif 2013, Rabi 2014 and Kharif 2014 the variation in seasonal average was observed for DFF, GY, BY PH, TN, and BM. In three seasons; broad genotypic variation was observed and genotypic average ranged for DFF (77.37 to 133.04); GY (4.73 to 24.63); BY (0.33 to 2.10); PH (65.98 to 148.43); TN (6.12 to 20.70); GN (93.70 to 314.55); PL (18.09 to 25.19); PW (0.98 to 3.87); TGW (12.50 to 26.07); SF (60.57 to 97.57); BM (11.26 to 54.12); HI (0.20 to 0.55) and per day productivity (0.04 to 0.19) among the BILs under study (**Table 3**). The data obtained from the three replications was assessed and compared with high yielding checks in each season. Considering the three season average, among BILs G3 scored highest grain yield, harvest index and per day productivity and G6 scored highest bulk yield compared to Swarna and on par with other checks. G2 was of shortest duration and showed desirable yield traits such as panicle length, 1000 grain weight, spikelet fertility and plant height compared with checks. It was early in flowering with lowest unfilled grains in all the seasons. Derived lines from G6 i.e., G5 had highest grain number, filled grains and panicle weight. G8 was identified as having highest average biomass and dry matter production among BILs, G13, and G1 for high tiller number and G14 for maximum days to maturity. High yielding check MTU1010 showed highest grain yield, bulk yield, harvest index and per day productivity than BILs for three seasons average.

The BILs were screened for seedling vigor in two seasons Kharif and Rabi in field conditions. Seedling vigor for BILs was obtained both for Kharif and Rabi season in terms of plant height and tiller number from the data taken on 40 days after transplanting and 70 days after transplanting. BILs seeds were subjected to germination test and vigor index analysis in vitro by paper towel method (ISTA, 1999). Seedling vigor was also assessed based on paper towel method using the data for shoot length and root length from 7 days and 14 days and germination studies. In both the seasons G2 had highest seedling vigor in terms of plant height. G14 was best for number of tillers in Kharif and G7 in Rabi season. G13(75S) showed highest vigor in terms

TABLE 3 | The mean performance of genotypes under the study across the seasons.


of tiller number consistently across the seasons. In terms of vigor, G6 and derived lines were better compared to checks. Productive tillers were highest in G1 at the time of harvest. Among the checks Tulasi and Sahbhagi Dhan showed comparatively higher vigor and BILs outperformed popular checks. G2 showed highest seedling vigor in paper towel screening method.

#### Stability Analysis

Observations on the yield traits for all three seasons were then subjected to combined analyses through Yield-stability statistic (YSi) (Kang, 1993), AMMI and GGE biplot models. In the analysis, each combination of season with location was considered as an environment. Analysis of variance was first conducted for each environment. Pooled data of 3 seasons was subjected to stability analysis using PB tools and R software and specific genotypic adaptation, general genotypic adaptation and specific population adaptation to different seasons were identified.

It was found that G3 was the most stable genotype in the selection ranks for GY followed by G6, Swarna, G5, G7, and G14 based on combined analysis of yield and stability using YSi statistic. Similarly for bulk yield, G6 scored highest rank followed by G4, G7, G12, G3, G10, and G2. Among the BILs G6 and G3 showed non-significant stability variance and high average yield, so they may be considered for further multilocation trials. Number of genotypes selected based on YSi ranking varied among the traits. 9 genotypes were found to be superior based on YSiscoring for GY and PH; 8 for DTM, FG,TGW, BM, TDMPD;7 for BY, DFF, TN,PTN, GN,PL,SF, and YPD and 5 genotypes were selected based on high trait mean and stability for HI. G6 was found to be stable for 14 traits under study except DFF, TN, PTN, and BM followed by Swarna which was stable for 13 yield contributing traits. G14(7K) showed stability and high mean for 12 traits, G3(14s) for 11 traits, G2(148S) and G8(24K) for 10 traits and G7(248S) for 9 traits (**Table 4**).

Sum of YSi scores for each yield and contributing trait was computed to identify the overall ranking of genotypes and G6 scored highest followed by Swarna, G3, G14, G8, and G2. Overall ranking varied if we select among the contributing traits. The varieties which scored highest for DFF, DTM belong to late duration as the highest values were considered for calculation. Similarly for plant height, the tallest varieties scored highest YSi


TABLE 4 | *YSi* Ranking of each genotype based on trait means and significance of stability

variance.

ranking. So selection of rank and its direction can be decided based on the requirements of target ecosystems. The ranking by YSi statistic based on predicted means for stability parameters for grain yield is shown in **Table 5**.

### General Genotypic Adaptation

AMMI and GGE biplot explained the general genotypic adaptation or stability across genotypes (**Figure 2**). To visualize the performance of different genotypes in a given environment, biplots were used. The relative ranking of different genotypes on the biplots is based on its projection onto the O-axis in AMMI Biplot and GGE biplot was used to diagnose the G × E interaction effects on each yield contributing trait. The results of the AMMI model analysis are interpreted on the basis of AMMI1 biplot where the graph is plotted with the main effect and first multiplicative axis term (PC1) for both genotypes and environments. Greater the Principal Component Axis (PC1) scores, either negative or positive, indicated the specific adaptation of a genotype to certain environments. The more the PC1 scores approximate to zero, the more stable the genotype among the environments under study. The AMMI biplot showed 81.3% fitness in the model for grain yield, and 60.9% for bulk yield. Among the BILs G8, G2, G3, G14, G11, and check Swarna (G15) exhibited high yield with high main (additive) effects showing positive PC1 score. BIL G10 showed less environmental interaction while three environments showed high interaction for GY. Consequently, for BY, Kharif 2014 (E2) showed high interaction but genotypes G2, G6, and G12 were identified with low environmental interactions and were considered best across the seasons for the trait. Based on AMMI analysis G10(3-1K) was the most stable genotype for BM; G6 and G3 for BY; G14 for DFF, DTM; G5 for FG, GN, TN; G3 for HI, GY, YPD; G2 for PH, TN, SF, 1000GW; G3 for PTN; G14 for TDM, TDMPD and G12 for TN. GGE biplot also showed similar results for stability of genotypes in trait expression across environments (Supplementary Figure 1). Genotypic variation was observed for each trait in case of adaptability to specific environments. Kharif environment was most favorable for high yielding BILs such as G3, G5, G2, and G14 while Rabi was favorable for G8 and G6 as they appeared most responsive


\**P* < *0.05,* \*\**P* < *0.01.*

for yield contributing traits in these respective environments. The environments E1 and E2 were more responsive for the traits BM, FG, GN, PL, PW, and SF and environment E3 was responsive for traits GY, TDM, YPD, TDMPD, DFF, DTM, and HI.

From the biplot graph of AMMI, it was inferred that interactions of environments are highly varied and all the three environments were highly interactive for most of the yield traits. E3 (Rabi season) appeared to be a favorable environment for BM and SF; E1 for BY, GN, PL, and E2 for PTN and GY. Genotypes G3(14S), G6(166S), and G9(250K) showed low interaction effects and hence they can be considered stable. In case of GY, G3, and G6 had high mean values and hence they can be recommended for all the environments. The genotypes with high interaction are suitable for specific environments, genotypes with high mean and positive interaction are suited for favorable environments and those with high mean and negative interaction are suited for unfavorable environments for the respective traits. A line that passes through the origin and is perpendicular to the O-axis in the biplots separates genotypes that yielded above the mean (G3, G4, G15, G7, G6, G8, G12, and G4) that would possibly yield above average in all the seasons and genotypes that yielded below average (G1, G14, G5, G10, G14, G9, and G1). The released varieties Tulasi, MTU1010, Swarna and Sahbhagi Dhan used as checks performed well across the seasons.

#### Specific Genotypic Adaptation

Genotypic evaluation was conducted and based on GGE biplot which-won-where pattern and adaptation showed specific genotypic adaptation to limited environment conditions or the adaptability of genotypes for each environment (**Figure 3**). The same genotype performed best across three seasons for plant height (G2), 1000 grain weight (G2) (Supplementary Figure 2), harvest index (G3) and per day productivity (G3). This shows that these traits have stable expression across the seasons with limited environmental influence and the selected genotypes are most stable for the particular trait. The traits like biomass, days to flowering, days to maturity, filled grains, grain number, panicle weight, spikelet fertility, total dry matter, tiller number showed same genotype performed better in two Kharif (wet season) seasons under study but another genotype appeared to be best for Rabi (dry season). So these traits are showing seasonal variation and genotypic performance depends on environmental conditions. Traits like bulk yield, productive tiller number and grain yield showed no seasonal dependence on genotypic performance.

The polygon was drawn joining cultivars that are located farthest from the origin so that all other cultivars are contained in the polygon. Perpendicular lines to the sides of the polygon divide the biplot into sectors. Each sector has a vertex cultivar which is present in the corner of the polygon. The vertex cultivar is the best performing cultivar in the environments that share the sector with it. Vertex genotypes are G3, G4 and G7 at E1, E2 and E3 respectively. In case of checks, it is inferred that cultivar Sahbhagi Dhan is suited to Rabi season and Tulasi to Kharif season. The analysis indicated that G3, G12, and G4 were suitable BILs for cultivation in irrigated environment as they had the highest ranking in biplot and in predicted means.

### Association Analysis

Multiple correlations between different yield and yield related traits was conducted for all the three seasons (**Figure 4**) and it was observed that grain yield has high significant association with panicle weight, 1000 grain weight, total dry matter, per day productivity and harvest index. Days to fifty percent flowering showed negative correlation with bulk yield, grain yield, spikelet fertility, 1000 grain weight, per day productivity and harvest index. Number of primary branches and secondary branches showed positive association both with filled and unfilled grains. Harvest index directly depended on grain yield, per day productivity, filled grains and panicle weight. In season wise correlation conducted among the yield traits, DFF showed highly significant association with DTM; TN with productive tiller number, panicle length and filled grain number; total number of grains with panicle weight in all the three seasons. Single plant yield showed stable and significant association with 1000 grain weight, biomass, harvest index and per day productivity across the seasons.

### Genotyping the BILs

All the BILs were screened using universal core genetic map for rice (Orjuela et al., 2010) and the genotypic data was analyzed using the Graphical Genotypes software (GGT 2.0) (van Berloo, 2008). 74 polymorphic SSR loci out of 165 genome wide core set microsatellite (SSR) markers were used for characterisation of BILs. On an average, percentage of recurrent genome of BILs varied from 36.8% (G2) to 90.6% (G14). The most stable and high yielding BILs, G6 (70.8%) and G3 (72.6%) had about 70% of recurrent parent genome and 20% donor parent genome with less than 10% of heterozygous segments. Further, G2 showed less percentage of recurrent parent genome and donor genome with maximum number of null alleles and recombination. Average percentage of recurrent parent genome in the genotypes was 74.7% and donor genome was 12.5%. Heterozygous segments average was 1.7% and null alleles were 9.7% and recombination was 18.5% (**Figure 5**).

Another panel of SSR markers which were linked to the reported QTLs from the same population was also used to screen these BILs. It was observed that many of the BILs have these reported QTLs in either homozygous or heterozygous condition. Yield QTL yldp1.4 from O. nivara was present in most of the BILs.G6 had O. nivara allele of yldp1.4 and its derived lines G4 and G5 had this QTL in completely heterozygous stage. G2 had four QTLs yldp1.4 yldp2.3, nsp1.2 and dtm 2.7 and G1 had three QTLs yldp9.1, dtm9.3 and nfg1.2.

## DISCUSSION

Pre-breeding and utilization of wild accessions are gaining importance in plant breeding programs for the identification of novel genes to improve yield levels of existing cultivars. Complex quantitative traits such as yield, with multiple contributing traits are highly influenced by environment interaction effects.

FIGURE 3 | Polygon views of the GGE biplot based on symmetrical scaling for "which-won-where" pattern of rice genotypes in three environments. (A), Polygon view of single plant yield. (B), which-won-where plot single plant yield. (C), Polygon view bulk yield. d which-won-where plot bulk yield. (E), Polygon view per day productivity. (F), which-won-where plot per day productivity.

Wide spread cultivation of rice in various agro ecological environments and the unpredicted effects of climate change makes the cultivation of stable and adaptable genotypes more desirable (Bose et al., 2012; Vanave et al., 2014). Stability and GEI studies are very important for the efficient breeding and adoption in multi-environment conditions (Kempton et al., 1997; Atlin et al., 2000; IRRI, 2006; Liang et al., 2015).

The yield of rice genotypes fluctuates considerably with change in environmental conditions (Bose et al., 2014). Segregation and appearance of wild traits in advanced generations are common phenomena in interspecific crosses. Keeping this in view, the present study was aimed at identification and characterization of a set of BILs developed from the same parental cross, for three seasons and stability was analyzed for key component traits for yield. Most of the multi location trials and multi-year testing focus only on the stability of grain yield but here we studied other yield contributing and component traits also.

Stability analysis models like YSi statistics, AMMI and GGE biplots are very useful in selecting lines with high homeostasis for broad target environments and were utilized in multilocation trials and in coordinated variety testing programmes. Prasad et al. (2001) studied stability and yield performance of mega varieties using the data from All India Coordinated Rice Improvement Programme (AICRIP) as well as international trials for a period of 25 years and identified four mega environments for testing the varieties for yield potential in India. Many studies have used GGE biplot analysis mainly for megaenvironment evaluation, cultivar evaluation, and assessment of varietal stability (Kang, 1993; Yan and Hunt, 2001; Yan and Kang, 2003; Dehghani et al., 2006; Navabi et al., 2006; Blanche et al., 2007; Ding et al., 2007; Jalata, 2011; Mohammadi et al., 2012; Rakshit et al., 2012; Amiri et al., 2015). Balestre et al. (2010); Nassir (2013) studied stability and adaptability of upland rice genotypes by the GGE biplot method based on the predicted genotypic and phenotypic values. The simultaneous selection for high mean and stability results in the selection of superior genotypes with non significant stability variance and it enhances quality of selection. This method was successfully utilized in most of the crops including rice (Wade et al., 1999; Ouk et al., 2007; Tariku et al., 2013) especially for assessing grain yield.

In this study, three seasons data was subjected to correlation analysis and the traits which are associated significantly were discussed for stability analysis. Stability analysis models helped identification of superior genotypes with both high mean yield and stability. Different stability analysis models showed that G3 is the most stable genotype for grain yield followed by G6 and Swarna. G5 and G6 were identified to be stable and ideal genotypes for bulk yield, grain number and number of filled grains, followed by G3, G7, G15, G8, and G12. G14 was identified as most stable genotype for biomass, flowering duration, panicle length and total dry matter production. G2 showed stability and high mean value for 1000 grain weight, spikelet fertility, plant height and panicle length; it was identified as stable genotype for days to fifty percent flowering and days to maturity with lowest mean value indicating the most stable short duration BIL followed by G3. Some of the superior genotypes for yield specific traits with less stability across the seasons can be stabilized with limited back cross approach (Singh and Huerta-Espino, 2004) with an adapted cultivar.

An ideal genotype would be one that has both high mean yield and high stability. The position of an "ideal" genotype is closer to the direction of the mean environment and has a zero projection onto the perpendicular AEC ordinate. G2 and G4 showed high mean ranking and were identified as the best performing lines in terms of both mean yield and stability across environments in the irrigated ecosystem. Based on adaptation map G2 is best adapted for environment E1, G4 for E2 and G7 for E3. Response plot for mean yield also indicated the same results across the seasons.

Among the genotypes recurrent parent Swarna was stable across the seasons for traits like bulk yield, biomass, days to maturity, number of filled grains, panicle length, productive tiller number, panicle weight. The mega variety Swarna, which is popular in major rice growing countries like India, Bangladesh, Philippines and Thailand; is known for its adaptability in wide range of environments (Prasad et al., 2001). The performance of the BILs was on par or above the mega varieties like Swarna and most popular cultivars of different durations. As the BILs and checks belong to different maturity groups, per day productivity (YPD) was considered to compare their yield. The frequency distribution of the pooled data of three seasons for each trait followed a bell shaped curve with Swarna placed on the peak of the curve (**Figure 6**). However, G6 and G3 proved significantly superior in yield over the recurrent parent Swarna and on par with the best check MTU1010. Graphical representation of the molecular marker data has relevance in studying the genome constitution of the recombinant population (Young and Tanksley, 1989). It was observed that the BILs had more than 70% of recurrent parent genome. Tian et al. (2006) reported that the high-yielding ILs contained relatively less introgressed segments than the low-yielding ILs in a set of 159 ILs derived from Oryza rufipogon in indica cultivar Guichao2 back ground using 126 polymorphic SSRs.

Plant height and 1000 grain weight were the most stable traits across the season with minimal genotypic variation and with PC1values of 96.5% and 97.8% respectively in GGE biplot (**Figure 7**). The explained SS (%) factor was calculated comparing sum of square (SS) from AMMI ANOVA showing the percentage contribution of genotype, environment and interaction effects in phenotypic expression of each trait. It was observed that grain yield was contributed mainly by genotype (41.28%), followed by environment (31.92%), and their interaction (26.81%). The percentage of explanation of phenotype by genotypic contribution was high for 1000 grain weight (90.40%) and plant height (85.99) and environment effect was high for tiller number (47.56), days to flowering (47.24) and grain yield (31.92) while interaction effect was high for bulk yield and filled grains per panicle (**Figure 8**).

The G × E interactions for most of the yield traits under study were significant but some traits showed stable genotypic performance across the environments. The seasonal variation between the highest value and lowest value was observed for traits BM, BY, DFF, DTM, TN, and GY but the difference was minimal for traits GN, PL, PW, TGW, SF, HI, and YPD (Supplementary Figure 3). For the widely varying traits with high GEI, additional agronomic management is also required along with crop improvement for trait stability. Crossover GE and dissimilarity between environments for discriminating genotypes were very low in case of GN, PH, PL, PW, SF, and TGW but were moderate in case of all other traits. Identification of stably expressing contributing traits is very essential for crop improvement for any major trait than combining multiple traits which fluctuate across the environments. It was observed that 1000 grain weight is the trait contributing most to yield and is stably inherited as well as stably associated with grain yield. So while considering the grouping of the genotypes for multi location yield trials, this trait should also to be considered along with duration for a reliable comparison and analysis. The future breeding programs need to focus on improvement of stably performing traits with high heritability

to develop stable high yielding genotypes. Dalvi et al. (2007), Panwar et al. (2008), Waghmode and Mehta (2011), and Padmavathi et al. (2013) studied the GE interaction for grain quality in rice and the stability of grain quality is also important in multilocation trials if quality is the selection criteria.

In this study, we used two Kharif and one Rabi season data and there was significant seasonal variation observed for most of the traits. In case of yield GEI were higher in Kharif season than in Rabi season and similar results were reported by Atlin et al. (2000). GGE biplots indicated that Kharif 2013 was the ideal season to select genotypes for BM, HI, PL, PTN, SF, GY, TDM, TGW, Kharif 2014 for DFF, FG, GN, PH, PW, and Rabi 2014 for BY and TGW. Kharif seasons were most ideal seasons to select BILs for yield contributing traits; especially Kharif 2014 was discriminative as well as representative among the seasons and was suitable for selecting genotypes with general adaptation. Rabi season was the most discriminating and least representative environment for testing genotypes and is useful in selecting only specifically adapted genotypes. Kharif 2013 was found least discriminating but most representative among the seasons for most of the traits. The highest environmental averages for all the yield traits were observed in either of the two Kharif seasons. The maximum value was also observed in the Kharif seasons except for traits like SF, BM, and YPD which showed maximum value in Rabi. The significant difference due to environment indicated the existence of genotypic differences in adaptability. Genotypes also differed considerably with respect to their stability for yield traits. Similar observations on GEI were made by Gauch and Zobel (1996); Wade et al. (1999); Ouk et al. (2007); Das et al. (2009); Sreedhar et al. (2011); Tariku et al. (2013); Akter et al. (2014) on multi environment studies using rice genotypes. All the three models showed similar results and utilization of Ysi statistic is advantageous and complements the AMMI and GGE method for selecting stable and high yielding genotypes (Nassir and Ariyo, 2011). Kumar et al. (2012) used different models for stability analysis and the correlations between the stability rankings of entries produced by the GGE model and the parameters of Shukla, AMMI, showed very high rank correlation coefficients.

Advanced BILs with stable yield traits can be grown in several environments to study QTL x environment interactions and these lines can be used in breeding programmes as well as to develop varieties in relatively less duration (Jeuken and Lindhout, 2004). Further studies will focus on (i) development of BIL x BIL mapping population from the stable identified genotypes (ii) identification of QTL for stable contributing traits for yield and (iii) development of varieties from selected stable BILs through multi location variety trials.

## CONCLUSIONS

The study showed the importance of genotype × environment interaction and stability analysis for evaluation of genotypic yield potential. Wild introgression lines derived from O. nivara in Swarna background were studied for stability for yield related traits in Kharif and Rabi season. The stability and adaptability studies using AMMI, GGE biplot and Ysi statistics indicated G3(14S) and G6(166S) as the most stable BILs with high yield performance. The percentage of explanation of genotype on phenotype was high for 1000 grain weight and plant height and environment effect was high for tiller number, days to flowering and grain yield and interaction effect was high for bulk yield and filled grains per panicle. DRR Dhan 40, an elite BIL and recently released variety showed yield stability with high mean performance and the mega varieties which were used as checks also showed yield stability across the seasons. It was observed that wild derived lines with about 70% of recurrent parent genome were more stable and showing enhanced yield levels. Thus, more emphasis should be devoted in future breeding programs to pre breeding and to develop genotypes with wider adaptation. Stability analysis and GEI may

be further extended widely for stress resistance, quality as well as nutrient composition for precise identification of superior genotypes.

#### ETHICS STATEMENT

The authors declare that the experiments comply with the current laws of the country in which they were performed and in compliance with ethical standards.

## AUTHOR CONTRIBUTIONS

DB, SN conceived and planned the work. Phenotypic and genotypic screening was performed by JB, AR, RY, KB, SM, MS, DB, RP, and GP. DS, DB, analyzed the data. DB wrote the manuscript. DS and SN revised and proofread the manuscript. Facilities provided at Indian Institute of Rice Research by VB.

## REFERENCES


#### ACKNOWLEDGMENTS

This research was carried out as part of ICAR- National Professor Project (F.No: Edn/27/4/NP/2012-HRD) funded by Indian Council of Agricultural Research, New Delhi, India; under sub project on Mapping QTLs for yield and related traits using BILs from elite x wild crosses of rice (ABR/CI/ BT/11). These lines were initially developed in Department of Biotechnology (DBT) funded project, (BT/AB/FG-2 (Ph-II) 2009), New Delhi, India. The authors are highly grateful to the Director, ICAR- IIRR for providing facilities.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 01530

Crop Sci. 37, 406–415. doi: 10.2135/cropsci1997.0011183X0037000 20017x


environment interactions for yield in rainfed lowland rice," in Plant Adaptation and Crop Improvement, eds M. Cooper, and G. L. Hammer (Wallingford: CAB International, in Association with IRRI and ICRISAT), 443–464.


Young, N. D., and Tanksley, S. D. (1989). Restriction fragment length polymorphism maps and the concept of graphical genotypes. Theor. Appl. Genet. 77, 95–101. doi: 10.1007/BF00292322

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Balakrishnan, Subrahmanyam, Badri, Raju, Rao, Beerelli, Mesapogu, Surapaneni, Ponnuswamy, Padmavathi, Babu and Neelamraju. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Temporally and Genetically Discrete Periods of Wheat Sensitivity to High Temperature

Henry M. Barber <sup>1</sup> \*, Martin Lukac1, 2, James Simmonds <sup>3</sup> , Mikhail A. Semenov <sup>4</sup> and Mike J. Gooding<sup>5</sup>

*<sup>1</sup> School of Agriculture, Policy and Development, University of Reading, Reading, UK, <sup>2</sup> Faculty of Forestry and Wood Sciences, Czech University of Life Sciences, Prague, Czechia, <sup>3</sup> Department of Crop Genetics, John Innes Centre, Norwich, UK, <sup>4</sup> Computational and Systems Biology Department, Rothamsted Research, Harpenden, UK, <sup>5</sup> Institute of Biological, Environmental and Rural Sciences, University of Aberystwyth, Aberystwyth, UK*

Successive single day transfers of pot-grown wheat to high temperature (35/30◦C day/night) replicated controlled environments, from the second node detectable to the milky-ripe growth stages, provides the strongest available evidence that the fertility of wheat can be highly vulnerable to heat stress during two discrete peak periods of susceptibility: early booting [decimal growth stage (GS) 41–45] and early anthesis (GS 61–65). A double Gaussian fitted simultaneously to grain number and weight data from two contrasting elite lines (Renesansa, listed in Serbia, *Ppd-D1a*, *Rht8*; Savannah, listed in UK, *Ppd-D1b*, *Rht-D1b*) identified peak periods of main stem susceptibility centered on 3 (s.e. = 0.82) and 18 (s.e. = 0.55) days (mean daily temperature = 14.3◦C) pre-GS 65 for both cultivars. Severity of effect depended on genotype, growth stage and their interaction: grain set relative to that achieved at 20/15◦C dropped below 80% for Savannah at booting and Renesansa at anthesis. Savannah was relatively tolerant to heat stress at anthesis. A further experiment including 62 lines of the mapping, doubled-haploid progeny of Renesansa × Savannah found tolerance at anthesis to be associated with *Ppd-D1b*, *Rht-D1b*, and a QTL from Renesansa on chromosome 2A. None of the relevant markers were associated with tolerance during booting. *Rht8* was never associated with heat stress tolerance, a lack of effect confirmed in a further experiment where *Rht8* was included in a comparison of near isogenic lines in a cv. Paragon background. Some compensatory increases in mean grain weight were observed, but only when stress was applied during booting and only where *Ppd-D1a* was absent.

Keywords: heat stress, meiosis, anthesis, Ppd-D1, Rht, wheat

## INTRODUCTION

Improving crop resilience to more frequent extreme weather events is required to maintain or improve crop yields across Europe (Semenov et al., 2014). Wheat, a major contributor to human diet and health (Shewry and Hey, 2015), is particularly susceptible to heat stress around meiosis and anthesis (Barnabas et al., 2008). Yield loss due to heat stress at these growth stages is primarily due to disruption of reproductive processes (Saini and Aspinall, 1982; Saini et al., 1983), as evidenced by a reduction in fertility and grain number (Dolferus et al., 2011). Previous reports on heat stress

#### Edited by:

*Soren K. Rasmussen, University of Copenhagen, Denmark*

#### Reviewed by:

*Robert John French, Department of Agriculture and Food, Australia Muhammad Zahid Ihsan, King Abdulaziz University, Saudi Arabia*

> \*Correspondence: *Henry M. Barber hmbarber28@gmail.com*

#### Specialty section:

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

Received: *10 May 2016* Accepted: *10 January 2017* Published: *25 January 2017*

#### Citation:

*Barber HM, Lukac M, Simmonds J, Semenov MA and Gooding MJ (2017) Temporally and Genetically Discrete Periods of Wheat Sensitivity to High Temperature. Front. Plant Sci. 8:51. doi: 10.3389/fpls.2017.00051*

**82**

in wheat usually concern only one of the susceptible timings i.e., meiosis (Saini and Aspinall, 1982; Saini et al., 1984) or anthesis (Tashiro and Wardlaw, 1990; Ferris et al., 1998; Lukac et al., 2012; Pradhan et al., 2012; Steinmeyer et al., 2013; Liu et al., 2016). Fewer studies have attempted to quantify the response to stress at both of these timings: Alghabari et al. (2014) suggest meiosis is the most vulnerable stage, but Prasad and Djanaguiraman (2014) report that it is anthesis that is particularly susceptible. Previous work has often assumed that these growth stages represent two separate, discrete periods of susceptibility but there is currently little evidence to support this. Single experiments on rice and wheat suggest that there may be a period between meiosis and anthesis that is relatively tolerant to heat stress (Satake and Yoshida, 1978; Craufurd et al., 2013), but it is unclear as to the specific growth stages when this tolerance occurs. Genotypic interactions with heat stress timing also require clarification. Although some recent work has compared the heat stress response at anthesis across multiple genotypes (Liu et al., 2016), little work has quantified how genotype influences susceptibility across both stages, even though consecutive exposure of both stages to stress seems likely to occur in field conditions (Wardlaw et al., 1989).

Here, we investigate firstly whether periods of vulnerability to heat stress during reproductive phases can truly be differentiated temporally, in association with growth stage development. Secondly we investigate whether the effect of genotype on heat stress vulnerability interacts with timing of stress. We pay particular attention to the effects of three alleles reported to influence heat stress tolerance and have adaptive significance in wheat grown in European regions with different frequencies and severities of heat stress, namely Rht8, Ppd-D1a, and Rht-D1b (Worland, 1996; Worland et al., 1998; Rebetzke et al., 2007; Gasperini et al., 2012; Alghabari et al., 2014; Barber et al., 2015; Kowalski et al., 2016; Jones et al., 2017). We also assess associations with the 1BL/1RS translocation (Schlegel and Korzun, 1997) which introduced a number of race-specific disease resistance genes (Snape et al., 2007). The translocation has also been variously associated with increased above ground biomass, spikelet fertility, delayed senescence, and drought tolerance (Villareal et al., 1998; Rajaram, 2001), but there is apparently little information with regards to its influence on heat stress tolerance.

This paper describes the use of 1-day transfers of pot-grown wheat to replicated controlled environments to identify and characterize any periods of heat susceptibility during external growth stages extending from the second node detectable growth stage (GS 32; Zadoks et al., 1974) to the grain milky-ripe stage (GS 77) and hence encompassing meiosis and anthesis (Barber et al., 2015). An initial study compared the Southern European wheat Renesansa (Ppd-D1a, Rht-D1a, Rht8) to the UK-adapted wheat Savannah (Ppd-D1b, Rht-D1b, 1BL/1RS). Once susceptible growth stages were identified, further experiments compared the heat stress responses of near isogenic lines (NILs) of a Paragon background varying for presence and absence of Rht8, and also the responses of a mapping population of 62 doubled haploid progeny of Renesansa × Savannah, at appropriate timings.

## MATERIALS AND METHODS

#### Plant Material

Savannah has a high yield potential in North West Europe with low bread making quality and was recommended in the UK in 1998. Renesansa, a Serbian winter wheat listed in 1995, has high yield potential and high bread making quality in southern Europe. Sixty-two lines were selected from a recombinant doubled haploid (DH) population of Savannah × Renesansa based on their alleles at Ppd-D1, Rht-D1, 1BL/1RS, and Rht8 (Xgwm261; Simmonds et al., 2006; Snape et al., 2007). NILs varying for the presence and absence of Rht8, though both remaining sensitive to photoperiod were developed in a Paragon background (Kowalski et al., 2016). Paragon is a photoperiod sensitive spring wheat that can be also sown in autumn and was first listed in the UK in 1999 with good bread making quality.

#### Growing Conditions and Post-harvest Analysis

Plants used in these experiments were grown in pots (180 mm diameter) at the Plant Environment Laboratory at the University of Reading, UK (51 27′ N latitude, 00 56′ W longitude). Each pot contained 2.8 kg of growing media comprising 4:2:4:1 of vermiculite: sand: gravel: compost mixed with Osmocote slow release granules (2 kg m−<sup>3</sup> ) containing a ratio of 15:11:13:2 of N:P2O5:K2O:MgO. Seven seeds were sown per pot; thinned to four plants per pot at the two leaf stage. The pots were maintained outside in prevailing conditions (**Table 1**) under a protective net cage in four randomized blocks with guard pots of wheat placed around the perimeter of experimental blocks. Fungicide was applied as and when required. Pots were watered up to twice daily by an automatic drip irrigation system to maintain field capacity. All treatments consisted of transfers to Saxil growth cabinets, which began between 10:20 and 11:20 h (BST) and remained there for 24 h (16 h day, night time between 22:00 and 06:00 h) before being returned outside to their original randomized block position. Average daily temperature during the treatment period was 14.3◦C in 2013/14 and 13.5◦C in 2014/15. Two temperature regimes were used in all experiments, day/night temperatures of 20/15 for the control treatment and 35/30◦C for the heat stress treatment. Pots were irrigated to field capacity before transfer, but

TABLE 1 | Outside temperatures under which plants were grown in the 2013/14 season.


were not irrigated whilst in the cabinets. Eight growth cabinets were used which allowed the two temperature treatments to be replicated for the four blocks. On the day of transfer main stems in each pot were tagged and assessed for growth stage (GS, Zadoks et al., 1974). Pots were weighed immediately before and after transfer to monitor water loss. Main stems and tillers were harvested separately after physiological maturity (GS 89) and dried (48 h at 80◦C). Ears and spikelets per ear were counted, after which grain was threshed from ears, then re-dried, weighed, and counted by a Kirby Lester K18 tablet counter.

#### Experiment 1

Experiment 1, sown on the 16th December 2013, comprised a complete factorial of: the two DH parent winter wheat cultivars, Savannah and Renesansa; day of transfer to Saxil growth cabinets (31 separate timings between May 2nd and June 13th 2014); and the two temperature regimes within growth cabinets. Confounding effects associated with temperature included water loss. The mean weight of pots on entry was 3.40 kg, whilst mean weights of pots on withdrawal were 3.19 and 2.98 kg (SED = 0.016) for the 20/15 and 35/30◦C treatments, respectively. More detailed studies on the water relations within this growing medium and system suggests that this degree of water loss would equate to 78 and 56% field capacity (FC; oven dry = 0% FC; Gooding et al., 2003), respectively, and that a FC of <70% maintained for 14 days during grain filling was required to reduce grain yield. A further confounded environmental variate was mean relative humidity [73% for 20/15◦C and 47% for 35/30◦C (SED = 4.4)] whilst in the cabinets.

#### Experiment 2

Also sown on the 16th December 2013, the treatment structure comprised a complete factorial design of: three genotypes [Paragon, Rht8 NIL, and Tall NIL (Kowalski et al., 2016)]; day of transfer to Saxil growth cabinets (5 separate days between 19th May and 10th June 2014) and the two temperature regimes within growth cabinets.

#### Experiment 3

Experiment 3 was sown on 3rd December 2014. The treatment structure comprised a complete factorial of 62 DH Lines, three growth stages at transfer to Saxil growth cabinets, and two temperature regimes within growth cabinets. The three timings targeted specific stages of growth: early booting (GS 39–41); mid booting (GS 43–45); and early anthesis (GS 63–65). Due to variable rates of development within a 24 h period, and differential rates of progression, not all lines were transferred within target. Nonetheless, GS at transfer was always recorded.

#### Statistical Analysis

The primary statistical approach was an appropriate factorial analysis of variance (ANOVA) with a blocking structure of Block/Cabinet/Pot (GenStat 14th edn., VSN International Ltd.). For Experiments 1 and 2, polynomial regressions were fitted across day of transfer to growth cabinet using orthogonal polynomial contrasts in the ANOVA i.e., treatment structure was pol (Day; n) <sup>∗</sup> Temperature <sup>∗</sup> Genotype, where n was the maximum level of polynomial to be fitted. Where quartic effects or deviations from them were significant in Experiment 1, fits were compared with the double Gaussian model (Equation 1) on an r 2 adj basis. The maximal double Gaussian model permits the estimation of two "bell-shaped" curves:

Relative Effect (%) = 100 + b(2πs<sup>1</sup> 2 ) −0.5 e −(t−m) 2 /2s<sup>1</sup> 2 + c(2πs<sup>2</sup> 2 ) −0.5 e −(t−n) 2 /2s<sup>2</sup> 2 ) (1)

Where: Relative Effect is the result at 35◦C (day temperature) expressed as a percentage of that achieved at 20◦C; b and c are the size of the two peaks; m and n are when, in time t, they are centered; and s<sup>1</sup> and s<sup>2</sup> are the Gaussian shape factors (standard deviation) for the two peaks. This double Gaussian approach has previously been used to detect other phenologically-dependent responses in wheat time series data sets (Lu et al., 2014). The FITNONLINEAR routine in GENSTAT 14 was used to compare regressions and allow a parsimonious approach to the inclusion of various parameters in the model fits. Additionally, the routine allowed simultaneous fits to different response variates (weighted for the inverses of their variances). Here, it was used to investigate potential compensation in mean grain weights at the time when grain numbers were reduced by heat stress.

Experiment 3 was analyzed by ANOVA with a treatment structure of Genotype × Target Growth Stage × Temperature. A regression analysis was conducted in an attempt to control the effects of varying growth stages within the target GS cohorts. Main and interacting effects of Rht-D1b, Rht8, Ppd-D1a, and 1BL/1RS were tested for their significance in the model (P < 0.05). In addition, after correcting for the linear effect of GS within target GS cohort, a QTL analysis was conducted from the effects of the high temperature treatment on individual lines within each target GS. A framework genetic map was constructed from 93 lines of the population as previously described by Snape et al. (2007), containing 107 single sequence repeat (SSR) markers and perfect markers for Ppd-D1, Rht-D1, and 1BL/1RS. Linkage map construction was performed using JoinMap <sup>R</sup> 3.0 (Kyazma BV) with default settings. Linkage groups were determined using a Divergent log-of-odds (LOD) threshold of 3.0 and genetic distances were computed using the Kosambi regression. The genetic map consisted of 25 linkage groups with 45 unlinked markers. QTL Cartographer 2.5 (North Carolina State University) was used for QTL detection using single marker analysis and composite interval mapping (CIM). Estimates of the additive effects and percentage of total variation for identified QTL were calculated using the multiple interval mapping (MIM) function.

#### RESULTS

#### Experiment 1

Grain yield per pot indicated a three factor interaction between day of transfer, temperature and cultivar (P = 0.002; deviation from quartic P = 0.007; **Figures 1A,B**). Most of the interaction was due to changes in grain number per pot (P < 0.001 for the three factor interaction; deviation from quartic P < 0.001), with some modification through partial compensatory increases in

mean grain weight, particularly after some of the earlier transfers (e.g., P < 0.001 for cubic.Day × Cultivar). There were no (P > 0.05) main, or interacting effects, of temperature on ear number per pot (mean for Renesansa and Savannah = 9.2 and 9.5, respectively; S.E.D. = 0.12; 345 d.f.) or spikelet number per ear (Renesansa = 20.3, Savannah = 20.0; S.E.D. = 0.09).

With regards to timing of susceptibility to heat stress, the grain yields from the main stems provided better clarity than the yields from the whole plot, presumably because of the broader spectrum of the growth stages deriving from the tillers (Jones et al., 2017) and as growth stage assessments focussed primarily on main stems. On the main stems, yields of Renesansa appeared to be repeatedly compromised by day transfers to the higher temperature from 6 to 12 May, and again from 22 to 30 May (**Figure 1C**). In Savannah there was a significant period of susceptibility from the 17 to 21 May, and possibly a second period from 4 to 9 June (**Figure 1D**). Variation in growth stage amongst mainstems appeared to be greater for Renesansa (**Figure 1E**) than for Savannah (**Figure 1F**). Nonetheless, on average, for much of the period of transfers, the growth stage development of Savannah appeared to be about 10 days later than that for Renesansa. This difference could be identified with accuracy at mid anthesis as over 80% of mainstems were scored as at GS 65 on 28 May for Renesansa and on 7 June for Savannah.

When Day of transfer was expressed as relative to GS 65, there was strong evidence for two peak timings of susceptibility, but there was no evidence that timing of the peaks for susceptibility varied for the two cultivars, or that the standard deviation of the two peaks varied (Gaussian s). With regards to grain numbers on the mainstem (**Table 2**; **Figure 2**), a first peak was centered about 18 days before GS 65 when 50% of Renesansa mainstems were at GS 43–45, and 50% of Savannah mainstems were at GS 41– 43 (**Figure 1**). Both cultivars appeared comparatively tolerant of the heat stress during late booting and ear emergence. A second period of susceptibility, however, was detected during late ear emergence and early phases of anthesis, centered on 3 days before GS 65 (**Table 2**; **Figure 2**), when most of the ears would have been at GS 61. Grain set in Renesansa appeared equally susceptible to the heat stress during booting and anthesis (**Table 2**; **Figure 2**). Grain set in Savannah was significantly more susceptible during booting than at anthesis, but the only time when grain set was significantly compensated by increased mean grain weight was at the earlier timing (**Table 2**; **Figure 2**). There was no statistical evidence for compensation for grain set failure through mean grain weight by Renesansa during either period of susceptibility.

#### Experiment 2

There was a significant interaction between the time of transfer and temperature on mainstem grain number (P = 0.005 for Temperature × quadratic Day). As in Experiment 1, a significant reduction in grain numbers from the main stems resulted from a day transfer to 35/30◦C rather than 20/15◦C, 18 days before mid anthesis (GS 65; **Figure 3**), whilst the plants were in the early to mid-stages of booting (c. GS 43). There were smaller reductions in grain numbers following heat stress during late ear-emergence and early anthesis, commensurate with the effects on grain numbers of Savannah at similar timings in Experiment 1. Plants appeared tolerant of the higher temperature at the start of booting (c. GS 40) and by mid anthesis (GS 65). There was no statistical evidence in Experiment 2 that reductions in grain numbers were mitigated by increases in mean grain weight; neither was there any evidence that Rht8 influenced tolerance to heat stress during booting or anthesis (P = 0.997 for Temperature × Day × Genotype on mainstem grain numbers).

#### Experiment 3

Within the doubled haploid population, when using the "target" growth stages for transfer as a fixed effect there was a very highly significant interaction (P < 0.001) between temperature, growth stage, and DH line for grain number. When making TABLE 2 | Parameter values for simultaneous double Gaussian fit (Figure 2) to the effects of increasing day temperature from 20 to 35◦C over successive single days for grain yield components on main stems of two cultivars of winter wheat.


some allowance for actual growth stages within target stress timings, there was evidence of increasing susceptibility from GS 37 to 41 (**Figure 4D**) and from GS 59 to 65 (**Figure 4F**). There was wide variation in susceptibility of lines within the doubled-haploid population, particularly at the mid-booting growth stage (**Figures 4B,E**). None of this variation was significantly associated with the markers for Rht8 or the 1BL/1RS translocation. At anthesis, however, main effect associations with both Rht-D1b (P < 0.001) and Ppd-D1a (P = 0.006) were significant. Rht (tall) and Ppd-D1a were associated with increased susceptibility during anthesis (**Figure 4F**). The QTL analysis confirmed the protective nature of the Savannah alleles (Rht-D1b and Ppd-D1b), but in addition identified a further, and stronger protective QTL from Renesansa on chromosome 2A (**Table 3**). None of these alleles could be detected as being protective against heat stress applied during booting. There was however, a weak protective QTL from Renesansa for heat applied during early booting on 2B (nearest marker = Xgwm120; LOD = 1.85; additive effect = −3.75).

In addition to effects on fertility, there was a significant three factor interaction on mean grain weight (P = 0.032). Increased mean grain weight at the higher temperature during the early stages of booting (**Figure 4A**) occurred in the lines not marked for Ppd-D1a, and was most evident in lines containing Rht-D1b. As anthesis progressed, the higher temperature caused progressively greater reduction in the mean grain weights of lines containing Ppd-D1a (**Figure 4C**).

#### DISCUSSION

This study clarifies the effect of heat stress on wheat yield during reproductive development, as well as the influence of growth stage and potentially adaptive genotypic effects. We have identified two discrete periods at which grain set in wheat is susceptible to high temperature: the first in early to midbooting presumably commensurate with susceptible meiotic stages (Barber et al., 2015) and the second during the early phases of anthesis. We have demonstrated that genotypic effects

on tolerance to heat stress vary with the particular period of vulnerability.

Reductions in grain number due to heat stress caused by reduced fertility found across all experiments in this study are in agreement with previous work (Saini and Aspinall, 1982; Ferris et al., 1998; Dolferus et al., 2011; Liu et al., 2016). There is some evidence to suggest that grain size can increase and partially compensate for losses caused by abiotic stresses (Semenov et al., 2014), however this is mostly confined to the booting period of susceptibility and was not consistently observed across genotypes. Grain size increases found at booting but not at anthesis support the lack of grain size compensation found by Liu et al. (2016). This variation in compensatory increases in mean grain weight over genotype and growth stage should be accounted for when attempting to improve the response of crop models to abiotic stress (Stratonovitch and Semenov, 2015; Liu et al., 2016). Consistent with previous literature, the peak periods of susceptibility appear to be early to mid-booting (Saini and Aspinall, 1982; Alghabari et al., 2014)

and early flowering (Ferris et al., 1998; Craufurd et al., 2013; Prasad and Djanaguiraman, 2014). There is some evidence to suggest that the period between meiosis and anthesis appears to be relatively tolerant to short durations of heat stress: similar to what has been observed in rice (Satake and Yoshida, 1978, 1981; Craufurd et al., 2013), with indications that this could also be true in wheat (Prasad and Djanaguiraman, 2014). Responses to heat stress are strongly influenced by genotype, as shown by variation within these experiments, especially between Savannah and Renesansa. Genotypic differences, especially at anthesis, as observed here, have been identified previously (Stone and Nicolas, 1994; Alghabari et al., 2014; Lobell et al., 2015; Liu et al., 2016). This suggests that there is potential for identifying heat tolerant traits within the current genetic diversity of wheat, which will be crucial for crop production

stage distributions of mainstems on day of transfer.

in future climates (Godfray et al., 2010; Semenov et al., 2014).

It is necessary to acknowledge the possible confounding effects between heat stress tolerance and water deficit (Barnabas et al., 2008; Alghabari et al., 2014) in these experiments. However, the deficits below FC reported here at the end of pot transfer, and the durations over which significant deficits could have occurred, are considered to be relatively minor compared with the results from experiments with longer periods of stress (Gooding et al., 2003; Alghabari et al., 2014). Nonetheless, booting is known to be a period particularly susceptible to drought (Barber et al., 2015) and future work on identifying tolerant traits to abiotic stresses will require consideration of the combination of drought and heat stress.

There has previously been some suggestion that the semi dwarfing allele Rht8, commonly found in southern European genotypes of wheat (Worland, 1996; Gasperini et al., 2012), could also increase tolerance to heat and drought stress compared to TABLE 3 | Quantitative trait loci for relative fertility (%) in response to heat stress during anthesis (grain numbers following 1 day transfer to 35◦C as a percentage of that achieved at 20◦C).


other semi dwarfing alleles (Alghabari et al., 2014). However, our study found no effect of Rht8 on susceptibility to heat stress. This suggests that even in future climates, Rht8 would not be of benefit to northern European genotypes due to its lower yield in comparison to other semi dwarfing alleles (Rebetzke et al., 2007). Furthermore, Ppd-D1a, to which Rht8 is closely linked (Gasperini et al., 2012) was shown to increase susceptibility to heat stress. Photoperiod insensitivity caused by the allele Ppd-D1a, a mechanism used to avoid abiotic stress (Gomez et al., 2014), is widely considered to be a beneficial trait in future climates due to reducing thermal time to senescence (Barber et al., 2015), thereby avoiding late season heat and drought stress. It was also suggested by Jones et al. (2017) that the increase in flowering duration associated with Ppd-D1a would add further resilience by increasing diversity of flowering timing within a field. However, the increase in susceptibility to heat stress associated with this allele, as well as lower overall grain yield in non-stressed seasons (Addisu et al., 2010) casts doubt over the benefits that Ppd-D1a might bring under future northern European climates. Although the introduction of Rht-D1b in to Northern European wheats has increased yield through increased harvest index and reduced lodging in fertile conditions (Flintham et al., 1997), it has also been associated with some negative traits, including decreases in fertility (Law et al., 1981). Preliminary work by Law and Worland (1985) suggested that the decrease in GA sensitivity caused by Rht-D1b increases susceptibility to heat stress. This is supported by later work in other cereals, such as barley, which shows that reducing sensitivity to GA increases susceptibility to heat stress (Vettakkorumakankav et al., 1999; summary provided by Maestri et al., 2002). However, our study shows evidence to the contrary. Here, Rht-D1b was associated with greater tolerance of high temperatures at anthesis than the other alleles associated with stature. In particular, the tall allele at the Rht-D1 locus was associated with susceptibility to heat stress at anthesis. This contrasts with the effects of Rht-D1 dwarfing alleles in some, but not all, backgrounds reported by Alghabari et al. (2014). We have found no genetic explanation for the poor performance of the Northern European genotype at booting. However, this can likely be attributed to the lack of selection pressure previously on breeding programmes for this trait.

With respect to the QTL analyses, others have also found regions on chromosomes on 2A and 2B to be associated with differential responses to heat stress (Mason et al., 2010; Talukder et al., 2014). Given the strength of the protective effect associated with the QTL on 2A further investigation is warranted for alleles in the relevant region from Renesansa. What is very clear from this study is that alleles and QTL detected as being associated with

#### REFERENCES


heat stress tolerance is highly dependent on the precise growth stage of the plant when excessive heat is experienced.

#### CONCLUSIONS

In conclusion, this paper provides the strongest existing evidence that the key phases susceptible to heat stress at booting and anthesis in wheat are discrete and that genotypes vary with regards to the most susceptible growth stage. Periods of susceptibility are repeatedly observed during GS 41–45 and again from GS 61–65. In the prevailing conditions (mean daily temperature 14.3◦C) periods of peak susceptibility could be separated by 15 days. We found no evidence that the southern European semi dwarfing allele Rht8 adds tolerance to heat stress within NILs or a DH population. In contrast, the north European allele Rht-D1b was associated with increased tolerance to heat stress at anthesis. The photoperiod insensitivity allele Ppd-D1a was also found to be linked to increased susceptibility to heat stress.

#### AUTHOR CONTRIBUTIONS

HB, MG, and MS contributed to experimental design, HB and MG conducted analysis on the data with assistance from ML on interpretation of the data, whilst JS conducted QTL and genetic analysis. HB and MG drafted the work with revisions from MS, ML, and JS. HB, MG, MS, ML, and JS approve of the final version of the manuscript and all agree to be accountable for all aspects of the work.

#### ACKNOWLEDGMENTS

John Innes Centre, Institute of Biological Environmental and Rural Sciences, and Rothamsted Research receive strategic funding from the Biotechnology and Biological Sciences Research Council (BBSRC) of the UK. Henry Barber acknowledges financial support from BBSRC DTP Grant BB/J014451/1. The authors are grateful to Mr. J. L. Hansen, Mr. L. G. Doherty, and Ms. C. J. Hadley for technical assistance with the controlled environment experiments, and to Dr. Simon Griffiths for supplying the near isogenic lines.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Barber, Lukac, Simmonds, Semenov and Gooding. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Determining Phenological Patterns Associated with the Onset of Senescence in a Wheat MAGIC Mapping Population

Anyela V. Camargo<sup>1</sup> \*, Richard Mott <sup>2</sup> , Keith A. Gardner <sup>3</sup> , Ian J. Mackay <sup>3</sup> , Fiona Corke<sup>1</sup> , John H. Doonan<sup>1</sup> , Jan T. Kim<sup>4</sup> and Alison R. Bentley <sup>3</sup> \*

*<sup>1</sup> National Plant Phenomics Centre, Institute of Biological Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK, <sup>2</sup> UCL Genetics Institute, University College London, UK, <sup>3</sup> The John Bingham Laboratory, National Institute of Agricultural Botany, Cambridge, UK, <sup>4</sup> The Pirbright Institute, Surrey, UK*

#### Edited by:

*Diego Rubiales, Spanish National Research Council, Spain*

#### Reviewed by:

*Philippa Borrill, John Innes Centre, UK Freddy Mora, University of Talca, Chile*

\*Correspondence: *Anyela V. Camargo avc1@aber.ac.uk Alison R. Bentley alison.bentley@niab.com*

#### Specialty section:

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

Received: *15 July 2016* Accepted: *30 September 2016* Published: *24 October 2016*

#### Citation:

*Camargo AV, Mott R, Gardner KA, Mackay IJ, Corke F, Doonan JH, Kim JT and Bentley AR (2016) Determining Phenological Patterns Associated with the Onset of Senescence in a Wheat MAGIC Mapping Population. Front. Plant Sci. 7:1540. doi: 10.3389/fpls.2016.01540* The appropriate timing of developmental transitions is critical for adapting many crops to their local climatic conditions. Therefore, understanding the genetic basis of different aspects of phenology could be useful in highlighting mechanisms underpinning adaptation, with implications in breeding for climate change. For bread wheat (*Triticum aestivum*), the transition from vegetative to reproductive growth, the start and rate of leaf senescence and the relative timing of different stages of flowering and grain filling all contribute to plant performance. In this study we screened under Smart house conditions a large, multi-founder "NIAB elite MAGIC" wheat population, to evaluate the genetic elements that influence the timing of developmental stages in European elite varieties. This panel of recombinant inbred lines was derived from eight parents that are or recently have been grown commercially in the UK and Northern Europe. We undertook a detailed temporal phenotypic analysis under Smart house conditions of the population and its parents, to try to identify known or novel Quantitative Trait Loci associated with variation in the timing of key phenological stages in senescence. This analysis resulted in the detection of QTL interactions with novel traits such the time between "half of ear emergence above flag leaf ligule" and the onset of senescence at the flag leaf as well as traits associated with plant morphology such as stem height. In addition, strong correlations between several traits and the onset of senescence of the flag leaf were identified. This work establishes the value of systematically phenotyping genetically unstructured populations to reveal the genetic architecture underlying morphological variation in commercial wheat.

Keywords: wheat, senescence, data science, phenology, phenotyping, MAGIC

#### INTRODUCTION

Wheat is a pillar of global food security, providing 20% of protein and calories consumed worldwide and up to 50% in developing countries. It is the main food staple in Central Asia, West Asia and North Africa, which have the world's highest per capita wheat consumption (Valluru et al., 2015). Global wheat production is at risk due to climate change, population growth, changing food

**91**

preferences and the plant health challenges associated with its widespread cultivation. In order to maintain optimal production and profitability, wheat producers and processors must prepare for and adapt to these challenges. The current emphasis on food security has focused research attention on two avenues to improve wheat yield (Valluru et al., 2015): (1) increasing photosynthetic capacity and efficiency (Reynolds et al., 2009); and (2) increasing partitioning of assimilates to the developing spike and grain.

The timing of key developmental transitions is critical for many crops, but is particularly important in the temperate small grain cereals. For example, the transition from vegetative to reproductive growth can have major effects on biomass accumulation and harvest index that profoundly affect either the locales in which a variety can be profitably grown, or its ultimate use. Thus, crops destined for grain production should transition early, relative to the length of the growing season, to allow ripening, avoid stress, and achieve a high harvest index of grain to total biomass. Forage, biofuel or dual purpose crops could usefully transition later to allow greater total biomass accumulation, but this has to be tempered with the likelihood of deleterious stress/weather events. Flowering time, therefore, has been a key selection target since the beginning of domestication (Izawa, 2007), initially inadvertently but since modern breeding began, very directly.

Understanding the extent and basis of other aspects of phenological variation may be useful in breeding for yield potential and stress adaptation. In wheat, leaves can contribute up to 40% of the nitrogen incorporated by the grains on the fifteenth day after anthesis (Simpson et al., 1983). Therefore, lifespan of the leaves (hence yield) is a trade-off with N remobilization. Delayed leaf senescence (as in the stay-green effect), which maintains active photosynthesis for a longer period, can increase grain yields under certain circumstances (Gregersen et al., 2008). Conversely, accelerated senescence leads to low carbon (C) but high N remobilization, indicating plasticity of C and N remobilization during development, perhaps correlated with senescence. A better understanding the genetic and environmental factors affecting these processes would help optimize C and N remobilization to the actively developing grains under different growth/stress conditions.

While several patterns of senescence have been proposed (Thomas and Howarth, 2000), an ideal senescence phenotype in wheat, and in cereals in general, still needs to be identified (Gregersen et al., 2008), perhaps because of strong and variable environmental effects. In monocarpic crops such as wheat the initiation of senescence typically leads to a massive remobilization of phloem-mobile nutrients from the senescing plant parts to developing sinks, such as seeds or grains (**Figure 1**; Gregersen et al., 2008; Distelfeld et al., 2014). Pathogen infection also interacts with developmental processes in a complex way and symptoms of senescence often accompany the progression of disease, although senescence can also be delayed in response to pathogen infection (Häffner et al., 2015).

A major constraint to progress in breeding for high yield varieties is the access to appropriate and consistent selection environments. The selection environment plays a key role in the

efficiency of the selection process. Since environmental variables are almost impossible to control under field conditions, the identification of specific genetic factors associated to crop yield becomes more challenging (Bentley et al., 2013).

Modern controlled environment (CE) growth control and/or recording of variable environmental parameters such as temperature and watering allow for elimination or reduction of uncontrollable influences. Having control and access to an experiment's environmental parameters allows for reproducibility, and decreases the levels of uncertainty as it is easier to reverse-engineer an experiment in order to identify—or at least justify—the causes of a given phenotype, and minimizes the amount of replication per subject due to the low variability of the environment. Therefore, CE phenotyping offers closely defined conditions compared to the relatively homogeneous but less controllable growing conditions in a field plot.

To understand the genetic control of phenology and the onset of senescence in wheat, we screened a core set of the NIAB elite MAGIC wheat population (Mackay et al., 2014) across time. The eight founders of this MAGIC population were selected in partnership with UK wheat breeders to sample trait variation and germplasm important to current UK breeding programmes (Bentley et al., 2013). MAGIC populations combine high levels of genetic diversity, recombination and homozygosity (Mackay et al., 2014) to create a panel of recombinant inbred lines (RILs). A well-designed MAGIC population captures and immortalizes the variation released by intercrossing, thereby providing a stable well-defined population to be shared across sites and used across years.

In this study, plants from the elite MAGIC core set and its parents were scored throughout their life cycle to capture traits, including decimal growth stages, biomass and plant height. We discuss the use of a subset of MAGIC lines in the Smarthouse as a proof of concept that the combination of MAGIC + Smarthouse phenotyping should be repeated on a grander scale.

## MATERIALS AND METHODS

#### Plant Material

A subset of the NIAB Elite eight-founder MAGIC population described in Mackay et al. (2014) was used for all phenotypic screening. The complete population consists of approximately 1000 recombinant inbred lines (RILs) generated from three cycles of recombination between eight elite United Kingdom wheat varieties (Alchemy, Brompton, Claire, Hereward, Rialto, Robigus, Soissons, Xi-19) followed by five rounds of selfing to derive RILs. Further information about the population, including pedigree, genotype, and existing phenotype data can be found at www.niab.com/MAGIC.

The core set used in this study was selected to represent all funnels of the 210 8-way crosses within the population. Two funnels were not represented due to limited seed availability making a total of 208 RILs in the core set.

## Glasshouse Cultivation

Plants were grown between mid-January 2015 and mid April 2015 in The National Plant Phenomics Centre facilities in Aberystwyth, UK. The eight parents of the MAGIC population and four additional elite varieties (Avalon, Santiago, Cadenza, and Zircon) were grown with the 208 RILs (see Table S1) under well watered conditions, with two replicates per genotype. Two seeds were sown in 8 × 8 cm pots of Levington F2 compost. After germination (approximately on the 30/10/2014) the seedlings were thinned to one per pot and transferred to a controlled environment room for vernalization (5◦C, 16 h daylength) for 9 weeks. Following vernalization plants were transferred to 15 × 15 × 20 cm pots of M2 compost. Field capacity and dry matter content of the compost was determined. Plants were transferred to the growth chamber where each pot was placed into a cart on a conveyor system. Pots were weighed and watered automatically to 75% gravimetric water content daily. Growth conditions were 14 h daylength using 600W sodium lamps to supplement (350µM/m<sup>2</sup> /sec) natural lighting, with the temperature settings of 18◦C (day) and 15◦C (night). Plant hygiene was monitored by visual inspection throughout the experiment, with an appropriate prophylactic and responsive spraying regime. Once the ears started to ripen, plants were removed from the system and allowed to finish ripening naturally with reduced watering. Reduced watering only occurred after all plants had passed Flag leaf senescence.

#### Phenotyping

Plants were manually scored for developmental stages according to the Zadoks scale (Zadoks et al., 1974) three times per week, and scored as days after sowing (DAS) when plants reached growth stage 39 (GS39; flag leaf fully emerged), GS55 (ear 50% emerged), GS65 (50% anthesis), and the onset of flag leaf senescence (FLS). At the end of the experiment, plants were harvested and above ground biomass (PW), tiller number (TN), plant height (PH), stem height (SH), top internode length (TIL), first/second/third ear length (FEL, SEL, TEL), first/other ear weight (FEW/OEW), and first flag leaf length (FFLL) were scored. The number of days between GS39 and GS55 (d1), GS55 and GS65 (d2) and GS55 and FLS (d3) were also determined. A multiple linear regression model (MLRM) was fitted to identify predictors of FLS among all the traits used in the analysis. A list of traits, abbreviations and Crop ontology terms (Shrestha et al., 2012) are provided in **Table 1**.

## Genotyping

The lines were genotyped using the Illumina Infinium iSelect 80,000 SNP wheat array ("80K array," http://www.illumina. com/), described in Wang et al. (2014). 20,639 SNP markers were scorable and polymorphic, of which 18,601 were successfully mapped in the MAGIC population (Mackay et al., 2014; Gardner et al., 2016). Linkage map generated by mpMap is reported in Gardner et al. (2016).

#### Plant Stress

The young plants showed symptoms that included chlorosis and necrosis (Figure S1). The chlorotic symptoms consisted of yellow areas surrounding lesions on the leaf blades. The necrotic symptoms comprised brown spots, lens-shaped lesions, surrounded by yellow borders. Although symptoms were controlled by routine spraying (Priori Xtra, Syngenta), we speculated that it constituted an undiagnosed disease (possibly Septoria) and the degree of infection was scored manually using the seedling infection type (IT) score shown in Table S1. Plant visual stress symptoms were scored first at GS31–39 and for the second time around GS70–GS80, and the average was calculated (SM). These qualitative IT scores were converted to a numerical scale for statistical analysis.

#### Statistical and Quantitative Trait Locus Analysis

Statistical analyses were performed in the R environment using Core Team (2013). Quantitative Trait Locus (QTL) analysis was performed using the R package HAPPY for multi-parental populations analyses (Mott et al., 2000). The genetic analysis of multi-parental populations requires a haplotype-based approach because single marker association or interval mapping can fail to detect a QTL if the causative alleles are not dispersed among the founders with the same strain distribution pattern as the linked markers (Mott et al., 2000).

## QTL Mapping

HAPPY's analysis is essentially two stage; ancestral haplotype reconstruction using dynamic programming, followed by QTL testing by linear regression:


#### TABLE 1 | Plant trait descriptions.


*TO, Trait Ontology (Liang et al., 2008).*

*CO, Crop Ontology (Shrestha et al., 2012).*

*\*DAS, Days after sowing.*

*Wheat ontology (Leo Valette, Bioversity, France, Personal Communication).*

*Unit is the metric of thetrait. Trait id is the TO or CO reference id.*

alleles s, t at locus labeled L, conditional upon all the genotype data for the individual. Then the expected phenotype is

$$\mathcal{y} = \sum\_{st} T\_{st} F\_{iLst},$$

and the T's are estimated by a linear regression of the observed phenotypes on these expected values across all individuals, followed by an analysis of variance to test whether the progenitor estimates differ significantly.


The models are presented here in the linear model framework (i.e., least-squares estimation, with ANOVA F-tests).

For an additive QTL, the parameters are the strain effect sizes; for a full interaction model there is a parameter for every possible strain combination. Then the one-QTL model is E(y) = XLtL.

There are S(S − 1)/2 + S parameters (where S is the number of strains) to be estimated in a full model allowing for interactions between the alleles within the locus, and S − 1 parameters in an additive model. For the full model, the i, j'th element of the design matrix X is related to the strain probabilities thus:

XLij = FiLst,

where

$$\mathbf{u}(s,t) = \dot{m}(s + \mathbf{S}(t-1)), t + \mathbf{S}(s-1)$$

and for the additive model

$$X\_{L\vec{\eta}} = \sum\_{s} F\_{iLs\vec{\eta}}$$

We used an additive model, where the contribution of each allele at the locus are assumed to act additively.

Furthermore, when mapping QTLs in structured populations the evidence for the existence of a QTL has to be considered in the context of other QTLs, which might explain some of the same component of variation. Population structure can produce long-range correlations between genotypes and hence ghost QTL, although the LD analysis suggests that the MAGIC population is relatively immune to this phenomenon. Although the MAGIC population is relatively unstructured, and therefore can be analyzed one locus at a time, in order to ensure the evidence for a given QTL was not confounded with that for others, statistical significance was assessed based on permuting the phenotypes (1000 times) between individuals, repeating the model fit, and finding the top-scoring marker interval. The empirical distribution of the max −logP values was then used to assess statistical significance. This technique is useful for non-normally distributed phenotypes and for estimating region-wide significance levels. We used −logP = 4 as a threshold in the multiple QTL modeling to test for association and FDR = 0.05 to identify significantly differential markers. The dashed line in QTL plots corresponds to an FDR rate of 0.05 and is calculated using the qvalue package (Storey and Tibshirani, 2003). The p-value corresponding to a q-value of 0.05 is determined by interpolation. When there are no q-values less than 0.05, the dashed line is omitted.

In addition, a multiple linear regression model (MLRM) was fitted to identify predictors of FLS among all the traits used in the analysis. We selected FLS because the trait is used as indicator of crop yield and biomass accumulation (Gan, 2014).

Principal Component Analyses (PCA) over normalized trait data was carried out to identify patterns between traits and genotypes. Biplots were used to show information on samples in a graphical manner (Kempton, 1984). PCA of marker data was carried out separately to test for population structure.

### QTL Validation

We used the R package mpMap to confirm QTL mapping results (Huang and George, 2011) and to analyse the effect of including marker covariates. QTL analysis is performed using interval mapping, then selected marker covariates are included in the linear model in a forward selection process.

#### Data Processing

Data were pre-processed using standard methods. Data corresponding to one replicate of the MAGIC line MEL 086-1 (note, all MAGIC lines are named with prefix "MEL") and another from one replicate of the elite line Avalon were removed due to seed infection. A small number of outliers (data points with suspicious values) were checked and, where possible, corrected. Missing values (2.19%) were imputed using multivariate imputation by chained equation (MICE) (van Buuren and Groothuis-Oudshoorn, 2011). Briefly, MICE operates under the assumption that given the variables used in the imputation procedure, the missing data are Missing At Random (MAR), which means that the probability that a value is missing depends only on observed values and not on unobserved values (Schafer and Graham, 2002). MICE creates a number of datasets by imputing missing values. That is, one missing value in original dataset is replaced by m plausible imputed values. We set m = 5 as the number of imputations. These values take imputation uncertainty into consideration. Statistics of interest are estimated from each dataset and then combined into a final one and replicates were averaged (Zhang, 2016).

## RESULTS

In this experiment, RILs, MAGIC parents (illustrated in **Figure 2**) and 4 other elite genotypes were grown to maturity

over a single time span and within a single glasshouse chamber with controlled watering and supplementary lighting and heating. Phenotype data were curated, and missing values (accounting for 2.19% of data points) imputed. Figure S2 shows a comparison between original data (red dots) and imputed data (blue dots), which suggested a high similarity between the two distributions as indicated by the overlapping dots.

#### Analysis of Traits

The distributions of the traits are shown in Figure S3. Most traits showed similar distributions with the exception of the discrete trait SM, which was skewed (see **Table 1** for trait description). SM is a discrete trait and the skewedness of the plot reflects that most RILs' scores were in the 0–2 range. **Figure 3** shows the frequency distributions for GS55, FLS, SH and d3. Pair-wise correlation analysis between all traits (Figure S3) identified strong correlations between FLS and GS39 (0.79), GS55 (0.73) and GS65 (0.69) and d3 (0.70); between FEL, SEL, and TEL (>0.86) between TIL and SH (0.76), PW and OEW (0.87), and between TEW and PW (0.87).

To determine if there was variation in duration between key developmental stages that was not simply a result of variation in overall developmental progression, we examined the time in days taken to progress from GS39 to GS55 (d1), from GS55 to GS65 (d2) and from GS55 to FLS (d3). Our MLRM identified d3 as a strong predictor (P < 0.05) of FLS, indicating that the time lapse between GS55 and FLS is a good candidate to predict FLS (**Figure 4**). In addition to d3, our MLRM identified other important predictors of FLS (P < 0.05). For example, the size of the flag leaf on the primary shoot, FFLL, was significantly (P < 0.05) associated with the timing of senescence (FLS). To demonstrate this result, dot size in **Figure 4** was used to represent an additional feature in the plots. In the case of **Figure 4**, FFLL (represented by dot size) was longer in RILs that senesced earlier. Among the MAGIC founders, Brompton, Hereward and Rialto senesced after Xi-19 and the two elite controls Zircon and Cadenza. The latter also had the shortest duration between GS39 and GS55. Previously, Mackay et al. (2011) reported Cadenza as the most environmentally sensitive variety detected in 8 years in Recommended List trials showing a linear increase in yield with increasing summer rainfall. This

supports the observations from the trait analysis that progression through different developmental processes e.g., flowering vs. senescence, is controlled independently. **Figure 5** shows d3 in relation to FLS (the contract of d2 to FLS is shown in Figure S5). Plants which senesced earlier took less time between GS55 and GS65 (d2) and between GS55 and FLS (d3).

To further evaluate the relationship between these traits, PCA was conducted over trait data scaled to have unit variance. Results of the analysis are shown on the biplot in **Figure 6**. The plot shows that PC1 and PC2 account for 48% of the total variance of the traits. Also, four clearly defined trait groups (anti clockwise) can be seen in the plot, the first one containing FLS, GS39, GS55 and GS65, d1, d3, and TN, the second group contained SEL, FEL and TEL, the third group contained OEW, PW, TIL, SH, FEW, and FFL and the fourth group contained HI and SM. Since (1) the smaller the angle between the trait vectors, the higher the correlation (2) trait values are smaller toward the middle of the plot and higher toward the edge, we can deduce that SM is negatively correlated to group one which is confirmed by the results from the correlation shown in Figure S4. The same argument could also be used between group one and group three, which is also confirmed by Figure S4.

Looking at the relationship between traits and MAGIC parents, Brompton and Hereward are positively correlated, tending to be slow to FLS (late to senesce) as indicated by their proximity to the FLS vector. In contrast, Soissons and Xi-19 show an opposite effect, rapidly reaching FLS as indicated by their location at the opposite side of FLS. Through this analysis we can also see that, in general, most RILs and parents have a similar overall phenome, as represented by their location close to the center of the plot. We can also see there are a number of divergent phenotypes, such as the one at the bottom of the plot (**Figure 6**, right panel) which corresponds to MAGIC line MEL 091-1a or the one at the top which correspond to MEL 089-1a. Looking closer, MEL 089-1a is proximal to PW and SH while MEL 091-1a is further away, which indicates these two lines contrast strongly for these particular traits. A picture of both lines taken on 27/04/2015 was added to the plot to facilitate interpretation. The plants show clearly contrasting differences in height and biomass.

#### Plant Stress Analysis

A low level of chlorotic and necrotic lesions was observed on the leaves early during the growth period. Symptoms (SM) were scored independently by two people at two time points and scores were averaged. When comparing symptom scores against the onset of FLS, we found that the more severely affected plants started senescence earlier than those plants that were mildly affected (**Figure 5**). Our MLRM also identified SM as a predictor of FLS (P < 0.05).

**Figure 6** confirms the negative correlation between FLS and SM as indicated by its opposite location from FLS. This correlation is consistent with that of Mycosphaerella graminocola, where infection induces senescence by manipulating signaling pathways in plants (Mengiste, 2012). However, it should be noted that the precise identity of the putative pathogen could not be confirmed.

#### Quantitative Trait Loci

We evaluated whether trait variation could be ascribed to underlying genetic variation. The lines were genotyped using the Illumina Infinium iSelect 80,000 SNP wheat array ("80K array," http://www.illumina.com/), described in Wang et al. (2014). 20,639 SNP markers were scorable and polymorphic, of which 18,601 were successfully mapped in the MAGIC population (Mackay et al., 2014; Gardner et al., 2016); linkage map for this population was produced using mpMap and reported in Gardner et al. (2016). First, we checked for signs of population structure. To do this, the marker based relationship matrix (A) was calculated using the R package rrBLUp (Endelman, 2011), then a PCA analysis by eigenvalue decomposition of A was calculated. Results are shown in Figure S7. This shows that the first PC accounts for less than 4% of the total spectrum. This confirms the expected absence of population structure. Genome mosaics corresponding to MEL 15-2, MEL 091-1a, and MEL 209-1 are shown in Figure S8. These decompose the lines' genomes into mosaics of founder haplotypes. The lines appear to be a random mix of the founders, which indicates an absence of gross population structure.

After confirming the absence of population structure, we used HAPPY (Mott et al., 2000) to test for association between each phenotype and the predicted founder haplotypes at each locus in the genome. We used the P-value threshold <10−<sup>4</sup> to call QTLs [−logP = 4, corresponding to a false discovery rate (FDR) = 0.05]. This analysis identified loci associated with two phenological traits, GS39 and GS55, and a number of traits such as SEL, SH, TIL, TEW, and HI. For GS39, three significant QTLs were found on chromosome 5A, at 201.36, 212.52, and 224.64 cM, corresponding to the markers BS00009369\_51, BS00021942\_51 and wsnp\_Ex\_c5978\_10478584, respectively (**Table 2**, **Figure 7**). For GS55, three QTLs were found on chromosome 5A, at 201.36, 216.05 and 227.66 cM, which correspond to BS00009369\_51, wsnp\_Ex\_rep\_c66689\_65011117, and Excalibur\_c7729\_144, respectively (**Figure 8**). In both cases, these three close peaks are likely to represent a single QTL. To confirm this hypothesis, we performed composite interval analysis using 5A as covariate and identified a single clear and strong marker on 5A (Figures S11A–E).

For SEL, four QTLs were identified on chromosome 5B at 60.65, 66.38, 90.8, 92.31 cM, corresponding to wsnp\_Ex \_c6548\_11355524, BS00001101\_51, wsnp\_Ku\_c2185\_4218722, and RAC875\_c19099\_434, respectively (**Figure 9**).

For SH, one QTL was identified on chromosome 4D at 32.24 cM which corresponded to the marker RAC875\_c1673\_193 (**Figure 10**). A QTL for TIL was also identified in the same location (**Figure 11**). QTLs for OEW and PW were also identified on 4D at 26.97 cM corresponding to RAC875\_c6922\_291 (Figure S10A) and 4D at 32.24 cM corresponding to RAC875\_c6922\_291 (Figure S10B), respectively. We identified one QTL for TEW on chromosome 4D at 26.97 cM, corresponding to RAC875\_c6922\_291 (Figure S10D). All of these QTLs co-located with the semi-dwarfing gene Rht-D1 (Rht2) (Ellis et al., 2002). The Rht (Reduced height) genes Rht-B1 (Rht1) or Rht-D1 (Rht2) are present in many high-yielding, semi-dwarf varieties, where they offer simple genetic control of high harvest index and resistance to lodging (Flintham et al., 1997).

For HI, 1 weak peak QTL on chromosome 2D at 55.4 cM was identified which corresponded to RAC875\_c6922\_291 (Figure S10E). Figure S9 shows a contrast between FLS and HI across all the MAGIC and elite lines. Cadenza and Soissons have some of the highest HIs and the shortest time to senesce. **Table 2** also

#### TABLE 2 | QTLs mapped for different traits.


*cM is the marker position.* −*logP is* −*log10 at the QTL peak; h<sup>2</sup> is the fraction of variance accounted for by QTL, after removing covariates Chr is the chromosome. P-value is the genome wise P-value for the QTL based on permutations. Alchemy, Brompton, Claire, Hereward, Rialto, Robigus, Soissons.*

shows that Soissons has the highest contribution (0.6) to that particular marker.

#### DISCUSSION

This study screened a core set of lines derived from the NIAB wheat MAGIC population under Smarthouse conditions as strategy to understand the physical and genetic relationship between different phenological traits. This strategy resulted in the detection of QTL interactions with novel traits suggesting that the methodology should be taken further in the future.

Pair-wise correlation between all traits identified high positive (≥0.69) correlations between GS39, GS55, GS65 and senescence

at the flag leaf (FLS), and between TEW and PW. There was also a negative correlation between FLS and length of the first flag leaf (FFLL), indicating that the shorter the flag leaf the more delayed was the start of senescence. Short flag leaves provide less nutrient assimilation therefore the plant has to compensate by either living longer, or producing a large number of tillers. Consistent with this idea, PC analysis also indicated a correlation between these traits. Suggestions that delayed leaf senescence leads to increased yield have been thrown into doubt (Borrill et al., 2015) but it may contribute under certain conditions. It will be interesting to see whether the correlation between leaf length and senescence is maintained under other environmental conditions. Genetically unstructured populations such as the MAGIC collection will be ideal to test whether experimental manipulation of the

size of the sink (grain mass) can further modulate flag leaf senescence.

In order to see if there was any clustering of individuals according to phenotype, a projection plot of the MAGIC lines onto the first two PCs was generated (Figure S6). The map shows lines grouped around 3 clusters of traits where Group 1 contained Cadenza, Zircon, Xi-19, Claire, Alchemy, Soissons, and Robigus; Group 2 contained Brompton and Hereward and Group 3 contained Avalon, showed similar trait profiles. Group 1 was early to senescence and group 2 later. Group 3 contained smaller plants as indicated by their opposite location to the FEL, SEL, and TEL vectors.

FLS was also negatively correlated with disease resistance (SM), indicating that highly susceptible plants were more likely to trigger senescence early. Support for this perspective may come from the observation that Xi-19 had the earliest FLS of all the parents and controls (Figure S9) and the joint highest disease score. In the field, Xi-19 flowers considerably later than

the earliest flowering parent, Soissons, which carries the Ppd-D1a allele for early flowering (Scarth et al., 1985).

After confirming absence of population structure with PCA of the kinship matrix, QTL mapping identified a very strong marker (−logP > 8.00) on chromosome 5A associated with GS39 and GS55, which we believe is likely to correspond to the vernalization gene VRN-A1. This gene plays an important role in the vernalization process in diploid (Triticum monococcum) and polyploid wheat (Triticum aestivum) (Loukoianov et al., 2005; Kiss et al., 2014). However, using HAPPY, no significant QTL was found on chromosome 2D, the location of the Ppd-D1 locus, for GS39 or GS55. In the field, the presence of the Ppd-D1 allele in Soissons results in this line flowering 7–14 days earlier than the other MAGIC founder lines. These contrasting results between field and CER for Ppd-D1 and Vrn-A1 associated QTL suggest that the plants in this experiment might have experienced reduced vernalization as a result of a lack of cold treatment. However, this idea was discarded because inspection of CER records did not show any temperature discrepancies during vernalization. Another possibility is that plants displayed disease-like symptoms at an early stage. We also noticed that Cadenza, one of the 4 elites and a genotype that does not need vernalization, was one of the first to senesce, had the shortest duration between GS39 and GS55 and was more disease susceptible than the similarly early-flowering Soissons (**Figures 4**, **5**). The fact that Cadenza has no vernalization requirement, might suggest that indeed plants we not fully vernalized or encountered a de-vernalizing effect. Whatever the cause, the results of the experiment appear to have been strongly affected by a vernalization issue. This may explain some of the "anomalous" behavior of Cadenza, Xi-19 and Zircon, all of which do not require vernalization.

Further insight is provided by QTL validation analyses using mpMap. With no covariates included in the QTL model, the mpMap interval mapping approach produces very similar results to HAPPY. However, many more QTLs (−logP > 10) are detected using a model with 10 covariates in mpMap, as can be seen in Figure S11. For GS39 and GS55, it can be seen that although the 5A QTL is still the highest peak, the Ppd-D1 marker is significant and detected as the third (GS39) or 2nd (GS55) highest QTL. For FLS, 5A is also the most prominent QTL, but there is no evidence for a QTL around the Ppd-D1 locus. This supports the observations from the trait analysis that progression through different developmental processes e.g., flowering vs. senescence, is controlled independently. Furthermore, QTL detected using mpMap for the length of the interval (d3) from flowering to senescence (Figure S11C) show a distinct pattern from both GS55 and FLS, although some loci are in common (e.g., 4D). As well as phenological traits, markers associated to morphological traits were also identified. For example, a strong marker on chromosome 4D was associated with shoot height (SH) and shoot number (TIL), as well as ear weight (OEW) and above ground biomass (PW). In all these cases, the QTL interval includes the Rht-D1 (Rht2) locus. The Rht-D1b allele at this locus causes a semi-dwarfing phenotype in wheat, is strongly correlated with a reduction in height and several other morphological traits and is segregating in the MAGIC population (dwarfing alleles are present in all parent lines except Robigus and Soissons). Interestingly, this locus also shows up in the highest QTL interval for the d3 developmental interval in the mpMap covariate analysis (Figure S11D), suggesting that there may be independent effect on timing of progression through development. The Rht-D1b allele has a premature stop codon resulting in reduced sensitivity to gibberellic acid, which has been associated with reduced plant height and earlier heading date (Wilhelm et al., 2013). Our analysis indicates there are differential effects on the duration of other developmental processes not directly related to height or flowering per se.

In addition to FLS, GS55, and GS39, the multiple covariate analysis also identified a strong (−logP > 18) peak on chromosome 7B (Figure S11E). Chromosome 7 has been previously associated to Septoria leaf blotch in an analysis of wheat-barley disomic addition lines. The highest level of resistance to infection by S. tritici was found in the H. vulgare chromosome addition line 7 followed by 4 and 6 (Rubiales et al., 2001).

Another interesting result is related to HI, for which a weak peak QTL was identified on the 2D chromosome, the location of Ppd-D1. In the field, the presence of the Ppd-D1 allele in Soissons results in flowering 7–14 days earlier than the other MAGIC founder lines. In our analysis, Soissons have

#### REFERENCES


some of the highest HIs and the shortest time to senesce but it also has the highest contribution (0.6) to that particular maker.

This study provides the first systematic phenological characterization of a wheat MAGIC population under controlled environment conditions. With careful developmental staging and end of life measurements of the MAGIC core set we were able to identify previously detected QTL loci on chromosomes 5A and 4D associated with the onset of senescence at the flag leaf. This powerful multi-founder population captures much of the genetic variation present in elite cultivars and a more detailed knowledge of fine-scale developmental and physiological patterns can be exploited for fine-tuning wheat's response to the environment. We have shown that the combination of MAGIC + Smarthouse can help extend the current understanding of developmental plasticity in elite wheat varieties with potential application for responding to the adaptation challenges facing agriculture in a changing climate.

### AUTHOR CONTRIBUTIONS

Conceived and designed the study: AB and AC; analyzed the data: AC; assisted with QTL analysis: RM, KG, and IM; provided genetic data: KG and IM; provided scoring data: FC; wrote the paper: AC; provided comments and corrected the manuscript: All authors.

#### ACKNOWLEDGMENTS

Access to the National Plant Phenomics Centre was provided by a National Capability for Crop Phenotyping grant (BBSRC ref number BB/J004464/1). The creation of the NIAB MAGIC population was supported by BB/E007201/1. We are grateful to the team of National Plant Phenomics Centre for carrying out the experiments, particularly Julie Pruvost. Thanks are extended to Prof. Luis A. J. Mur (Aberystwyth University, UK) and Dr. Flavio M. Santana (Embrapa Wheat) for critical discussions.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 01540


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer FM and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2016 Camargo, Mott, Gardner, Mackay, Corke, Doonan, Kim and Bentley. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Four Tomato FLOWERING LOCUS T-Like Proteins Act Antagonistically to Regulate Floral Initiation

Kai Cao1, 2, Lirong Cui <sup>1</sup> , Xiaoting Zhou<sup>1</sup> , Lin Ye<sup>1</sup> , Zhirong Zou<sup>1</sup> \* and Shulin Deng<sup>2</sup> \*

*<sup>1</sup> State Key Laboratory of Crop Stress Biology for Arid Areas, Horticulture College, Northwest A&F University, Yangling, China, <sup>2</sup> Laboratory of Plant Molecular Biology, Rockefeller University, New York, NY, USA*

The transition from vegetative growth to floral meristems in higher plants is regulated through the integration of internal cues and environmental signals. We were interested to examine the molecular mechanism of flowering in the day-neutral plant tomato (*Solanum lycopersicum* L.) and the effect of environmental conditions on tomato flowering. Analysis of the tomato genome uncovered 13 PEBP (phosphatidylethanolamine-binding protein) genes, and found six of them were *FT*-like genes which named as *SlSP3D*, *SlSP6A*, *SlSP5G*, *SlSP5G1*, *SlSP5G2,* and *SlSP5G3*. Six FT-like genes were analyzed to clarify their functional roles in flowering using transgenic and expression analyses. We found that SlSP5G, SlSP5G2, and SlSP5G3 proteins were floral inhibitors whereas only SlSP3D/SFT (*SINGLE FLOWER TRUSS*) was a floral inducer. *SlSP5G* was expressed at higher levels in long day (LD) conditions compared to short day (SD) conditions while *SlSP5G2* and *SlSP5G3* showed the opposite expression patterns. The silencing of *SlSP5G* by VIGS (Virus induced gene silencing) resulted in tomato plants that flowered early under LD conditions and the silencing of *SlSP5G2* and *SlSP5G3* led to early flowering under SD conditions. The higher expression levels of *SlSP5G* under LD conditions were not seen in *phyB1* mutants, and the expression levels of *SlSP5G2* and *SlSP5G3* were increased in *phyB1* mutants under both SD and LD conditions compared to wild type plants. These data suggest that *SlSP5G*, *SlSP5G2,* and *SlSP5G3* are controlled by photoperiod, and the different expression patterns of *FT*-like genes under different photoperiod may contribute to tomato being a day neutral plant. In addition, PHYB1 mediate the expression of *SlSP5G*, *SlSP5G2,* and *SlSP5G3* to regulate flowering in tomato.

Edited by:

*John Doonan, Aberystwyth University, UK*

#### Reviewed by:

*Richard Macknight, University of Otago, New Zealand Hao Peng, Washington State University, USA*

\*Correspondence:

*Zhirong Zou zouzhirong2005@hotmail.com; Shulin Deng sdeng@rockefeller.edu*

#### Specialty section:

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

Received: *08 November 2015* Accepted: *17 December 2015* Published: *11 January 2016*

#### Citation:

*Cao K, Cui L, Zhou X, Ye L, Zou Z and Deng S (2016) Four Tomato FLOWERING LOCUS T-Like Proteins Act Antagonistically to Regulate Floral Initiation. Front. Plant Sci. 6:1213. doi: 10.3389/fpls.2015.01213* Keywords: tomato, floral repressor, floral activator, PEBP protein, FT-like genes, phytochromes

## INTRODUCTION

In flowering plant, the timing of the transition from vegetative to reproductive phase is a major event in the plant life cycle. Both physiological and genetic studies have revealed the complexity of mechanisms that tightly control switch from vegetative to reproductive growth in the apical meristem (Bernier et al., 1993; Shalit et al., 2009). The phosphatidylethanolamine-binding proteins (PEBPs), found in both angiosperms and gymnosperms, have evolved to become both activators and repressors of flowering and they can be classified into three clades (Gyllenstrand et al., 2007; Karlgren et al., 2011). An example of this functional diversification is seen in the six PEBP family members of Arabidopsis. FLOWERING LOCUS T (FT) and TWIN SISTER OF FT (TSF), which

**103**

belong to the FT-like clade, function as flowering activators, TERMINAL FLOWER 1 (TFL1), BROTHER OF FT AND TFL1 (BFT), and ARABIDOPSIS THALIANA CENTRORADIALIS (ATC), which classify to the TFL1-like clade, are usually flowering repressors, and MOTHER OF FT AND TFL1(MFT), which defines the MFT-like clade, is predominantly a floral promoter (Karlgren et al., 2011).

In Arabidopsis, a long-day plant, FT is expressed in leaf phloem companion cells. This protein which triggers floral development in the shoot apical meristem (SAM) under long day (LD) conditions is a major output of the photoperiod pathway and controls floral transition in response to the changes in day length (Kardailsky et al., 1999; Kobayashi et al., 1999). CONSTANS (CO) encodes a zinc finger protein and promotes flowering under LD conditions (Putterill et al., 1995). In LD conditions, FT is activated by CO (Samach et al., 2000), and the FT protein then interacts with a novel endoplasmic reticulum membrane protein called FT-INTERACTING PROTEIN 1 (FTIP1; Liu et al., 2012). Following the interaction FT is transported from the companion cells to the sieve elements and entered the SAM by mass flow, where it associates with the basic leucine zipper domain (bZIP) transcription factor FD to activate downstream targets such as SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1 (SOC1) and the floral meristem identity gene APETALA 1 (AP1; Abe et al., 2005; Wigge et al., 2005; Corbesier et al., 2007). Also a PEBP family protein TSF probably acts in a similar way to FT (Yamaguchi et al., 2005). In SD conditions, Arabidopsis flowering is controlled by a gibberellin pathway, which promotes flowering through the activation of the flower meristem identity gene LEAFY (LFY) with no involvement of any PEBP family proteins (Moon et al., 2003). In SD plant rice, Hd3a, a FT homolog promotes flowering under SD conditions (Komiya et al., 2008, 2009). In the dayneutral plant tomato, the homolog of FT, SlSP3D/SFT (SINGLE-FLOWER TRSS), has been shown to encode the mobile florigen signal and promote tomato flowering (Molinero-Rosales et al., 2004).

Although almost all FT-like proteins act as floral activators an antagonistically functional switch has occurred in Beta vulgaris (sugar beet) and Nicotiana tabacum (tobacco) because of gene duplication event(s) generating other paralog(s). In sugar beet, BvFT1 protein acts as an inhibitor in floral development whereas another FT-like protein BvFT2 works as a promoter (Pin et al., 2010). Substitutions of specific amino acids can convert BvFT1 to a floral inducer and BvFT2 into a floral repressor (Pin et al., 2010). In tobacco, four FT-like proteins, NtFT1, NtFT2, and NtFT3 proteins are floral inhibitors whereas only NtFT4 is a floral inducer (Harig et al., 2012). These data suggest that some FTlike proteins, which are evolutionarily more related to FT than to TFL1/CEN, have evolved into flowering repressors.

Phytochromes are primary photosensory receptors that perceive red and far-red light of higher plants. These photochromic proteins exist in two photo-interconvertible isomeric forms: the red light absorbing form and the far-red light absorbing form (Hughes and Lamparter, 1999). Arabidopsis has five phytochrome genes, PHYA to PHYE, which encode the apoproteins of PHYA to PHYE, respectively (Quail et al., 1995). PHYB plays an inhibitory role in floral initiation in Arabidopsis; the phyB mutant flowered earlier than WT in both LD and SD conditions, but the early-flowering phenotype of the phyB mutant is more pronounced in SD than in LD conditions (Goto et al., 1991; Mockler et al., 1999). phyB mutations of the LD pea plant (Weller and Reid, 1993), SD plant sorghum (Childs et al., 1997), and rice (Izawa et al., 2002) showed early-flowering and decreased photoperiodic sensitivities. PHYB delays flowering by suppressing the expression of FT in Arabidopsis (Endo et al., 2005) and Hd3a in rice (Izawa et al., 2002). Tomato contains five phytochrome genes, named PHYA, PHYB1, PHYB2, PHYE, and PHYF (Hauser et al., 1997). The tomato PHYB1 is mainly involved in the de-etiolation response of seedlings, unfolding of the hypocotyl hook, cotyledon expansion, hypocotyl elongation, and anthocyanin accumulation (Kerckhoffs et al., 1997; Weller et al., 2000). However, the function of phytochromes in tomato flowering have not yet been reported.

Tomato is a photoperiod-insensitive, perennial in its native habit. The flowering time of tomato is measured by the number of leaves in the initial segment, which is rather stable under various environmental conditions (Kinet, 1977). Here, we performed expression and transgenic studies to clarify the functional roles of four expressed FT-like genes in tomato. One of the FT-like genes has already been identified by Molinero-Rosales et al. (2004), whereas the other three genes have not been studied. Here, we demonstrate the functional differentiation between these genes in controlling flowering through overexpression in Arabidopsis and VIGS-mediated knocking down in tomato. Our data suggest that among four expressed FT-like proteins, three of them act as floral repressors and only one of them function as a floral promoter. We also showed the expression profiles of tomato FT-like genes under LD and SD conditions in tomato wild-type (WT) and phy mutants. The evolution of antagonistic FT-like paralogs may be a common strategy in Solanaceous plants to fine-tune floral development in response to internal and environmental cues.

#### MATERIALS AND METHODS

#### Plant Material and Growth Conditions

We used cv. MoneyMaker (Solanum lycopersicum L.) wild type (WT) as control in this study, and phyA, phyB1, phyB2, and phyB1B2 mutants in the MoneyMaker background were provided by the Tomato Genetic Resource Center (Department of Vegetable Crops, University of California, Davis) and their TGR accession numbers were LA4356, LA4357, LA4358, and LA4364, respectively. Tomato seeds were soaked in 50% bleach for 30 min. After the treatment, seeds were rinsed thoroughly in running water, then sown directly on a germination paper and incubated at 25◦C. After germination, seedings were sowed onto commercial substrate and grown in a growth chamber under LD (16 h of light/8 h of dark) conditions or SD (8 h of light/16 h of dark) at 300µmol m−<sup>2</sup> s −1 and 25◦C (both day and night).

To study the spatial expression patterns of FT-like genes, we extracted total RNA from leaf, apex, stem, flower, and root tissues, pooled from three 7-week-old plants. For diurnal changes in the expression of FT-like genes, leaves were harvested every 4 h for 24 h (0, 4, 8, 12, 16, 20, and 24 h), pooled from 3 third leaves of 5-week old plants. To study the effect of photoperiod on the expression of these genes, 5-week old uniform plantlets were transferred from LD conditions to SD conditions and reversely. Three different leaves at the same level were harvested 1, 2, and 3 day after the transfer.

#### Phylogenetic Analysis

Tomato protein sequences of the PEBP family members were downloaded from https://solgenomics.net/, Arabidopsis thaliana PEBP family members were downloaded from https://www. arabidopsis.org/, tobacco FT-like proteins reported by Harig et al. (2012) were download from https://solgenomics.net/, and sugar beet FT-like proteins reported by Pin et al. (2010) were download from http://www.ncbi.nlm.nih.gov/. Protein sequences were aligned using the maximum-likelihood method implemented in ClustalW software (Thompson et al., 1994). An N-J tree was produced from the results of 1000 bootstrap replicates using the ClustalW program.

#### Gene Expression Studies

Total RNA was extracted using an RNeasy Plant Mini Kit (Qiagen) following the manufacturer's instructions. cDNA synthesis was performed by using the SuperscriptIII First strand synthesis system (Invitrogen) following the manufacturer's instructions. Real-time PCR was performed using SYBR Premix Ex Taq (TAKARA) in a Biorad CFX96 realtime PCR system. ACTIN was used as an internal control. The primers used were listed in **Supplementary Table S1**. Real-time quantitative PCR was repeated with three biological replicates, and each sample was assayed in triplicate by PCR.

## Plasmid Constructs and Plant Transformation

The ORFs of SlSP3D(Solyc03g063100), SlSP5G(Solyc05g053850), SlSP5G2(Solyc11g008640), SlSP5G3(Solyc11g008650) were amplified by PCR, cloned in pENTR/3C vector (Invitrogen) and then transferred into pBCO-DC by recombination (Jang et al., 2007) using LR Clonase enzyme (Invitrogen). The resultant plasmid was used to transform A. thaliana (Col-0) plants by the Agrobacterium tumefaciens strain GV3101-mediated floral dip method (cite the original ref as well; Zhang et al., 2006). Transformed plants were selected on 0.8% agar media containing Murashige and Skoog salts, 0.5 g/L MES, and 10 g/L Sucrose and containing 10µg/L basta. Arabidopsis plants were grown in a growth chamber under LD conditions at a light intensity of 100µmol m−<sup>2</sup> s −1 at 20◦C (day and night).

#### Virus-Induced Gene Silencing (VIGS) in Tomato Plants

pTRV1 (pYL192) and pTRV2 (pYL156) vectors had been described in Liu et al. (2002). The pYL170 TRV2 vector was derived by cloning a PstI-blunt-DraIII fragment of pYL156 into EcoRI-blunt-DraIII-cut pCAMBIA3301. This vector was identical to pYL156, except for a plant selection marker. To generate pTRV2-SlPDS, pTRV2-SlSP5G, pTRV2SlSP5G2, and pTRV2SlSP5G3, a cDNA fragment was PCR amplified using a tomato ecotype MoneyMaker cDNA library and primers were described in **Supplementary Table S1**. The resulting PCR products were cloned into EcoRI-BamHI-cut pTRV2 (PYL170).

One-week-old tomato seedlings were used for the VIGS assay, pTRV1 and pTRV2 or its derivatives were introduced into A. tumefaciens strain GV3101 and the Agrobacterial strains mixed. A 5-mL culture was grown for 16 h at 28◦C in 50 mg/L gentamycin and 50 mg/L kanamycin. The next day, the culture was inoculated into 30 mL of Luria-Bertani medium containing antibiotics, 10 mM MES, and 20 mM acetosyringone. The culture was grown 16 h in a 28◦C shaker (200 r.p.m). A. tumefaciens cells were harvested and resuspended in infiltration media (10 mM MgCl2, 10 mM MES, and 200 mM acetosyringone), adjusted to an OD600 of 1.5, and left at room temperature for 3–4 h. Agroinfiltration was performed with a needleless 1-mL syringe into two tomato cotyledons (Velásquez et al., 2009).

### RESULTS

#### Identification and Phylogenetic Classification of Tomato FT-Like Genes

To identify FT-like proteins encoded by the tomato genome, the amino acid sequence of Arabidopsis FT protein was used to perform a BLAST survey against the tomato whole-genome database (https://solgenomics.net/). A total of 13 predicted PEBP genes were identified and annotated. In a previous study, the plant PEBP family could be classified into three main clades, described as FT-like, TFL1-like, and MFT-like (Chardon and Damerval, 2005). To evaluate the evolutionary relationships among the tomato, tobacco, sugar beet, and Arabidopsis FTlike proteins, specific and combined phylogenetic analysis based on their amino acid sequence were performed. We created a maximum-likelihood tree from an alignment of the 13 tomato PEBP proteins, the Arabidopsis 6 PEBP proteins, the tobacco FT-like (NtFT1-NtFT4) proteins and the sugar beet BvFT1 and BvFT2 proteins. **Figure 1** shows that there were six FT-like genes, five TFL1-like genes, and two MFT-like genes in the tomato PEBP family. PEBP family proteins contained two key motifs which are a putative ligand-binding pocket and an external loop. Protein sequence alignment also revealed a change of an amino acid residue from Tyr in tomato FT-like proteins to His in TFL1-like at the entrance of the binding pocket (**Supplementary Figure S1**), This amino acid residue in part determines the functional difference between FT and TFL1 in Arabidopsis (Hanzawa et al., 2005). Another amino acid residue was changed from Gln in tomato FT-like proteins to Asp in TFL1-like at the external loop encoded by the fourth exon (**Supplementary Figure S1**). This was another critical residue for the functional difference between FT and TFL1 in Arabidopsis (Ahn et al., 2006). These results suggest that SlSP3D, SlSP6A, SlSP5G, SlSP5G1, SlSP5G2, and SlSP5G3 are FT-like genes.

Nucleotide sequence comparisons between genomic and predicted CDS allowed the identification of the exon-intron structures of tomato PEBP genes. Tomato PEBP genes showed conserved genomic organization and the exons were placed in

identical positions relative to the amino acid sequence of the Arabidopsis PEBP genes family, except for SlSP5G1 and SlSP5G3 (**Supplementary Figure S2**). The length of exons was quite conserved compared among tomato FT-related genes themselves and with Arabidopsis FT-related genes, but the introns differed in length. For the FT-like genes exon-intron structures, SlSP6A and SlSP5G1 were truncated by a premature stop codon in their last exon and there was only one 222 exon without intron for SlSP5G3. In sugar beet and tobacco, FT-like protein could be further divided into floral promoters and floral repressors.

#### Expression Pattern of FT-Like Genes in Different Organs Under LD and SD Conditions

To investigate the roles of the six tomato FT-like genes in flowering, we first monitored their expression levels in different organs. We isolated total RNA from the leaf, cotyledon, apex, stem, flower, and root tissues of 7-week-old tomato plants growth under LD and SD conditions. We compared the expression levels of SlSP3D, SlSP6A, SlSP5G, SlSP5G1, SlSP5G2, and SlSP5G3 with those of the housekeeping gene ACTIN by qRT-PCR. No expression was detected for SlSP6A and SlSP5G1 in all tissues. Considering there are premature stop codons in their last exons (Carmel-Goren et al., 2003; Consortium, 2012), these two genes probably do not encode functional proteins and are in fact pseudogenes. SlSP3D, SlSP5G, SlSP5G2, and SlSP5G3 are mainly expressed in leaf and cotyledon under both LD and SD conditions (**Figures 2A,B**). A much higher expression level of SlSP5G was observed under LD conditions compared to SD conditions and SlSP5G2 and SlSP5G3 displayed an opposite expression pattern (**Figures 2A,B**). Under SD conditions, the number of leaves on the tomato main stem at flowering was eight on average, while this number increased to nine under LD conditions (**Figure 5A**).

### Diurnal Rhythmic Expression Patterns of FT-Like Genes

To investigate the relationships among the four expressed FTlike genes, SlSP3D, SlSP5G, SlSP5G2, and SlSP5G3, we examined their diurnal expression patterns using the third leaves of 5-weekold seedlings. We performed qRT-PCR analyses using RNA from tomato plants grown in a LD diurnal cycle or a SD diurnal cycle. The expression of SlSP3D peaked at 4 h after dawn under LD conditions (**Figure 3A**) confirming previous results (Shalit et al., 2009). Our results also revealed that SlSP5G was transcribed at dawn and its expression peaked at the end of the day under LD conditions (**Figure 3B**). Under SD conditions, SlSP5G was constantly expressed at a lower level compared with its expression under LD conditions (**Figure 3B**). The expression pattern of SlSP5G2 was different from that of SlSP5G, which showed a higher expression level under SD conditions, with expression peaking after 4 h of light under SD conditions (**Figure 3C**). On the other hand, SlSP5G3 showed nearly the same diurnal oscillation pattern as SlSP5G2, and it peaked at 4 h after light under SD conditions (**Figure 3D**).

## Tomato FT-Like Genes Have Antagonistic Functions in Floral Development in Transgenic Arabidopsis Plants

According to previous studies, SlSP6A and SlSP5G1 were not expressed in tomato plants (Abelenda et al., 2014; **Figures 2A,B**) and consistent with this result we failed to clone SlSP6A and SlSP5G1 from our tomato cDNA library. To investigate the functions of other FT-like genes in tomato flowering, SlSP3D, SlSP5G, SlSP5G2, and SlSP5G3 were transferred into Arabidopsis plants under the control of a cauliflower mosaic virus (CaMV) 35S promoter.

Overexpressing SlSP3D led to early flowering in transgenic Arabidopsis (**Figure 4B**) cofirming previous report that SlSP3D was a flowering promoter (Molinero-Rosales et al., 2004). Overexpression of SlSP5G, SlSP5G2, and SlSP5G3 delayed flowering in transgenic Arabidopsis plants compared to wild-type controls (**Figures 4A,C–E**). The number of rosette leaves before flowering was seven in Col-0 under LD conditions. However, this number decreased to four in SlSP3D overexpressing plants (line 1), increased to 9.5 in SlSP5G3 overexpressing plants (line 1), 12.5 in SlSP5G overexpressing plants (line 2) and 15.5 in SlSP5G2 (line 2) overexpressing plants under LD conditions (**Figure 4F**). There were four overexpressing SlSP3D, SlSP5G, SlSP5G2, and SlSP5G3 lines, respectively, and the number of rosette leaves before flowering in the other overexpressing SlSP3D, SlSP5G, SlSP5G2, and SlSP5G3 lines were shown in **Supplementary Figure S3**. These results indicate that SlSP3D is a floral promoter, and SlSP5G, SlSP5G2, and SlSP5G3 are floral repressors.

## The Effect of Photoperiod on the Expression of SlSP3D, SlSP5G, SlSP5G2, and SlSP5G3 Genes

**Figure 5B** shows that SlSP5G expression increased under LD conditions, while SlSP5G2 and SlSP5G3 expression increased under SD conditions. These results suggested that SlSP5G,

SlSP5G2, and SlSP5G3 were targets of photoperiodic regulation. Therefore, we determined the expression levels of SlSP3D, SlSP5G, SlSP5G2, and SlSP5G3 in tomato plants grown under LD conditions for 4 weeks and then transferred to SD conditions for 3 days, and vise-versa. There was no change of SlSP3D expression when tomato plants were transferred from LD conditions to SD conditions or from SD conditions to LD conditions (**Figure 5C**). Downregulation of SlSP5G and upregulation of SlSP5G2 and SlSP5G3 were apparent after tomato plants were transferred from LD conditions to SD conditions (**Figures 5D–F**). With tomato plants transferred from SD conditions to LD conditions, we found a directly increase of SlSP5G expression and a decrease of SlSP5G2 and SlSP5G3 expression after only one LD photoperiod (**Figures 5D–F**). These results indicated that SlSP5G, SlSP5G2, and SlSP5G3 are directly regulated by day length.

### Silencing of the Tomato SlSP5G, SlSP5G2, and SlSP5G3 Genes Using TRV-VIGS Vector

To study the function of SlSP5G, SlSP5G2, and SlSP5G3 in tomato flowering under LD and SD conditions, we constructed a TRV-VIGS vector to suppress the expression of the endogenous SlSP5G, SlSP5G2, and SlSP5G3. A mixture of Agrobacterium cultures containing pTRV1 and pTRV2, carrying tomato SlSP5G (pTRV2-SlSP5G), SlSP5G2 (pTRV2-SlSP5G2), or SlSP5G3 (pTRV2-SlSP5G3), were infiltrated into the cotyledon of 1-weekold tomato plants. We also used TRV-VIGS vector to suppress the expression of the endogenous phytoene desaturase gene (PDS) as a control. Tomato plants infected with pTRV-SlPDS developed a photo-bleached phenotype in the upper leaves 10 days post-agro-infiltration (**Supplementary Figure S4**). Under LD conditions, the number of leaves on tomato main stem upon flowering was nine on average. However, this number was reduced to seven when the tomato plants were infected with pTRV1/pTRV2-SlSP5G (**Figure 6A**). Sixteen out of twenty tomato plants showed early flowering after infiltration with pTRV1/pTRV2-SlSP5G compared with tomato plants infiltrated with pTRV1/pTRV2. We also extracted RNA from leaves of early flowering tomato plants to confirm that SlSP5G was indeed silenced by qRT-PCR. The primers that anneal to the SlSP5G gene outside the region targeted for silencing were used. In early flowering tomato plants infiltrated with pTRV2-SlSP5G, SlSP5G expression was reduced significantly compared with the TRV infected controls (**Figure 6B**). The results suggest that SlSP5G is a flowering repressor under LD conditions.

Under SD conditions, the number of leaves on tomato main stem at flowering was eight on average; however, this number was reduced to 6.5 in tomato plants infected with pTRV1/pTRV2-SlSP5G2 and pTRV1/pTRV2-SLSP5G3 (**Figure 6C**). Fourteen out of twenty tomato plants showed slight early flowering after infiltration with pTRV1/pTRV2-SlSP5G2 and pTRV1/pTRV2-SLSP5G3. RT-PCR also confirmed the decreased expression of SlSP5G2 and SlSP5G3 in infiltrated tomato plants (**Figure 6D**). These data suggest that SlSP5G2 and SlSP5G3 are factors that control tomato flowering under SD conditions.

## Effects of Phytochrome B1 on the Expression of SlSP3D, SlSP5G, SlSP5G2, and SlSP5G3 Genes

As phytochromes are very important photoreceptors mediating flowering both in LD plants and SD plants (Izawa et al., 2002; Endo et al., 2005). We examined whether these photoreceptors have an effect on tomato flowering. We determined the number of leaves at flowering and the expression of the four expressed FT-like genes, SlSP3D, SlSP5G, SlSP5G2, and SlSP5G3 in seedlings of phyA, phyB1, phyB2, and phyB1B2 tomato mutants. We found that the number of leaves at flowering in phyA and phyB2 mutants was the same as that in WT under both LD and SD conditions. However, the number of leaves at flowering in phyB1 and phyB1B2 mutants was 6 and 6.5, respectively, under LD conditions and this number was 8.5 on average under SD conditions (**Figures 7A–E**). In phyB1 and phyB1B2 mutants, there were a constant low expression level of SlSP5G under both LD and SD conditions (**Figures 7H,J**). However, there was a stable high expression level of SlSP5G in WT, phyA and phyB2 mutants under LD conditions. Under SD conditions, there was a higher expression of SlSP5G2 and SlSP5G3 mRNA in phyB1 and phyB1B2 mutants compared to WT (**Figures 7F,H,J**). No difference was detected between phyA, phyB2 mutants and WT on the expression levels of SlSP5G, SlSP5G2, and SlSP5G3 under both LD and SD conditions (**Figures 7F,G,I**). Together, these results clearly demonstrate that PHYB1 has significant influence on the expression of SlSP5G, SlSP5G2, and SlSP5G3, and on the flowering time of tomato plants under both LD and SD conditions.

*SlSP5G* (D), *SlSP5G2* (E), and *SlSP5G3* (F) of tomato plants transferred from SD to LD condition, and vise-versa. All data are expressed as means ±SE of three independent pools of extracts. Three technical replicates were performed for each extract.

## DISCUSSION

#### FT-Like Genes Act Antagonistically to Regulate Floral Initiation in Tomato

Plant PEBP family proteins are divided into three major clades, with the FT-like and MFT-like clades primarily acting to promote and the TFL1-like clade primarily acting to repress floral development. In this study, we queried the complete tomato genome sequences and identified 13 PEBP genes, six of which belong to the FT-like clade, five are classified in the TFL-like clade and two are MFT-like clade. In the six FT-like clade, SlSP3D, SlSP6A, SlSP5G, SlSP5G1, SlSP5G2, and SlSP5G3, two, SlSP6A and SlSP5G1 were not expressed in tomato plants. It has already been demonstrated that SlSP3D/SFT, the tomato ortholog of FT, induces flowering in day-neutral tomato and sft mutants show late flowering phenotype (Molinero-Rosales et al., 2004; Lifschitz et al., 2006). Here, we show that transgenic Arabidopsis plants possessing SlSP3D displayed much earlier flowering phenotype compared to control plants. FT-like proteins that promote flowering have been identified in many species such as Populus spp. (poplar; Böhlenius et al., 2006), Malus domestica (apple; Hättasch et al., 2008), B. vulgaris (sugar beet; Pin et al., 2010), Solanum tuberosum (potato; Navarro et al., 2011), N. tabacum (tobacco; Harig et al., 2012), and Oryza sativa (rice; Kojima et al., 2002).

Based on phylogenetic data SlSP5G, SlSP5G2, and SlSP5G3 have been postulated to be orthologous to FT-like genes (Abelenda et al., 2014). However, overexpression of SlSP5G, SlSP5G-2, or SlSP5G-3 in Arabidopsis resulted in late flowering phenotype compared control plants. In sugar beet and tobacco, FT-like genes can act as flowering promoters and repressors. The two sugar beet FT-like genes, BvFT1 and BvFT2 differ in

three amino acid residues within the critical region encoded by the fourth exon (**Supplementary Figure S1**). Try (134), Gly (137), and Trp (138) are the most important three amino acids of the external loop for BvFT2 protein. Substitution of these three amino acid residues in BvFT2 was sufficient to convert it into a repressor (Pin et al., 2010). The change of BvFT1 Asn (138) into Try, Gln (141) into Gly and Gln (142) into Trp could completely revert its repressing function to promoting function in flowering. Four FT-like proteins have been reported in tobacco. NtFT4 is a flowering activator and the amino acid residues at the three conserved positions matched those of Arabidopsis FT and BvFT2 whereas the corresponding positions in the floral repressors NtFT1-3 were not conserved (Harig et al., 2012). Through protein sequence alignment, we also found that in the tomato SlSP3D the amino acid residues at the three critical positions were Tyr (133), Gly (136), and Trp (137) and these matched to those of the other floral activators, such as FT in Arabidopsis, BvFT2 in sugar beet and NtFT4 in tobacco (**Supplementary Figure S1**). However, the amino acid residues of SlSP5G in these three conserved positions were the same as those found in the floral repressors like NtFT1-3 in tobacco. The corresponding positions of SlSP5G2 and SlSP5G3 in these positions were not conserved compared with other floral activators and repressors (**Supplementary Figure S1**). These results suggest that SlSP5G, SlSP5G2, and SlSP5G3 were initially promoters of flowering but these mutations within the external loop converted its function to flowering repression. The three amino acids are critical for the activator vs. repressor function.

## The Expression Profiles of FT-Like Genes is Influenced by Photoperiod

The expression of FT-like genes in many species is regulated in a photoperiod-dependent manner (Samach et al., 2000; Kojima et al., 2002). Termination and flowering in cultivated tomato are not sensitive to day length, but flower initiation occurs earlier and inflorescence development far better in SD conditions than in LD conditions (Kinet, 1977). All four tomato FT-like genes were expressed exclusively in leaf tissue (**Figure 2**), which was the same as tobacco FT-like genes (Harig et al., 2012). In tobacco, NtFT1, NtFT2, and NtFT4 showed higher expression levels under SD conditions than under LD conditions (Harig et al., 2012). In sugar beet, the floral repressor BvFT1 was expressed at high levels when plants were grown in SD or in non-vernalized biennials plants that were not competent to flower (Pin et al., 2010). We also found that SlSP5G mRNA expression was up-regulated under LD conditions, while SlSP5G2 and SlSP5G3 mRNA increased under SD conditions (**Figure 5B**). The expression of SlSP3D was similar under both LD and SD conditions (**Figure 5B**). Although tomato is day-neutral with respect to flowering, the expression of the SlSP5G, SlSP5G2, and SlSP5G3 identified here seem to be photoperiod dependent. SlSP5G most likely controls tomato flowering under LD conditions while SlSP5G2 and SlSP5G3 seem to regulate flowering under SD conditions. Tomato plants have an adaptive mechanism to adjust flowering according to photoperiod using a combination of different FTlike genes.

In this study, the silencing of SlSP5G by TRV-VIGS vector under LD conditions resulted in early flowering of tomato

plants, and the silencing of SlSP5G2 and SlSP5G3 by TRV-VIGS vector under SD conditions led to early flowering of tomato plants. These results also showed that SlSP5G, SlSP5G2, and SlSP5G3 were floral repressors. SELF PRUNING (SP) is a homolog of TFL1-like gene and SP protein functions as an anti-terminator, maintaining vegetative growth (Pnueli et al., 1998). Mutant sp plants form progressively shorter sympodial units, until the shoots terminate in two successive inflorescences. In many species the ratio of floral activators and repressors, e.g., local ratios of SFT/SP3D (FT-like) and SP (TFL1-like), has been proposed to regulate local growth termination equilibria in all meristems of the tomato shoot system (Shalit et al., 2009; McGarry and Ayre, 2012). The three tomato FT-like floral repressors appear to have taken on the role usually played by TFL1 homologs in most other plants. Additional research is required to classify how FT-like floral activators and repressors and SP set the timing of the developmental switch from vegetative to reproductive growth. Both SFT/SP3D and SP of tomato bind to 14-3-3 and bZIP (SPGB, a homolog of FD) proteins in yeast, but each protein also has its own specific binding proteins (Pnueli et al., 2001). In Arabidopsis, FT protein is first transferred into the sieve elements and then subsequently transported by mass flow to the apex, where it interacts with FD to promote flowering (Abe et al., 2005; Wigge et al., 2005; Corbesier et al., 2007). In tomato, SlSP5G, SlSP5G2, and SlSP5G3 maybe like SlSP3D/SFT and they may interact with SPGB to control tomato flowering.

#### Phytochrome B1 Regulates FT-Like Genes

Phytochromes are photochromic proteins that regulate light responses under different light conditions (quantity, quality, and timing). Our data showed that, in the tomato phyB1 mutant, the expression of SlSP5G under both LD and SD conditions was very low. The expression of SlSP5G2 and SlSP5G3 was always in a fairly high level, compared with WT under both LD and SD conditions. Based on the results we obtained in tomato plants, we found that the PHYB1 could promote the expression of SlSP5G under LD conditions but suppress the expression of SlSP5G2 and SlSP5G3 under both LD and SD conditions. It has been shown that PHYB has a general inhibitory effect on flowering in both LD plants and SD Plants (Lin, 2000; Yanovsky and Kay, 2003). An inhibitory effect of PHYB on FT expression has been shown in Arabidopsis (Valverde et al., 2004; Endo et al., 2005). In rice, the phyB mutation abolishes the night break effect on flowering and Hd3a mRNA, and PHYB suppresses the expression of Hd3a (Izawa et al., 2002; Ishikawa et al., 2005).

Phytochromes need to interact with the circadian clock to regulate flowering time in different day-lengths, but the molecular details of such interactions remains unclear (Valverde et al., 2004; Song et al., 2012). phyB mutations of the SD plant sorghum and the LD plant Arabidopsis both caused an early flowering phenotype; tomato phyB1 mutant also has an early flowering phenotype under LD conditions. One interpretation of this observation is that PHYB action may suppress floral initiation regardless of photoperiods, but the signal transduction or plant's responsiveness to PHYB signaling is gated by the action of the circadian clock, resulting in different day-length responses in the flowering time of different plants.

FIGURE 8 | Model of the photoperiod effect on flowering in tomato. The expression of FT-like genes was regulated by photoperiod and mediated by phytochrome B1. In LD conditions, the expression of *SlSP5G* was induced, and the expression of *SlSP5G2* and *SlSP5G3* were inhibited. In SD conditions, the expression of *SlSP5G* was inhibited, and the expression of *SlSP5G2* and *SlSP5G3* were induced. The different expression pattern of tomato FT-like genes under different photoperiod may contribute tomato being a day neutral plant. Phytochrome B1 could promote the expression of *SlSP5G,* and inhibit the expression of *SlSP5G2* and *SlSP5G3*.

Based on the results we obtained in this study, we propose a model to explain the photoperiod effect on tomato flowering (**Figure 8**). This model is consistent with all of the results we obtained in our studies and suggests that the expression pattern of these FT-like genes is regulated by photoperiod and mediated by PHYB1. In addition, four tomato FT-like genes reveal they act antagonistically to regulate floral initiation. Understanding the molecular mechanism of flowering in a day-neutral plant has important implications for agriculture. Further studies are required to integrate the knowledge obtained from model species like LD plant Arabidopsis and SD plant rice to provide further insight on the mechanisms regulating flowering in day-neutral plant.

## AUTHOR CONTRIBUTIONS

Conceived and designed the experiments: SD and ZZ. Performed the experiments: KC. Analyzed the data: LC and KC. Contributed reagents/materials/analysis tools: XZ. Amplify the seed: LY. Wrote the paper: KC.

## ACKNOWLEDGMENTS

We thank Tomato Genetic Resource Center (Department of Vegetable Crops, University of California, Davis) for providing all the tomato mutants, Dr. Nam-Hai Chua for guiding this project and editing the manuscript, and Dr. Haixi Sun for his help in phylogenetic analysis.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 01213

Supplementary Table S1 | Sequences of primers used in this study for plasmid construction, quantitative RT-PCR and VIGS.

Supplementary Figure S1 | Multiple amino acid alignment of PEBP domain of Arabidopsis, sugar beet, tobacco, and tomato PEBP family proteins. The vertical arrowhead indicates the crucial amino acid change responsible for the difference between FT-like and TFL-like functions identified by Hanzawa et al. (2005). Amino acid residues conserved in FT-like proteins that

#### REFERENCES


promote flowering are shaded in red, identified by Pin et al. (2010) and Harig et al. (2012).

Supplementary Figure S2 | The exon-intron structures of tomato PEBP genes resembles that of AtFT. Boxed areas depict the exons and lines represent introns. Numbers represent exon and intron lengths (bp).

Supplementary Figure S3 | The number of rosette leaves before flowering in the other FT-like genes overexpression lines. All data are showed as mean ±SE of eight plants in each overexpression lines.

Supplementary Figure S4 | Silencing of PDS control gene causes photobleaching in tomato plans. Photographs were taken 4 weeks after silencing. (A) Tomato plant infected by *TRV-SlPDS* vectors. (B) Tomato plant

infected by empty *TRV* vectors.

flower induction and initiation in apple (Malus domestica). Tree Physiol. 28,


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Cao, Cui, Zhou, Ye, Zou and Deng. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Physiological Traits Associated with Wheat Yield Potential and Performance under Water-Stress in a Mediterranean Environment

Alejandro del Pozo<sup>1</sup> \*, Alejandra Yáñez 1, 2, Iván A. Matus <sup>3</sup> , Gerardo Tapia<sup>3</sup> , Dalma Castillo<sup>3</sup> , Laura Sanchez-Jardón<sup>4</sup> and José L. Araus <sup>5</sup>

<sup>1</sup> Programa de Investigación de Excelencia Interdisciplinaria, Adaptación de la Agricultura al Cambio Climático (A2C2), Facultad de Ciencias Agrarias, Centro de Mejoramiento Genético y Fenómica Vegetal, Universidad de Talca, Talca, Chile, <sup>2</sup> Departamento de Ciencias Agrarias, Facultad de Ciencias Agrarias y Forestales, Universidad Católica del Maule, Curicó, Chile, <sup>3</sup> Centro Regional Investigación Quilamapu, Instituto de Investigaciones Agropecuarias, Chillán, Chile, <sup>4</sup> Centro Universitario de la Patagonia, Universidad de Magallanes, Coyhiaque, Chile, <sup>5</sup> Unitat de Fisiologia Vegetal, Facultat de Biologia, Universitat de Barcelona, Barcelona, Spain

#### Edited by:

Edmundo Acevedo, University of Chile, Chile

#### Reviewed by:

Agata Gadaleta, University of Bari, Italy Cándido López-Castañeda, Colegio de Postgraduados, Mexico

> \*Correspondence: Alejandro del Pozo adelpozo@utalca.cl

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

> Received: 17 May 2016 Accepted: 21 June 2016 Published: 07 July 2016

#### Citation:

del Pozo A, Yáñez A, Matus IA, Tapia G, Castillo D, Sanchez-Jardón L and Araus JL (2016) Physiological Traits Associated with Wheat Yield Potential and Performance under Water-Stress in a Mediterranean Environment. Front. Plant Sci. 7:987. doi: 10.3389/fpls.2016.00987 Different physiological traits have been proposed as key traits associated with yield potential as well as performance under water stress. The aim of this paper is to examine the genotypic variability of leaf chlorophyll, stem water-soluble carbohydrate content and carbon isotope discrimination (113C), and their relationship with grain yield (GY) and other agronomical traits, under contrasting water conditions in a Mediterranean environment. The study was performed on a large collection of 384 wheat genotypes grown under water stress (WS, rainfed), mild water stress (MWS, deficit irrigation), and full irrigation (FI). The average GY of two growing seasons was 2.4, 4.8, and 8.9 Mg ha−<sup>1</sup> under WS, MWS, and FI, respectively. Chlorophyll content at anthesis was positively correlated with GY (except under FI in 2011) and the agronomical components kernels per spike (KS) and thousand kernel weight (TKW). The WSC content at anthesis (WSCCa) was negatively correlated with spikes per square meter (SM2), but positively correlated with KS and TKW under WS and FI conditions. As a consequence, the relationships between WSCCa with GY were low or not significant. Therefore, selecting for high stem WSC would not necessary lead to genotypes of GY potential. The relationship between 113C and GY was positive under FI and MWS but negative under severe WS (in 2011), indicating higher water use under yield potential and MWS conditions.

Keywords: carbohydrate, carbon isotope discrimination, chlorophyll, drought, stem reserves

## INTRODUCTION

Since the Green Revolution the yields of wheat and other cereals have increased considerably in many regions of the world, including Chile (Calderini and Slafer, 1998; Engler and del Pozo, 2013; del Pozo et al., 2014), as a result of genetic improvement and better agronomic practices. The yield potential, i.e., the yield achieved when the best available technology is used, has also increased almost linearly since the sixties, particularly in more favorable environments where soil water availability is not limited (Zhou et al., 2007; Fischer and Edmeades, 2010; Matus et al., 2012; del Pozo et al., 2014). Yield under water-limiting conditions, such those of the rainfed Mediterranean environments, has also increased during the past decades (Sánchez-García et al., 2013). Notwithstanding the possible need for phenological adjustment (earliness) a higher yield potential may also translate into a higher performance under water stress (Nouri et al., 2011; Hawkesford et al., 2013). However, the potential yield and water-limited yield of wheat needs to continue increasing in order to cope with future demand for food, which is a consequence of the growing population and changes in social habits (Fischer, 2007; Hawkesford et al., 2013), and also to reduce the negative impacts on crop productivity of global climate change (Lobell et al., 2008; Lobell and Gourdji, 2012).

The increase, in the yield potential and stress adaptation of wheat has been attained mainly through empirical selection for grain yield (GY). However, there is evidence that phenotyping using physiological traits, as a complement to agronomic traits, may help in identifying selectable features that accelerate breeding for yield potential and performance under drought (Araus et al., 2002, 2008; Fischer, 2007; Foulkes et al., 2007; Cattivelli et al., 2008; Fleury et al., 2010). The increases in yield potential of wheat since the sixties have been both positively correlated with shoot dry matter and harvest index (HI); the latter also being positively associated with water-soluble carbohydrate (WSC) content of stems at anthesis (Foulkes et al., 2007). Under water limiting conditions, various physiological process and traits have been associated with GY (e.g., Araus et al., 2002, 2008; Condon et al., 2004; Reynolds et al., 2006; Tambussi et al., 2007). Among them are traits related to pre-anthesis accumulation of WSC in stems and its further use during grain filling (Ehdaie et al., 2006a,b; Reynolds et al., 2006), delays in senescence during grain filling assessed via changes in leaf color (Lopes and Reynolds, 2012), and those related to water use efficiency, in particular carbon isotope discrimination (113C) in kernels (Richards et al., 2002; Araus et al., 2003, 2008).

WSCs are accumulated in stems prior to anthesis and then are remobilized to the grain during the grain-filling period (Blum, 1998; Bingham et al., 2007). Under water limiting conditions, where canopy photosynthesis is inhibited, the contribution of stem carbohydrate to grain growth could be very significant (Ehdaie et al., 2006a,b; Reynolds et al., 2006). Both spring and winter wheat lines have been shown to vary significantly for WSC concentration and WSC content in stems around anthesis (Ruuska et al., 2006; Foulkes et al., 2007; Yang et al., 2007), whereas positive correlations have been observed between accumulated WSC at anthesis and GY in winter wheat genotypes (Foulkes et al., 2007), as well as with kernel weight in recombinant inbred lines (RILs) from the Seri/Babax population (Dreccer et al., 2009). However, stem WSC concentrations can be negatively correlated with stem number m−<sup>2</sup> (Dreccer et al., 2013).

Drought increases senescence, by accelerating chlorophyll degradation, leading to a decrease in leaf area and canopy photosynthesis. There is evidence that stay-green phenotypes with delayed leaf senescence can improve their performance under drought conditions (Rivero et al., 2007; Lopes and Reynolds, 2012).

113C can be used as a selection criterion for high water use efficiency (Condon et al., 2004; Richards, 2006), but also can provide an indirect determination of the effective water used by the crop (Araus et al., 2002, 2008; Blum, 2009). In fact,

2012 (C,D). Each of the soil depth values are means from two sensors (replicates). Abbreviations H and PM refer to the dates of heading and physiological maturity, respectively. Dates of irrigation (i) at Santa Rosa are marked with arrows.

kernel 113C can be positively or negatively correlated with GY depending on soil water availability. Indeed, under moderate stress to well-watered Mediterranean conditions 113C has been reported to be positively correlated with GY in wheat (Araus et al., 2003, 2008 for wheat) and barley (Acevedo et al., 1997; Voltas et al., 1999; del Pozo et al., 2012), whereas the opposite trend has been reported under severe drought conditions (but see Araus et al., 1998).

In this study we investigated the genotypic variability of flag leaf chlorophyll content (measured with a portable leaf meter), stem WSC accumulation at anthesis and the 113C of mature kernels, as well as the relationship of these traits with GY and its agronomical components, in spring bread wheat under contrasting water conditions in a Mediterranean environment. It is hypothesized that within a large set (384 genotypes) cultivars and advanced lines of spring bread wheat there is high genotypic variability for agronomic and physiological traits. In addition, the yield performance of genotypes under drought conditions is associated with stem WSC accumulation, delayed leaf senescence, and carbon discrimination in grains.

### MATERIALS AND METHODS

## Plant Material and Growing Conditions

A collection of 384 cultivars and advanced semidwarf lines of spring bread wheat (Triticum aestivum L.), including 153 lines from the wheat breeding program of the Instituto de Investigaciones Agropecuarias (INIA) in Chile, 53 from the International Wheat and Maize Improvement Centre (CIMMYT) that were previously selected for adaptiveness to Chilean environments (these lines share common ancestors with the INIA-Chile breeding program), and 178 lines from INIA in Uruguay (Table S1). The objective with this set of lines was to create a germplasm base to breed for drier areas in Chile and subsequently other countries within the projects involved.

This large set of genotypes was evaluated in two Mediterranean sites of Chile: Cauquenes (35◦ 58′ S, 72◦ 17′ W; 177 m.a.s.l.) under the water stress (WS) typical of the rainfed at this site, and Santa Rosa (36◦ 32′ S, 71◦ 55′ W; 220 m.a.s.l.) under full irrigation (FI) and moderate water stress (MWS) conditions achieved through support irrigation. Trials were assayed during two consecutive (2011 and 2012) crop seasons, except for the MWS trial, which was only set up during 2011. Cauquenes corresponds to the Mediterranean drought-prone area of Chile; the average annual temperature is 14.7◦C, the minimum average is 4.7◦C (July) and the maximum is 27◦C (January). The evapotranspiration is 1200 mm (del Pozo and del Canto, 1999) and the annual precipitation was 410 and 600 mm in 2011 and 2012, respectively. Santa Rosa corresponds to a high yielding area; the average annual temperature in this region is 13.0◦C, the minimum average is 3.0◦C (July) and the maximum is 28.6◦C (January; del Pozo and del Canto, 1999). The annual precipitation was 736 and 806 mm, in 2011 and 2012, respectively.

The experimental design was an α-lattice with 20 incomplete blocks per replicate, each block containing 20 genotypes. In each replicate two cultivars (Don Alberto and Carpintero) were included eight times. Two replicates per genotypes were used, except at Cauquenes and Santa Rosa SI in 2011 where a single replicate was established. Plots consisted of five rows of 2 m in length and 0.2 m distance between rows. The sowing rate was 20 g m<sup>2</sup> and sowing dates were: 07 September and 23 May, in 2011 and 2012, respectively at Cauquenes; 31 and 7 August, in 2011 and 2012, respectively at Santa Rosa. Because the sowing date in 2011 at Cauquenes was much later than in 2012, the water stress was more severe in the first year. Plots were fertilized with 260 kg ha<sup>1</sup> of ammonium phosphate (46% P2O<sup>5</sup> and 18% N), 90 kg ha−<sup>1</sup> of potassium chloride (60% K2O), 200 kg ha−<sup>1</sup> of sul-po-mag (22% K2O, 18% MgO, and 22% S), 10 kg ha−<sup>1</sup> of boronatrocalcite (11% B), and 3 kg ha−<sup>1</sup> of zinc sulfate (35% Zn). Fertilizers were incorporated with a cultivator before sowing. During tillering an extra 153 kg ha−<sup>1</sup> of N was applied. Weeds were controlled with the application of Flufenacet + Flurtamone + Diflufenican (96 g a.i.) as pre-emergence controls and a further application of MCPA (525 g a.i.) + Metsulfuron-metil (5 g a.i.) as postemergents. Cultivars were disease resistance and no fungicide was used.

Furrow irrigation was used in Santa Rosa: one irrigation at the end of tillering (Zadocks Stage 21; Zadoks et al., 1974) in the

MWS trial and four irrigations at the end of tillering, the flag leaf stage (Z37), heading (Z50), and middle grain filling (Z70) in the FI trial respectively. Soil moisture at 10–20, 20–30, 30–40, and 40–50 cm depth was determined by using 10HS sensors (Decagon Devices, USA) connected to an EM-50 data logger (Decagon Devices, USA). The 10HS sensor determines volumetric water content by measuring the dielectric constant of the soil using capacitance/frequency domain technology. Two sets of sensors were set up in each environment and mean values of two sensors per depth are presented in **Figure 1**.

#### Agronomical Traits

Days from emergence to heading (DH) were determined in Santa Rosa, through periodic (twice a week) observations, when approximately half of the spikes in the plot had already extruded. At maturity and for each plot the plant height (PH) of the different trials, up to the extreme of the spike (excluding awns), was measured, the number of spikes per m<sup>2</sup> (SM2) were determined for a 1 m length of an inside row, and the number of kernels per spike (KS) and 1000 kernel weight (TKW) were determined in 25 spikes taken at random. Grain yield was assessed by harvesting the whole plot.

#### Leaf Chlorophyll Content and Water-Soluble Carbohydrates

Chlorophyll content (SPAD index) was determined at anthesis and then during grain filling about 2 weeks after anthesis (both measured on given calendar dates) in five flag leaves per plot using a SPAD 502 (Minolta Spectrum Technologies Inc., Plainfield, IL, USA) portable leaf chlorophyll meter. WSC concentration in stems (harvested at ground level and excluding leaf laminas and sheaths) was determined at anthesis and maturity, on five main stems per plot, using the anthrone reactive method (Yemm and Willis, 1954). The stem length was measured and then dried for 48 h at 60◦C, weighed and ground. Next, a 100 mg subsample was used for WSC extraction, with 3 ml of extraction buffer containing 80% ethanol 10 mM Hepes-KOH (pH = 7.5), and incubated at 60◦C overnight. Then, to separate the debris, the samples were centrifuged at 60 rpm for 30 min. The anthrone reagent was added to each supernatant and placed over a hotplate at 80◦C for 20 min. Finally, the absorbance of the sample was measured at 620 nm in an EPOCH microplate UV-Vis Spectrophotometer (Biotek) using COSTAR 3636 96 wellplates (Corning) for the UV range. WSC content per whole stem and per unit land area were calculated as WSC concentration per unit stem weight (mg CHO g stem−<sup>1</sup> ), and WSC content per unit of stem (mg CHO stem−<sup>1</sup> ) and per unit grown area (g CHO m−<sup>2</sup> ), respectively. In addition, the apparent WSC remobilization was calculated as the differences from anthesis to maturity in WSC content on a stem and land area basis.

#### Stable Carbon Isotope Analysis

The stable carbon (13C/12C) isotope ratio was measured in mature kernels using an elemental analyser (ANCA-SL, PDZ Europa, UK) coupled with an isotope ratio mass spectrometer, at the Laboratory of Applied Physical Chemistry

TABLE 1 | F-values of ANOVA for agronomic and physiological traits, for 378 genotypes of wheat grown under severe water stress (Cauquenes WS) and full irrigation (Santa Rosa FI) in two growing seasons.


GY, grain yield; DH, day to heading; PH, plant height; SM2, spikes m−<sup>2</sup> ; KS, kernel per spike; TKW, thousand kernel weight; KM2, kernels m−<sup>2</sup> ; SPADa, SPAD anthesis; SPADgf, SPAD grain filling; SWa, stem weight at anthesis; SWm, stem weight at maturity; WSCa, WSC at anthesis; WSCm, WSC at maturity; WSCCa, WSC content anthesis; WSCCm, WSC content maturity; ∆13C, kernel ∆13C.

a In 2011, 10 genotypes were discarded from the analysis due to low spike numbers.

\*P < 0.05; \*\*P < 0.001; \*\*\*P < 0.0001.

at Ghent University (Belgium). The <sup>13</sup>C/12C ratios were expressed in δ notation (Coplen, 2008) determined by: δ <sup>13</sup>C = ( <sup>13</sup>C/12C)sample/(13C/12C)standard −1 (Farquhar et al., 1989), where sample refers to plant material and standard to the laboratory standards that have been calibrated against international standards from Iso-Analytical (Crewe, Cheshire, UK). The precision of δ <sup>13</sup>C analyses was 0.3‰ (SD, n = 10). Further, the carbon isotope discrimination (113C) of kernels was calculated as: 113C (‰) = (δ <sup>13</sup>C<sup>a</sup> − δ <sup>13</sup>Cp)/[1+ (δ <sup>13</sup>Cp)/1000], where a and p refer to air and the plant, respectively (Farquhar et al., 1989). δ <sup>13</sup>C<sup>a</sup> from the air was taken as −8.0‰.

#### Yield Tolerance Index

The yield tolerance index (YTI), which combines the relative performance of a genotype under drought with its potential yield under irrigated conditions (Ober et al., 2004), was calculated as:

$$YTI = \left(\frac{Y\_D}{\overline{Y}\_D}\right)\left(\frac{Y\_I}{\overline{Y}\_I}\right)\left(\frac{\overline{Y}\_D}{\overline{Y}\_I}\right) = \left(\frac{Y\_D Y\_I}{\overline{Y}\_I^2}\right) \tag{1}$$

where Y<sup>D</sup> and Y<sup>I</sup> are the genotype mean yield under drought (Cauquenes) and irrigation conditions (Santa Rosa, fully irrigation), respectively, and Y<sup>D</sup> and Y<sup>I</sup> are the mean yield of all genotypes growing under drought and irrigated conditions, respectively.

#### Statistical Analysis

In 2011, 10 genotypes were discarded from analysis due to low emergence. In addition six genotypes from Uruguay were discarded from the analysis for having late heading time (more than 100 days) an plant height >120 cm. ANOVAs for physiological and yield-related traits were performed for the whole set of genotypes using PROC MIXED of the SAS Institute Inc. Genotypes and environment (Cauquenes WS and Santa Rosa FI) were considered fixed effects, whereas blocks and incomplete blocks within each replication (in an α-lattice design) were considered random effects. Data from Santa Rosa MWS where not considered in the ANOVAs because there was no replication and only one year (2011) of observations. Correlation analysis was performed between agronomic and physiological traits, and also stepwise regressions between grain yield and related agronomical and physiological traits. Principal component analysis (PCA) was carried out for the 378 genotypes using the mean values for physiological and agronomical traits evaluated under severe water stress in Cauquenes and full irrigation in Santa Rosa, in two growing seasons, using IBM SPSS Statistics 19.

## RESULTS

#### Agronomical and Physiological Traits

For SM2, KS and TKW the genotype x environment (GxE) interaction was highly significant (P < 0.001) in both growing

TABLE 2 | Means ± standard deviation and ranges (minimum–maximum) for chlorophyll content in SPAD units, stem weight, water-soluble carbohydrate (WSC) concentration and content per stem at anthesis and maturity, and carbon isotope discrimination (113C) in kernels, for 378 genotypes of wheat grown under severe water stress (Cauquenes WS), mild water stress (Santa Rosa MWS) and full irrigation (Santa Rosa FI) in two growing seasons, except at MWS.


seasons, whereas for GY, PH, and KM2 was only in one growing season (**Table 1**). Among the physiological traits, the SPAD index exhibited a significant (P < 0.001) GxE interaction in both growing seasons, but stem weight and WSC concentration and content, and 113C of kernels was only in 2012 (**Table 1**).

Under FI in Santa Rosa, the average GY of the three sets of wheat genotypes (378 in total) was 8–10 Mg ha−<sup>1</sup> but some genotypes produced up to 12 Mg ha−<sup>1</sup> (**Figure 2A**). Under MWS in Santa Rosa the average GY was 4.8 Mg ha−<sup>1</sup> . Under WS GY was significantly (P < 0.0001) reduced in Cauquenes, by 79 and 68% in 2011 and 2012, respectively, compared to Santa Rosa under FI (**Figure 2A**). Also, plant height was reduced under WS by 40 and 9% in 2011 and 2012, respectively (**Figure 2B**).

The reduction in SM2, KS and TKW under WS compared with FI was in general more pronounced in the first growing season; on average (of the two growing seasons) these traits were reduced by 25, 41, 21, and 18%, respectively, whereas KM2 was reduced by 53% (**Figure 3**).

The relationships for GY under FI and WS showed no significant correlation in both years (P > 0.05). The yield tolerance index (YTI) of the 378 genotypes based on GY under WS and FI presented a wide range of values in both years, from 0.05 (very susceptible) to 0.65 (very tolerant genotypes). The frequency distribution of YTI had a left-skewed deviation in 2011 (mean YTI = 0.21) compared to 2012 (mean YTI = 0.32).

Days to heading, determined under FI, differed by about 20 days between the earliest and latest genotypes (**Table 2**). A wide range of SPAD index values among genotypes was observed in environments (WS, MWS, and FI) and growing seasons (**Table 2**). A significant reduction (P < 0.001) in the SPAD index at anthesis and during grain filling was observed under WS in 2012.

Stem weight and stem WSC concentration and content were much higher at anthesis compared to maturity. Their average reductions over two growing seasons were about 43, 77, and 87%, respectively, under WS at Cauquenes, and 23, 79, and 84%, respectively, in Santa Rosa under FI (**Table 2**). The apparent WSC remobilization was on average 279, 220, and 170 mg per stem under WS, MWS, and FI, respectively (data not shown).

The WSC concentration and content per stem at anthesis and maturity presented large genotypic variabilities in all the environments (**Table 2**). The stem WSC per unit area (g m−<sup>2</sup> ) at anthesis was highly correlated to the WSC concentration (r = 0.66 and 0.84, P < 0.001, for WS for FI, respectively, in 2012) and the stem biomass (g m−<sup>2</sup> ; r = 0.81 and 0.66, P < 0.001, for a WS for FI, respectively, in 2012).

The 113C also exhibited genotypic variability under WS, MWS, and FI (**Table 2**), but lower values were found under WS compared to FI.


#### Relationships between Yield, Agronomical, and Physiological Traits

GY was positively correlated with SM2 and KM2, but negatively correlated with TKW, in both water regimes and growing seasons (**Figure 4**). GY was also positively correlated (r = 0.3–0.52, P < 0.001) with plant height in all the environments.

Days to heading (determined at FI) was not correlated with GY, but it was positively correlated with SM2 and negatively correlated with TKW, except under FI in 2012 (**Table 3**). The SPAD index was positive and significantly correlated with GY (except under FI in 2011) and the agronomical components KS and TKW (**Table 3**). The WSC content at anthesis (WSCCa) was negatively correlated with SM2, but positively correlated with KS and TKW under WS and FI conditions (**Figure 5**). As a consequence, GY exhibited a low positive correlation with WSCCa under WS in 2012, and non or negative correlation under FI (**Table 3**).

The relationship between 113C and GY was slightly negative under WS in 2011, but positive and highly significant in 2012, and also positive under MWS and FI in 2011 and 2012 (**Table 3**; **Figure 6A**). Indeed, Pearson correlation values of the relationship between 113C vs. GY depended on the environment, increasing from low to medium yields and further declining at higher GY (**Figure 6B**). The correlation between 113C and STI under SWS was not significant in 2011 but was positive and significant in 2012 (r = 0.51; P < 0.01).

PCA analysis indicated that the two first principal components (PC) explained >50% of the observed variability, under WS and FI conditions (**Figure 7**). KS was the agronomical component more close related with GY under WS and FI (except in 2011). Among the physiological traits, 113C presented the strongest association with GY, except under the severe WS in 2011 (**Figure 7**). The SPAD index at anthesis was close associated with GY under WS in 2011, but with TKW under WS in 2012 and FI. WSCCa was also close related to TKW in all the environments, and days to heading was associated SM2.

The stepwise regression analysis between GY and related agronomical (SM2, TKW, and KS) and physiological (SPADa, WSCCa, and 113C) traits indicated that under water stress conditions, the contribution of the agronomical trails was greater than the physiological ones, but under full irrigation conditions WSCCa and 113C contributed similarly to the agronomical traits to GY (**Table 4**).

#### DISCUSSION

The set of 378 wheat genotypes tested in this work exhibited a high phenotypic variability for physiological and agronomic traits. The water stress in Cauquenes was very severe as reflected in the low average GY (1.7 Mg ha−<sup>1</sup> in 2011). However, some genotypes were able to produce more than 4 Mg ha−<sup>1</sup> under such WS conditions and showed high values of YTI (>0.50). Actually, YTI was highly correlated (r > 0.92; P < 0.0001 in both years) with GY under WS in Cauquenes. Under the full irrigation conditions of Santa Rosa some genotypes achieved extremely high yields (12 Mg ha−<sup>1</sup> ),

of 378 genotypes of wheat grown under water stress (WS) in Cauquenes and mild water stress (MWS) and full irrigation (FI) in Santa Rosa, in 2011 and 2012. Pearson correlation values are shown in the table above.

for a Mediterranean environment. Large genotypic variability in GY and its agronomical components has also been found in 127 recombinant inbred lines (Dharwar Dry × Sitta) of wheat growing under severe water stress in Obregon, Mexico (Kirigwi et al., 2007), and in 105 lines of the double-haploid population (Weebil × Bacanora) in four contrasting highyielding environments (García et al., 2013).

The strong reduction in GY under WS was mainly a consequence of the decline in SM2 (41%), followed by KS (21%), and as a consequence the number of kernels per m<sup>2</sup> was reduced (53%; **Table 2**). Thus, kernels per m<sup>2</sup> is the agronomical component most affected by drought, as previously reported by other authors (Estrada-Campuzano et al., 2012). In addition the TKW also decreased, but to a lesser extent (18%). As a consequence GY was positively correlated with the number of kernels m−<sup>2</sup> (**Figure 4**; r = 0.81, P < 0.0001 for all the environments), but the correlation coefficients for each environment were not as high as has been reported by several authors (see Sinclair and Jamieson, 2006). In fact, a trade-off among the agronomical components was observed where SM2 was negatively correlated with KS under FI (r = −0.50 and −0.58 in 2011 and 2012, respectively) and TKW in WS (r = −0.36 and −0.49 in 2011 and 2012, respectively) and FI (r = −0.60 and −0.58 in 2011 and 2012, respectively) conditions. The PCA indicated that KS was better associated with GY in both WS and FI conditions (**Figure 7**). Other studies have also shown that KS but not TKW was associated with GY under water stress conditions (Dencˇic et al., 2000 ´ ) and also a high-yielding environment (García et al., 2013).

#### Chlorophyll Content

Chlorophyll content at anthesis was positively correlated with GY and the agronomical components KS and TKW, particularly under WS (**Table 3**). Drought increases senescence by accelerating chlorophyll degradation leading to a decrease in leaf area and photosynthesis. There is evidence that staygreen phenotypes with delayed leaf senescence can improve their performance under drought conditions (Rivero et al., 2007; Lopes and Reynolds, 2012). In wheat and sorghum, genotypic variability has been detected in chlorophyll content as well as in the rate of

FIGURE 5 | Relationships between stem water-soluble carbohydrate content at anthesis (WSCCa) and grain yield (A), spikes per m−<sup>2</sup> (B), kernel per spike (C), and thousand kernel weight (D) in 378 genotypes of wheat grown under water stress (WS) in Cauquenes and mild water stress (MWS) and full irrigation (FI) in Santa Rosa, in 2011 and 2012. Pearson correlation values are in Table 3.


TABLE 4 | Stepwise regression analysis between grain yield (GY) and related agronomical (TKW, SM2, and KS) and physiological (SPADa, WSCCa, and 113C) traits of 378 genotypes of wheat grown under water stress (WS) in Cauquenes and full irrigation (FI) in Santa Rosa, in 2011 and 2012.

leaf senescence (measured with a portable leaf chlorophyll meter) during grain-filling (Harris et al., 2007; Lopes and Reynolds, 2012). In durum wheat (Triticum turgidum ssp. durum) staygreen mutants growing under glasshouse conditions remained green for longer and had higher rates of leaf photosynthesis and seed weight (Spano et al., 2003). These mutants with the stay-green characteristic also had higher levels of expression of the Rubisco small subunit of (RBCS) and chlorophyll a/b binding protein (Rampino et al., 2006). Bread wheat genotypes with functional stay-green characteristics have also shown higher GY and total biomass in field conditions (Chen et al., 2010). Another study on Canadian spring wheat revealed that GY was positively correlated with green flag leaf duration and total flag leaf photosynthesis (Wang et al., 2008). Studies on spring wheat in the USA found a positive correlation between the staygreen trait and GY and grain weight in both water-limited and

well-watered conditions (Blake et al., 2007). Therefore, a delay in leaf senescence would increase the amount of fixed carbon available for grain filling.

#### Stem Water-Soluble Carbohydrate

Large genotypic variability in stem WSC concentration and content was found at anthesis and maturity, in both environments (**Table 2**; **Figure 5**). Other studies conducted in spring and winter wheat lines have also found large variability in WSC concentration and WSC content on an area basis in stems around the time of anthesis (Ruuska et al., 2006; Foulkes et al., 2007; Yang et al., 2007). WSCs are accumulated in stems prior to anthesis and are then remobilized to the grain during the grain-filling period (Blum, 1998; Bingham et al., 2007). Indeed under water limiting conditions, where canopy photosynthesis is inhibited, the contribution of stem carbohydrate to grain growth could be very significant (Ehdaie et al., 2006a,b; Reynolds et al., 2006). In our study, more carbohydrate was accumulated at anthesis under WS than under FI, and the decline in stem WSC from anthesis to maturity was greater under WS, particularly in 2012 (360 vs. 130 mg per stem under WS and FI, respectively). This suggests that there was a larger remobilization of reserves during grain filling under WS. However, there were no clear relationships between the stem WSCCa, or the apparent WSC remobilization and GY, varying the correlation values from not significant to negative on the different environments (**Table 3**; **Figure 5**). Zhang et al. (2015) found also no significant correlation between stem WSC and GY in 20 genetically diverse double haploids derived from the cross of cvs. Westonia × Kauz, growing under drought, and irrigated conditions in Western Australia. These results differ from those found by Foulkes et al. (2007) in winter wheat under non water-stressed conditions in England.

It seems that there is a trade-off between the stem WSCCa and some of the agronomical yield components. In fact, negative correlations exist with SM2 in all the environments, but the correlations were positive with KS and TKW (**Table 3**; **Figure 5**). The PCA analyses also showed a high association between WSCCa and TKW (**Figure 7**). This negative relationship between WSC and either number of stems or number spikes per m<sup>2</sup> at maturity has also been reported for other wheat genotypes (Rebetzke et al., 2008a; Dreccer et al., 2009, 2013). Why genotypes with lower number of stems present higher stem WSC concentration and content? A possible explanation is that genotypes with lower number of stems per unit area have bigger stems; if fact, our results indicated a significant (p < 0.001) negative correlation (r = −0.29 and −0.36 under WS, and −0.58 and −0.56 under FI, in 2011 and 2012, respectively) between SM2 and stem weight at anthesis. Thus, genotypes with lower number of stems have probably more light transmission through the canopy and therefore higher rates of photosynthesis per stem, leading to higher stem weight and WSC content (more reserves), and greater numbers of grains per spike and kernel size. A significant and positive correlation between accumulated WSC at anthesis and kernel weight has been also observed in recombinant inbred lines (RILs) from the Seri/Babax population (Dreccer et al., 2009). Another hypothesis (complementary of the previous one) may be that those genotypes able to produce less tillers (because poorer adaptation to growing conditions—such as water stress-) are those which accumulate more carbohydrate since these photoassimilates are not used for growth. Therefore, selecting for high stem WSC, either under near optimal agronomical conditions or under water stress, would probably lead to genotypes with lower tillering capacity and GY potential. The study conducted by Dreccer et al. (2013) in RILs of contrasting tillering and WSC concentration in the stem, and grown at different plant densities or on different sowing dates, indicates that genotypic rankings for stem WSC persisted when RILs were compared at similar stem density.

#### Carbon Isotope Discrimination

The genotypic differences in carbon isotope discrimination found among the 384 genotypes (**Table 3**) agree with other studies conducted in Mediterranean conditions. For example, higher 113C (or lower carbon isotope composition, δ <sup>13</sup>C) in modern cultivars compared with old varieties has been found

in bread (del Pozo et al., 2014) and durum wheats (Araus et al., 2013).

The relationship between 113C and GY was positive under MWS or FI but was negative under WS (**Figure 5**). Other studies in wheat (Araus et al., 2003, 2008) and barley (del Pozo et al., 2012) have also shown that 113C in kernels can be positively or negatively correlated with GY depending on soil water availability. Positive relationships between 113C (or negative with δ <sup>13</sup>C) and GY have been frequently reported for cereals under Mediterranean conditions (see Rebetzke et al., 2008b for bread wheat and Araus et al., 2003, 2013 for durum wheat), and this can be explained by the fact that genotypes maintaining a larger transpiration and thus water use during the crop cycle will be the most productive (Araus et al., 2003, 2008, 2013; Blum, 2005, 2009). In fact, negative relationships between kernel oxygen isotope composition (δ <sup>18</sup>O) or enrichment (118O) and grain yield have been reported in bread wheat under fully irrigated conditions (Cabrera-Bosquet et al., 2011; del Pozo et al., 2014) as well as for durum wheat under Mediterranean conditions (Araus et al., 2013) and subtropical maize under well irrigated and moderate stress (Cabrera-Bosquet et al., 2009). Indeed, carbon isotope composition can be used as a selection criterion for high water use efficiency (Condon et al., 2004; Richards, 2006), but also can provide an indirect determination of the effective water used by the crop (Araus et al., 2002, 2008; Blum, 2009). The effect of phenology on 113C (earlier genotypes exhibiting higher 113C) may be discarded, since heading date was not correlated with 113C (P > 0.05) in none of the environments. Actually, the positive correlations between 113C and GY was also found when the relationship were studied within subset of 212 genotypes with similar heading duration (80–85 days); r = 0.50 for WS and 0.42 for WI in 2012.

## CONCLUSIONS

The identification of genotypic variability for agronomical and physiological traits under water stress conditions and full irrigation is of great interest for breeders because selected genotypes with favorable traits can be used as parents in future crosses. Among these, genotypes with higher numbers of fertile tillers would lead to higher numbers of kernels per m<sup>2</sup> and GY under terminal water stress and non-stress conditions. Additionally, genotypes with delay in leaf senescence (a higher SPAD index) would lead to higher KS and TKW, particularly under water stress, and to a lesser extent at full irrigation. In the case of yield potential conditions, this is probably the consequence of greater amounts of fixed carbon available for grain filling, whereas under water stress stay-green it is an indicator of resilience to stress conditions. In addition, genotypes with higher carbon discrimination values are associated with higher GY under MWS and full irrigation, indicating that more water is used by the crop. In addition, selection for a higher WSC at anthesis may bring negative consequences in terms of yield potential and adaptation to MWS conditions. This study clearly illustrates the importance of defining the target environment for wheat breeding before determining the set of phenotyping traits for selection.

#### AUTHOR CONTRIBUTIONS

AD and IM designed the experiments, selected the germplasm and participated on field evaluations. AY and GT were in charge of carbohydrate determinations. DC was in charge of the management of the experiments and evaluation of agronomic traits. LS and JA contributed to analysis of the data. AD was in charge of the writing up but all the authors contributed to the manuscript.

#### REFERENCES


#### ACKNOWLEDGMENTS

This work was supported by the research CONICYT grants FONDECYT N◦ 1150353 and program "Atracción de Capital Humano Avanzado del Extranjero" N◦ 80110025. Participation of JA was supported through the Spanish project AGL2013-44147-R. We thank to CIMMYT and the National Research Program of Rainfed Crops of INIA-Uruguay for providing wheat germplasm, Alejandra Rodriguez and Alejandro Castro for technical assistance in field experiments, and Boris Muñoz for the analysis of soluble carbohydrates.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 00987

different water regimes. Ann. Bot. 104, 1207–1216. doi: 10.1093/aob/ mcp229


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 del Pozo, Yáñez, Matus, Tapia, Castillo, Sanchez-Jardón and Araus. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Linking Dynamic Phenotyping with Metabolite Analysis to Study Natural Variation in Drought Responses of Brachypodium distachyon

#### Edited by:

Puneet Singh Chauhan, National Botanical Research Institute (CSIR), India

#### Reviewed by:

Kemal Kazan, Commonwealth Scientific and Industrial Research Organisation, Australia Iker Aranjuelo, Agribiotechnology Institute (IdAB)-CSIC-UPNA, Spain

\*Correspondence:

Maurice Bosch mub@aber.ac.uk Luis A. J. Mur lum@aber.ac.uk

#### †Present address:

Lorraine H. C. Fisher, Natural Resources Institute, University of Greenwich at Medway, Central Avenue, Chatham, Kent, UK

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 11 July 2016 Accepted: 07 November 2016 Published: 29 November 2016

#### Citation:

Fisher LHC, Han J, Corke FMK, Akinyemi A, Didion T, Nielsen KK, Doonan JH, Mur LAJ and Bosch M (2016) Linking Dynamic Phenotyping with Metabolite Analysis to Study Natural Variation in Drought Responses of Brachypodium distachyon. Front. Plant Sci. 7:1751. doi: 10.3389/fpls.2016.01751 Lorraine H. C. Fisher<sup>1</sup>† , Jiwan Han<sup>2</sup> , Fiona M. K. Corke<sup>2</sup> , Aderemi Akinyemi<sup>1</sup> , Thomas Didion<sup>3</sup> , Klaus K. Nielsen<sup>3</sup> , John H. Doonan<sup>2</sup> , Luis A. J. Mur<sup>1</sup> \* and Maurice Bosch<sup>1</sup> \*

1 Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK, <sup>2</sup> The National Plant Phenomics Centre, Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK, <sup>3</sup> DLF Seeds A/S, Store Heddinge, Denmark

Drought is an important environmental stress limiting the productivity of major crops worldwide. Understanding drought tolerance and possible mechanisms for improving drought resistance is therefore a prerequisite to develop drought-tolerant crops that produce significant yields with reduced amounts of water. Brachypodium distachyon (Brachypodium) is a key model species for cereals, forage grasses, and energy grasses. In this study, initial screening of a Brachypodium germplasm collection consisting of 138 different ecotypes exposed to progressive drought, highlighted the natural variation in morphology, biomass accumulation, and responses to drought stress. A core set of ten ecotypes, classified as being either tolerant, susceptible or intermediate, in response to drought stress, were exposed to mild or severe (respectively, 15 and 0% soil water content) drought stress and phenomic parameters linked to growth and color changes were assessed. When exposed to severe drought stress, phenotypic data and metabolite profiling combined with multivariate analysis revealed a remarkable consistency in separating the selected ecotypes into their different pre-defined drought tolerance groups. Increases in several metabolites, including for the phytohormones jasmonic acid and salicylic acid, and TCA-cycle intermediates, were positively correlated with biomass yield and with reduced yellow pixel counts; suggestive of delayed senescence, both key target traits for crop improvement to drought stress. While metabolite analysis also separated ecotypes into the distinct tolerance groupings after exposure to mild drought stress, similar analysis of the phenotypic data failed to do so, confirming the value of metabolomics to investigate early responses to drought stress. The results highlight the potential of combining the analyses of phenotypic and metabolic responses to identify key mechanisms and markers associated with drought tolerance in both the Brachypodium model plant as well as agronomically important crops.

Keywords: Brachypodium distachyon, drought, grasses, hormones, metabolite profiling, natural variation, phenotyping, stress

## INTRODUCTION

fpls-07-01751 November 25, 2016 Time: 13:18 # 2

Drought, the sub-optimal supply of water, is an important environmental stress that limits the productivity of major crops worldwide. Climate change models predict greater variability in rainfall patterns and increased periods of summer drought, posing serious risks for food production. The cereals, wheat, rice, and maize, provide fifty percent of human dietary energy supply and climate change is projected to negatively impact on their production (IPCC, 2014). Likewise, a decline in the productivity of cereal crops and forages will impact on animal feed supplies and therefore affect the amount and price of milk and meat available for human consumption (Wheeler and Reynolds, 2013). As the global population increases, and hence the demand for staple cereal crops, drought will have widespread implications both on the environment and related socio-economic factors, agronomy, employment, migration, food security, health, and mortality (Stanke et al., 2013). Understanding drought tolerance and possible mechanisms for improving drought resistance in agronomically important crops is therefore a prerequisite to develop drought-tolerant crops that produce significant yields with reduced amounts of water.

Drought affects morphological, physiological, biochemical, and molecular processes in plants, resulting in growth inhibition. The extent of these changes is dependent on time, stage, and intensity of the drought stress (Chaves et al., 2009). Drought tolerance is therefore considered a complex trait under polygenic control and involving complex morpho-physiological mechanisms. Progress in improving drought tolerance in cereal crops and forages has been hampered by the lack of genetic and genomic tools, their large, and often polyploid, genomes and, critically, their large physical size and relatively long life cycles. Brachypodium distachyon (Brachypodium) contains many of the desirable properties required for a model system for the monocotyledon grasses (Opanowicz et al., 2008) and a number of genetic and genomic tools have been developed (Mur et al., 2011), including the genome sequence of accession Bd21 (The International Brachypodium Initiative, 2010). This has positioned Brachypodium as a powerful model species to accelerate trait improvement in cereals, forages and biomass crops (Rancour et al., 2012; Slavov et al., 2013; Girin et al., 2014). The Bd21 accession has been the main focus for experiments focussing on molecular analysis of drought stress (Bertolini et al., 2013; Verelst et al., 2013). The exploitation of genetic diversity in wild relatives and diversity germplasm of crops species holds the key to improving important agronomic traits. However, the domestication of cereal and forage crops has resulted in a decrease in genetic diversity compared to their wild ancestors (Buckler et al., 2001), decreasing our capacity for trait improvement in response to environmental stress including drought. Brachypodium grows in a wide variety of habitats and was never domesticated. This provides an excellent opportunity to identify drought associated gene-trait associations and transfer this knowledge to improve drought associated agronomic traits in important food and forage crops. The natural variation in aboveground and belowground physiology upon drought induced stress has been described for Brachypodium germplasm collections containing mainly material originating from Turkey (Luo et al., 2011; Chochois et al., 2015).

Conventional methods to phenotype drought related traits are time-consuming, labor intense, low throughput and often involve destructive harvest of plants making repeated measurements on the same plant impossible. Dynamic plant phenotyping methods enable controlled irrigation combined with automated imaging. These are particularly suited for studies on drought tolerance as these require accurate watering regimes and monitoring of responses over time to dissect the dynamic nature of drought development and the resulting stress response (Berger et al., 2010; Chen et al., 2014; Honsdorf et al., 2014).

While such phenotyping facilities have gained much interest in the context of increasing our understanding of the genetics of drought tolerance and for making gene-trait associations, the advantages it offers for metabolite–phenotype associations have yet to be explored. This is the case even though metabolite levels can be more closely linked to the macroscopic phenotype than genes as they reflect the integration of gene expression, enzyme activity, and other processes (Arbona et al., 2013) and, as such, are often used as predictive biomarkers. While metabolite profiling, as a diagnostic and predictive tool to identify novel markers for diseases, is widely used in the medical field, the application of biomarkers is not a common strategy to assist crop improvement (Steinfath et al., 2010). Drought stress leads to the accumulation of metabolites that function as osmolytes, antioxidants, or scavengers that help plants to avoid or tolerate stresses (Seki et al., 2007). Given the importance of drought induced biochemical changes, metabolic analyses provide a powerful approach to elucidate tolerance mechanisms and identify metabolic markers that may assist in developing drought-tolerant crops. For instance, a metabolomics study in oats (Avena sativa) has defined key processes involved in drought tolerance (Sanchez-Martin et al., 2015), while the combination of metabolite and phenotypic analysis identified correlations between certain metabolites and important fruit quality and yield traits in tomato (Schauer et al., 2006), highlighting the potential of metabolic markers for crop selection, evaluation and improvement.

In this study, we combine metabolic profiling with dynamic phenotyping of a diversity panel of Brachypodium accessions. We utilized an expanded germplasm collection of 138 Brachypodium diploid inbred lines (**Supplementary Table S1**) including ecotypes collected from a range of geographies and ecological niches from Northern Spain (Mur et al., 2011) to assess the natural variation in drought responses. Following initial drought screens, we used the facilities at the UK National Plant Phenomics Centre (NPPC-Aberystwyth) for high-resolution temporal imaging of 10 ecotypes with contrasting responses to drought stress to determine phenotypic trait associations to drought tolerance. Metabolite profiling of these samples not only informed on metabolic pathways involved in conferring drought tolerance in these selected Brachypodium ecotypes, but also enabled identification of key metabolite–phenotype associations. Our results identify several novel drought induced correlations between hormone pathways and phenotypic traits and provide a platform for improving our understanding of the genetics underlying drought associated metabolite–phenotype correlations essential for improving drought tolerance in cereal crops and forage grasses.

#### MATERIALS AND METHODS

fpls-07-01751 November 25, 2016 Time: 13:18 # 3

#### Plant Material

One hundred and thirty-eight Brachypodium diploid inbred lines were selected for the initial large drought screen, including ecotypes collected from a range of geographies and ecological niches (**Supplementary Table S1**), with 101 lines collected in Spain, 31 in Turkey, 3 in Iraq and one each from France, Italy, and Croatia.

#### Initial Drought Screens

For the large drought screen, six replicates of 138 Brachypodium ecotypes were sown in small pots (7.5 cm diameter, 9 cm height) in 4:1 John Innes No 1 potting compost: grit mix. The pots were placed in a glasshouse at 21–22◦C with 16 h of light (natural light supplemented with artificial light from 400-W sodium lamps). Two weeks after sowing plants were fully randomized. As it was our aim to induce drought stress in actively growing plants, well before they normally would start flowering, plants were not vernalized (this also applies to the second drought screen and the detailed plant phenotyping experiment). One ecotype, Arc23, was excluded as only three out of the six replicates had germinated. For the remaining 137 ecotypes, at least four replicate plants were available (114 with six replicates, 17 with five replicates, and 6 with four replicates). All seedlings were watered equal amounts prior treatment. Those replicates (2–3 plants per ecotype) to be exposed to water deficit were raised on circular blocks, approximately 1.5 cm in height, to prevent contact with any water draining through the soil of well-watered control plants. Total water withdrawal began 28 days after sowing and lasted 6 days. Leaf wilting was scored by two individuals according to a 1–6 scale for this trait (1 = no effect, 6 = severe effect) based on the visual assessment categories described by Engelbrecht et al. (2007). Plants were weighed immediately after harvest of the above ground biomass.

A second, refined drought screen was performed on 48 selected Brachypodium ecotypes (see **Supplementary Table S1**). Seedlings were germinated in square pots (9 cm × 9 cm × 7.5 cm), containing the same soil mix as before, in replicates of 12 (six controls and six to be stressed). Plants were grown under controlled environment conditions in a growth chamber at 21◦C and under a 16 h photoperiod with 176 µmol m−<sup>2</sup> s <sup>−</sup><sup>1</sup> photon flux density supplied by white fluorescent tubes (OSRAM, Garching, Germany). All plants germinating within 1–2 days of each other (with the exceptions of ecotypes Kah6 and Uni14, which developed approximately 5 days later). Plants were randomized in a block design with buffer plants placed on outskirts to reduce changes in stress resulting from interactions with microclimate variations. Plants were all given equal volumes of water and water levels were monitored throughout, averaging at 0.25 m<sup>3</sup> m−<sup>3</sup> prior stress treatment and 0.29 m<sup>3</sup> m−<sup>3</sup> in soils of control plants during the stress treatment period. Total water withdrawal of six replicates per ecotype started 33 days after sowing and lasted for 12 days. The average soil moisture content upon drought treatment dropped to 0.024 m<sup>3</sup> m−<sup>3</sup> on day 42 while at the end of the experiment on day 45 no reading could be obtained.

Leaf wilting observations were taken after 12 days of drought treatment, using the same criteria as used in the large screen. On day 45, plants were harvested and fresh weight and dry weight biomass recorded; the latter after drying at 70◦C for 3 days. Fresh and dry weight (DW) figures were then used to calculate above ground plant water content (PWC) on a fresh weight (FW) basis using the following equation: PWC (%) = [(FW(g) − DW(g))/FW(g)] × 100%.

### Phenotyping of Drought Induced Physiological Changes in Selected Brachypodium Ecotypes

Ten Brachypodium ecotypes were selected for detailed dynamic plant phenotyping using the facilities of the National Plant Phenomics Centre (NPPC) at Aberystwyth University. Seedlings were grown in 7.5 cm square pots, each containing 225 ± 0.5 g 4:1 Levington F2: grit sand in a regular glasshouse (for conditions see large drought screen). Twenty-eight days after sowing, the plants were transferred to the NPPC Smarthouse. Four plants of each ecotype were placed in a single tray lined with blue germination paper to ensure even distribution of water between the pots. Dividers were placed between the plants to enable imaging of individual plants, each of the trays was placed on a cart of the conveyer belt system (**Figure 2A**). All plants were watered at calculated 75% field capacity for an additional 2 days with the start of the treatments 33 days after sowing (das). Treatments were comprised of two levels of drought stress: 24 plants for each ecotype (six trays/carts) exposed to 15% soil water content (SWC), another set of 24 plants were not watered at all (0% SWC). Twelve control plants per ecotype (three trays/carts) continued to be watered to 75% SWC. Weighing and watering of the plants was fully automated. Samples of the compost were soaked to determine field capacity or dried to determine dry matter content. The water content was the difference between these two values. Target watering weight was calculated based upon a percentage of the water content. Plants from a parallel experiment were weighed at maximum size and this showed that plant fresh weight accounted for around 2% of the daily water usage, so no adjustment for plant biomass was deemed necessary. The different treatments lasted for 12 days with the aboveground biomass of four plants (one tray) per ecotype being harvested after 12 days for each of the treatments. Growth conditions in the Smarthouse were 22◦C 16 h day/20◦C night with supplementary top-up lighting. Watering was calculated based on the field capacity and dry matter content of the compost. Automated, target weight based watering was provided on a daily basis. Imaging was also performed daily starting from imposition of drought treatment (34 das) of all plants until the last day (45 das) of the treatments.

The timing for the initiation of the treatments for the two drought screens and for the phenotyping experiment (28, 33, and 33 das, respectively), as well as for the end-point of the treatments (34, 45, and 45 das, respectively), was based on our observations that Bd21, which does not require vernalization, started flowering ∼45 das. The chosen time-points therefore represented a reasonable equivalent for all ecotypes to be at a pre-flowering stage had they been vernalized.

#### Feature Extraction

fpls-07-01751 November 25, 2016 Time: 13:18 # 4

Images of each tray were taken from four side view angles with an interval of 90<sup>o</sup> and one top view once a day. The camera used to image the plants exports 24 bit RGB color images, i.e., each channel has 256 class color levels. **Figure 2B** shows the position of four plants in each tray and the images acquired from side view and top view cameras. Images were processed to segment the plant from the background and to extract plant height, top view and side view projection area and color information. In order to acquire relatively uniform processing results for further analysis, images were filtered by a retina filter (Benoit et al., 2010), which enhances the contrast between plant and background, improves color consistency and provides spatial noise removal and luminance correction. Each side view image contains two individual plants, which are separated by a panel. Therefore after this pre-processing step, regions of interests (ROI) containing only one single plant were acquired with the bottom position of each ROI fixed just beyond the top of the pot from where the plant height and side view area are measured. **Figure 2C** illustrates the image processing on a side view image. In this image, the ROIs are fixed and height and total pixels (plant area) are acquired for each individual.

Within each ROI, plants were segmented according to the pixel colors in RGB color space. Although the pre-processing improved image quality and removed some noise, some small background pixel patches (clustering around 10–20 pixels) with the same colors of a plant, were removed by morphological operations.

Four features were extracted directly from the side view images of segmented plants: height, total pixels, yellow pixels, and grey pixels. The yellow color is defined as a color range where the red channel value is 10 levels higher than that of the green channel. The grey color range is defined by the following formula in which V denotes the value:

$$\frac{V\_{\text{Green}}}{V\_{\text{Blue}}} < 2 \text{ and } \frac{V\_{\text{Green}}}{V\_{\text{Blue}}} > 1.15 \text{ and } \frac{V\_{\text{Green}}}{V\_{\text{Red}}} < 1.4 \text{ and } V\_{\text{Green}} < 150$$

The top view image processing extracted total pixel numbers for each of the four plants in a tray separated by two panels. For top view image processing, the color description of the plants needed to be more complex to extract as much as possible of the plant pixels while minimizing noise. **Figure 2D** shows the top view segmentation and pixel count. The colors of the plants were defined as:

VGreen < 200 andVGreen > 20 andVRed > 20 andVRed < 200

$$\begin{aligned} \text{and } (V\_{\text{Green}} - V\_{\text{Blue}}) &> 12 \,\text{and } \frac{V\_{\text{Red}}}{V\_{\text{Green}}} < 1.5\\ &\text{and } \frac{V\_{\text{Green}}}{V\_{\text{Blue}}} > 1.5 \end{aligned}$$

All extracted values for height, estimated plant area and color information were written into a csv file for further analysis.

Image processing and feature extraction was achieved using C++ and OpenCV, an open source computer vision library<sup>1</sup> .

#### Stomatal Conductance

Stomatal conductance was measured for 9 of the selected Brachypodium ecotypes (Bd21 was excluded as it started to flower during the course of the drought experiment) over a period of 7 days following drought treatment as well as for well-watered control plants at 5 weeks after sowing. Plants were grown under a 12 h light period (8 am – 4 pm) at 250 µmol m−<sup>2</sup> s <sup>−</sup><sup>1</sup> photon flux density supplied by white fluorescent tubes (OSRAM) in a growth chamber at 20◦C. Measurements of stomatal conductance were made on the second fully emerged true leaf between 12 noon and 1 pm using an AP4 porometer (Delta-T devices Ltd, Cambridge, UK).

#### Metabolite Profiling

For each treatment (0%, 15%, and 75% SWC), leaf samples harvested 12 days after initiation of drought treatments, were pooled for each of the ecotypes and metabolites extracted following the procedure described by Allwood et al. (2006). Glass vials were capped and analyzed in random order on a LTQ linear ion trap (Thermo Electron Corporation). Data were acquired in alternating positive and negative ionization modes over four scan ranges (15–110, 100–220, 210–510, and 500–1200 m/z), with an acquisition time of 5 min. Discriminatory metabolites were selected and tentatively identified by following statistical analyses and interrogation of KEGG: Kyoto Encyclopedia of Genes and Genomes<sup>2</sup> and MZedDB<sup>3</sup> . To substantiate these identifications, nominal mass signals were investigated further by targeted nano-flow Fourier Transform-Ion Cyclotron Resonance Ultra-Mass-Spectrometry (FT-ICR-MS) using TriVersa NanoMate (Advion BioSciences Ltd) on a LTQ-FT-ULTRA (Thermo Scientific) to obtain ultra-high accurate mass information and MSn ion-trees. Based on an accuracy of 1 ppm for the FT-ICR-MS, the top ranking metabolite with this range was indicated as the identification for each discriminatory negative ionization mode flow injection electrospray (FIE)-MS metabolite.

#### Statistical Analysis

FIE-MS data was normalized with the total ion count for each sample used to transform the intensity value for each metabolite in to a percentage of the total ion count, after the removal of metabolites below 50 m/z. ANOVA, Pearsons correlation analyses, Principal Component Analyses (PCA), and Hierarchical Cluster Analyses (HCA) were completed using the R-based MetaboAnalyst 2.0 interface (Xia et al., 2015).

<sup>1</sup>http://www.opencv.org/

<sup>2</sup>http://www.genome.jp/kegg/

<sup>3</sup>http://maltese.dbs.aber.ac.uk:8888/hrmet/index.html

#### Fisher et al. Natural Variation Brachypodium Drought Response

## RESULTS

#### Brachypodium Drought Screens

fpls-07-01751 November 25, 2016 Time: 13:18 # 5

Two progressive drought screens were conducted with the aim to identify a core set of Brachypodium ecotypes for further detailed analysis. The initial large drought screen of 138 different Brachypodium ecotypes highlighted the considerable variation in the size and stature between the ecotypes, both in well-watered and drought stressed conditions. Wilting scores are the most widely used indicator for plant drought stress and have been shown to allow for robust ranking of survival when exposed to drought (Engelbrecht et al., 2007). Hence, visual assessment of plant wilting (**Supplementary Table S2**) formed the basis for the selection of those ecotypes to be included for a second, more refined drought screen. Other considerations taken into account for further selection were based on inclusion of ecotypes for resequencing<sup>4</sup> , available seed stock and developmental aspects.

A total of 48 different ecotypes were selected for the second drought screen, featuring bigger pots and controlled environment conditions as well as an increased number of replicates compared to the first screen. **Figure 1A** shows the distribution of the above ground plant water content (PWC) figures for the 48 ecotypes after withholding water for 12 days. During this period, the soil moisture content dropped from an average of ∼ 0.25 m<sup>3</sup> m−<sup>3</sup> before the treatment to ∼ 0.024 m<sup>3</sup> m−<sup>3</sup> 9 days after withholding water, while 12 days after treatment no readings could be obtained anymore. PWC ranged from just below 30% for Koz1 to almost 80% for ABR8. The ranking of the PWC was used as the main criterion for the selection of 10 ecotypes for further detailed analysis. Ecotypes ABR8, Pal6, and Mur3 were selected as drought resistant ecotypes as they showed the highest PWC upon drought treatment (**Figure 1A**). These ecotypes also showed a low wilting score in the second screen (**Figure 1B**). Koz1, Gal10, Per3 and ABR3 were selected as drought-susceptible ecotypes, since they featured amongst the lowest PWC scores and ranked highly in the wilting scores (**Figures 1A,B**). ABR4, Bd21, and Luc21 were included as having an intermediate (INT) response. As expected, comparison between the PWC data (**Figure 1A**) with the wilting scores (**Figure 1B**) shows that there is a strong negative correlation between these two measures (r = −0.869).

#### Phenotypic Analyses of 10 Selected Brachypodium Lines

Having selected 10 Brachypodium lines that were either classified as tolerant (TOL), INT, or susceptible (SUS) to the drought conditions imposed during the preceding drought screens (**Figure 1**); we next performed an integrated phenotype and metabolite analysis. Plants were exposed to two levels of drought targeting 15 and 0% SWC over a period of 12 days. RGB images were acquired daily from side views at different angles and top view and phenotypic features (height, area, and color) were extracted from the plant images (**Figure 2**). Plant area, as estimated from side view images, provides a proxy for growth (Neilson et al., 2015) and response to drought. An example of side-area over time for the 10 ecotypes is shown in **Figure 3**. Tolerant genotypes (blue lines) show higher accumulation of side area under water stress when compared to SUS (green) and INT (red) genotypes. Under well-watered control conditions there is no obvious distinction in side area accumulation according to the pre-classified tolerance groups (**Figure 3**). It should be noted that Bd21, included as an INT ecotype, started to flower during the course of the drought experiment and, therefore, was excluded from further phenotypic and metabolomic data analyses. To allow balanced multivariate analyses of our data, ABR3 was re-classified (as a line with borderline phenotype) and moved from the SUS to the INT drought tolerance group. The targeted genotypes were also assessed for stomatal performance under conditions tending toward 0% SWC (**Supplementary Figure S1**). Examination of stomatal performance under drought with each genotype and phenotypic class suggested that none exhibited significantly (P = 0.982) different responses. Thus, differential drought tolerance across the Brachypodium genotypes did not arise from stomatal effects.

Unsupervised PCA was applied to the phenotypic data obtained at day 12 for the plants exposed to 15 and 0% SWC and compared to the fully watered 75% SWC controls. Data obtained from different genotypes were classified as TOL, INT, or SUS phenotypes based on our previous screens (**Figure 1**), and re-classification of ABR3 as mentioned before. PCA of phenotypic data obtained for 75% SWC controls suggest that most genotypes were morphologically similar with the possible exception of two TOL lines; Pal6 and ABR8 which lay outside the 95% confidence interval circles (**Figure 4A**). However, further analyses of these genotypes using ANOVA suggested that their most prominent feature ("area-side") was not significantly different (P = 0.41) from the other genotypes. With imposition of drought to 15% SWC, phenotypic features were again not discriminating between the SUS and INT groups but were discriminating with two TOL genotypes Pal6 and ABR8 across PC2 (**Figure 4B**). ANOVA suggested that these two and indeed Mur3 were significantly taller (P = 0.04) as also indicated from box and whisker plots for plant height (**Figure 4B** inset). Considering phenotypic data for plants exposed to 0% SWC, clear differentiation was observed between TOL and SUS genotypes (**Figure 4C**). This aspect was also observed using HCA where each phenotypic group were clustered (**Figure 4D**). HCA and ANOVA (**Figure 4E**) demonstrated that drought tolerance was associated with significantly increased height and 'area side' but with reduced grey and yellow pixels.

## Metabolite Analyses Indicate Differential Responses to Drought within Each Phenotypic Group

Metabolite analyses based on FIE-MS were performed on leaf samples collected 12 days after initiation of the two drought treatments and compared to well-watered controls. Images of the 9 genotypes recorded immediately before harvest for metabolite analysis are shown in **Supplementary Figure S2**. Non-supervised PCA of the well-watered control samples revealed no distinct

<sup>4</sup>http://jgi.doe.gov/our-science/science-programs/plant-genomics/ brachypodium/

FIGURE 1 | Assessment of drought stress for selected Brachypodium ecotypes exposed to water deficit. Plant water content (PWC) measures are ranked from high to low (A) and wilting scores from low (no sign of wilting) to high (B) for the 48 ecotypes included in the screen. Arrows indicate the ecotypes selected for further detailed analysis; dashed lines are included to compare the relative rankings of the selected ecotypes between (A) and (B). Both PWC and wilting score data are based on six biological replicates, except for Foz8, Sar24, and Tek3 (five replicates) and Kah6 (four replicates). Origin: dark grey = Spain, light grey = Turkey, black bars = Other. Horizontal bars: PWC > 70% (Blue) = ecotypes classified as tolerant (TOL); PWC 55–70% (red) = ecotypes classified as intermediate (INT); PWC < 55% (green) = ecotypes classified as susceptible (SUS). Error bars indicate standard deviation.

FIGURE 2 | Plant Phenotyping and image processing. Setup of high-throughput phenotyping system. A total of 600 Brachypodium plants (10 genotypes) where phenotyped over 12 days (A). Each tray holds four replicate individuals. The tray will be rotated four times with an angle of 90◦ . Therefore, each individual plant can have two side views from two orthogonal angles. The top view image is taken when the tray is at 0◦ (B). The image processing is to segment the plant from background, such as the panels and wall. The program automatically finds the central separating panel to setup two regions of interests (ROI). The measurement happens in each ROI. The program copes with various lighting conditions and image quality (C). The top view image processing finds the panels and uses them to separate those four plants in the tray. Then the total pixels of each plant are acquired. These plants are denoted by different colors (D).

clusters in the metabolite profiles of the three pre-defined phenotypic groups (**Figure 5A**). However, distinct clusters linked to relative drought tolerance emerge for the metabolites extracted from plants exposed to 15% SWC (**Figure 5B**), in particular for the TOL ecotypes. More severe drought to 0% SWC resulted in three clearly distinct clusters corresponding to TOL, SUS, and INT phenotypes (**Figure 5C**). The major source of variation were extracted for treatments to 15 and 0% SWC based on PCA loading vectors and significant differences as identified using ANOVA. These were tentatively identified by database interrogating based on high-resolution MS of the targeted m/z. For the 15% SWC samples the major sources of variation were tentatively associated with tricarboxylic acid (TCA) cycle intermediates malate and citrate, the phospholipidderived hormone jasmonate, and unidentified metabolites with m/z of 236.01 and 386.9. HCA suggested that these metabolites alone could discriminate between the TOL and other groups at 15% SWC (**Figure 5D**). All with the exception of 236.01 m/z were relatively increased in the TOL grouping. Upon more severe drought (0% SWC) more metabolites were targeted which again could discriminate the TOL genotypes from the others (**Figure 5E**). These metabolites included jasmonate as well as other plant hormones gibberellin (GA17) and salicylate, metabolites associated with amino acid metabolism (aspartate, glutamate), chorismate, and lipid metabolism (caprylate). In each case, accumulation of the metabolites was relatively increased in the TOL lines.

To further highlight the possible importance of the pathways, m/z corresponding to metabolites within the pathways were extracted (**Figure 6**). In the case of the TCA cycle (**Figure 6A**), the putative TCA metabolites allowed the separate clustering of the TOL genotypes from other phenotypic classes. These data, as well as box and whisker plots, suggested significant increases in the TCA cycle in TOL genotypes. The TCA intermediates 2-oxoglutarate and oxaloacetate also contribute carbon skeletons for amino acid biosynthesis and examination of m/z tentatively linked to amino acid, suggested that lower levels of amino acid accumulation might be linked to a SUS phenotype (**Figure 6B**). Indeed, glutamate, glutamine, and the drought responsive amino acid proline were significantly elevated in both TOL and INT phenotypes. With alanine, significant increases were only associated with the TOL phenotypes.

Considering m/z tentatively linked with the biosynthesis of jasmonate or salicylate, HCA separated clustered TOL genotypes (**Figures 6C,D**). The TOL phenotypes exhibited significant increases in key oxylipins, linolenic acid, and the active jasmonate hormones, jasmonic acid (JA) and jasmonate-isoleucine (JA-Ile) (**Figure 6C**). With the salicylate pathway only salicylic acid (SA) itself (**Figure 6D**) was significantly increased in TOL phenotypes.

#### Correlation between Metabolite and Phenotypic Traits

Since both the phenotypic data and the metabolite analyses reveal distinct clusters that match the designation of the different drought tolerance groups established during the initial screening, we wished to determine potential correlation between the metabolite pathway data and phenotypic variables. To assess this, a new matrix was derived which incorporated both metabolite and phenotypic data. Given the very different types of data, data was mean-centered and divided by the standard deviation of each variable. The data from the resulting matrix were established to be normally distributed.

The outcome of multivariate correlation analyses (based on Pearson's r as a distance measure) is provided in **Supplementary Figure S3**. These analyses suggested that the putative jasmonate pathway correlated only with itself and no phenotypic measure. However, a key cluster (arrowed in **Supplementary Figure S3**) suggested positive correlations with phenotypic and metabolite indicators and a negative correlation with "Grey"/"Yellow" pixel counts.

To facilitate visualization of these trends, key variables in this cluster were extracted and separately analyzed (**Figure 7**). This highlighted that plant area ("top" and "side") and height positively correlated with the relative concentrations

of TCA intermediates, alanine and the hormone salicylate. This would suggest that tolerance in these Brachypodium genotypes was related to the accumulation of these metabolites. Linked to this, negative correlations were observed with grey and yellow pixel contents- features linked to susceptibility to drought.

#### DISCUSSION

The Poaceae (or grasses), including cereals and grasses of grasslands and pastures, constitute the major source of dietary calories for human and livestock, are increasingly important as lignocellulosic biomass for biofuels, and define many natural ecosystems and agricultural landscapes. Drought is an important environmental factor limiting the productivity of crops worldwide. The greater the diversity within a species, the better the potential for the species to overcome and adapt to changes in the environment, such as drought. Brachypodium displays considerable phenotypic variation across its geographic range, and is a key model species for cereals, forage grasses and energy grasses (Brkljacic et al., 2011). Its extensive natural variation encompasses many traits of agronomic or adaptive significance and, together with genetic and genomic tools available, is therefore well positioned for improving agronomic traits.

#### Screening and Ecotype Selection

To assess the natural variation to drought stress and select ecotypes for further detailed studies, we performed two successive screens in which, respectively, 138 and 48 diploid Brachypodium ecotypes, sourced from different geographical locations, were exposed to progressive drought stress. These screens

demonstrated the natural variation among Brachypodium ecotypes, in terms of morphology, biomass accumulation and response to drought stress. For instance, final dry weight biomass of the watered control replicates ranged from 0.032 to 0.317 g in the first screen and from 0.183 to 0.318 g in the second screen (data not shown). This was in line with the large phenotypic variation in response to drought stress observed amongst 57 Brachypodium ecotypes that mostly originated from Turkey (Luo et al., 2011).

When assessed for wilting, leaf water content (LWC) and chlorophyll fluorescence (Fv/Fm), these Brachypodium ecotypes were classified into four different groups ranging from TOL to most-SUS to drought stress (Luo et al., 2011). Assessment of genotypic variation has suggested greater genetic diversity in wild Brachypodium individuals from the Western Mediterranean region compared to those from the Eastern (Mur et al., 2011). Here we screened geographically more diverse ecotypes (101 lines collected in Spain, 31 in Turkey, 3 in Iraq, and 1 each from France, Italy, and Croatia), with our second screen only having six ecotypes (Adi10, Bd21, Koz1, Gaz8, Kah6, and Bd2-3) in common with the study by Luo et al. (2011).

Ecotypes were tentatively classified as TOL when exhibiting a PWC higher than 70% and SUS with a PWC lower than 55% with the ecotypes in-between classified as INT (**Figure 1A**). Brachypodium is found in many climate zones throughout temperate regions, spanning habitats near sea level to over 1800 m altitude (Des Marais and Juenger, 2016), suggesting that populations are locally adapted. It was therefore anticipated that similarly classified ecotypes might share similarities with regards to ecological niche. However, based on highland/lowland descriptions or altitude, such shared responses were not apparent in our screens. For example, two out of the three selected drought resistant ecotypes (Mur3 and Pal6) were from the highlands in Spain (ABR8 was from a hillside origin close to Siena in Italy), as were three of the four selected drought-susceptible ecotypes (ABR3, Per3 and Gal10; Koz1 is from the highlands in Turkey).

A study looking at root phenotypes amongst 81 Brachypodium accessions also found no correlation between the phenotypic diversity and geographical origins (Chochois et al., 2015). This lack of correlation may suggest that the phenotypic plasticity of Brachypodium to changing environmental conditions (i.e., acclimation responses, Des Marais and Juenger, 2016) prevails over ecotypic differentiation (local adaptation, or within-species niche divergence leading to ecotypic differentiation, Liancourt and Tielbörger, 2009). It should be noted, however, that other features associated with ecological niches, such as soil type, acidity, and climate may still attribute to a shared drought response, but these parameters were not identified in this study.

#### Phenotypical Characteristics of the Different Tolerance Groups

Maintaining plant growth and yield under drought represents a major objective for plant breeding. Growth, caused by cell division and cell elongation, is reduced under drought owing to impaired enzyme activities, loss of turgor, and decreased energy supply (Farooq et al., 2012). Most studies on drought are performed by withholding water, leading to progressive drought stress. However, under natural conditions plants are often exposed to moderate drought stress, in particular in temperate climates. As such, enhanced survival of Arabidopsis thaliana (Arabidopsis) under severe drought was shown not to be a good indicator for improved growth performance under mild drought conditions (Skirycz et al., 2011). In addition to progressive drought (0% SWC), we therefore exposed Brachypodium ecotypes to mild drought stress (15% SWC), utilizing the gravimetric watering control of the NPPC. Image based projected shoot area allows estimation of above ground biomass and, when assessed over time, can serve as a useful proxy for overall plant growth (Honsdorf et al., 2014; Neilson et al., 2015). On average, based on projected shoot area, exposure to moderate drought

stress resulted in 85% more biomass accumulation during the treatment compared to exposure to severe stress (ranging from 43% for Pal6 to 211% for Gal10). Similarly, moderate drought imposed an average yield penalty of 38% (ranging from 28% for Mur3 to 46% for Gal10) when compared to well-watered controls. Results suggest that under mild drought stress, assessment of plant height, another parameter used to estimate plant biomass in several crop species (Tilly et al., 2015), could be a good predictor for ecotypes that perform well under more severe drought stress. However, the phenotypic features measured after exposure to moderate drought were unable to separate the Brachypodium ecotypes in the TOL, INT, and SUS groups.

In contrast to mild drought, PCA on the phenotypic data after progressive drought showed a clear differentiation between TOL and SUS genotypes. The TOL group was not only characterized by having significantly increased measures for height and side area, but also showed distinct color-related properties with significantly reduced proportions of yellow and grey pixels. While leaf 'greenness' has been shown to correlate with foliar nitrogen and chlorophyll, the relative proportion of yellow pixels is indicative of leaf senescence characterized by yellowing or chlorosis (Rajendran et al., 2009; Li et al., 2014; Neilson et al., 2015). Interestingly, we also identified a "grey" pixel color range that reflected the relationship between greenness and red and blue color pixels. The physical basis of this trait was not determined but could be due to altered pigmentation or water content. Thus, our results suggest reduced or delayed drought induced senescence in the TOL lines. Overall, our phenotypic data analyses upon exposure to progressive drought confirm the classification of the selected ecotypes into TOL, SUS, and INT, and therefore the robustness of the preceding drought screens.

## Metabolites Associated with Drought Stress

The interaction between genotype and environment is complex. The ability for metabolites to integrate these two components reflects an increasing tendency to use metabolites as selection markers in crop breeding programs to accelerate the development of improved cultivars tolerant to drought (Sanchez-Martin et al., 2015). Metabolites with a higher or indeed lower relative accumulation in the TOL lines when exposed to drought not only provide insights into the regulation of metabolic networks under drought stress, but could also lead to metabolite marker development. In this study, we used our well-established metabolomic analyses pipeline (e.g., Lloyd et al., 2011) to mine high-resolution FIE-MS datasets using multivariate approaches to identify major sources of variation in the experimental parameters. Through interrogation of the loading vectors linked to PCA, coupled with ANOVA of the datasets, we identified metabolites which exhibited differential accumulation in the TOL class in response to 0% SWC. Database interrogation with the highly resolved (to 1ppm) m/z suggested that these metabolites included three phytohormones, metabolites associated with amino acid metabolism and putative TCA cycle metabolites.

Phytohormones play critical roles in regulating plant responses to stress. Expression profiling of markers for defense-related phytohormones showed that Brachypodium has phytohormone responses more similar to those of rice than of Arabidopsis, suggesting that monocots share a common defense system that is different from that of dicots (Kouzai et al., 2016). Our analyses identified relative higher levels of GA, SA, and JA in the TOL ecotypes when exposed to severe drought with the latter also higher in TOL lines under mild drought. A central role for the GA class of growth hormones in the response to abiotic stress is becoming increasingly evident (Colebrook et al., 2014). For instance exogenous application of GA can alleviate drought-imposed adverse effects in maize (Akter et al., 2014), although application of GA has also been shown to increase shoot height of dwarf Barley lines, negating the increased stress tolerance exhibited by the dwarf plants (Vettakkorumakankav et al., 1999). SA accumulation has been reported to improve drought tolerance in Arabidopsis by inducing stomatal closure (Khokon et al., 2011) and inhibiting stomatal opening (Okuma et al., 2014). Similarly, in oats, tolerance to drought has been at least partially associated with the accumulation of SA, again by influencing stomatal opening (Sanchez-Martin et al., 2015). However, like for GA, the effect of SA on drought tolerance is complex and others have reported a reduction of drought tolerance by SA application (Miura and Tada, 2014). Since no differential effect on stomatal closure was observed upon progressive drought between the Brachypodium ecotypes, SA may be involved in alternative drought-induced regulatory mechanisms.

An interesting question is how far SA could be influencing the two primary metabolism pathways targeted in our study; the TCA cycle and amino acid metabolism. The TCA cycle is a crucial component of respiratory metabolism and is often altered in plants experiencing stress. Drought induced accumulation of TCA cycle metabolites (Urano et al., 2009) and the upregulation of TCA cycle-related genes in Arabidopsis shoots (Cavalcanti et al., 2014), have been reported. Metabolic differences in the stress tolerance of four different lentil genotypes were related to a reduction in the levels of TCA cycle intermediates suggesting an impaired energy metabolism with consequences on the ability of seedlings to acquire water and to support transport processes (Muscolo et al., 2015). SA has a well-characterized role in maintaining mitochondrial electron flow under stress conditions by inducing the expression of alternative oxidase (AOX) (Feng et al., 2009). This will influence the NADH oxidation by mitochondrial complex I which is coupled to the TCA cycle (Vanlerberghe, 2013). Thus, TOL Brachypodium genotypes could be exhibiting a SA-AOX mechanism of maintaining bioenergetic metabolism during drought.

Amino acid accumulation, in particular proline, is considered a protective mechanism in many water-stressed plants (Rai, 2002). Proline has been shown to increase in several different plant species under drought stress, including maize, wheat, and Miscanthus (Rampino et al., 2006; Witt et al., 2012; Ings et al., 2013), and is thought to function primarily as an osmoprotectant, thereby protecting cells from damage caused by stress (Delauney and Verma, 1993). Indeed, overproduction of proline has been shown to result in increased tolerance to osmotic stress in transgenic plants (Kishor et al., 1995; Zhu et al., 1998; Yamada et al., 2005). The lower levels of both proline and glutamate, which can act as the precursor for proline, in the SUS class, suggests increased sensitivity to drought induced osmotic damage to the ecotypes within this class. In this context, there is evidence suggesting SA can influence the accumulation of some amino acids upon drought stress. A recent study focusing on drought in Creeping Bentgrass (Agrostis stolonifera) showed that SA conferred drought tolerance and also the accumulation of proline, serine, threonine and alanine (Li et al., 2016). However, an SA-influenced increased accumulation of carbohydrates, as noted in the Li et al. (2016) study was not prominent in our results.

Despite increasing evidence for the involvement of JA in drought stress, there is little knowledge about its actual role in drought stress signaling, particularly when compared to its involvement in the response to biotic stresses (Du et al., 2013). In some studies, JA seems to improve drought tolerance while in others it has been reported to cause a reduction in growth and yield (Riemann et al., 2015). Interestingly, JA was one of the five metabolites contributing to the discrimination between TOL and SUS/INT under mild stress. A JA-synthesizing lipoxygenase was among the most interacting genes in a regulatory interaction network analyses of Arabidopsis exposed to mild drought stress (Clauw et al., 2015), in agreement with a role of JA in response to mild drought. Higher levels of GA, SA and JA in the TOL Brachypodium ecotypes when exposed to severe drought might suggest a protective role for these hormones. Clearly, cross-talk between these and other hormones (including ABA, ethylene, and auxins) and their associated signal transduction elements may play important roles in the response to drought stress. Further studies are necessary to confirm that increases in GA, SA, and JA are indeed involved in establishing the drought tolerance phenotype exhibited by the TOL Brachypodium ecotypes. In particular, the possible role of JA in influencing primary metabolism to confer drought tolerance in grasses needs to be assessed.

Since analysis of both phenotypic and metabolite data, obtained from the same plant material upon progressive drought, revealed the TOL-SUS-INT clusters, there was an opportunity to assess potential correlations between the two data-sets to suggest metabolic events that could be contributing to the phenotypic changes. Biomass yield and delayed leaf senescence (stay-green) rank among the most important traits for improvement of crop plants under drought stress (Rivero et al., 2007; Salekdeh et al., 2009). Importantly, our analyses identified a number of metabolites (TCA-cycle intermediates, alanine, and SA; as well as JA; see **Supplementary Figure S3**) that correlate positively with phenotypic measures for biomass yield (area and height) and negatively with measures for stress (yellow and grey pixels). These observations would strongly suggest that SA and JA stress hormone signaling is important for maintaining some plant growth and reducing stress phenotypes. Although there is limited biological replication of the metabolite data, such conclusions highlight the value of integrative 'omic' approaches in providing novel insights into plant phenomena.

## CONCLUSION

Drought screens of a diverse Brachypodium ecotype collection, integrated with phenotypic and metabolite profiling of selected Brachypodium ecotypes, highlight the variation in the response of Brachypodium ecotypes to water stress. Combined with its genotypic diversity (Gordon et al., 2014), this confirms the value of Brachypodium as a powerful model for the improvement of cereal, bioenergy, forage, and turf grasses to changing environmental conditions, including drought stress. The combination of phenotypic analysis and metabolite profiling revealed a remarkable consistency in separating the selected ecotypes into their pre-defined drought tolerance groups, highlighting the value of multivariate analysis as a robust approach to analyze complex phenotypic and metabolite data sets. The relative abundance of several metabolites, including for the phytohormones SA and JA, appear to correlate with biomass yield and reduced stress features. Although further studies are necessary to validate these findings, the results highlight the potential advantage of combining the analyses of phenotypic and metabolic responses to identify key mechanisms and markers associated with drought tolerance in crops. Overall, this work shows that different phenotyping assays could reliably identify drought-tolerant and sensitive Brachypodium lines, while metabolite analysis may be predictive. Whilst the current study only provides correlations, future work will be aimed at gaining a better understanding of the genetic basis of the observed differences in the responses to drought stress between selected Brachypodium ecotypes and to investigate causation using recombinant inbred populations (Bettgenhaeuser et al., 2016) and reverse genetics approaches.

### AUTHOR CONTRIBUTIONS

fpls-07-01751 November 25, 2016 Time: 13:18 # 13

MB and LM conceived and designed the experiments. LF carried out drought screening and associated analyses and helped with phenotyping. FC and JD supervised plant phenotyping and JH performed image processing and extraction of phenotypic measures. AA performed stomatal conductance measurements. LM supervised metabolite profiling and performed data analyses. TD and KN helped with project design and provided project resources. MB and LM wrote the manuscript and LF, FC, TD, KN, and JD provided critical comments for manuscript improvement. All authors have read the manuscript and agree with its content.

## FUNDING

Funding for this research was provided by the Biotechnology and Biological Sciences Research Council (BBSRC) in the form of an industrial CASE Ph.D. studentship (BB/I016872/1) with DLF Seeds A/S as industrial partner. Access to the National Plant Phenomics Centre (BB/J004464/1) was enabled through a Transnational Access European Plant Phenotyping Network (EPPN) grant (EPPN, Grant Agreement No. 284443) to TD and KN funded by the FP7 Research Infrastructures Programme of the European Union.

#### ACKNOWLEDGMENTS

The authors wish to thank Manfred Beckmann and Kathleen Taillard (IBERS, Aberystwyth University) for excellent technical assistance with FIE-MS. We would also like to acknowledge the

#### REFERENCES


support of Ray Smith and Tom Thomas (Aberystwyth, UK) for maintaining many of the plants used in the work and staff at the National Plant Phenomics Centre.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016.01751/ full#supplementary-material

FIGURE S1 | Stomatal performance in selected Brachypodium ecotypes. At 5 weeks old, the nine Brachypodium accessions were either watered on a daily basis or droughted through non-watering. Stomatal conductance was determined at mid-point in the light period for each plant using a porometer every other day. Results are presented as mean conductance (n = 5 plants ± SE). Results are presented in three registers corresponding to susceptible (SUS), intermediate (INT) and tolerant (TOL) genotypes, respectively.

FIGURE S2 | Brachypodium ecotypes 12 days after initiation of treatments (45 das). Side image photographs show two out of the four plants harvested for metabolite profiling for each of the ecotypes. TOL, tolerant; INT, intermediate; SUS, susceptible.

FIGURE S3 | Correlation analyses of phenotypic and metabolomic data. Metabolites tentatively linked to four key pathways as targeted by unbiased chemometric analyses were correlated with phenotypic datasets. The degree of Pearson's correlation is indicated as the intensity of red (positive) or green (negative) color. r 2 values are provided in Supplementary Table S4.

TABLE S1 | Overview of the Brachypodium distachyon germplasm collection used for the two drought screens. Origin and further geographical information available is listed for each of the 138 ecotypes. All accession are from IBERS collection, except <sup>∗</sup> Benavente Collection; ∗∗ ABR collection; ˆ Garvin collection; ˆˆ Vogel collection. Ecotypes included in second drought screen are highlighted in grey.

TABLE S2 | Mean wilting scores obtained for each ecotype included in first drought screen. Scores are averages of three individual plants except <sup>∗</sup> , based on two plants. Ecotypes included in second drought screen are highlighted in grey. SD, standard deviation.

TABLE S3 | Normalized shoot area data. Normalized shoot area data, derived from side images, used to prepare graphs shown in Figure 3. Standard deviation of the measurements is included (SD). Abbreviations: das, days after sowing; S-area, side area; SWC, soil water content.

TABLE S4 | Multivariate regression analyses (R 2 ) of phenomic and metabolomics data.



Brachypodium distachyon. BMC Plant Biol. 16:59. doi: 10.1186/s12870-016- 0749-9


introgression lines for tomato improvement. Nat. Biotechnol. 24, 447–454. doi: 10.1038/nbt1192


drought stress in Brachypodium distachyon leaves. Mol. Plant 6, 311–322. doi: 10.1093/mp/sss098


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Fisher, Han, Corke, Akinyemi, Didion, Nielsen, Doonan, Mur and Bosch. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpls-07-00619 May 4, 2016 Time: 13:43 # 1

# Public Availability of a Genotyped Segregating Population May Foster Marker Assisted Breeding (MAB) and Quantitative Trait Loci (QTL) Discovery: An Example Using Strawberry

James F. Hancock<sup>1</sup> \*, Suneth S. Sooriyapathirana<sup>2</sup> , Nahla V. Bassil<sup>3</sup> , Travis Stegmeir<sup>4</sup> , Lichun Cai<sup>1</sup> , Chad E. Finn<sup>5</sup> , Eric Van de Weg<sup>6</sup> and Cholani K. Weebadde<sup>7</sup>

#### Edited by:

Alejandro Del Pozo, Universidad de Talca, Chile

#### Reviewed by:

Richard Jonathan Harrison, East Malling Research, UK Peter Douglas Savaria Caligari, Verdant Bioscience, Indonesia

#### \*Correspondence:

James F. Hancock hancock@msu.edu

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 04 December 2015 Accepted: 22 April 2016 Published: 09 May 2016

#### Citation:

Hancock JF, Sooriyapathirana SS, Bassil NV, Stegmeir T, Cai L, Finn CE, Van de Weg E and Weebadde CK (2016) Public Availability of a Genotyped Segregating Population May Foster Marker Assisted Breeding (MAB) and Quantitative Trait Loci (QTL) Discovery: An Example Using Strawberry. Front. Plant Sci. 7:619. doi: 10.3389/fpls.2016.00619 <sup>1</sup> Department of Horticulture, Michigan State University, East Lansing, MI, USA, <sup>2</sup> Department of Molecular Biology and Biotechnology, Faculty of Science, University of Peradeniya, Peradeniya, Sri Lanka, <sup>3</sup> United States Department of Agriculture – Agricultural Research Service, National Clonal Germplasm Repository, Corvallis, OR, USA, <sup>4</sup> Lassen Canyon Nursery Breeding Program, Redding, CA, USA, <sup>5</sup> United States Department of Agriculture – Agricultural Research Service, Horticultural Crops Research Unit, Corvallis, OR, USA, <sup>6</sup> Wageningen UR Plant Breeding, Wageningen, Netherlands, <sup>7</sup> Department of Plant, Soil and Microbial Sciences, Michigan State University, East Lansing, MI, USA

Much of the cost associated with marker discovery for marker assisted breeding (MAB) can be eliminated if a diverse, segregating population is generated, genotyped, and made available to the global breeding community. Herein, we present an example of a hybrid, wild-derived family of the octoploid strawberry that can be used by other breeding programs to economically find and tag useful genes for MAB. A pseudo test cross population between two wild species of Fragaria virginiana and F. chiloensis (FVC 11) was generated and evaluated for a set of phenotypic traits. A total of 106 individuals in the FVC 11 were genotyped for 29,251 single nucleotide polymorphisms (SNPs) utilizing a commercially available, genome-wide scanning platform (Affymetrix Axiom IStraw90TW). The marker trait associations were deduced using TASSEL software. The FVC 11 population segregating for daughters per mother, inflorescence number, inflorescence height, crown production, flower number, fruit size, yield, internal color, soluble solids, fruit firmness, and plant vigor. Coefficients of variations ranged from 10% for fruit firmness to 68% for daughters per mother, indicating an underlying quantitative inheritance for each trait. A total of 2,474 SNPs were found to be polymorphic in FVC 11 and strong marker trait associations were observed for vigor, daughters per mother, yield and fruit weight. These data indicate that FVC 11 can be used as a reference population for quantitative trait loci detection and subsequent MAB across different breeding programs and geographical locations.

Keywords: Fragaria × ananassa, F. chiloensis, F. virginiana, association mapping, quantitative trait analysis (QTL)

## INTRODUCTION

fpls-07-00619 May 4, 2016 Time: 13:43 # 2

The expense of developing a dense linkage map of single sequence repeats (SSRs) (Sargent et al., 2012) and/or single nucleotide polymorphism (SNPs) (Bassil et al., 2015) to conduct marker assisted breeding (MAB) (Collard and Mackill, 2008) is prohibitive for many breeding programs, particularly in the developing countries. Much of this cost can be circumvented; however, if a diverse, segregating population is made available to the global breeding community that has already been genotyped using markers from a published linkage map. That segregating population could be evaluated by the recipient breeding program in situ for the traits of interest and then the published linkage map could be used to search for markers associated with traits of interest. The only costs incurred by the recipient breeding program would be those associated with the phenotyping and computer analysis and use of only the relevant markers. The high costs associated with the generation of a dense linkage map of a segregating population could be avoided, saving the recipient breeding program hundreds of thousands of dollars in development costs. Costs could be further reduced by utilizing publicly available association mapping software such as 'Tassel' (Bradbury et al., 2007; Khan, 2011).

In this paper, we describe a hybrid population of the octoploid strawberry that should be a rich source of novel genes for the global breeding community. We have genotyped this population for 29,251 SNPs that were previously mapped in another segregating population (Bassil et al., 2015). The objective of this paper is to show how this population can be used by other breeding programs to economically find and tag useful genes for MAB. A similar approach could be used in all crops to make MAB much more available to breeding programs with limited resources.

## MATERIALS AND METHODS

#### Segregating Population

We have generated a hybrid population of the octoploid strawberry consisting of four subspecies. The primary cultivated strawberry, Fragaria × ananassa Duchesne ex Rozier, initially arose from a hybridization between F. chiloensis (L.) Miller subsp. chiloensis forma chiloensis and F. virginiana Miller subsp. virginiana in Europe 250 years ago (Hancock, 1999). Wildcollected clones of both species have been evaluated in multiple locations to identify the possible beneficial traits that could be incorporated into the cultivated strawberry and thereby select elite germplasm (Hancock et al., 2001a,b). Elite selections of F. virginiana and F. chiloensis were intercrossed in 23 combinations and evaluated in the field in Michigan and Oregon (Hancock et al., 2010). The most impressive family was FVC-11 [(Frederick 9 × LH 50–4) × (Scotts Creek × 2 MAR 1A)], which had the best combination of fruit size, color, and yield and was composed of four different subspecies: F. virginiana ssp. virginiana from Ontario (Frederick 9, PI 612493), F. virginiana ssp. glauca from Montana (LH 50-4, PI 612495), F. chiloensis ssp. chiloensis forma patagonica from Chile (2 MAR 1A, PI 602567), and F. chiloensis ssp. pacifica from California (Scotts Creek, PI 612490). This population likely contains as much diversity as is possible in a single octoploid strawberry family as it is composed of four different subspecies from four distant ecological regions spanning two continents.

### Marker Development

We genotyped 106 individuals in the FVC 11 family utilizing a commercially available, genome-wide scanning platform (Affymetrix Axiom IStraw90TW), according to manufactures instructions (Affymetrix, Inc., Santa Clara, CA, USA). This platform was developed as part of the international RosBREED project, focused on enabling marker-assisted breeding through identification and validation of QTLs for traits of importance to breeders (Iezzoni et al., 2010; Bassil et al., 2015). Short-read sequences from one diploid and 19 octoploid accessions were aligned to the diploid F. vesca 'Hawaii 4' reference genome to identify SNPs and indels for incorporation into the array. A total of 95,062 marker loci (SNPs, indels, and haploSNPs) were included on the array. After removing monomorphic, segregation wise distorted and ambiguous markers by applying necessary data filters in excel, we identified polymorphisms in the FVC 11 family in a subset of 6,594 SNPs that had already been placed in a dense genetic linkage map of the full sib-family 'Holiday' × 'Korona' (van Dijk et al., 2014; Bassil et al., 2015). This gave us the physical positions of our segregating markers to do a subsequent association analysis.

## Phenotypic Data

Seventy-eight genotypes of the FVC 11 population were evaluated at Benton Harbor, MI in 2007 and 2008 for their daughter plant production, inflorescence number, inflorescence height, crown production, flower number, fruit size, yield, internal color, soluble solids, and fruit firmness. In June 2007, two to three replicates (runner plants) of each genotype were planted in the field in Benton Harbor, MI, in a randomized complete block design. Plants were set in rows at 1.2 m × 1.2 m spacing and all runners were trained by cross-cultivation into a 0.5 m wide square.

A total of three random inflorescences were selected per mother and daughter plant and their heights were measured from crown to tip and their flower numbers were counted. The number of crowns was also counted on each mother plant and the three daughter plants as well as the total number of daughters produced by each mother plant (original plants set in field). Overall plant vigor was estimated on a 1–7 (least to most vigorous) scale based on plot fill and individual plant vigor. The first five ripe fruit were harvested to determine an average fruit weight, and after another five fruits had ripened all ripe and unripe fruits were picked to determine yield. Fruit firmness (g mm−<sup>2</sup> ) was measured on five ripe fruit per plot (when available) using the compression test of BioWorks' FirmTech 2 (Wamego, KS, USA). Two ripe fruits from each replication were cut in half and percent internal color was estimated visually based on how deep the color penetrated the flesh. Soluble solids were taken by squeezing one drop of juice onto the handheld refractometer from the two fruits for two separate readings. These data were previously reported by Stegmeir et al. (2010).

#### Association Analysis

fpls-07-00619 May 4, 2016 Time: 13:43 # 3

Marker-trait associations were determined by using TASSEL 5 software (Bradbury et al., 2007). Diallelic SNP markers were called for genotypes in which homozygous classes were designated as 0 or 1 and heterozygous class was designated as 0.5. The monomorphic markers, the markers with ≥15% missing data and markers whose genotypes were genetically ambiguous were removed prior to the analysis. The numerical marker and trait data were uploaded to Tassel and kinship was calculated as in a classical association analysis. Then MLM mapping was used to find significant marker trait associations. The probability (p) estimate of [−log10(p)] was used to represent the strength of the associations and the threshold was set to 3.00 for reporting most significant marker-trait associations.

### RESULTS AND DISCUSSION

#### Phenotypic Variability

As was previously reported by Stegmeir et al. (2010) significant variation was observed among the progeny for their inflorescence number, inflorescence height, crown production, flower number, fruit size, yield, internal color, soluble solids, fruit firmness, and plant vigor (**Table 1**). Coefficients of Variations (CVs) ranged from 10% for fruit firmness to 68% for yield. When progeny means were compared with those of the parental means, many traits exhibited transgressive segregation, most notably yield, and fruit weight. Of the two parents, the F. virginiana had the highest value for vigor, crown production, inflorescence number, inflorescence height, flower numbers, yield, and depth of fruit color, while the F. chiloensis had the highest values for fruit weight and firmness.

### Genotypic Variability and Association Analysis

The FVC 11 segregating population proved to be highly polymorphic. Out of the 6,594 SNPs that had been placed on the genetic linkage map of 'Holiday' × 'Korona' by Bassil et al. (2015), we found 37.42% to be polymorphic in our FVC-11 population. In the subsequent QTL analysis, we found a number of SNP markers that were closely linked to important horticultural traits, with highly significant p values and −log10 (p) values of ≥3.0. We discovered a SNP on LG 6 (AX-89796183) that was associated with both plant vigor and the production of daughter plants.

TABLE 1 | The mean, range, and coefficient of variation (CV) for 15 traits evaluated in the strawberry FVC 11 family at Benton Harbor, MI in 2008 (Stegmeir et al., 2010).


#### TABLE 2 | Significant marker-trait associations found in the FVC 11 population.


fpls-07-00619 May 4, 2016 Time: 13:43 # 4

SNPs in two regions of LG 2 were associated with fruit weight (AX-89904664; AX-89880679, and AX-89823518). A SNP on LG 5 (AX-89849271) was associated with yield per plant, while two other SNPs on LG 1 (AX-898760359) and LG 6 (AX-89898803) were significantly associated with yield per plot (**Table 2**).

## CONCLUSION

Based on the results described above, it is clear that the FVC-11 population is highly polymorphic for numerous horticulturally important traits and could provide useful genetic variability to other strawberry breeding programs. It is likely that the population also segregates for a number of other important horticultural traits that we did not evaluate. The original F. virginiana parent Frederick 9 is resistant to powdery mildew and leaf scorch, and very highly resistant to the northern root knot nematode. The other original F. virginiana parent LH-50-4 is unusually cold-hardy and is also very highly resistant to the northern root knot nematode. The F. chiloensis parent Scotts Creek has high salt and drought tolerance, and low nutrient needs. The other F. chiloensis parent MAR 1A also has high salt and drought tolerance, and low nutrient requirements. The environmental adaptations of the parents ranged from the Mediterranean climates where Scotts Creek and MAR 1A grew, to the high elevation Rocky Mountain habitat of LH 50-4 and the mid-south Canadian, continental climate of Frederick 9.

We have demonstrated how an existing, soon to be published linkage map can be used to find QTL cheaply, using publically available software (e.g.,: 'TASSEL') (Bradbury et al., 2007; Khan,

#### REFERENCES


Hancock, J. F. (1999). Strawberries. Wallingford: CABI Publishing.


2011) and a commercially available SNP array. We will maintain the FVC 11 family at Michigan State University and will make plants available to other interested strawberry breeders until at least December 31, 2017, so that other global programs can search for breeder-friendly markers and use these informative markers themselves. We ask only that the recipients pay for any required phytosanitary analysis and shipping costs.

#### AUTHOR CONTRIBUTIONS

JH – Designed study and was primary source of funding, wrote manuscript. SS – Collected leaf samples for SNP analysis, conducted QTL analysis, generated tables for manuscript NB – Supervised DNA extraction and development of SNPs. TS – Collected phenotypic data. LC – Organized SNP data for analysis. CF – Generated FVC family that was analyzed. EW – Provided unpublished linkage map for QTL analysis. CW – Was secondary source of funding, played integral role in planning, and interpretation of data.

## FUNDING

Partially funded by USDA's National Institute of Food and Agriculture "Specialty Crop Research Initiative project", "RosBREED: Enabling marker-assisted breeding in Rosaceae" (2009-51181-05808) and AgBioResearch, Michigan State University, East Lansing.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Hancock, Sooriyapathirana, Bassil, Stegmeir, Cai, Finn, Van de Weg and Weebadde. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Utilization of Molecular, Phenotypic, and Geographical Diversity to Develop Compact Composite Core Collection in the Oilseed Crop, Safflower (Carthamus tinctorius L.) through Maximization Strategy

#### Edited by:

*Rodomiro Ortiz, Swedish University of Agricultural Sciences, Sweden*

#### Reviewed by:

*Craig Wood, Plant Industry (CSIRO), Australia Fred Stoddard, University of Helsinki, Finland*

#### \*Correspondence:

*Shailendra Goel shailendragoel@gmail.com Arun Jagannath jagannatharun@yahoo.co.in*

#### Present Address:

*Murali T. Variath, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India*

†

*‡ These authors have contributed equally to this work.*

#### Specialty section:

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

Received: *30 June 2016* Accepted: *03 October 2016* Published: *19 October 2016*

#### Citation:

*Kumar S, Ambreen H, Variath MT, Rao AR, Agarwal M, Kumar A, Goel S and Jagannath A (2016) Utilization of Molecular, Phenotypic, and Geographical Diversity to Develop Compact Composite Core Collection in the Oilseed Crop, Safflower (Carthamus tinctorius L.) through Maximization Strategy. Front. Plant Sci. 7:1554. doi: 10.3389/fpls.2016.01554* Shivendra Kumar <sup>1</sup>‡ , Heena Ambreen<sup>1</sup>‡ , Murali T. Variath1 †, Atmakuri R. Rao<sup>2</sup> , Manu Agarwal <sup>1</sup> , Amar Kumar <sup>1</sup> , Shailendra Goel <sup>1</sup> \* and Arun Jagannath<sup>1</sup> \*

*<sup>1</sup> Department of Botany, University of Delhi, New Delhi, India, <sup>2</sup> Centre for Agricultural Bioinformatics, Indian Council of Agricultural Research-Indian Agricultural Statistics Research Institute, New Delhi, India*

Safflower (*Carthamus tinctorius* L.) is a dryland oilseed crop yielding high quality edible oil. Previous studies have described significant phenotypic variability in the crop and used geographical distribution and phenotypic trait values to develop core collections. However, the molecular diversity component was lacking in the earlier collections thereby limiting their utility in breeding programs. The present study evaluated the phenotypic variability for 12 agronomically important traits during two growing seasons (2011–12 and 2012–13) in a global reference collection of 531 safflower accessions, assessed earlier by our group for genetic diversity and population structure using AFLP markers. Significant phenotypic variation was observed for all the agronomic traits in the representative collection. Cluster analysis of phenotypic data grouped the accessions into five major clusters. Accessions from the Indian Subcontinent and America harbored maximal phenotypic variability with unique characters for a few traits. MANOVA analysis indicated significant interaction between genotypes and environment for both the seasons. Initially, six independent core collections (CC1–CC6) were developed using molecular marker and phenotypic data for two seasons through POWERCORE and MSTRAT. These collections captured the entire range of trait variability but failed to include complete genetic diversity represented in 19 clusters reported earlier through Bayesian analysis of population structure (BAPS). Therefore, we merged the three POWERCORE core collections (CC1–CC3) to generate a composite core collection, CartC1 and three MSTRAT core collections (CC4–CC6) to generate another composite core collection, CartC2. The mean difference percentage, variance difference percentage, variable rate of coefficient of variance percentage, coincidence rate of range percentage, Shannon's diversity index, and Nei's gene diversity for CartC1 were 11.2, 43.7, 132.4, 93.4, 0.47, and 0.306, respectively while the corresponding values for CartC2 were 9.3, 58.8, 124.6, 95.8, 0.46, and 0.301. Each composite core collection represented the complete range of phenotypic and genetic variability of the crop including 19 BAPS clusters. This is the

**148**

first report describing development of core collections in safflower using molecular marker data with phenotypic values and geographical distribution. These core collections will facilitate identification of genetic determinants of trait variability and effective utilization of the prevalent diversity in crop improvement programs.

Keywords: safflower, phenotypic data, AFLP, regional gene pools, Maximization (M) strategy, MSTRAT, POWERCORE, core collection

#### INTRODUCTION

Safflower (Carthamus tinctorius L.) is a dryland oilseed crop widely adapted to grow over a broad range of geographical locations extending from Far East to American region (Dajue and Mündel, 1996). It was initially cultivated for extraction of dyes and subsequently gained importance as a source of edible oil due to its nutritionally desirable composition of plant-based unsaturated fatty acids namely, oleic, and linoleic acid (Ashri et al., 1977; Dajue and Mündel, 1996; Khan et al., 2009). In addition, the medicinal properties of safflower and its use as a system for production of pharmaceutical products are well documented (Weiss, 1983; McPherson et al., 2009; Carlsson et al., 2014). Safflower is severely affected by several biotic and abiotic stresses and is characterized by low yield and spiny nature which have discouraged farmers from adopting its cultivation in several countries including India (Nimbkar, 2008). Moreover, the breeding lines and cultivars of safflower harbor low genetic diversity (Kumar et al., 2015), which restricts their utility in breeding programs. Therefore, an extensive characterization of the prevalent genetic and phenotypic diversity among the global germplasm of the crop is required to facilitate development of effective crop improvement strategies.

Germplasm resources act as a reservoir for trait variability and are of prime importance for crop improvement. However, their large size and heterogeneous structure restricts their accessibility and application (Brown, 1989a,b; Noirot et al., 1996; van Hintum, 2000). For effective management and utilization of these resources, Frankel (1984) introduced the concept of "core collection." A core collection is a representative subset of minimum number of non-redundant individuals capturing maximum variability prevalent in the entire germplasm collection. Characterization and evaluation of core collection is an easier task compared to the entire germplasm collection. Initially, core collections were developed using morphological parameters and/or geographical distribution (Huaman et al., 1999; Tai and Miller, 2001; Upadhyaya and Ortiz, 2001; Upadhyaya et al., 2003, 2009; Li et al., 2005; Bhattacharjee et al., 2007; Mahalakshmi et al., 2007). Subsequently, availability of molecular markers and their greater efficacy in elucidating genetic diversity have facilitated the development of more robust core collections using molecular markers either alone (Zhang et al., 2009) or in conjunction with phenotypic data in various crop species (Wang et al., 2006; Ebana et al., 2008; Shehzad et al., 2009; Belaj et al., 2012; Díez et al., 2012; Liu et al., 2015).

Until now, efforts to consolidate safflower genetic resources into core collections were based on assessment of morphological traits and geographical distribution. Johnson et al. (1993) developed the first core collection in safflower consisting of 210 accessions by evaluating a germplasm collection of 2042 accessions from ∼50 countries. Dwivedi et al. (2005) developed another core collection comprising 570 accessions from a total collection of 5522 safflower accessions from 38 countries. However, since most agronomically important traits are quantitative in nature, they are significantly influenced by genotype × environment (GE) interactions. Therefore, the data types (morphological and geographical information) used for development of the initial core collections in safflower would have under-represented the genetic diversity present in the crop due to lack of allelic information. Efforts are required to include genetic diversity based on molecular markers for development of a more effective and robust core collection in safflower.

The present study describes the phenotypic evaluation of a global representative collection of 531 safflower accessions and development of a robust core collection in safflower using maximization strategy. To the best of our knowledge, this is the first report of a composite core collection in safflower utilizing molecular variability along with geographical distribution and phenotypic data. This collection will be useful in designing crop improvement programs in a more effective manner and in dissecting the molecular determinants of trait variability.

#### MATERIALS AND METHODS

#### Germplasm Resources

The safflower germplasm used in the present study comprised of 531 accessions. The details of the accessions including their PI numbers, country of origin and regional pool along with the strategy used for their selection has been described by Kumar et al. (2015).

#### Measurement of Phenotypic Data

The accessions were grown and characterized in two consecutive seasons (2011–12 and 2012–13) at Agricultural Research Station, University of Delhi, Bawana Road, New Delhi, India (Latitude: 28◦ 38′ N, longitude: 77◦ 12′ E and altitude: 252 m). Ten seeds of each accession were sown in a single row of 2 m with an average distance of 0.2 m between plants and a gap of 0.6 m between each row. Locally adopted agronomic practices were followed for raising a healthy crop.

Phenotypic characterization was done following the guidelines of International Plant Genetic Resources Institute (IPGRI) for safflower. Each accession was characterized for 12 traits which included 8 pre-harvest and 4 post-harvest traits. The pre-harvest traits were growth habit (GH), plant height (PH), spininess (SP), number of primary branches (PB), branch location (BL), number of heads per plant (HD), flower color (FC), and days to 50% flowering (DTF). The post-harvest traits were 100-seed weight (SW), seed oil content (OC), oleic acid content (OA), and linoleic acid content (LA). The data was recorded for three healthy plants of each accession.

Growth habit of the plant was recorded as "erect" or "sprawling" on ground. For plant height, main shoot length was measured from soil surface to the highest inflorescence of the plant. Spininess of the accessions were recorded at the onset of flowering and reported as "present" or "absent." Number of branches originating from the main axis was counted as number of primary branches. Distribution of primary branches on the main shoot determined branch location in safflower and was categorized as basal, upper one-third, upper two third, and from base to apex of the plant. The total number of inflorescences (primary, secondary, and tertiary) per plant was recorded as number of heads per plant. Flower color was documented as yellow, orange, red and off-white at full bloom stage. For each accession, the number of days from planting to onset of flowering in 50% plants was considered as days to 50% flowering. Seed weight of 100 achenes from each plant was measured in grams and recorded as 100-seed weight. Oil content was measured by Near-Infrared Reflectance Spectroscopy (NIRS) (Foss, Germany). Oil content in seed samples of 300 safflower accessions was estimated by Soxhlet method and used for the development and calibration of NIRS equations for oil content measurement in safflower (manuscript under preparation). Fatty acid composition (oleic and linoleic acid content), was determined by methyl esterification followed by gas chromatography using Clarus 580 (Perkin Elmer, USA) as per manufacturer's instructions.

#### Statistical Analysis of Phenotypic Data

Phenotypic correlations between different quantitative traits (computed as Pearson correlation coefficient, r), cluster analysis based on Euclidean distance and two-dimensional Principal coordinate analysis (PCoA) were performed using PAST version 3.10 (Hammer et al., 2001). Frequency distribution of accessions for different classes of traits was calculated. Evaluation of seasonal variation for the traits under consideration was conducted through Multivariate Analysis of Variance (MANOVA) using SPSS version 18 (Statistical Package for the Social Sciences; SPSS Inc. Released, 2009. PASW Statistics for Windows, Version 18.0. Chicago: SPSS Inc.).

#### Development of Core Collections

MSTRAT (Gouesnard et al., 2001) and POWERCORE (Kim et al., 2007) were used for development of independent core collections using phenotypic data of seasons 2011–12, 2012–13 and genotypic data reported by Kumar et al. (2015). In MSTRAT, 20 replicates and 100 iterations were tested at a fixed sample size of 10%. The core collection with highest Shannon's diversity index was selected. POWERCORE was used as described in the user's manual (Kim et al., 2007).

## Evaluation of Core Collections

Core collections were evaluated by estimating Shannon's diversity index (I) and Nei's gene diversity (H) using POPGENE version 1.32 (Yeh et al., 1999). Additionally, mean difference percentage (MD%), variance difference percentage (VD%), variable rate of coefficient of variance (VR%), and coincidence rate of range (CR%) were calculated to assess the level of diversity captured in core collection with respect to the entire collection (Hu et al., 2000). T-test and F-test were performed to study difference in mean and variance of traits between the entire collection and composite core collections. The "coverage" criterion described by Kim et al. (2007) was used to evaluate the percentage diversity captured for each variable in the composite core collections.

## RESULTS

## Analysis of Pre-harvest Traits

Analysis of pre-harvest traits revealed significant phenotypic variability among the safflower accessions used in the current study. Erect growth was observed in 529 accessions while two accessions (PI-305204 and PI-306912) showed sprawling growth in both the seasons (2011–12 and 2012–13). Plant height of the studied accessions ranged from 94 to 226 cm in 2011–12 and from 73 to 211 cm in 2012–13 growing seasons (Supplementary Figures 1A,B). Although these values suggest a minor shift in the overall range between the two seasons, plant height of individual accessions did not show a markable difference. In our study, around 21% of the accessions (111) were non-spiny while 79% of accessions (420) were spiny in nature. The number of primary branches in the studied accessions ranged from 4 to 34 in 2011– 12 season and from 5 to 33 for 2012–13 season. The position of branch emergence is associated with the bushy nature in safflower. A large number of accessions (38%) had branches located in the upper one third portion of the plant followed by 31% of accessions with branches in the upper two third portion. The remaining 31% of accessions had branches originating from the base till the apex giving it a more bushy appearance.

The number of heads per plant varied from 11 to 203 and from 9 to 189 for 2011–12 and 2012–13 growing seasons, respectively. Safflower shows different shades for its corolla color varying from yellow, orange, red to off-white. In our study, yellow was the most common color (76% of accessions) followed by orange (11% of accessions). Days to 50% flowering was recorded for each accession as described above. The trait distribution was observed to be asymptotically normal in both the seasons (Supplementary Figures 1C,D). Based on these observations, accessions were categorized as early flowering (tail of the distribution curve; 119– 128 days and 137–146 days for 2011–12 and 2012–13 seasons, respectively), mid flowering (129–151 days for 2011–12 and 147– 174 days for 2012–13, respectively) and late flowering (tail of the distribution curve; 152–160 days and 175–182 days for 2011–12 and 2012–13 season, respectively; Supplementary Figures 1C,D). Although days to 50% flowering shifted between the two seasons, no change was observed in the associated categories of accessions between the seasons. Based on the above analysis, we identified 14 early-flowering, 490 mid-flowering, and 27 late-flowering accessions.

## Analysis of Post-harvest Traits

The hundred seed weight value ranged from 1 to 8 g for 2011– 12 season and from 2 to 8 g in 2012–13 season. No significant difference was observed in the phenotypic range between the two seasons. Estimation of oil content was performed using NIRS. The oil content among the analyzed accessions ranged from 16 to 50% in 2011–12 while for the 2012–13 season it ranged from 15 to 47% (Supplementary Figures 1E,F). Accessions with oil content <22% were considered as "low oil content" while those with >40% oil content were categorized as "high oil content" (Supplementary Figures 1E,F). Accessions with low and high oil content remained consistent in both the seasons. The oleic acid content ranged from 9 to 82% with most accessions (93%) falling in the lower range of oleic acid content (below 25%) and a few (7%) having medium and high oleic acid content (>75%). Linoleic acid content varied from 13 to 87% with most accessions (90%) showing high linoleic acid content (65–80%) and few accessions having very high (3%), medium or low linoleic acid content (6.6%). **Table 1** includes list of accessions with high oil content (>40%), high oleic acid (>75%), and very high linoleic acid (≥80%) observed in the current study.

#### Correlation Analysis between Traits

Correlation analysis indicated a significant negative correlation (r = −0.99) between oleic and linoleic acid content of safflower. The correlation values for all other traits were below the significance level of 0.50. The highest positive correlation value was observed between number of heads per plant and number of primary branches (0.45). The correlation coefficient values for the analyzed traits are listed in **Table 2**.

## Distribution of Traits within and between Regional Gene Pools

The 531 accessions used in this study represented all the 10 regional gene pools defined by Ashri (1975) based on morphological parameters. The distribution of different phenotypic classes among the safflower regional gene pools is given in Supplementary File 1. Although morphological delineation was not prominently observed between different regional gene pools for most traits, a few character states were more pronounced in some gene pools. Accessions with increased plant height (>155 cm) were limited to Iran-Afghanistan, Turkey, Far East, and Europe. The majority of accessions with low head count per plant were from the Far East. A higher number of primary branches (25–33) was found only among accessions from the Indian subcontinent, Far East, America, and Iran-Afghanistan. Early flowering accessions were found only among genotypes from Far East, Indian subcontinent, Egypt, and America. On the other hand, all other pre-harvest traits namely growth habit, spines, location of branches on the main axis of plant and flower color did not show any preferential distribution to any regional gene pool.

Among post-harvest traits, high oil content was observed only in accessions from the American region while some accessions from the Indian subcontinent had up to 40% of oil content (Supplementary File 1). High oleic acid content (>75%) was found only in accessions from America and Indian subcontinent. All accessions from Near East, Turkey, Egypt, Sudan, Europe, and Iran-Afghanistan had low oleic acid content. Higher ranges of 100 seed weight (6–8 gm) were found predominantly among Indian accessions and to a limited extent from American region.

## Cluster Analysis and Principal Coordinate Analysis (PCoA)

The inter-relationships and genetic distance between safflower accessions based on phenotypic data was assessed through unweighted pair group method with arithmetic mean (UPGMA) clustering using Euclidean distance matrix (**Figure 1**). Safflower accessions were grouped in five major clusters designated as CL

TABLE 1 | List of safflower accessions with high oil content (>40%), linoleic acid content (≥80%), and oleic acid content (>75%).


#### TABLE 2 | Correlation coefficient between eight quantitative traits studied in the entire safflower collection.


*OC, oil content; OA, oleic acid content; LA, linoleic acid content; SW, 100-seed weight; PH, plant height; NH, number of heads per plant; NB, number of primary branches; DTF, days to 50% flowering.*

\**Denotes correlation between same trait.*

I–CL V. Information on distribution of accessions in different clusters is given in **Table 3**. CL V is the largest cluster with 215 accessions. All clusters, except Cluster III, were dominated by accessions from the Indian subcontinent and America. Cluster III had significant representation of accessions from Iran-Afghanistan, Far-East, and Europe.

In principal coordinate analysis (PCoA), coordinate axes 1 and 2 captured 42.5 and 22.4%, respectively of the total existing variation among the accessions (**Figure 2**). Accessions from Indian subcontinent were mainly present in quadrants III and IV with minor representation in quadrants I and II. Accessions from American region were homogenously distributed among all the quadrants of PCoA obtained using phenotypic data. Accessions from Iran-Afghanistan region were mainly found in quadrants I and II with a few accessions in quadrants III and IV. Far East accessions were restricted to quadrants I and IV while accessions from the European region were limited to quadrant I and II. Accessions from the Near East region were found to be part of quadrants I and II while Sudanese accessions were distributed in all the four quadrants. Accessions from Turkey and Egypt were predominantly found in quadrants I and II.

### Analysis of Seasonal Variations and Development of Core Collections Using POWERCORE and MSTRAT

MANOVA analysis indicated significant seasonal effects as well as significant interaction effect between seasons and accession effects by considering all quantitative traits together (**Table 4**). Therefore, phenotypic data for both the seasons (2011– 12 and 2012–13) and molecular marker data were treated independently for development of core collections. Usage of the two maximization (M) strategy based programs resulted in the generation of six core collections (CC1-CC6).

In our earlier work, molecular profiling of the 531 accessions identified 157 polymorphic AFLP markers (Kumar et al., 2015). Core collections were developed with these AFLP markers using POWERCORE and MSTRAT and designated as CC1 and CC4, respectively. CC1 included 14 accessions (2.6% of the entire collection) belonging to six out of 10 regional gene pools while CC4 comprised 26 accessions (4.9% of the entire collection) belonging to seven regional gene pools (**Table 5**). Phenotypic data of seasons 2011–12 and 2012–13 was used to develop core collections CC2 and CC3, respectively using POWERCORE. CC2 consisted of 26 accessions (4.9% of the entire collection) from six regional gene pools (**Table 5**) and regions of secondary introduction (Australia and America). CC3 consisted of 27 accessions (5.1% of the entire collection) from six regional gene pools of safflower. Core collections CC5 and CC6, were developed using phenotypic data of season 2011–12 and 2012– 13, respectively using MSTRAT. CC5 consisted of 47 accessions (8.8% of the entire collection) from seven regional gene pools and regions of secondary introduction (America and Australia). CC6, comprising 54 accessions (10% of the entire collection) had representation from eight regional gene pools along with regions of secondary introduction.

The ranges, means, and variances for all the quantitative traits were calculated for core collections developed using phenotypic data (CC2, CC3, CC5, and CC6) and compared with corresponding values for the entire collection (Supplementary Tables 1, 2). MD% displays the difference in averages between the core and the entire collection and should be <20% for a representative core collection. MD% ranged from 6.36 to 15.45% for the four core collections (**Table 6**). VD% indicates the variance captured by a core collection and ranged from 36.4 to 59% in the current analysis. The coefficient of variance (VR%) captured in the core collection should have a value higher than 100%. CC5 and CC6 had high VR% above 105% while CC2 and CC3 showed a value of ∼96.1% (**Table 6**). The range distribution of traits in a core collection in comparison to entire collection is measured by CR% whose value should be greater than 80%. All


TABLE 3 | Distribution of accessions from different regional gene pools of safflower in clusters of UPGMA dendrogram constructed using phenotypic data.

the analyzed core collections displayed high CR% value ranging from 94.25 to 143.52%. Shannon-Weaver diversity index (I) was calculated for all the core collections and ranged from 0.44 to 0.53. The core collections, CC1 and CC4 derived using molecular marker data, showed highest Shannon-Weaver diversity index with a value of 0.53 and 0.49, respectively which was higher than the corresponding values obtained for core collections derived using phenotypic data (**Table 6**). Nei's genetic diversity (H) for the six core collections ranged from 0.273 to 0.346. Similar to Shannon's diversity index, the highest value of H was recorded for CC1 (0.346) and CC4 (0.318) (**Table 6**).

### Development and Evaluation of Composite Core Collections

Based on the various indices described above, all the core collections developed in our study appeared to represent the prevalent diversity of the entire collection. However, none of the core collections contained representation from all the 19 clusters derived by Bayesian Analysis of Population Structure (BAPS) (**Table 7**), which captured diverse combinations of alleles and resulted in meaningful genetic stratification of the collection (Kumar et al., 2015). In order to capture the maximum range of allelic diversity/trait state in a core collection and prevent tradeoff between two data types when used together, we attempted to combine phenotypic and molecular variability by merging core collections derived from each strategy separately (**Figure 3**). The core collections developed by POWERCORE, i.e., CC1 (14 accessions), CC2 (26 accessions), and CC3 (27 accessions) were combined to form a non-redundant composite core collection referred to as CartC1 (Supplementary File 2). CartC1 comprised 57 accessions (10.7% of initial collection) representing 19 BAPS clusters, eight regional gene pools and two regions of secondary introduction for safflower (**Tables 5**, **7**). Similarly, the core collections derived through MSTRAT, i.e., CC4 (26 accessions), CC5 (47 accessions), and CC6 (54 accessions) were merged resulting in a non-redundant composite core collection referred to as CartC2 (Supplementary File 2). CartC2 consisted of 106 accessions (∼20% of initial collection) including representation from all 19 BAPS clusters, ten regional gene pools and two



*Based on Wilks' Lambda.*

regions of secondary introduction for safflower (**Tables 5**, **7**). Forty four accessions were common among the two composite core collections (**Figure 3**).

The ranges, means and variances for all the quantitative traits for CartC1 and CartC2 are provided in **Table 8**. Homogeneity tests were performed to evaluate the difference in means (ttest) and variances (F-test) of traits between the entire collection and composite core collections (α = 0.05; **Table 8**). For a core collection to be representative of the entire collection, it is expected that the difference in mean should not deviate by more than 20% for the traits (Hu et al., 2000). Difference between the mean of the entire collection and CartC1 was non-significant for oil content, 100 seed weight, plant height, number of heads per plant, number of primary branches per plant, and days to 50% flowering. We observed non-significant differences in variance for three traits (100 seed weight, plant height, number of primary branches per plant) between CartC1 and the entire collection (**Table 8**). T-test provided non-significant differences for oil content, 100 seed weight, plant height, number of heads per plant, and days to 50% flowering while F-test revealed nonsignificant variance for only two traits (100 seed weight and plant height) between CartC2 and the entire collection. In "Coverage" analysis (Kim et al., 2007), CartC1 and CartC2 showed 100% coverage value for different phenotypic and genetic variables under consideration.

The composite core collections were validated for their representativeness of the entire collection through evaluation indices which are given in **Table 6**. The Shannon's diversity index (I) and Nei's genetic diversity (H) were 0.47 and 0.306, respectively for CartC1 and 0.46 and 0.301, respectively for CartC2. We assessed distribution of accessions of CartC1 and CartC2 in the dendrogram obtained through phenotypic and genetic analysis of entire collection. CartC1 and CartC2 showed balanced distribution in all the clusters of Neighbor Joining (genetic analysis; **Figure 4**) and UPGMA (phenotypic analysis;



TABLE 6 | Evaluation indices for developed core collections.


*MD%, mean difference; VD%, variance difference; VR%, variable rate; CR%, coincidence rate of range; I, Shannon's diversity index, and H, Nei's genetic diversity.*

**Figure 5**). Thus, CartC1 and CartC2 provided a more rational and exhaustive representation of all the phenotypic and genetic variability than the independent core collections (CC1–CC6) developed in the present study.

#### DISCUSSION

A vast collection consisting of 25,179 accessions of safflower is available in 22 gene banks of 15 countries around the world (Zhang and Johnson, 1999). Phenotypic characterization of safflower germplasm in earlier studies demonstrated significant variability for several agronomic traits (Knowles, 1969; Ashri, 1975; Johnson et al., 2001; Amini et al., 2008; Khan et al., 2009). In spite of substantial diversity in its germplasm, yield enhancement in the crop has achieved limited success. Breeding strategies often focus on a limited set of agronomic traits resulting in cultivars with a narrow genetic base. For example, Kumar et al. (2015) showed that the cultivars and breeding lines of safflower from the Indian subcontinent have a narrow genetic base although extensive genetic diversity was present in the regional germplasm. This makes the cultivars highly susceptible to environmental changes and vulnerable to yield penalties. One of the main limitations of earlier approaches has been the overdependence on morphological and geographical parameters due to lack of information on the genetic structure of safflower germplasm based on molecular markers. The present study attempted to address the above issue by generating two composite core collections in safflower that include data on molecular variability of the crop in addition to phenotypic and geographical parameters.

#### Phenotypic Diversity of the Crop and Identification of Accessions with Desirable Agronomic Traits

Significant variation was observed among the 531 accessions for 12 agronomic traits. More than 85% of accessions had plant height <155 cm, which is desirable due to ease of mechanical harvesting from shorter plants (Weiss, 1983). Most safflower varieties and genotypes grown around the world have spines on the leaves and bracts of the plant (Dajue and Mündel, 1996). Spiny nature of the crop is one of the factors responsible for

TABLE 7 | Distribution of accessions of developed core collections in different BAPS clusters derived based on AFLP markers by Kumar et al. (2015).


reluctance of farmers to grow safflower, especially in countries like India where harvesting is done manually. Spiny types were widely represented in our collection of 531 accessions. It was hypothesized that non-spiny varieties are generally low in yield and oil content (Dajue and Mündel, 1996). However, we did not observe a significant association between presence of spines and seed oil content. We identified 15 spiny accessions with high seed oil content (**Table 1**) and several spiny accessions with low seed oil content in the representative collection. The genetics of oil content and spines needs to be investigated further in order to design effective breeding strategies involving these traits.

Traits such as number of primary branches and heads per plant influence seed yield (Ashri et al., 1974; Patil et al., 1994; Dajue and Mündel, 1996). We found significant variation in the above traits and accessions with high number of primary branches and increased number of heads were identified. Analysis of seed yield for these accessions is required to identify promising genotypes. Days to 50% flowering varied between the two growing seasons and ranged from 119 to 160 days (in 2011– 12) and from 137 to 182 days (in 2012–13). Delayed flowering in the second year was attributable to cooler temperatures in February and March than in the previous year. The average maximum temperature recorded for the months of February and March 2012 was ∼29◦C while the corresponding value was ∼23◦C in 2013. (http://www.weatherspark.com). Though temperature fluctuations did affect developmental stages as well as flowering-related events, the early flowering accessions were consistent between the two seasons.

FIGURE 3 | Flowchart describing the strategy and results of development of core collection for safflower. Numerical values in parenthesis indicate the number of accessions in respective cores. Values indicated above the double-headed arrows depict the number of accessions common between different core collections.


TABLE 8 | Ranges, means, and variances for the entire collection and composite core collections, CartC1 and CartC2.

\**Abbreviations for phenotypic traits provided in footnote of* Table 2*;* #*Data for season 2011–2012 presented.*

Identification and use of high oil yielding genotypes is important for increasing oil content in safflower cultivars. Breeding efforts in America led to the development of cultivars with increased seed oil content ranging from 45 to 55% (Bergman et al., 1985; Rubis et al., 2001). However, such improvements are lacking among Indian cultivars which have oil content ranging from 27 to 35%. Evaluation of oil quantity in the 531 accessions by NIRS identified 15 accessions with high oil content (>40%). These would serve as important breeding material in safflower. All the high oil yielding accessions (>40%) had low 100-seed weight (3–4 gm) in our study. This observation is in consonance with earlier reports, which suggest that increased hull thickness enhances seed weight but reduces oil content (Ranga Rao et al., 1977; Dajue and Mündel, 1996). Safflower oil has a desirable fatty acid composition. High linoleic lines of safflower are favored for animal feed and in the paint and varnish industry (Knowles, 1989; Bergman et al., 2001) while high oleic lines are nutritionally desirable because of its hypo-cholesterolemic effect and greater oxidative stability (Fuller et al., 1967). A high oleic line of Indian origin (Knowles and Bill, 1964) was effectively utilized in various safflower breeding programs in the USA (Mündel and Bergman, 2009). The safflower collection used in this study contained 17 accessions with high oleic acid content and 15 accessions with high linoleic acid content (**Table 1**). Accessions with desirable traits identified in the present study could be incorporated in breeding programs for crop improvement.

### Assessment of Regional Gene Pools Based on UPGMA Analysis of Phenotypic Data

Accessions from the Indian Subcontinent and American region were distributed in all the five clusters (**Figure 1**) suggesting that they harbor maximum phenotypic diversity for the studied

traits. Knowles (1969), based on morphological analysis of accessions from the Indian subcontinent, reported them as a uniform assemblage resulting from a single introduction. In contrast, our assessment indicates that accessions from the Indian subcontinent are phenotypically diverse. Morphological diversity among accessions from the Indian subcontinent was reported in earlier studies (Kupsow, 1932; Chavan, 1961; Hanelt, 1961). Indian accessions have also been shown to harbor significant genetic diversity (Kumar et al., 2015). The American germplasm was found to be phenotypically diverse in the current study but was genetically conserved (Kumar et al., 2015). Near East and Iran-Afghanistan accessions clustered together based on phenotypic data, supporting our earlier proposal of considering them as a single gene pool (Kumar et al., 2015). Accessions from European region were distributed in several clusters based on phenotypic data similar to the observation obtained through molecular data analysis. Interestingly, accessions from Far East, Turkey and Egyptian region were present in all the clusters although they exhibited low genetic diversity (Kumar et al., 2015). These results indicate that UPGMA analysis based on phenotypic data alone is unable to accurately define the genetic relationships among safflower accessions.

#### Composite Core Collections Effectively Capture the Global Molecular, Phenotypic, and Geographical Variability of the Crop

In recent years, increased availability of molecular resources has enabled their utilization in development of core collections in crop species (Belaj et al., 2012; El Bakkali et al., 2013) but until now, no such attempts have been made in safflower. Use of molecular markers for development of core collections is advantageous as they reflect diversity at the DNA level as opposed to morphological markers wherein different genotypes might show similar phenotypic traits due to environmental effects. Additionally, molecular markers are more effective in identifying and minimizing redundancy. Several studies have emphasized on use of maximization (M) strategy for development of highly robust core collections (Bataillon et al., 1996; McKhann et al., 2004). The M strategy retains maximum number of alleles at each locus and is considered as the most powerful approach for maintaining diverse alleles (Schoen and Brown, 1993). MSTRAT and POWERCORE programs have been successfully used for construction of core collection in various plant species such as grapes, olive and sesame (Le Cunff et al., 2008; Belaj et al., 2012; Zhang et al., 2012). A combination of molecular markers and maximization (M) strategy has been utilized for the first time in our study for construction of a core collection in safflower.

Earlier studies reported significant GE interactions in safflower and emphasized on multi-location and multi-seasonal trials to evaluate heritability of characters for their effective utilization in breeding programs (Singh et al., 2004; Mahasi et al., 2006). In our study, MANOVA analysis indicated prominent GE interactions (**Table 4**). Therefore, seasonal datasets were treated independently for developing core collections. The six core collections thus generated, efficiently captured the entire range of trait variability but failed to include complete genetic diversity represented in 19 clusters derived earlier (Kumar et al., 2015) through Bayesian analysis. Additionally, many accessions were common between different core collections. For example, in core collections developed using POWERCORE, 10 accessions were common between CC1 (marker-based) and CC2/CC3 (phenotype-based). Only 4 accessions were unique to CC1 while 16 and 17 accessions were unique to CC2 and CC3, respectively. In MSTRAT-derived core collections, 19 accessions were common between CC4 (marker-based) and CC5/CC6 (phenotype-based). The number of accessions unique to CC4, CC5, and CC6 were 7, 28, and 35, respectively. The presence of common accessions between core collections derived using different types of data indicates an overlap in genetic and phenotypic components of the studied accessions. These accessions represent a subset of genotypes that are highly diverse at both molecular and phenotypic level.

The core collections developed using each program were merged to derive a more robust and non-redundant composite core collection (CartC1 by POWERCORE and CartC2 by MSTRAT). The vast phenotypic diversity of the initial collection was retained in both collections. Accessions with desirable agronomic traits and extreme phenotypes, which were present in very low numbers in the entire collection and scattered in the initial core collections were captured in the composite core collections (**Table 8**). Both the composite core collections provided comprehensive coverage of allelic diversity and had representation from all the 19 BAPS clusters identified earlier for safflower (Kumar et al., 2015; **Table 7**). Evaluation indices (MD%, VD%, VR%, CR%, I, H) for CartC1 and CartC2 were

comparable and reflect their effectiveness in capturing diversity of the crop (**Table 6**). Our approach of deriving independent core collections from molecular and phenotypic data and their subsequent merger to create composite core collections avoided trade-off between the diversity captured using the molecular and phenotypic data sets.

Geographical distribution influences the extent of genetic variability of a species. The effect is more prominently seen in case of in-breeding species (Rao and Hodgkin, 2002). Geographical patterning is evident in safflower which is highly self-pollinating in nature and is grown in different agro-climatic regions across the world (Knowles, 1969; Ashri, 1975; Chapman et al., 2010). The two composite core collections showed minor variations in representation of the 10 regional gene pools. CartC1 included 8 regional pools excluding Sudan and Kenya while CartC2 contained representation from all the 10 regional gene pools (**Table 5**). Similar to the entire collection, both CartC1 and CartC2 showed predominance of accessions from Indian subcontinent and America accounting for ∼50% of the total entries. In contrast, the earlier core collection developed by Johnson et al. (1993) had a major proportion of accessions (∼46%) from the Mediterranean region and South-West Asia while the core collection derived by Dwivedi et al. (2005) consisted of ∼78% accessions from South and South-East Asia.

The number of accessions in a core collection is an important factor determining its effective utilization (Brown and Spillane, 1999). The core collections developed earlier for safflower consisted of 210 accessions (Johnson et al., 1993) and 570 accessions (Dwivedi et al., 2005) while the composite core collections developed in the present study are comparatively smaller with 57 (CartC1) and 106 (CartC2) accessions. The larger size of the core collections developed in earlier studies could be due to the larger number of accessions in their initial germplasm collection. However, the advantage of the present study is that the initial collection used for development of composite core collections has been characterized extensively for both molecular and phenotypic diversity and the generated core collections have therefore effectively captured the global genetic and phenotypic diversity of the crop. Additionally, CartC1 has better utility value in comparison to CartC2 due to its smaller size and comparable diversity.

The present study is the first attempt where molecular diversity data has been used in conjunction with phenotypic data and geographical distribution to develop core collections in safflower. The small size of the composite core collections would be advantageous for field studies and association mapping. These collections will provide access to genetically diverse and agronomically important germplasm that would be useful in widening the genetic base of the crop and facilitate characterization of genetic determinants of trait variability. This information can be used to design more effective breeding programs to increase the global utility of safflower as an oilseed crop.

#### AUTHOR CONTRIBUTIONS

AJ, SG conceived and designed the experiments. SK, HA performed the experiments, analyzed the data. MV helped in

#### REFERENCES


collection of phenotypic data. AR provided necessary support in the statistical analysis. SK, HA, SG, AJ wrote the manuscript. MA, AK, AJ, SG provided facilities for completion of experiments and reviewed the manuscript.

#### ACKNOWLEDGMENTS

This work was supported by the DST-PURSE grant of Department of Science and Technology, Government of India (grant no. Dean(R)/2009/868) provided to the University of Delhi. HA was aided by a research fellowship from University Grants Commission, India. We sincerely thank the reviewers for their critical comments and suggestions to improve the manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 01554


Llimensee, and W. J. Peacock (Cambridge: Cambridge University Press), 161–170.


rice (Oryza rufipogon Griff.) populations in China. PLoS ONE 10:e0145990. doi: 10.1371/journal.pone.0145990


for diversified utilization of germplasm. Crop Sci. 49, 1769–1780. doi: 10.2135/cropsci2009.01.0014


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Kumar, Ambreen, Variath, Rao, Agarwal, Kumar, Goel and Jagannath. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Trichome-Related Mutants Provide a New Perspective on Multicellular Trichome Initiation and Development in Cucumber (*Cucumis sativus* L)

#### Xingwang Liu1, 2, Ezra Bartholomew1, <sup>2</sup> , Yanling Cai <sup>2</sup> and Huazhong Ren1, 2 \*

<sup>1</sup> College of Horticulture, China Agricultural University, Beijing, China, <sup>2</sup> Beijing Key Laboratory of Growth and Developmental Regulation for Protected Vegetable Crops, China Agricultural University, Beijing, China

Trichomes are specialized epidermal cells located in aerial parts of plants that function in plant defense against biotic and abiotic stresses. The simple unicellular trichomes of Arabidopsis serve as an excellent model to study the molecular mechanism of cell differentiation and pattern formation in plants. Loss-of-function mutations in Arabidopsis thaliana have suggested that the core genes GL1 (which encodes a MYB transcription factor) and TTG1 (which encodes a WD40 repeat-containing protein) are important for the initiation and spacing of leaf trichomes, while for normal trichome initiation, the genes GL3, and EGL3 (which encode a bHLH protein) are needed. However, the positive regulatory genes involved in multicellular trichrome development in cucumber remain unclear. This review focuses on the phenotype of mutants (csgl3, tril, tbh, mict, and csgl1) with disturbed trichomes in cucumber and then infers which gene(s) play key roles in trichome initiation and development in those mutants. Evidence indicates that MICT, TBH, and CsGL1 are allelic with alternative splicing. CsGL3 and TRIL are allelic and override the effect of TBH, MICT, and CsGL1 on the regulation of multicellular trichome development; and affect trichome initiation. CsGL3, TRIL, MICT, TBH, and CsGL1 encode HD-Zip proteins with different subfamilies. Genetic and molecular analyses have revealed that CsGL3, TRIL, MICT, TBH, and CsGL1 are responsible for the differentiation of epidermal cells and the development of trichomes. Based on current knowledge, a positive regulator pathway model for trichome development in cucumber was proposed and compared to a model in Arabidopsis. These data suggest that trichome development in cucumber may differ from that in Arabidopsis.

Keywords: unicellular, *arabidopsis*, multicellular, cucumber, mutants, trichome-related genes, regulator pathway

#### INTRODUCTION

Plant trichomes are highly specialized epidermal protrusions that are located on the surfaces of leaves, stems, petioles, sepals, seed coats, and other aerial organs. Their diversity is almost as great as the number of species on which they are found. Morphologically, they can be unicellular or multicellular as well as secretory glandular or non-glandular (Hülskamp et al., 1998; Hülskamp, 2004; Tissier, 2012; Chen et al., 2014). In the model plant Arabidopsis thaliana,

#### *Edited by:*

John Doonan, Aberystwyth University, UK

#### *Reviewed by:*

Henrik Buschmann, University of Osnabrück, Germany Amy T. Hark, Muhlenberg College, USA

> *\*Correspondence:* Huazhong Ren renhuazhong@cau.edu.cn

#### *Specialty section:*

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

> *Received:* 20 May 2016 *Accepted:* 22 July 2016 *Published:* 10 August 2016

#### *Citation:*

Liu X, Bartholomew E, Cai Y and Ren H (2016) Trichome-Related Mutants Provide a New Perspective on Multicellular Trichome Initiation and Development in Cucumber (Cucumis sativus L). Front. Plant Sci. 7:1187. doi: 10.3389/fpls.2016.01187 extensive studies have been performed on unicellular trichome development, especially on leaves (Hülskamp, 2004; Ishida et al., 2008; Pesch and Hülskamp, 2009). Classical molecular genetic approaches have identified several regulators that work in distinct developmental processes, such as trichome initiation/formation, endo-reduplication, and branch construction, and growth orientation (Schwab et al., 2000; Szymanski et al., 2000; Chen et al., 2014). The regulatory pathways of unicellular trichome development comprise both positive (mutants develop fewer trichomes) and negative (mutants develop more and/or clusters of trichomes) transcription factors (Ishida et al., 2008; Balkunde et al., 2010; Grebe, 2012). The crucial positive transcription factors belong to three protein classes: one WD40-repeat protein TRANSPARENT TESTA GLABRA1 (TTG1) (Galway et al., 1994; Walker et al., 1999); four basic helix-loop-helix (bHLH) proteins: GLABRA3 (GL3)m ENHANCER OF GLABRA3 (EGL3) (Payne et al., 2000; Zhang et al., 2003), TRANSPARENT TESTA (TT8) (Zhang et al., 2003), and MYC-1 (Zhao et al., 2012); and three R2-R3 type-MYB transcription factors: GLABRA1 (GL1), MYB23 and MYB5 (Oppenheimer et al., 1991; Li et al., 2009; Tominaga-Wada et al., 2012). They bind together to form an MYB-bHLH-WD40 (MBW) trimeric complex that activates the downstream gene GL2 (GLABRA2), which initiates trichome differentiation (Pesch and Hülskamp, 2009). Moreover, several small, single-repeat MYB-negative regulatory proteins, such as TRIPTYCHON (TRY), CAPRICE (CPC), ENHANCER OF TRYAND CPC1 (ETC1), ECT2, ETC3, CAPRICE-LIKE MYB3, and TRICHOMELESS1, and TRICHOMELESS2 (TCL1, and TCL2, respectively), have been shown to act in a non-cellautonomous manner (Wada et al., 1997, 2002; Esch et al., 2004; Kirik et al, 2004; Pesch and Hülskamp, 2009; Wester et al., 2009; Edgar et al., 2014; Hauser, 2014; Wang and Chen, 2014). They compete with the R2R3 MYB protein GL1 and bind to bHLH proteins, including GL3/EGL3, to suppress trichome initiation in adjacent cells (Wang et al., 2014).

Cucumber (Cucumis sativus L.), as one of the most important vegetable crops, is also covered with trichomes ranging from the stems and leaves to the flowers, branches, fruits, and tendrils. During early fruit development, deep ridges along the length of the fruit cover the fruit surface, and densely spaced fruit trichomes are randomly scattered relative to the ridges (Liu et al., 2011; Chen et al., 2014). Trichomes on cucumber fruit are called spines. In cucumber, the fruit spine combines with tubercles to form the warty fruit trait, which is a very important fruit quality trait (Chen et al., 2014; Li et al., 2015). Compared to warty fruit, smooth fruit, with no spines, or tubercles, is very important for the breeding of freshly eaten cucumber types, as they are easier to clean, package, transport and store (Zhang et al., 2010; Yang et al., 2014). Moreover, smooth fruit is becoming increasingly popular due to its attractive and distinctly shiny appearance (Li et al., 2015; Pan et al., 2015; Cui et al., 2016). Despite the importance of fruit spines in cucumber breeding for external quality, there are limited reports of the regulation of fruit spine development and few detailed characterizations of cucumber genes with disturbed trichome development. To understand trichome development in cucumber, we must determine the crucial regulators for its initiation and formation based on cucumber trichome-related mutants.

We begin this review by focusing on cucumber trichomerelated mutants and the role that they play in trichome formation in plants with multicellular trichomes. Several mutants, such as trichome-less (tril) (Wang et al., 2016), glabrous 3 (csgl3) (Pan et al., 2015; Cui et al., 2016), tiny branched hair (tbh) (Chen et al., 2014), micro-trichome (mict) (Zhao et al., 2015a), and glabrous 1 (csgl1) (Li et al., 2015), inhibit trichome development via different mechanisms, but all of these mutants cause a reduction in one type of spine. The possible mechanisms involved in modulating trichome development in cucumber mutants are summarized herein.

## MUTANTS WITH DISTURBED TRICHOME IN CUCUMBER

Five trichome-related mutants have been reported in cucumber (Chen et al., 2014; Li et al., 2015; Pan et al., 2015; Zhao et al., 2015a; Cui et al., 2016; Wang et al., 2016). Wild-type cucumber fruits have two types of trichomes, both of which are multicellular (**Figures 1**, **2**) (Chen et al., 2014). Type I trichomes are small, glandular trichomes, with a 3- to 5-cell base topped with a 4- to 8-cell head. Type II trichomes, which predominate, are much larger, non-glandular trichomes that are composed of a base and stalk (**Figure 2**). Compared to wild-type cucumber plants, all mutants appeared to be glabrous with no noticeable trichomes on the leaves, stems, tendrils, floral organs, or fruits (**Table 1**, **Figures 1**, **2**). Each mutant has a trichome type except for the mutants tril and csgl3 (**Figure 2**). Trichomes have been reclassified into three morphologies according to their shape on leaves and fruit surfaces. Type I trichomes, which are found in the tbh, mict, and csgl1 mutants, have a small papillar-shaped head (**Figures 2A,D,F**). Type II trichomes, which exist in the tbh and mict mutants, consist of one to five rounded cells without the pyramid-shaped head or pieshaped base (**Figures 2B,E**). However, type III trichomes occur only in tbh mutants; this type of trichome was not previously described. Here, we classified these trichomes into the new type III according to the branching at the top of the trichome (**Figure 2C**). Trichome phenotypes from tbh, mict and csgl1 indicated that these three mutants are all involved in trichome development and not in trichome initiation. Interestingly, the tril and csgl3 mutants showed a completely glabrous phenotype on the shoot and fruit epidermis (**Figures 2G,H**), suggesting that TRIL and CsGL3 may be upstream positive regulators of TBH, MICT, and CsGL1 for the regulation of multicellular trichome development and may affect epidermal cell initiation.

**Abbreviations:** TRIL, trichome-less; TBH, tiny branched hair; MICT, microtrichome; TTG1, transparent testa glabra1; bHLH, basic helix-loop-helix; GL3, glabra3; EGL3, enhancer of glabra3; MBW, MYB-bHLH-WD40; TRY, triptychon; CPC, caprice; ETC1, enhancer of tryand caprice1; HD-Zip, homeodomain-leucine zipper; PDF2, protodermal factor2; ATML1, Arabidopsis thaliana meristem layer1; MADS, MCMI-AGAMOUS-DEFICIENS-SRF4.

Compared to the multicellular trichomes found on aerial organs, such as leaves and fruits, the trichomes found on underground organs, such as root hairs, are characterized as single-celled, unbranched, elongated, and soft-structured with small tumors attached. Based on the published results, there was no difference between the wild-type and the tbh, mict, and csgl1 mutants (Chen et al., 2014; Li et al., 2015; Zhao et al., 2015a), indicating that the TBH, MICT, and CsGL1 mutants may not be involved in root hair formation. In contrast, the root length and number of branches of the tril mutant increased, suggesting that root hair formation in cucumber might be regulated by TRIL (Wang et al., 2016).

Other remarkable differences (except for the common phenotype) may exist among each mutant, such as dwarfism, branching, leaf curvature, and petal opening rates. In mict and tbh, rounded-head trichomes were found on the hypocotyls,

(Li et al., 2015), and Picture (G) is from (Wang et al., 2016).


TABLE 1 | Previous morphological studies of trichome in *tbh*, *csgl1*, *mict*, *tril and csgl3*.

but none were found in csgl1, suggesting trichome distribution is specific (Chen et al., 2014; Li et al., 2015; Zhao et al., 2015a).

#### *TBH, MICT*, AND *CSGL1* ARE THE ALLELIC WITH ALTERNATIVE SPLICING

To decipher the molecular defects in the mict, csgl1, and tbh mutants, a map-based cloning approach was undertaken by three independent groups to isolate these genes. The results showed that TBH, MICT, and CsGL1 are allelic and that there was a 2649 bp fragment deletion from -189 to 2460 bp of the start codon in Csa3M748220 in the mict, csgl1, and tbh mutants (Li et al., 2015; Zhao et al., 2015a) (**Figure 3A**). According to the gene ID that provided in the references, we extracted their proteins from the cucumber genome database (http://www.icugi.org/cgi-bin/ICuGI/genome/search.cgi) and analyzed their protein domains online (https://blast.ncbi.nlm. nih.gov/Blast.cgi). These genes were predicted to encode a class I homeodomain-leucine zipper (HD-Zip) protein consisting of different amino acid residues. For example, CsGL1 and TBH contain 240 amino acids with the conserved HD (65AA-120AA) and Zip domains (121AA-164AA), but the MICT protein consists of 242 amino acid residues with a HALZ motif. Notably, there were two isoforms of this gene in the cucumber genome database (Csa3M748820.1 and Csa3M748820.2); therefore, we infer that the different protein lengths encoded by the same gene suggest that Csa3M748220 may exist by alternative splicing in cucumber. Subcellular localization showed that the Csa3M748220 coding sequence was fused to GFP (35S) in the nuclei of tobacco and onion epidermal cells (Zhao et al., 2015a,b). Moreover, a transcriptional activation activity assay in yeast found that Csa3M748220 had weak activity as a transcriptional activator (Zhao et al., 2015a). Therefore, based on the above results, Csa3M748220 has the typical features of a transcription factor. HD-Zip proteins are unique to the plant kingdom and can be classified into four groups, I-IV, based on their distinctive traits of DNA-binding specificities, gene structures, and common motifs (Abe et al., 2003; Hülskamp et al., 2005). HD-Zip I genes have been demonstrated to be involved largely in biological processes, such as abiotic stress responses, meristem regulation,and trichome development (Hanson et al., 2001; Himmelbach et al., 2002; Hjellström et al., 2003; Saddicl et al., 2006; Zhao et al., 2015a,b).

#### *TRIL*, *CSGL3* ARE THE ALLELIC WHICH OVERRIDE THE EFFECT OF *TBH/MICT/ CSGL1* ON THE REGULATION OF MULTICELLULAR TRICHOME INITIATION AND DEVELOPMENT

Thereare few reports on the regulatory genes that control multicellular trichome development in cucumber. This presents a good opportunity to use these mutants to analyze their regulatory mechanisms in trichomes of cucumber. As mentioned above, tril and csgl3 show a completely different glabrous morphology from that of the other three trichome-developmentrelated mutants, and only the epidermal cells, including stomata and encircling guard cells, were visible (**Figure 2G**). This indicates that TRIL and CsGL3 function in trichome cell fate determination. A transcriptome profiling analysis among wildtype, tril, tbh, mict, and csgl1 revealed that TBH, MICT, and CsGL1 were not expressed in the tril mutant (baseMean 0) but were highly expressed in the wild-type (baseMean 518.25) (Zhao et al., 2015a). TRIL was mapped to Cas6M514870, a member of the class IV HD-Zip family that shares 66.7% identity with PROTODERMAL FACTOR2 (PDF2, At4g04890.1), a shoot epidermal cell differentiation-related gene, and 35% identity with GL2 (At1g79840.2), a gene that initiates trichome differentiation in Arabidopsis. In the tril mutant, there is a 5008 bp insertion fragment after the first exon (**Figure 3B**). CsGL3 was also mapped to Csa6M514870 by Pan and Cui (Pan et al., 2015; Cui et al., 2016). In Pan's study, the loss of function of the CsGL3 was due to the insertion of a 5-kb-long terminal repeat (LTR) retrotransposon in the 4th exon of CsGL3 (**Figure 3B**). The insertion location is different from that of tril. Cui used three markers (InDel-19, dCAPs-2, and dCAPs-11) designed from the sequence of Csa6M514870, which co-segregated with the trait. In addition, Csa6M514870 was found to harbor 3 single-base substitutions: T→C (611 bp), G→A (820 bp), and G→A (865 bp) at the fourth exon, resulting in a change in the amino acid sequence. We still do not know how the CsGL3 changed in its corresponding mutant csgl3.

In the tbh, mict, and csgl1 mutants, TRIL/CsGL3 was overexpressed at different levels due to the different stages at which the samples were collected (Chen et al., 2014; Li et al., 2015; Zhao et al., 2015a). Moreover, the trichome phenotype analyzed in the F2 population between the tril and mict mutants suggests that the TRIL gene has a significant influence on trichome initiation and in determining the fate of epidermal cells,

whereas the MICT/TBH/CsGL1 gene only influences trichome development at the shoot. Genetically, based on the doublemutant phenotype in csgl3csgl1, the TRIL/GsGL3 gene is assumed to act upstream of the MICT/TBH/CsGL1 gene and control the expression of MICT/TBH/ CsGL1 (Zhao et al., 2015a; Wang et al., 2016).

#### MODELS FOR TRICHOME PATTERNING IN CUCUMBER MAY DIFFER FROM MODELS IN *ARABIDOPSIS*

In the model plant Arabidopsis, the activator-inhibitor model has guided intuitive modeling and experimental design for a long time because it offers a reasonable explanation for the apparently paradoxical situation in that trichome-promoting and trichomeinhibiting genes are both expressed strongly in trichomes (Pesch and Hülskamp, 2009). Here, we focus on a positive regulatory model for trichome development between Arabidopsis and cucumber. The main reason is that all of the mutants in cucumber reveal fewer or no trichomes visually, suggesting that they encode a positive regulator of trichome development. Solid evidence of the genetic basis of trichome initiation has identified genes that (a) control the entry into the trichome pathway and (b) control the spacing of initiation events. Lossof-function mutations in Arabidopsis thaliana have suggested that the genes GL1 (which encodes an MYB transcription factor) and TTG1 (which encodes a WD40 repeat-containing protein) are important for the initiation and spacing of leaf trichomes (Galway et al., 1994; Walker et al., 1999), while for normal trichome initiation, the genes GL3, and EGL3 (Payne et al., 2000; Zhang et al., 2003), which encode helix-loop-helix (bHLH) proteins, are needed (**Figure 4A**).

In cucumber, the TRIL/CsGL3 gene encodes a HD-Zip IV protein. The loss-of-function mutant showed that tril/csgl3 has a trichome-less phenotype, indicating that this gene is a positive regulator of trichome development. Its homologous genes PDF2 and ARABIDOPSIS THALIANA MERISTEM LAYER1 (ATML1) in Arabidopsis also encode HD-Zip IV protein family members (Nakamura et al., 2006). The pdf2atml1 double mutant displays striking defects in shoot epidermal cells (Abe et al., 2003). In tomato, another typical multicellular trichome plant, Wo RNAi transgenic tomato plants showed similar shoot and root epidermal cell formation as that of the tril mutant. Wo shares an amino acid sequence identical to that of TRIL/CsGL3 and belongs to the same HD-Zip IV protein family. These results indicate that the role of TRIL/CsGL3 in the initiation and control of multicellular trichrome pathways. Based on previous results, a proposed model explaining trichome patterning in the positive regulator pathway in cucumber was built (**Figure 4B**). In this model, HD-Zip transcription factors may bind to other types of transcription factors to generate a specific complex to control cucumber multicellular trichome formation.

Based on transcriptional data from all of the mutants, several candidate genes should be focused on extensively (Chen et al., 2014; Zhao et al., 2015a,b). For example, the CsMYB6, CsWIN1, and CsGL2 genes were down-regulated not only in tril/csgl3 but also in tbh, mict, and csgl, indicating that those genes were involved in multicellular trichome development in cucumber.

#### DISCUSSIONS

In the past decade, substantial progress has been made in delineating the genes that control trichome development in

developmental progression.

a complex to positively regulate trichome development. The core complex in MBW (A) may differ from the HD-Zip complex (B). The bold arrows in the boxes indicate

cucumber. Several groups provide evidence to suggest that a number of transcriptional activators, such as HD-Zip I and HD-Zip IV, play a role in fine tuning the spatial and temporal distribution of trichomes. Researchers have tried to demonstrate the possible mechanisms of these transcriptional activators.

However, much remains unknown and needs to be elucidated in future research. The functions of the HD-Zip IV and HD-Zip I genes require further investigation through genetic transformation in cucumber plants. Our current knowledge about the gene regulatory networks is largely limited to the unicellular trichomes in the model plant Arabidopsis. However, little is known about the regulatory network that controls the development of multicellular trichomes in cucumber. Evidence indicates that the common regulatory mechanisms of unicellular trichomes in Arabidopsis or of multicellular trichomes in cotton involve plant-specific genes that function distinctively. An analysis of the differential expression data generated by RNA-seq can offer new information for identifying putative key multicellular development transcription factors in cucumber. This is why we focused on the critical transcription factor genes. A new set of more than 42 transcription factor genes, including Homeodomain, MCMI-AGAMOUS-DEFICIENS-SRF4 (MADS), and WRKY domains, has been identified in a transcriptome analysis of all cucumber trichomerelated mutants (Zhao et al., 2015a). These transcription factor genes are unique to plants and are involved in a range of activities; many are associated with multicellular trichrome development and other species-specific development processes. The study of the interaction among those transcription factors is an emerging area of research because these factors probably share many biological functions for trichome development. All of this information warrants further investigation.

The aim of this review is to provide readers with a summary of the progress of cucumber trichome development and to

#### REFERENCES


encourage plant scientists to further investigate the mechanism of trichrome initiation and development and their regulatory network. We have summarized the mutants related to fruit trichomes, the key genes that control trichome initiation and development, gene relationships and a possible model for fruit trichome positive regulatory mechanisms that is different from Arabidopsis in core transcriptional factor numbers. All of this may help us to better understand the advances in the study of cucumber trichomes. At the same time, more investigations must be conducted.

Because cucumber is a horticultural crop of worldwide importance and fruit spines directly affect its commercial quality, an extensive characterization of cucumber trichomes will not only help us to understand the underlying molecular mechanisms involved in multicellular trichome development but will also pave the way for creating new cucumber varieties with desired trichome growth and density. Moreover, cucumber trichomes may serve as a model system for studying the development of multicellular trichomes.

#### AUTHOR CONTRIBUTIONS

HR and XL organized the review, wrote the first draft and generated **Figures 1**–**4**. EB and YC contributed to a second draft. All of the authors revised the manuscript multiple times. HR and XL performed the final revision of the manuscript, which was read and approved by all authors.

#### ACKNOWLEDGMENTS

We thank the members of the Ren lab for helpful discussions for this review. This work was supported by National Research and Development Program (2016YFD0101705), Beijing Agricultural Innovation Consortium (BAIC01-2016) and Beijing Agricultural Scientific and Technological Program (20160415) to HR.


TRANSPARENT TESTA GLABRA1 (TTG1) gene. Plant Mol. Biol. 57, 67–81. doi: 10.1007/s11103-004-6768-1


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Liu, Bartholomew, Cai and Ren. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Interactive Effects of Elevated [CO2] and Water Stress on Physiological Traits and Gene Expression during Vegetative Growth in Four Durum Wheat Genotypes

#### Susan Medina1, 2, Rubén Vicente<sup>1</sup> \*, Amaya Amador <sup>3</sup> and José Luis Araus <sup>1</sup>

1 Integrative Crop Ecophysiology Group, Plant Physiology Section, Faculty of Biology, University of Barcelona, Barcelona, Spain, <sup>2</sup> Crop Physiology Laboratory, International Crops Research Institute for Semi-Arid Tropics, Patancheru, India, <sup>3</sup> Unitat de Genòmica, Centres Científics i Tecnològics, Universitat de Barcelona, Barcelona, Spain

#### Edited by:

Paul Christiaan Struik, Wageningen University and Research Centre, Netherlands

#### Reviewed by:

Iker Aranjuelo, Instituto de Agrobiotecnología (CSIC-UPNA), Spain Fulai Liu, University of Copenhagen, Denmark Salvatore Ceccarelli, Rete Semi Rurali, Italy

\*Correspondence:

Rubén Vicente vicenteperez.ruben@gmail.com

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 06 June 2016 Accepted: 04 November 2016 Published: 22 November 2016

#### Citation:

Medina S, Vicente R, Amador A and Araus JL (2016) Interactive Effects of Elevated [CO2] and Water Stress on Physiological Traits and Gene Expression during Vegetative Growth in Four Durum Wheat Genotypes. Front. Plant Sci. 7:1738. doi: 10.3389/fpls.2016.01738 The interaction of elevated [CO2] and water stress will have an effect on the adaptation of durum wheat to future climate scenarios. For the Mediterranean basin these scenarios include the rising occurrence of water stress during the first part of the crop cycle. In this study, we evaluated the interactive effects of elevated [CO2] and moderate to severe water stress during the first part of the growth cycle on physiological traits and gene expression in four modern durum wheat genotypes. Physiological data showed that elevated [CO2] promoted plant growth but reduced N content. This was related to a down-regulation of Rubisco and N assimilation genes and up-regulation of genes that take part in C-N remobilization, which might suggest a higher N efficiency. Water restriction limited the stimulation of plant biomass under elevated [CO2], especially at severe water stress, while stomatal conductance and carbon isotope signature revealed a water saving strategy. Transcript profiles under water stress suggested an inhibition of primary C fixation and N assimilation. Nevertheless, the interactive effects of elevated [CO2] and water stress depended on the genotype and the severity of the water stress, especially for the expression of drought stress-responsive genes such as dehydrins, catalase, and superoxide dismutase. The network analysis of physiological traits and transcript levels showed coordinated shifts between both categories of parameters and between C and N metabolism at the transcript level, indicating potential genes and traits that could be used as markers for early vigor in durum wheat under future climate change scenarios. Overall the results showed that greater plant growth was linked to an increase in N content and expression of N metabolism-related genes and down-regulation of genes related to the antioxidant system. The combination of elevated [CO2] and severe water stress was highly dependent on the genotypic variability, suggesting specific genotypic adaptation strategies to environmental conditions.

Keywords: climate change, durum wheat, elevated [CO2 ], genotypic variability, stable isotopes, transcript levels, vegetative growth, water stress

## INTRODUCTION

Food security is facing new challenges nowadays due to the increase in the world population and the impacts of climate change on agriculture and food supply. Wheat is a very important crop for the human diet, ranking in fourth position in terms of the world's most important crops by production quantity after sugarcane, maize, and rice (FAO, 2013). Although bread wheat dominates global wheat production, durum wheat is an economically and culturally important staple crop in the Mediterranean region, used for the production of pasta, bread, burghul, couscous, and freekeh (Habash et al., 2009). In the second half of the twentieth century, local durum wheat landraces were replaced by improved semi-dwarf cultivars, which showed higher yield and harvest index (Soriano et al., 2016). In the early 1970s, introduction of germplasm from CIMMYT (International Maize and Wheat Improvement Centre) increased grain yield (Sanchez-Garcia et al., 2013). Improvement in wheat yield per unit area constitutes one of the largest challenges to be addressed by breeding programs, covering numerous research areas (McKersie, 2015). Projections of wheat production assume that the growth rate will be lower than the historical growth rates reported in the second half of the twentieth century (Bort et al., 2014; Nakhforoosh et al., 2015), with insignificantly higher yields in modern wheat genotypes released in recent years (Sanchez-Garcia et al., 2013). It is unlikely that any improvements will support the increase in world population or mitigate against future extreme weather events (Araus et al., 2002; Alexandratos and Bruinsma, 2012; Trnka et al., 2014).

Observations of the climate system confirm that Earth's mean surface temperature is increasing rapidly as a consequence of the anthropogenic emissions of CO<sup>2</sup> and other greenhouse gases (IPCC, 2013). The atmospheric concentration of CO<sup>2</sup> ([CO2]) has increased by more than 40% since the beginning of the industrial revolution and is expected to double by the end of this century (IPCC, 2013). As atmospheric [CO2] is currently a limiting factor for C<sup>3</sup> photosynthesis, the primary effect of a short-term exposure to elevated [CO2] includes an initial stimulation of photosynthesis due to both enrichment of substrate for ribulose bisphosphate carboxylase oxygenase (Rubisco) carboxylation and inhibition of competitive Rubisco oxygenation which may eventually contribute to a higher biomass (Stitt and Krapp, 1999; Long et al., 2006). High [CO2] also induces a stomatal closure leading to a better leaf water status. However, growth over the long-term under elevated [CO2] leads to a down-regulation of photosynthetic capacity, which has been related to a decline in Rubisco protein content and activity, together with a higher carbohydrate accumulation and a decline in N concentration and protein content in wheat (Aranjuelo et al., 2011, 2013; Vicente et al., 2015a,b). This phenomenon suggests that regulatory mechanisms may occur in the plant, e.g., end-product inhibition, carbon sink limitation, biomass dilution effects, or a decline in nutrient uptake and/or assimilation (Stitt and Krapp, 1999; Vicente et al., 2015a). Moreover, elevated [CO2] leads to an altered expression pattern of genes involved in the photosynthetic apparatus, the distribution of C, respiration, and N metabolism in durum wheat (Vicente et al., 2015b).

Increasing greenhouse gas emissions may cause further warming together with rainfall reduction in the next decades, which will increase the frequency and intensity of drought in the Mediterranean basin (Habash et al., 2009; IPCC, 2013; McKersie, 2015). For the Iberian Peninsula it is predicted that drought stress can occur at any growth stage of wheat (Russo et al., 2015), with the grain-filling phase being the most studied. However, the number of studies focusing on drought stress during early growth is limited. Although rainfall has been traditionally most abundant and evapotranspiration the lowest during winter, the occurrence of drought in winter months during the early stages of the crop cycle has been reported in recent times (Russo et al., 2015). This can further constrain wheat growth and thus final grain yield, mostly through a decrease in the ear density and number of kernels per unit crop area (Araus et al., 2008; Rebolledo et al., 2013). In addition, a constitutive (i.e., in absence of water stress) rapid development of wheat plants (early vigor) could be a positive trait and relevant for further avoiding drought stressrelated consequences at both early and late growth stages. Early vigor could benefit plant growth and yield by increasing resource acquisition, shading the soil, preventing evaporation from it, and suppressing weeds (Maydup et al., 2012; Bort et al., 2014; Pang et al., 2014). As a consequence, differences in early growth (tillering and further stem elongation) will affect the number of fertile stems (and thus the ear density) and the size of the ears (and thus the potential number of grains per ear), which are the main contributors determining grain yield (Guo et al., 2016).

Plant responses to water stress define a complex and sophisticated regulatory network comprising physiological, biochemical, and molecular mechanisms. In wheat, some of these responses include inhibition of plant growth and photosynthetic capacity, together with a wide range of physiological responses, including changes in stomatal closure and decreases in transpiration, Rubisco efficiency, and chlorophyll content as well as an increase in oxidative stress among other responses (Budak et al., 2013; Nezhadahmadi et al., 2013). Such responses are modulated by stress severity. Cessation of watering showed a progressive reduction in leaf relative water content, water potential and photosynthesis in durum wheat (Habash et al., 2014). Liu et al. (2016) reported a progressive inhibition of photosynthetic activity as water stress is more severe in fieldgrown bread wheat, probably due to non-stomatal limitations, which led to lower grain yields even at moderate water stress. Furthermore, water stress in wheat leads to complex changes in the expression of some genes, including those involved in photosynthesis, respiration, N metabolism, lipid metabolism, transcription factors, signal transducers, and synthesis of protective proteins (Habash et al., 2009, 2014; Budak et al., 2013; Yousfi et al., 2016). These changes in gene expression occurred mainly in the early phases of the stress (Habash et al., 2014).

Plant responses to elevated [CO2] or water stress are influenced by the duration and level of the environmental factor, the growth stage, and the genetic variability. Studies carried out with different durum wheat genotypes demonstrated that the responsiveness to elevated [CO2] (Aranjuelo et al., 2013), water stress (De Leonardis et al., 2007; Aprile et al., 2013; Habash et al., 2014), and the combination of both (Erice et al., 2014) is genotype specific. Moreover, the growth stage greatly influences the response of durum wheat to elevated [CO2] (Aranjuelo et al., 2011; Vicente et al., 2015a) and drought (Liu et al., 2016). In addition, the interactive effects of environmental conditions and genotypic variability cannot be anticipated from the individual effects of these treatments (Ceccarelli et al., 1991). Some studies have shown positive effects of elevated [CO2] on water stress tolerance of different bread wheat varieties (Harnos et al., 2002; Wall et al., 2006; Robredo et al., 2011; Bencze et al., 2014). A positive synergistic effect of elevated [CO2] and water stress has been reported to decrease g<sup>s</sup> , and thus leads to an improvement in water use efficiency at the stomatal and whole plant level (Bencze et al., 2014; Pazzagli et al., 2016). The decrease in photosynthesis under water stress is often mitigated by elevated [CO2] (Bencze et al., 2014), resulting in increased levels of carbohydrates for the development of new tissues or filling grain (Wall et al., 2006). However, such positive effects of elevated [CO2] in improving stress tolerance are not always achieved (Hudak et al., 1999; Pleijel et al., 2000). Bencze et al. (2014) reported that drought at elevated [CO2] led to a stimulation of the antioxidant enzyme system in bread wheat, which suggests a high level of oxidative stress. Erice et al. (2014) showed that the stimulation of plant growth by elevated [CO2] was only found in durum wheat genotypes with high harvest indices and optimal water supply. Therefore, additional efforts are still necessary to deepen our understanding of the interactive effect of [CO2] and water regime in durum wheat.

The aim of this work was to determine the physiological and molecular mechanisms involved in the adaptive response of four semi-dwarf (i.e., post-Green Revolution) durum wheat cultivars to different [CO2] and water regimes. Durum wheat genotypes were grown under controlled conditions at ambient and elevated [CO2] and two different water regimes (fully irrigated and moderate/severe water stress). We assessed plant growth, physiological traits, stable C and N isotopic signatures, and transcript levels for stress-responsive genes that could be good indicators of durum wheat's adaptation to future climate conditions at vegetative growth stages. The genes selected corresponded to key enzymes in the metabolism of C (the Rubisco large and small subunits, RBCL and RBCS, respectively, and phosphoenolpyruvate carboxylase, PEPC) and N (the cytosolic and plastidial glutamine synthetases, GS1 and GS2, respectively), as well as proteins involved in stress responses (dehydrins 11, DHN11, and 16, DHN16, catalase, CAT, and superoxide dismutase, SOD). Rubisco is the key enzyme for photosynthetic CO<sup>2</sup> assimilation, and its activity is highly responsive to atmospheric [CO2] (Vicente et al., 2011; Carmo-Silva et al., 2015). PEPC is a cytosolic enzyme that catalyzes the βcarboxylation of phosphoenolpyruvate to produce oxaloacetate, which is involved in anaplerotic functions. GS1 and GS2 play a central role in N metabolism: the former is thought to be involved in the primary assimilation of ammonium from nitrate reduction and photorespiration, while the latter is mainly involved in the transport of N through the plant and N recycling from catabolic processes. The function of the dehydrin family is not completely understood, but these proteins are involved in conferring stress tolerance (Kosová et al., 2014). Catalases and superoxide dismutases are primary antioxidant enzymes involved in the elimination of reactive oxygen species (ROS) such as the cytotoxic H2O<sup>2</sup> produced by photorespiration (Luna et al., 2005) and the superoxide generated during photosynthetic electron transport (Xu et al., 2010; Huseynova et al., 2014). Thus, our study combines the effects of genotypic variability and future environmental conditions, integrating plant performance with gene expression, and aims to identify traits associated with better performance during vegetative growth.

#### MATERIALS AND METHODS

#### Plant Material and Growth Conditions

The experiment was conducted with four semi-dwarf durum wheat [Triticum turgidum L. ssp. durum (Desf.)] genotypes: Mexa (year of commercial release: 1977), Regallo (1988), Burgos (1997) and Ramirez (2006). These cultivars represent high-yield genotypes released in the last forty years that are (or were) widely cultivated in the Mediterranean regions of Spain. The study of these genotypes could provide information about the adaptation of modern cultivars to climate change and whether there are differences between them associated with the year they were released. The experiment was conducted from May to July 2015 in two controlled environment chambers (Conviron E15; Controlled Environments, Winnipeg, MB, Canada) in the Experimental Facilities of the Faculty of Biology at the University of Barcelona. A total of 96 durum wheat plants (24 for each genotype) were sown in 2 L pots containing a mixture of standard substrate:perlite (1:1, v/v) and were grown with a long light period of 16 h, a photosynthetic photon flux density (PPFD) of 350 µmol m−<sup>2</sup> s −1 , a day/night temperature of 23/17◦C and a relative humidity of 60%. During the entire experiment, half of the pots were cultivated under atmospheric [CO2] (400 µmol mol−<sup>1</sup> ) in one chamber, while the other half grew under elevated [CO2] (790 <sup>µ</sup>mol mol−<sup>1</sup> ) in the other chamber with injection of CO<sup>2</sup> from an external bottle (Carburos Metálicos S.A., Barcelona, Spain). The temperature, relative humidity and [CO2] within each chamber were continuously monitored by Conviron series controllers (CMP3243 Controlled Environments Ltd., Winnipeg, MB, Canada). The technical staff of the Experimental Facilities of the Faculty of Biology tested the growth conditions of each chamber periodically with external sensors: an HMP75 humidity and temperature probe and a GMP222 CO<sup>2</sup> probe for use with an MI70 series hand-held indicator (Vaisala, Vantaa, Finland). Similarly, the PPFD was periodically verified with an LI-188B quantum/radiometer/photometer (LI-COR Inc., Lincoln, NB, USA).

The plants were uniformly irrigated every 2 days with 50% Hoagland's nutrient solution over a 25 day period. After that (Zadoks 21), the water stress was imposed; one half of the plants of each genotype and [CO2] were maintained under wellwatered conditions (100% pot capacity, PC) until the end of the experiment, while the other half were subjected to water stress conditions. The maximum soil volumetric water content of each pot was evaluated at the beginning of the experiment as the difference between pot weight after watering with the excess water drained and the pot dry weight. Thus, pots were watered by direct measurements of the pot weight and the water supply was adjusted to the pot water conditions established for each water regime. In the water-stressed plants the watering was progressively restricted by 10% PC every 2 days. First, after 8 days the water-stressed plants received a 60% PC (moderate water stress) and this irrigation regime was strictly maintained for 10 days (see **Figure 1** for a schematic representation of the experimental design). At the end of this period (Zadoks 26), equal numbers (48) of well-watered and water-stressed plants were sampled. The youngest fully expanded leaf was collected, immediately frozen in liquid nitrogen and stored at −80◦C for gene expression, C and N content and stable isotope analyses. After that, the whole plant was harvested and dried in an oven at 60◦C for 72 h for biomass analysis. Second, in the remaining half of the plants (48), the progressive water limitation continued for 8 more days until water-stressed plants received a 30% PC (severe water stress). As in the moderate water stress, the irrigation conditions in well-watered and water-stressed plants were maintained for 10 days. Later, these 51-day-old plants (Zadoks 28–32) were collected following the procedure described above. The moderate and severe water stresses were defined in this experiment based on similar reductions in irrigation and stomatal conductance used in other studies (Galmes et al., 2007; Liu et al., 2016). The pots were rotated three times a week to avoid edge effects in the growth chambers over the course of the experiment. We used a rotatory randomized complete block design with three replicates (one plant per pot) per factor combination ([CO2], water level and genotype) at each sampling.

#### Physiological Traits

Prior to harvest a hand-held portable spectroradiometer (GreenSeeker, NTech Industries, Ukiah, CA, USA) was used to estimate the normalized difference vegetation index (NDVI) of each plant (only at the second sampling date). Relative chlorophyll content was measured with a Minolta SPAD-502 chlorophyll meter (Spectrum Technologies, Plainfield, IL, USA). Stomatal conductance (gs) was measured using a Decagon SC-1 Leaf Porometer (Decagon Device, Inc., Pullman, WA, USA). Both chlorophyll content and stomatal conductance of the adaxial surface were recorded in the central segment of the same youngest fully expanded leaf between 3 and 5 h after the start of the photoperiod. In addition, plants were collected to determine the leaf, shoot, root, and plant dry weights as indicated above, while the roots were washed in tap water until all substrate was removed. The number of tillers and the root to shoot dry weight ratio (root/shoot) were then determined.

## C and N Content and Stable Isotope Signatures

A fraction of the youngest fully expanded leaf was finely powdered and then 1 mg of this leaf material was used for the measurements of total C and N content (as a percentage of

leaf dry weight) and the stable C (13C/12C) and N (15N/14N) isotope ratios. Measurements were carried out using an elemental analyzer (Flash 1112 EA; ThermoFinnigan, Bremen, Germany) coupled with an isotope ratio mass spectrometer (Delta C IRMS; ThermoFinnigan), operating in continuous flow mode, at the Scientific Facilities of the University of Barcelona. As has been described previously (Bort et al., 2014; Yousfi et al., 2016), the <sup>13</sup>C/12C ratio was expressed in δ notation: δ <sup>13</sup>C (h) = [(13C/12C)sample/(13C/12C)standard − 1] × 1000. The standard refers to international secondary standards of known <sup>13</sup>C/12C ratios (IAEA CH7 polyethylene foil, IAEA CH6 sucrose, and USGS 40 L-glutamic acid) calibrated against Vienna Pee Dee Belemnite calcium carbonate. The same δ notation was used for the <sup>15</sup>N/14N ratio (δ <sup>15</sup>N) using N<sup>2</sup> in air as standard.

#### Quantitative Reverse Transcriptase PCR Amplification

Frozen leaf samples were ground with liquid nitrogen and subsequently RNA was isolated from 100 mg of this material with Ribozol RNA Extraction Reagents (Amresco, Solon, OH, USA) according to the manufacturer's instructions. RNA quantity and quality was measured using a NanoDrop ND-1000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). RNA integrity was checked by 1.5% (w/v) agarose gel electrophoresis. Total RNA (1 µg) was treated with PerfeCTa DNase I RNase-free (Quanta Biosciences, Gaithersburg, MD, USA) to eliminate residual genomic DNA. cDNA was synthesized using a qScript cDNA Synthesis Kit (Quanta Biosciences) following the manufacturer's instructions. The qRT-PCR assays were performed in optical 384-well-plates with the LightCycler 480 System (Roche Applied Science, Penzberg, Germany) in the Centres Científics i Tecnològics de la Universitat de Barcelona (CCiTUB), in a reaction volume of 10 µL: 5 µL of PerfeCTa SYBR Green FastMix (Quanta Biosciences), 200 nM of each gene-specific primer and 1 µL of diluted cDNA (1:10). The thermal profile was as follows: initial denaturation for 30 s at 95◦C, PCR cycling (45 cycles) for 5 s at 95◦C, 15 s at 60◦C, and 10 s at 72◦C, and a final step of 95◦C for 5 s and 60◦C for 60 s to obtain the dissociation curve. Two technical replicates were analyzed per biological replicate. Specific primers for genes encoding the Rubisco large subunit (NC\_021762), phosphoenolpyruvate carboxylase (Y15897), plastidial glutamine synthetase (DQ124212), dehydrin 11 (AJ890140), and superoxide dismutase (KP696754) were designed in Primer-BLAST (http:// www.ncbi.nlm.nih.gov/tools/primer-blast/) using the following criteria: Tm = 60 ± 1 ◦C, primer length of 18–25 bases, GC content of 30–70% and product size of 60–150 bases. The specificity of PCR amplification was confirmed by the presence of unique amplicons of the expected length on 3.5% (w/v) agarose gels. The genes encoding the ADP-ribosylation factor and the RNase L inhibitor-like protein, previously identified as potential reference genes (Vicente et al., 2015b), were used to normalize qRT-PCR data after the evaluation of their expression stability in this study. All primers used for gene expression analysis and their symbols are listed in Supplementary Table S1. The values of the cycle threshold (Ct) were calculated using the LightCycler

1.5 software (Roche Applied Science). The quantification of the relative gene expression was analyzed using the comparative C<sup>t</sup> method 2−11Ct (Schmittgen and Livak, 2008), and the data were presented as the log<sup>2</sup> fold change.

#### Data Analysis

The effects of [CO2] (ambient and elevated), water regime (wellwatered and water stressed), genotype (Mexa, Regallo, Burgos, and Ramirez), and their interaction on plant growth, chlorophyll content, g<sup>s</sup> , and C and N contents and isotope composition were determined through a three-factor (2 CO<sup>2</sup> × 2 water regimes × 4 genotypes) analysis of variance (ANOVA) for each sampling date (moderate and severe waters stress; see Supplementary Table S2) with GenStat 6.2 (VSN International Ltd, Hemel Hempstead, UK). Further, and given the implicit complexity of the design, each genotype was analyzed through a two-factor ANOVA (2 CO<sup>2</sup> × 2 water regimes) for both sampling dates. All factors were treated as fixed independent variables. When the F-ratio was significant (P < 0.05), the least significant difference (LSD) test was used to assess differences between treatment means. Clustered heat maps of relative gene expression were built in the R statistics environment (R Development Core Team, 2008) to study the effects of elevated [CO2] and water stress on transcript levels. A correlation matrix was generated in R for evaluating the relationships between all parameters analyzed. Visualization of significant correlations was performed using Cytoscape software (Shannon et al., 2003).

#### RESULTS

#### Effect of [CO2], Water Regime, and Genotype on Plant Growth

Total biomass of the plant and its different fractions (leaves, shoot and root), the root/shoot ratio, and the number of tillers were analyzed through two-factor ANOVA ([CO2] × water regime) for each genotype (**Tables 1**, **2**). Moderate and severe water stress were established with reductions of 40 and 70% in the water supplied to the pots and average decreases of 34 and 57% in g<sup>s</sup> , respectively, compared to well-watered plants (data not shown). Growth under elevated [CO2] led to significant increases in biomass compared to ambient [CO2] (**Tables 1, 2**). At the first sampling date, elevated [CO2] increased root biomass in Mexa, Regallo and Ramirez (and also in Burgos, P = 0.074), but only increased plant biomass in Regallo. The root/shoot ratio also increased in Regallo and Burgos under elevated [CO2]. At the second sampling date, elevated [CO2] increased plant biomass due to higher shoot and root biomass compared to ambient [CO2], with larger increases in plant biomass in Mexa and Regallo under well-watered conditions in comparison to water stressed conditions. As a consequence of the increases in both shoot and root dry weights by elevated [CO2], the root/shoot ratio was not altered, except in Regallo. Moreover, in this genotype an increase in the tillers per plant was also observed under elevated [CO2] but only in well-watered conditions.

Moderate water stress did not lead to statistical differences in biomass, the root/shoot ratio, or the number of tillers between


TABLE 1 | Total leaf (LDW), shoot (SDW), root (RDW) and plant (PDW) dry weight, root/shoot ratio, and number of tillers per plant in four durum wheat genotypes grown under ambient or elevated [CO2 ] and well-watered or moderate water stress conditions (100 vs. 60% pot capacity).

Significant effects for elevated [CO2] (C), water stress (W) and their interaction (C × W) were determined by two-factor ANOVA (P). Values with the same letter are not significantly different for the interaction [CO2] × water level. Significant P values are marked in bold (P < 0.05).

well-watered and water-stressed plants (**Table 1**). However, severe water stress led to significant changes in these parameters, while NDVI was also affected (**Table 2**). Plant biomass generally decreased under severe water stress compared to well-watered conditions and was associated with decreases in leaf, shoot, and root dry weights. Water restriction decreased the number of tillers per plant in Burgos under severe water stress, while this reduction was not significant in the other genotypes. Additionally, the NDVI values were lower in water-stressed plants compared to well-watered plants, irrespective of the [CO2] and the genotype (**Table 2**). In general, at severe water stress the interaction [CO2] × water regime × genotype showed that the root/shoot ratio strongly increased in Regallo, especially under ambient [CO2] and well-watered conditions (Supplementary Table S2). The Burgos and Mexa cultivars had higher shoot dry weight than Ramirez and Regallo, while root dry weight was higher in Regallo (Supplementary Table S2). Furthermore, significant [CO2] × genotype interaction showed that Burgos and Regallo under elevated [CO2] increased tiller production, whereas Ramirez and Regallo plants under ambient [CO2] had lower tillering (Supplementary Table S2).

#### Effect of [CO2], Water Regime, and Genotype on Chlorophyll Content, gs, C and N Content and C and N Isotope Composition

The interactive effects of [CO2] and water regime on chlorophyll content, g<sup>s</sup> , and C and N contents and isotope composition were analyzed in the youngest fully expanded leaf through twofactor ANOVA for each genotype during vegetative growth under moderate (**Table 3**) and severe water stress (**Table 4**). At moderate water stress, elevated [CO2] compared to ambient [CO2] decreased N content in Mexa and Regallo, and δ <sup>13</sup>C regardless of the genotype, while it increased chlorophyll content in Mexa, g<sup>s</sup> , and C content in Regallo, and δ <sup>15</sup>N in all genotypes except in Regallo. Water stress reduced g<sup>s</sup> and increased


TABLE 2 | Total leaf (LDW), shoot (SDW), root (RDW) and plant (PDW) dry weight, root/shoot ratio, number of tillers per plant, and normalized difference vegetation index (NDVI) in four durum wheat genotypes grown under ambient or elevated [CO2 ] and well-watered or severe water stress conditions (100 vs. 30% pot capacity).

Significant effects for elevated [CO2] (C), water stress (W) and their interaction (C × W) were determined by two-factor ANOVA (P). Values with the same letter are not significantly different for the interaction [CO2] × water level. Significant P values are marked in bold (P < 0.05).

δ <sup>15</sup>N in Regallo, and increased δ <sup>13</sup>C in Burgos and Ramirez. Three-factor ANOVA showed significant interactions for δ <sup>15</sup>N (Supplementary Table S2). The [CO2] × genotype interaction mainly showed that δ <sup>15</sup>N was higher in Ramirez and Burgos at elevated [CO2] and in Regallo at both [CO2], whereas the lowest values were observed in Ramirez at ambient [CO2]. The water regime × genotype interaction indicated that δ <sup>15</sup>N was higher in Mexa and Regallo under water stress than in the other genotypes.

At severe water stress, elevated [CO2] relative to ambient [CO2] decreased the N content in Regallo and Burgos and δ <sup>13</sup>C regardless of the genotype, while it increased δ <sup>15</sup>N in Burgos (**Table 4**). Furthermore, g<sup>s</sup> in Burgos and N content in Regallo decreased under severe water stress compared to well-watered conditions. In addition, under well-watered conditions g<sup>s</sup> was higher in Burgos than in other genotypes (Supplementary Table S2). Chlorophyll content was lower in Regallo, whereas δ <sup>13</sup>C was higher in Burgos, as compared to other genotypes (Supplementary Table S2).

#### Effect of [CO2] and Water Regime on Gene Expression for Each Durum Wheat Genotype

Treatment effects on transcript levels were evaluated for each genotype using nine genes that encode enzymes of primary C and N metabolism and stress-responsive proteins (Supplementary Table S1). Elevated [CO2] and water stress led to changes in


TABLE 3 | Chlorophyll content, stomatal conductance (gs), N and C content, and N and C isotope composition (δ <sup>15</sup>N and δ <sup>13</sup>C, respectively) in four durum wheat genotypes grown under ambient or elevated [CO2 ] and well-watered or moderate water stress conditions (100 vs. 60% pot capacity).

Significant effects for elevated [CO2] (C), water stress (W) and their interaction (C × W) were determined by two-factor ANOVA (P). Significant P values are marked in bold (P < 0.05).

gene expression depending on genotype and the level of water restriction (**Table 5**; Supplementary Figure S1). At moderate water stress, elevated [CO2] decreased transcript levels of RBCL, RBCS, and GS2 relative to control conditions (ambient [CO2] and well-watered conditions), particularly in the Mexa and Regallo genotypes. Under water stress the transcript levels for these enzymes markedly increased in Ramirez. Elevated [CO2] caused a generalized increase in the transcript levels of PEPC and GS1, particularly when it was combined with moderate water stress. Transcript abundances of the dehydrins, DHN11 and DHN16, were generally higher under elevated [CO2] and well-watered conditions, but lower under ambient [CO2] and water stress relative to control conditions. However, DHN11 and DHN16 showed opposite expression patterns under elevated [CO2] and water stress. [CO2] enrichment and moderate water stress decreased transcript levels of CAT and SOD in Mexa, Regallo, and Burgos compared with control conditions, whereas in Ramirez they did not change significantly.

Gene expression analysis indicated greater genotype-specific differences under severe water stress than under moderate water stress (**Table 5**; Supplementary Figure S1). In Mexa under elevated [CO2] and well-watered conditions there were higher transcript levels of RBCL, RBCS, PEPC, GS1, GS2, and CAT and lower levels of DHN16, relative to control conditions. Severe water stress did not substantially alter gene expression. In Regallo most of the transcripts studied were lower in all treatment combinations than in control conditions. However, DHN16 and SOD transcripts increased under ambient [CO2] and water stress, and these together with CAT and GS1 also increased under elevated [CO2] and water stress. In the case of Burgos, elevated [CO2], water stress and their combination strongly reduced transcript levels in comparison to control conditions, especially for GS1 and DHN16, while SOD transcripts increased under water stress and elevated [CO2] × water stress as observed in Regallo. In Ramirez elevated [CO2] led to a reduction in the transcript levels of RBCL, RBCS, and SOD and an increase in PEPC and GS1 compared to control conditions. Water stress increased PEPC and DHN16 transcript levels relative to control conditions, while under the combination of elevated [CO2] and water stress


TABLE 4 | Chlorophyll content, stomatal conductance (gs), N and C content, and N and C isotope composition (δ <sup>15</sup>N and δ <sup>13</sup>C, respectively) in four durum wheat genotypes grown under ambient or elevated [CO2 ] and well-watered or severe water stress conditions (100 vs. 30% pot capacity).

Significant effects for elevated [CO2] (C), water stress (W) and their interaction (C × W) were determined by two-factor ANOVA (P). Significant P values are marked in bold (P < 0.05).

greater transcript abundances were observed for most of the genes.

### Correlation Network of Physiological Traits and Gene Expression

A Pearson correlation matrix was generated using the mean values for each treatment combination, genotype and sampling date (n = 32) of the physiological traits and transcript levels (Supplementary Table S3), excluding NDVI, which was only measured at severe water stress, and δ <sup>13</sup>C, which was influenced by C composition of the CO<sup>2</sup> bottles used in the elevated [CO2] chamber (Aljazairi et al., 2015). Of the 190 correlations between parameters, there were 28 positive and 19 negative significant correlations (P < 0.05) that are represented in an association network (**Figure 2**). Most of the significant correlations were observed between physiological traits and transcript levels independently. Positive correlations were found among leaf, shoot, and plant dry weights, between the leaf and shoot dry weights with the number of tillers, and between root and plant dry weights. The root/shoot ratio was positively correlated with root dry weight and negatively correlated with leaf and shoot dry weights, the number of tillers and N content. Furthermore, δ <sup>15</sup>N was also negatively correlated with N content and the number of tillers. Chlorophyll content was correlated positively with leaf, shoot, root, and plant dry weights, and negatively with g<sup>s</sup> . On the other hand, positive correlations were found between N content with leaf and shoot dry weights and the number of tillers, and negative correlations between N content with root dry weight, and between g<sup>s</sup> with root and plant dry weights. In the case of transcript levels, RBCL was correlated with RBCS, GS1, GS2, and PEPC, whereas RBCS correlated with GS2 and DHN11, GS2 with DHN11, and PEPC with GS1. Furthermore, some relationships were found between physiological traits and gene expression (**Figure 2**). Positive correlations appeared between DHN16 with plant biomass (leaf, shoot, root, and plant dry weights), CAT with g<sup>s</sup> and SOD with C content. Moreover, negative correlations were found between chlorophyll content with RBCL, RBCS, GS2, and CAT, also between root dry weight with RBCL, RBCS, GS2, and DHN11, and finally plant dry weight with RBCL.


TABLE 5 | Transcript changes in four durum wheat genotypes grown under ambient or elevated [CO2 ] and well-watered or water stressed conditions: (A) moderate and (B) severe water stress.

White indicates no change, blue up-regulation, and red down-regulation in each treatment relative to the treatment under ambient [CO2] and optimal water supply for each genotype, as shown in the color bar for a log<sup>2</sup> scale. RBCL, Rubisco large subunit; RBCS, Rubisco small subunit; PEPC, phosphoenolpyruvate carboxylase; GS1, cytosolic glutamine synthetase; GS2, plastidial glutamine synthetase; DHN11, dehydrin 11; DHN16, dehydrin 16; CAT, catalase; SOD, superoxide dismutase.

−3 0 3

## DISCUSSION

Although, substantial efforts have been made in recent years to identify traits associated with wheat performance during early growth (Maydup et al., 2012; Rebolledo et al., 2013; Bort et al., 2014; Pang et al., 2014; Wilson et al., 2015), little attention has been paid to the effect of interactions between elevated [CO2] and water stress in durum wheat. The effects of water restriction on crop growth have been mostly studied with the view of improving drought impacts at late growth stages in Mediterranean environments. However, projections of future climate change in the Iberian Peninsula predict major rainfall limitations and higher evapotranspiration during winter months (Russo et al., 2015) and therefore early-season drought is a matter of concern. In this context, we describe the effects of elevated [CO2] and water stress during the first part of the growth cycle in four durum wheat genotypes on physiological traits and expression of nine genes that respond to changes in [CO2] and water levels (Ali-Benali et al., 2005; Budak et al., 2013; Vicente et al., 2015b; Yousfi et al., 2016). The coordination of these parameters under the different combinations of factors is discussed.

## Changes in Physiological Traits of Durum Wheat Genotypes under Different Water Regimes and [CO2] Levels

A moderate water stress in 43-day-old plants did not significantly alter plant growth (**Table 1**). Long-term exposure to elevated [CO2] led to higher root biomass relative to ambient [CO2] independently of genotypic variability, in concordance with reports from other crop species (Madhu and Hatfield, 2013). This increment was associated with higher plant growth in Regallo and higher root/shoot ratios in Regallo, Burgos, and Mexa (**Table 1**). In fact, under elevated [CO2] root growth is often more stimulated than the aerial part of the plant, although it depends on genotype × environment variation (Stitt and Krapp, 1999; Madhu and Hatfield, 2013). A severe water stress

in 51-day-old plants showed greater effects on plant growth than moderate water stress (**Table 2**). [CO2] enrichment generally led to an increase in plant biomass by increasing root and shoot biomass and tillering, particularly under optimal water supply. This could be due to the effects of [CO2] fertilization on the net photosynthetic rate (Long et al., 2006; Vicente et al., 2015b), especially in genotypes with large harvest indices such as post-Green Revolution cultivars (Aranjuelo et al., 2013). It could also be caused by carbohydrate accumulation, which may lead to increases in the number of tillers (Stitt and Krapp, 1999). On the other hand, severe water stress constrained plant growth (dry matter and NDVI), in agreement with earlier studies in durum wheat (Erice et al., 2014; Nakhforoosh et al., 2015; Yousfi et al., 2016), with Ramirez and Burgos being the genotypes most affected. According to Marti et al. (2007), we suggest that progressive water restriction during the vegetative stage constrained the photosynthetic area, which may cause negative effects on final biomass and yield.

Chlorophyll content, g<sup>s</sup> , and N and C contents and isotope compositions at moderate and severe water stress did not reveal statistical significance for the interactions [CO2] × water regime and [CO2] × water regime × genotype (**Tables 3, 4**; Supplementary Table S2). Stomatal conductance (gs) generally decreases under elevated [CO2] and drought stress due to an increase in internal [CO2] and as a water saving strategy, respectively (Long et al., 2006; Nakhforoosh et al., 2015; Vicente et al., 2015b; Pazzagli et al., 2016). The average g<sup>s</sup> values decreased under water restriction at moderate and severe water stress, but it was only significantly decreased in some genotypes (**Tables 3, 4**). On the other hand, elevated [CO2] did not alter g<sup>s</sup> at this growth stage, except for an increase in g<sup>s</sup> under moderate water stress in Regallo, which could favor CO<sup>2</sup> assimilation and consequently biomass accumulation under this water regime (**Tables 1, 3**). Earlier studies have shown a decrease in g<sup>s</sup> under water stress (Peremarti et al., 2014; Pazzagli et al., 2016), while negligible changes have been reported under elevated [CO2] in tomato and durum wheat, and increases have even been recorded for Regallo (Vicente et al., 2015a; Pazzagli et al., 2016). Therefore, the growth stage and the severity of the water stress influenced stomatal closure, while elevated [CO2] had minor effects on g<sup>s</sup> during vegetative growth.

Elevated [CO2] generally decreased N content in the present study (**Tables 3, 4**), which has been observed in C<sup>3</sup> plants through shifts in N uptake and/or assimilation (which agrees with the changes in transcript levels of N-metabolism enzymes; see below) together with other uncertain mechanisms, e.g., the biomass dilution effect, increased N loss, and sink limitation (Stitt and Krapp, 1999; Aranjuelo et al., 2011; Vicente et al., 2015a,b). N content was also diminished by severe water stress in Regallo, in agreement with previous studies in durum wheat (Yousfi et al., 2012, 2016). Chlorophyll content only increased under elevated [CO2] in Mexa at the first sampling date, but the effect disappeared at the second sampling (**Table 3**). [CO2] enrichment and water stress did not modify C content in leaves, suggesting that the decrease in N content was not simply due to N dilution caused by rapid growth (Taub and Wang, 2008). Overall, our data showed that the decrease in N content in plants grown under elevated [CO2] and water stress during vegetative growth is genotypically dependent.

The δ <sup>13</sup>C and δ <sup>15</sup>N have been used as potential physiological tracers in plants under elevated [CO2] and water limitation (Aranjuelo et al., 2011; Yousfi et al., 2012, 2016; Araus et al., 2013; Bort et al., 2014). Elevated [CO2] and water stress caused an increase in δ <sup>15</sup>N, although these effects depended on the genotype and were attenuated or disappeared in severe water stress relative to the moderate stress treatment (**Tables 3, 4**; Supplementary Table S2). Variations in δ <sup>15</sup>N in response to the growth conditions, together with N content, could indicate shifts in N metabolism (Bort et al., 2014), although δ <sup>15</sup>N is determined by many processes that are not completely understood (Ariz et al., 2015). Nevertheless, the higher δ <sup>15</sup>N could suggest lower N availability, because N absorption and assimilation cannot fractionate between the <sup>14</sup>N and <sup>15</sup>N isotopologues under such environmental factors (Lopes and Araus, 2006; Tcherkez, 2011). Additionally, this could reflect a decrease in N translocation from the root to the shoot (Lopes and Araus, 2006). Moreover, δ <sup>13</sup>C increased in some genotypes under moderate water stress, regardless of the [CO2] considered, but this increment, also observed under severe water stress, did not reach statistical significance (**Tables 3, 4**). Elazab et al. (2012) and Bort et al. (2014) also showed a δ <sup>13</sup>C increase in flag leaves of different durum wheat genotypes under water stress at later growth stages, which could be associated with higher water-use efficiency (Araus et al., 2008, 2013; Tardieu, 2013; Bort et al., 2014). A stronger water stress does not always lead to larger changes in δ <sup>13</sup>C, particularly when analyzed in dry matter, as noted in previous studies in rice (Kano-Nakata et al., 2014) and Pinus tabuliformis (Ma et al., 2014). In addition, δ <sup>13</sup>C was strongly reduced at high [CO2] because of the very negative δ <sup>13</sup>C of the CO<sup>2</sup> used to increase the [CO2] within the growth chamber (Aljazairi et al., 2015).

## Expression of Stress-Responsive Genes in Durum Wheat Genotypes under Different Water Regimes and [CO2] Levels

Strong differences in gene expression were observed between treatments and among the different genotypes studied (**Table 5**; Supplementary Figure S1). In our study, RBCL and RBCS showed a common expression pattern (**Table 5**), confirming the coordinated expression of both subunits necessary for the assembly of the Rubisco holoenzyme (Suzuki and Makino, 2012). At the first sampling date, gene expression of RBCL and RBCS was down-regulated in response to elevated [CO2] no matter which water regime was considered, in agreement with other wheat studies (Aranjuelo et al., 2013; Habash et al., 2014; Vicente et al., 2015b). This down-regulation was associated with lower N content and higher δ <sup>15</sup>N in a genotype-dependent manner. The former could be explained by non-selective decreases in N or reallocation of N within the plant under elevated [CO2] (Aranjuelo et al., 2011; Vicente et al., 2015a). The latter was probably associated with changes in N uptake, assimilation or redistribution within the plant (Araus et al., 2013). At the second sampling date, elevated [CO2] decreased the N content in Regallo and Burgos, which was related to down-regulation of transcript levels of Rubisco subunits and N-assimilation enzymes (GS1 and GS2), and higher root and plant biomass. These shifts could indicate that plant biomass might increase under elevated [CO2] in a genotype-dependent manner even when transcript levels of Rubisco subunits decrease during vegetative growth. This could be due to the remobilization of an N over-investment in Rubisco to reuse it in developing new tissues (Richards, 2000; Vicente et al., 2011; Carmo-Silva et al., 2015). However, the decrease in Rubisco transcript levels under water stress did not indicate the greater photosynthetic efficiency that was hypothesized under elevated [CO2]. Instead it was associated with lower plant biomass, which might suggest an inhibition of CO<sup>2</sup> assimilation and plant growth in concordance with previous studies (Hayano-Kanashiro et al., 2009; Peremarti et al., 2014).

PEPC is a multifaceted key enzyme that in C<sup>3</sup> plants is linked to the provision of Krebs cycle intermediates, and its overexpression in transgenic wheat improved drought tolerance and grain yield (Qin et al., 2015). PEPC expression has not been widely studied during early growth in durum wheat plants. In the current work it was induced under the combination of elevated [CO2] and moderate water stress in most genotypes, whereas at severe water stress genotypic variation determined its expression pattern (**Table 5**). The induction could be related to its major role in providing C skeletons for amino acid and lipid biosynthesis (González et al., 2003). This may be due to an increase in the enzyme's substrates, such as carbohydrates, typically found under elevated [CO2] and water stress (Khoshro et al., 2013; Vicente et al., 2015b). These results indicate that further work is necessary to broaden our understanding of the biological role of PEPC and its implication in plant growth, especially in genotypes (i.e., Ramirez) with an up-regulation of gene expression under stress conditions.

At moderate and severe water stress, GS1 and PEPC expression was significantly coordinated, as were the expressions of the GS2 and Rubisco genes (**Table 5**; Supplementary Table S3). Under severe water stress, GS1 and GS2 expression was more influenced by genotypic variability than environmental conditions. Yousfi et al. (2016) also reported genotypic differences in the expression of these genes under drought stress, with a general down-regulation under stress conditions. Lower N contents and transcript abundances for RBCL and RBCS under water stress and especially under elevated [CO2] were associated with higher repression of the GS2 gene, indicating a coregulation of primary C and N metabolism (Stitt and Krapp, 1999; Vicente et al., 2015b, 2016). In some treatments, mainly at the first sampling date, opposing gene expression patterns were observed between GS1 and GS2. This fact, together with the coordination of GS1 with PEPC, might indicate a predominant remobilization of C and N compounds and an inhibition of primary N assimilation under water stress and elevated [CO2]. Thus, the results support a significant coordination between C and N metabolism at the transcript level under conditions of elevated [CO2] and water stress. In addition, the pattern of gene expression for GS1 and GS2 supports the use of these genes as indicators of N metabolism under

water stress conditions, as reported previously (Nagy et al., 2013).

DHN11 and DHN16 encode for two dehydrins that belong to group 2 of late embryogenesis abundant (LEA) proteins (Ali-Benali et al., 2005). The up-regulation of dehydrin genes under water restriction is often associated with stress tolerance, although their specific role as osmotically active compounds is still unknown (Kosová et al., 2014). Moderate and severe water stress reduced DHN11 gene expression regardless of the [CO2] level compared with control conditions. In the case of the DHN16 gene, moderate water stress mostly up-regulated its expression, whereas under severe water stress the opposite occurred (**Table 5**). Elevated [CO2] at the first sampling date mostly enhanced DHN11 and DHN16 gene expression, while at the second sampling date its combination with severe water stress led to a wide range of changes in transcript levels in a genotypedependent manner. Our results showed that the pattern of gene expression could differ between dehydrins, in concordance with previous studies (Ali-Benali et al., 2005; Melloul et al., 2013; Kosová et al., 2014). Additionally, the severity of the water stress, [CO2] enrichment and the genotype influenced dehydrin transcript levels.

CAT and SOD enzymes form part of the system responsible for lowering ROS and avoiding oxidative stress. In general, gene expression of CAT and SOD was repressed under moderate water stress regardless of [CO2] (**Table 5**). Such repression was only maintained for CAT at severe water stress in Burgos, while their expression was up-regulated in the other genotypes under elevated [CO2] × severe water stress. This could suggest a higher demand for ROS control, which would indicate a limitation to the transfer of electrons through photosystems to drive C assimilation (Martins et al., 2016). Enzyme activity and CAT gene expression have been reported to decrease under elevated [CO2] in wheat, possibly due to the inhibition of photorespiration, while they increased only in response to severe drought (Luna et al., 2005; Xu et al., 2010; Vicente et al., 2015b). The available studies reporting changes in SOD gene expression and protein content under such conditions are contradictory, reporting different pattern of changes (Kim et al., 2006; Li et al., 2008; Caruso et al., 2009; Xu et al., 2010). Our results highlighted that water regime and genotype were key factors influencing the expression of genes involved in the antioxidant system, indicating a greater need for protection against oxidative damage under severe water stress.

#### Coordination between Physiological Traits and Transcript Levels in Durum Wheat Grown under Different Environmental Conditions during Vegetative Growth

The different changes in plant growth parameters indicate that the responsiveness to elevated [CO2] and water stress during early growth depends on (i) the duration of the treatment, because [CO2] enrichment results in greater increases in plant biomass in older plants; (ii) the severity of the water stress, which is more pronounced under severe water stress; (iii) and the genotypic variability. In general, elevated [CO2] stimulated plant growth and reduced N content, which at the transcript level was related to a down-regulation of Rubisco and N assimilation genes and up-regulation of genes that take part in C-N remobilization. Moderate water stress did not lead to gross changes in physiological traits, but severe water stress restricted plant growth and N content, while changes in g<sup>s</sup> and δ <sup>13</sup>C suggested a water-saving strategy relative to wellwatered conditions. The transcript profile suggested an inhibition of primary C fixation and N assimilation, differences between dehydrins and a genotypic variation in gene expression under severe water stress, with an induction of genes involved in antioxidant machinery. The stimulation of plant biomass under elevated [CO2] did not compensate for plant growth limitation under water restriction. Lastly, we observed different genotypic responses to environmental factors, as also reported in barley (Ceccarelli et al., 1991). Regallo showed the lowest plant biomass and chlorophyll and N contents, which was related to a repression of genes for N assimilation and induction for dehydrins, SOD and CAT, while the opposite results were recorded for Burgos (data not shown). Therefore, increased plant growth was linked to up-regulation of N assimilation and down-regulation of stressresponsive genes, suggesting lower oxidative damage.

Considering different environmental conditions predicted for the future climate scenario and genotypic variations, network analysis was used to identify physiological traits, and transcript levels that are correlated during vegetative growth in durum wheat (**Figure 2**). Early growth is a positive trait for improving plant tolerance in water-limited environments that has the potential for larger final plant biomass and yield (Wilson et al., 2015). Plant growth parameters were positively correlated with each other in most cases, suggesting that early plant growth is driven by all plant fractions and tiller production, as reported in other studies (Rebolledo et al., 2013; Wilson et al., 2015). Regardless of genotype, the positive correlation between root and plant biomass was mainly due to the stimulation of root biomass under elevated [CO2], in agreement with previous reports (Madhu and Hatfield, 2013; and citations therein). In contrast, water restriction (mainly severe water stress) limited both root and shoot biomass, which are often diminished under severe drought conditions (Nezhadahmadi et al., 2013). Positive effects of elevated [CO2] on root biomass could mitigate drought effects on plant growth by allowing better exploitation of water and nutrients from deep soil layers (Madhu and Hatfield, 2013).

N content was correlated negatively with the root/shoot ratio and positively with the tillers per plant and shoot biomass, and this was probably due to the typically higher N content observed in shoots relative to roots (Vicente et al., 2015a). Hence, greater vegetative growth in durum wheat requires high amounts of N, which in turn will be conditioned by N availability. δ <sup>15</sup>N has been proposed as an indicator of responses to stress, such as water stress, N starvation and salinity (Yousfi et al., 2012, 2016; Bort et al., 2014), although it has had little attention for studies of elevated [CO2] (Ariz et al., 2015). Here we observed a negative correlation of δ <sup>15</sup>N with N content and tillers per plant, with elevated [CO2] being the main factor that increased δ <sup>15</sup>N in our experiment. Nevertheless, the fractionating processes of N metabolism affecting δ <sup>15</sup>N under elevated [CO2] and water stress are not fully understood (Tcherkez, 2011).

Leaf chlorophyll content has been extensively used as an indicator of different physiological and agronomical components, particularly at later growth stages (Araus et al., 2008). The network analysis confirmed that chlorophyll content is a positive trait for vegetative growth in durum wheat, and this can be easily implemented in most of studies because this measurement is simple, quick, and non-destructive with modern portable devices. Effects of elevated [CO2] and water stress on g<sup>s</sup> have been widely studied (Long et al., 2006; Pazzagli et al., 2016), including the proposal of g<sup>s</sup> as a trait indicator of drought stress tolerance (Nagy et al., 2013). In our study g<sup>s</sup> was negatively correlated with chlorophyll content and root and plant biomass. This could highlight that increased vegetative growth was related to stomatal closure, maybe as a water saving strategy or as a direct response to elevated [CO2].

The positive correlations among the transcript levels of the genes encoding RBCL, RBCS, GS1, GS2, and PEPC supported a balanced coordination between C and N metabolism under elevated [CO2] and water stress. On the other hand, our results underlined the key role of Rubisco and GS in plant responses to environmental conditions (Nagy et al., 2013; Carmo-Silva et al., 2015; Vicente et al., 2015b; Yousfi et al., 2016). We showed negative associations between transcript levels of Rubisco subunits and GS2 with chlorophyll content and plant biomass. This fact could indicate that a stimulation of plant growth may be associated with a lower investment of resources (mainly N) in Rubisco protein, especially under elevated [CO2], thus leading to a higher nitrogen efficiency (Pang et al., 2014; Carmo-Silva et al., 2015). The negative correlation between transcript levels of CAT and chlorophyll content highlighted that the up-regulation of CAT expression was a response to the high H2O<sup>2</sup> levels generated under stress conditions (Luna et al., 2005), which could promote chlorophyll degradation (Upadhyaya et al., 2007). Interestingly, transcript levels of CAT were positively correlated with g<sup>s</sup> , although a negative correlation should be expected since greater g<sup>s</sup> leads to lower photorespiration rates and consequently lower H2O<sup>2</sup> generation (Luna et al., 2005). We found a positive relationship between C content and transcript accumulation for SOD, not previously reported to our knowledge. Higher SOD expression might suggest a better ROS control that triggers an efficient electron transfer and C fixation. In our study, DHN11 transcript accumulation was negatively associated with root biomass, while transcripts for DHN16 were positively linked with plant biomass. These results suggest promising functions for DHN16 in stress tolerance during vegetative growth, as Kosová et al. (2014) proposed in a study examining wheat seed development.

In summary, parameters such as chlorophyll and N content, g<sup>s</sup> and δ <sup>15</sup>N, and the expression of RBCL, RBCS, GS2, DHN11, and DHN16 genes were identified as good indicators for the selection of genotypes with better performance during early plant growth under elevated CO<sup>2</sup> and water stress. Additionally, network analysis underlined the relevance of N metabolismtraits such as N content, δ <sup>15</sup>N, GS1, and GS2, in the genotypic response of durum wheat to future environmental scenarios in the Mediterranean basin.

## CONCLUSION

We conclude that [CO2] effects on plant growth had greater impacts than moderate or severe water stress during vegetative growth of durum wheat. Whereas, elevated [CO2] generally led to increases in plant growth, water stress had a negative effect, preferentially as the water stress develops over time. In addition, the interactive effects of both [CO2] and water regime depends on genotypic variability. Gene expression profiles at moderate water stress were mainly affected by environmental conditions among the different genotypes. However, with further water restriction, genotype-specific differences were found to affect gene expression more than environmental conditions. These facts reflect a wide range of adaptation mechanisms in durum wheat under elevated [CO2] and water stress during vegetative growth, probably due to the complex regulatory network that takes place with both factors. Moreover, our study did not show a clear trend concerning the genetic advance in response to future climate change scenarios. Our results evidenced for durum wheat the need to take into account the genotypic variability for a greater understanding of plant adaptation to climate change. Moreover, the correlation network demonstrated that the combination of phenotyping and gene expression analysis is a useful approach to identify phenotype-genotype relationships and their behavior in response to different environments during vegetative stages.

## AUTHOR CONTRIBUTIONS

SM and JLA conceived and designed the experiments. SM, RV, and AA contributed to the experimental work. SM, RV, and JLA analyzed the data and interpreted the results. RV wrote the paper under the supervision of JLA, and SM and AA revised the manuscript. All authors have read and approved the final manuscript.

## FUNDING

This study was supported by the Spanish National Programme for Research Aimed at the Challenges of Society of the Ministry of Economy and Competitiveness (grants No. AGL2013-44147- R and AGL2016-76527-R). SM was the recipient of a fellowship "Presidente de la República PRONABEC-III" from Peruvian Government.

## ACKNOWLEDGMENTS

We thank the Unitat de Genòmica of the CCiTUB, Josep Matas (Servei de Camps Experimentals), Adrián Gracia of the University of Barcelona, and Marco Betti of the University of Seville for technical assistance.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 01738/full#supplementary-material

### REFERENCES


on drought variability in the Iberian Peninsula. Front. Environ. Sci. 3:1. doi: 10. 3389/fenvs.2015.00001


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Medina, Vicente, Amador and Araus. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# SNP-Based QTL Mapping of 15 Complex Traits in Barley under Rain-Fed and Well-Watered Conditions by a Mixed Modeling Approach

Freddy Mora<sup>1</sup> , Yerko A. Quitral <sup>2</sup> , Ivan Matus <sup>3</sup> , Joanne Russell <sup>4</sup> , Robbie Waugh<sup>4</sup> and Alejandro del Pozo<sup>2</sup> \*

#### Edited by:

Soren K. Rasmussen, University of Copenhagen, Denmark

#### Reviewed by:

Xiaoquan Qi, Institute of Botany-The Chinese Academy of Sciences, China Evelyne Costes, Institut National de la Recherche Agronomique, France

> \*Correspondence: Alejandro del Pozo adelpozo@utalca.cl

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 24 January 2016 Accepted: 08 June 2016 Published: 27 June 2016

#### Citation:

Mora F, Quitral YA, Matus I, Russell J, Waugh R and del Pozo A (2016) SNP-Based QTL Mapping of 15 Complex Traits in Barley under Rain-Fed and Well-Watered Conditions by a Mixed Modeling Approach. Front. Plant Sci. 7:909. doi: 10.3389/fpls.2016.00909 1 Instituto de Ciencias Biológicas, Área de Biología Molecular y Biotecnología, Universidad de Talca, Talca, Chile, <sup>2</sup> Centro de Mejoramiento Genético y Fenómica Vegetal, Facultad de Ciencias Agrarias, PIEI Adaptación de la Agricultura al Cambio Climático (A2C2), Universidad de Talca, Talca, Chile, <sup>3</sup> Centro Regional de Investigación Quilamapu, Instituto de Investigaciones Agropecuarias, Chillán, Chile, <sup>4</sup> The James Hutton Institute, Dundee, Scotland

This study identified single nucleotide polymorphism (SNP) markers associated with 15 complex traits in a breeding population of barley (Hordeum vulgare L.) consisting of 137 recombinant chromosome substitution lines (RCSL), evaluated under contrasting water availability conditions in the Mediterranean climatic region of central Chile. Given that markers showed a very strong segregation distortion, a quantitative trait locus/loci (QTL) mapping mixed model was used to account for the heterogeneity in genetic relatedness between genotypes. Fifty-seven QTL were detected under rain-fed conditions, which accounted for 5–22% of the phenotypic variation. In full irrigation conditions, 84 SNPs were significantly associated with the traits studied, explaining 5–35% of phenotypic variation. Most of the QTL were co-localized on chromosomes 2H and 3H. Environment-specific genomic regions were detected for 12 of the 15 traits scored. Although most QTL-trait associations were environment and trait specific, some important and stable associations were also detected. In full irrigation conditions, a relatively major genomic region was found underlying hectoliter weight (HW), on chromosome 1H, which explained between 27% (SNP 2711-234) and 35% (SNP 1923-265) of the phenotypic variation. Interestingly, the locus 1923-265 was also detected for grain yield at both environmental conditions, accounting for 9 and 18%, in the rain-fed and irrigation conditions, respectively. Analysis of QTL in this breeding population identified significant genomic regions that can be used for marker-assisted selection (MAS) of barley in areas where drought is a significant constraint.

Keywords: drought, marker segregation distortion, RCSL, physiological trait, kinship

## INTRODUCTION

Abiotic stresses can significantly reduce crop yields and restrict the latitudes and soils on which commercially important species can be cultivated (Lobell and Field, 2007; Jacobsen et al., 2012). The development of drought-tolerant genotypes as well as genotypes with higher water-use efficiency is of global interest as populations continue to increase and water availability decreases (Araus et al., 2002, 2008; Cattivelli et al., 2008). In barley (Hordeum vulgare L.), a large number of morphological and physiological traits are linked to drought tolerance (Chen et al., 2010; del Pozo et al., 2012) which exhibit strong environmental interactions (Tondelli et al., 2006). Increasing tolerance to drought stress has become a major goal for barley breeding programs particularly in light of prolonged drought periods as a result of climate change (Wehner et al., 2015).

To determine the genetic basis of complex traits, important genetic, and genomic resources have been developed in a wide range of species (Kota et al., 2003; Mora et al., 2015), including barley (Kota et al., 2003; Close et al., 2009; Zhou et al., 2015). Single nucleotide polymorphism (SNP) markers are the most abundant sequence variations encountered in eukaryotic genomes (Griffin and Smith, 2000), and in barley offer the potential for generating very high-density genetic maps (Close et al., 2009), providing a useful tool for quantitative trait locus/loci (QTL) mapping for marker-assisted selection (MAS; Sato and Takeda, 2009; Szucs et al., ˝ 2009).

Conventionally, quantitative trait locus/loci (QTL) mapping is carried out using markers that follow a Mendelian segregation ratio, which depends on the population under investigation (Xu, 2008). The phenomenon that alleles at a locus deviate from the Mendelian expectation has been defined as segregation distortion (Zhang et al., 2010), which has been encountered in many commercially important species, such as maize (Lu et al., 2002), rice (Xu et al., 1997), tomato (Paterson et al., 1988), and barley (Malosetti et al., 2011). Biologically, segregation distortion can be due to chromosome loss, genetic isolation mechanisms, and the presence of viability genes. Non-biological factors such as scoring errors and sampling errors also can contribute to segregation distortion (Alheit et al., 2011).

Construction of genetic maps and QTL analysis using distorted markers is risky because the basic assumption of Mendelian segregation is violated, and, consequently most of these markers are removed from the subsequent QTL analyses (Luo et al., 2005; Xu, 2008). Moreover, several authors found that distorted markers influence the estimation of genetic intervals and the markers order on a same chromosome (Lorieux et al., 1995a,b; Zhu et al., 2007). Because these markers are routinely removed, valuable information around these regions is lost (Luo et al., 2005). However, if distorted markers are handled properly, their effects on genetic map construction and QTL identification and mapping can be significantly improved (Xu, 2008; Alheit et al., 2011; Hashemi et al., 2015).

From an analytical standpoint, many QTL mapping procedures have been developed for Mendelian populations, but few are available for markers that do not segregate in a typical Mendelian fashion (Zhu et al., 2007; Xu, 2008). Recently, Malosetti et al. (2011) found severe allele frequency distortions in many chromosomal regions in a RIL population of barley. The authors stated that violation of the basic assumptions implies that the genetic covariance between genotypes (i.e., genetic relatedness) in the population is not homogeneous, and analogous to association mapping studies, a QTL mapping mixed model that account for this heterogeneity should be used to avoid false QTL detection. In addition, mixed models have been used to investigate QTL-by-environment interaction (Malosetti et al., 2004; Boer et al., 2007) and to map QTL for several traits simultaneously (Malosetti et al., 2008).

In barley, a large number of mapping populations have been developed to map QTL. Further, advanced mapping populations, including near-isogenic lines (NILs), chromosome segment substitution lines (CSSLs), and recombinant chromosome substitution lines (RCSLs, Schmalenbach et al., 2008, 2011; Sato and Takeda, 2009; Naz et al., 2012), have also been developed to facilitate the genetic dissection of complex traits. As a consequence, many QTL controlling complex traits including agronomic, and morphological traits, yield component, disease resistance, tolerance to abiotic stress, and malting quality have been identified (Pillen et al., 2004; Talame et al., 2004; Li et al., 2005; Gyenis et al., 2007). However, studies that have identified QTL for drought-related morphological and physiological traits are still scarce in barley (Chen et al., 2010; Mir et al., 2012; Sayed et al., 2012; Kalladan et al., 2013; Li et al., 2013; Wójcik-Jagła et al., 2013; Honsdorf et al., 2014; Mansour et al., 2014; Naz et al., 2014). Thus, a breeding program has been developed in the Mediterranean area of central Chile, in areas where drought is a significant constraint to yield. This study identifies SNP markers associated with 15 complex traits (including physiological and morphological traits) in a breeding population consisting of 137 RCSLs of barley (Matus et al., 2003), evaluated under contrasting water availability. Given that markers showed a very strong segregation distortion, a QTL mapping mixed model was employed to account for the heterogeneity in genetic relatedness between genotypes.

## MATERIALS AND METHODS

#### Plant Material and Field Evaluation

A set of 137 RCSL were evaluated in field conditions in Santa Rosa (35◦ 78′ S, 72◦ 17′ W), in the Mediterranean climatic region of central Chile, under rain-fed and fully irrigated conditions in 2008–2009. The average annual temperature in this region is 13◦C, the minimum average is 3◦C (July), and the maximum 28.6◦C (January). Monthly maximum and minimum temperatures and precipitation during 2008 are in Table S1. The annual precipitation in 2008 was 992 mm but the amount during the growing season was 120 mm. The soil is a sandy loam, Humic Haploxerand, Andisol. For fully irrigated conditions, four furrow irrigation of 50 mm water was applied from heading to maturity.

The RCSL population was developed using the advanced backcross strategy of Tanksley and Nelson (1996). The accession of Hordeum spontaneum (Caeserea 26-24 from Israel) was the donor parent and Hordeum vulgare subsp. vulgare "Harrington" (a North American malting quality standard) was the recurrent parent (Matus et al., 2003). The recurrent parent was used as the female and the donor as the male to obtain the F<sup>1</sup> generation. Finally, the lines were obtained using two backcrosses with the recurrent parent and six generations of self-pollination (BC2F6).

The field trial was arranged in a 14 × 10 alpha-lattice design with two replications ("Harrington" cultivar was replicated more times for arrangement v = 14 × 10 = 140). Fertilizer and field management practices recommended for optimum barley production were used (Inostroza et al., 2009; del Pozo et al., 2012). The 15 different morphological, agronomic, and physiological traits measured during this experiment are described in **Table 1**.

#### Phenotypic Data Analysis

A mixed linear modeling approach was employed for phenotypic data analysis using the MIXED procedure in SAS. Field data were analyzed on the basis of the statistical model (Stich et al., 2008):

$$\mathcal{y}\_{ijno} = \mu + \mathfrak{g}\_i + l\_j + (\mathfrak{gl})\_{ij} + r\_{nj} + b\_{onj} + \varepsilon\_{ijno}$$

where yijno is the observed phenotype for the ith RCSL at the jth location in the oth incomplete block of the nth replicate, µ is an intercept term (overall population mean), g<sup>i</sup> is the genotypic effect of the ith entry, l<sup>j</sup> is the effect of the jth location, rnj is the effect of the nth replicate of the jth location, bonj is the effect of the oth incomplete block of the nth replication of the jth location, and εijno is the residual or within-plot error. To analyze the effect of environment (rain-fed and fully irrigated) on the RCSL population, environment (contrasting environmental condition), genotype, and genotype-environment interaction were considered as fixed effects while replication and block were analyzed as random effects. Error variances were assumed to be heterogeneous among locations according to Stich

TABLE 1 | List of 15 morphologic, agronomic, and physiological traits measured in a RCSL population of barley under two contrasting environmental conditions in southern Chile.


et al. (2008). Adjusted entry means M<sup>i</sup> were calculated for each RCSL as:

$$M\_i = \hat{\mu} + \hat{\mathfrak{g}}\_i$$

where µˆ and gˆ<sup>i</sup> are the generalized least-squares estimates of µ and g<sup>i</sup> , respectively. Adjusted entry means, calculated by environment and in a combined analysis, were latter used in the analysis of QTL. The CORR procedure of SAS was used to estimate Pearson correlations (r, n = 137) between pairs of traits.

#### DNA Extraction and Genotyping

Leaf tissues were harvested from each plant and crushed with liquid nitrogen. Genomic DNA was extracted from 200 to 300 mg of leaf sample using Qiagen DNeasy Plant mini kit (QIAGEN Co.). DNA concentration was determined by nanodrop (Thermo Fisher Scientific) and adjusted to 100 ng/µl. DNA samples were sent to the Southern California Genotyping Consortium, Illumina BeadLab at the University of California, Los Angeles (UCLA) for the OPA-SNP assay with the 1536 plex detection platform of barley OPA 1 (BOPA1) developed by Close et al. (2009). SNP markers are distributed across the seven barley chromosomes. The SNP loci were designated by HarvEST:Barley unigene assembly #32 numbers (http:// harvest.ucr.edu/). Genotyping of the 137 RCSLs and their parents was performed by Illumina GoldenGate assay. The order of polymorphic markers from BOPA1 was performed using MEGA5 software (Tamura et al., 2011). Moreover, the chromosome segments introgressed into H. vulgare cv. Harrington from Hordeum spontaneum were estimated from the graphical haplotypes (Van Berloo et al., 2008) in each of the recombinant lines selected.

#### QTL Mapping by a Mixed Modeling Approach

For the segregation data of each SNP marker, deviations from the Mendelian ratios (1:1 ratio for RSCL populations) were tested using the Chi-square test. Given that SNP markers did not segregate in an expected Mendelian ratio (details in the Results Section) a mixed model (Stich et al., 2008) was employed for QTL analysis of a designed cross using a structured variance–covariance matrix, where the structure was induced by selection (Malosetti et al., 2011). This QTL mapping mixed model accounts for the heterogeneity in genetic relatedness between genotypes. The hypothesis of association of SNP markers with the 15 target traits was tested using the following mixed model implemented in the program TASSEL 3.0 (Bradbury et al., 2007):

$$M\_i = \mu\_1 + \varkappa\_i \alpha + \mu\_i + e\_i \tag{1}$$

where M<sup>i</sup> is the adjusted entry mean of the ith RCSL, µ<sup>1</sup> is an intercept term, x<sup>i</sup> is the SNP genotype of ith RCSL, α is the additive allele substitution effect (SNP effect), u<sup>i</sup> is the residual genetic background effect of the ith entry, and e<sup>i</sup> is the random residual effect. u<sup>i</sup> is assumed to follow a normal distribution with variance-covariance matrix G = σ 2 u · 2· K, where K is the coefficient of co-ancestry matrix between entries. The threshold used for declaring an association significant was P < 0.01. According to Malosetti et al. (2011) this mixed modeling approach for QTL detection can accommodate the extra genetic covariance by embedding kinship information in the model, leading to appropriate tests, and minimizing the rate of false QTL or gene detection. Variance components were estimated using the Restricted Maximum Likelihood (REML) method. Additionally, false discovery rate (FDR)-adjusted p-values were calculated using PROC MULTTEST in SAS software. The Bayesian information criterion (BIC) was used to compare the simple model that ignore kinship information and the model with a structured variance–covariance matrix (kinship information).

#### RESULTS

#### Phenotypic Data Analysis

The statistical analysis of fixed effects for the 15 complex traits under study are summarized in **Table 2**. According to the F-values (type III tests of fixed effects), the 137 RCSL presented significant differences at P < 0.01 in most traits under study. Significant differences were observed between both contrasting water regimes for most traits, including biological yield (BY) and grain yield (GY). In fact, the average GY under rain-fed was reduced by 81% in relation to fully irrigated condition, indicating that the barley RCSL lines were exposed to a severe water stress. Also, there was a strong reduction in BY (−59.9%), dry weight at tillering (DWT; −39.9%), tiller number (TN; −48.1%) and thousandkernel weight (TKW; −28.1%) (**Table 2**). Peduncle length (PL), peduncle extrusion (PE), plant height (PH), and the physiological trait IPAR also evidenced significant environmental effect. There were significant interactions between RCSL and environment for PE, PH, TN, hectoliter weight (HW), kernels per spike (KS), and GY.

Estimates of phenotypic correlations among traits are shown in **Table 3**. Grain yield was positively and significantly correlated with TN, BY, HW, HI, and KS, and negatively correlated with PL and PH (P < 0.01, **Table 3**). The physiological trait IPAR was significantly and positively correlated with biological yield (r = 0.6) and grain yield (r = 0.41). Relative water content (RWC), PE, and chlorophyll fluorescence (Fv/Fm) were not significantly correlated (P > 0.05) with most of the variables studied.

#### Kinship Matrix to Avoid False-Positives

A large number of spurious QTLs were detected when the genetic covariance matrix is ignored in the mixed model. In fact, 444, 460, and 516 false-positives were detected for all traits studied in the rain-fed and fully irrigated conditions, and combined analysis, respectively, using a structured variance–covariance matrix in a mixed model, where the structure was induced by selection. This QTL mapping mixed model accounts for the heterogeneity in genetic relatedness between genotypes. **Figure 1** shows the results of segregation distortion analysis based on the P-values, obtained by the chi-square test (P-values were plotted

TABLE 2 | Summary of statistical analysis of fixed effects for 15 complex traits measured in a RCSL population of barley under rain-fed and well-watered conditions.


Data are presented as phenotypic means with minimum and maximum values in parentheses. GEI, genotype-environment interactions; NS, not significant.

\*Significant at the 0.05 probability level.

\*\*Significant at the 0.01 probability level.


TABLE 3 | Pearson correlation coefficients among mean variables (adjusted entry means) measured in a RCSL population of barley under rain-fed and well-watered conditions.

\*Significant at the 0.05 probability level.

\*\*Significant at the 0.01 probability level.

on a −Log10 scale). According to the Bayesian information criterion (BIC) there was significant evidence against the simple models that ignore kinship information in most of the traits studied (Table S2); between the two competing models, the best model is the one that has the smallest BIC-value. In the combined analysis, BIC was smaller for the mixed model that includes the kinship information (K model), for most of the traits. Therefore, the use of a QTL detection model that accounts for the heterogeneous genetic relatedness between RCSLs lines, caused by the uneven sharing of genetic background, is clearly necessary.

## SNP-Based QTL Mapping of Complex Traits

The number of QTLs detected for all the traits under study, including the chromosome number and the percentage of the total phenotypic variation explained by the QTLs, are summarized in **Table 4** (details of all significant SNP–trait associations are given in Table S3). The Manhattan plots of the genome-wide QTL study showed a lower number of significant QTLs under rain-fed (**Figure 2**) in comparison with fully irrigated conditions (**Figure 3**). In fact, 57 QTLs were detected in rain-fed, which accounted for 5–22% of the phenotypic variation (KS and PH, respectively). In the fully irrigated condition, a total of 84 SNPs were significantly associated with the traits studied, which explained from 5 to 35% of phenotypic variation (TN and HW, respectively). In the combined analysis, 92 QTLs were detected to be highly significant, which accounted for 5–21% of the phenotypic variation.

Most of the QTLs were co-localized on chromosomes 2H and 3H; i.e., about 53% (30/57) and 62% (52/84) in the rain-fed and irrigation conditions, respectively. The chromosomes with lowest number of QTLs found were 4H and 6H, with two QTLs each, which accounted for 5.1–8.3% of the phenotypic variation. Only two QTLs were detected for grain yield in rain-fed condition, SNPs 2711-234 (1H) and 1923-265 (1H), which explained 7 and 8.6% of phenotypic variation, respectively, but importantly, they were stables across the two contrasting water regimes. Under full irrigation, six QTLs were detected for grain yield, SNPs 2711- 234 (1H), 1923-265 (1H), ConsensusGBS0598-3 (3H), 9018-522 (3H), 4105-1417 (3H), 7045-950 (3H), and explained from 5.2 to 18%. Different to grain yield, the putative QTLs underlying biological yield were detected on chromosomes 5H and 7H, in the rain-fed condition, and on chromosomes 2H, 3H, and 5H in the fully irrigated condition.



PV (%), percentage of the phenotypic variation explained by SNP markers; NQ, number of significant QTLs; ChN, chromosome number.

All QTLs detected for BY, Fv/Fm, DWT, HI, IPAR, KS, PE, PH, SL, RWC, TKW, and TN were environment-specific. Although most QTL-trait associations were environment specific, some stable associations were also detected for HW, SNPs 2711-234 (1H) and 1923-265 (1H), and PL, SNPs 9282-205 (3H), 3965-353 (3H), 2335-1614 (3H), and 5212-1409 (7H).

Fifteen and ten SNPs, respectively, were associated with more than one trait in the fully irrigated and rain-fed conditions. Importantly, the SNPs 4105-1417, 7045-950, 9018- 522, ConsensusGBS0598-3 (all on chromosome 3H) were concomitantly associated with grain yield, harvest index, and kernel per spike, in the fully irrigated condition, which accounted for 5.2–20% of the phenotypic variation. The mentioned SNPs ConsensusGBS0598-3, 7045-950, and 4105-1417 were also associated to biological yield, and the SNP 9018-522 with spike length. In both environmental conditions, grain yield and HW shared QTLs linked to the SNPs 1923-265 and 2711-234 (on chromosome 1H); both QTLs correspond to stable associations across environments.

QTLs with >15% of the phenotypic variation explained by SNP markers are shown in **Table 5**; these are relatively moderate (to major) QTLs detected in this RSCL population of barley. The results confirmed the presence of more QTLs with >15% in the fully irrigated (9) than the rain-fed condition (2), and the majority of these QTLs were environment-specific. In the fully irrigated condition, a relatively major QTL was detected underlying HW, i.e., SNP marker 1923-265 on chromosome 1H at 140 cM, which accounted for 35.3% of the total phenotypic variation. In the same genomic region, the SNP marker 2711-234 (on 1H at 139 cM) explained 27% of the phenotypic variation for HW. These QTLs were stable across the contrasting environmental conditions. Interestingly, the major locus 1923-265 (1H) associated with HW, was also detected for grain yield in both environmental conditions, which explained between 8.6 and 18.1% of the phenotypic variation, in the rain-fed and fully irrigated conditions, respectively. Finally, one genomic region on chromosome 4H, was moderately associated with plant height (i.e., SNPs ABC08009-1-2-304 and 954-1377) and was environment-specific in the rain-fed condition.

#### DISCUSSION

In this study, a RCSL population consisting of 137 lines was evaluated for 15 complex traits, including plant height, grain yield, and yield-related traits in two contrasting environment conditions. Severe allele frequency distortions was evidenced along the seven linkage groups in the RCSL population. By incorporating kinship into the model, as proposed by Malosetti et al. (2011) for populations that have undergone some selection resulting in a departure from Mendelian segregation ratios, we identified QTLs associated with key traits in two contrasting field conditions.

Environment-specific genomic regions were detected for the majority of the traits (12/15). These findings are consistent with the study conducted by Wang et al. (2014), in which most of the QTL for different traits varied between environments in a doubled haploid population of barley.

Most of the economically important traits in barley are inherited quantitatively. Plant height (PH), for instance, is under polygenic control, and represents one of the most important agronomic traits for barley (Wang et al., 2014; Zhou et al., 2015). In this study, the highest number of QTL was detected for plant

height (20 QTL in both conditions) which were observed on chromosomes 2H, 3H, 4H, and 6H. Similarly, Honsdorf et al. (2014) found the highest number of associations for this trait, which were located on all chromosomes except 5H. Inostroza et al. (2009) found SSR-trait associations for PH on chromosomes 1H, 2H, 4H, 5H, 6H, and 7H, evidencing a genome-wide distribution. Interestingly, in the present study, a significant genomic region (explained 22% of phenotypic variance) on chromosome 4H, which comprises the SNPs ABC08009-1-2-304

and 954-1377, controls PH in barley under rain-fed conditions. The majority of the QTLs were detected on chromosomes 2H (4/7) under fully irrigated and 4H (5/7) under rain-fed condition. This result is partially consistent with Pasam et al. (2012), who found 32 associations with plant height with the majority located on chromosomes 2H and 3H. Malosetti et al. (2011) found significant SNP associations with PH on chromosomes 2H, 3H, 5H, and 7H, with two important QTLs on chromosomes 3H (127.1 cM) and 5H (69.3 cM), which are known to carry semi-dwarfing genes in barley (Malosetti et al., 2011; Wang et al., 2014). The use of semi-dwarf genes has greatly improved barley TABLE 5 | Relatively moderate (or major) QTLs detected in a RSCL population of barley (only QTLs with >15% of the phenotypic variation explained by SNP markers), evaluated under well-watered and rain-fed conditions by a mixed modeling approach.


PV(%), percentage of the phenotypic variation explained by SNP markers; ChN, chromosome number; Pos, SNP position in cM.

yields with controlled plant height being used to reduce yield loss arising from lodging and to increase the harvest index (Wang et al., 2014). Zhou et al. (2015) found a major QTL for plant height mapped at 105.5 cM on chromosome 3H, which had a LOD score of 13.01 and explained 44.5% of phenotypic variation. In this study, one QTL was detected at 197 cM on chromosome 3H (SNP 9610-1195), in the rain-fed condition, which explained 7.3% of phenotypic variation.

In the fully irrigated condition, a relatively major QTL was found for HW on chromosome 1H at 140 cM (SNP 1923-265), which explained 35.3% of phenotypic variation. This result is consistent with the study of Rode et al. (2012) who found three QTLs for HW on chromosome 1H, including the SNP 1923-265. In rain-fed condition, this SNP explained 8.6% of phenotypic variation for grain yield, and under fully irrigated condition, it explained 5.7 and 18.1% for RWC and GY, respectively. In contrast, Rode et al. (2012) did not find any SNP controlling HW associated with another related-trait. As expected, the SNP marker 2711-234 (in the same genomic region of SNP 1923- 265 on chromosome 1H, at 139 cM) that explained 27% of the phenotypic variation for HW, was associated with grain yield in both environmental conditions, explaining 10 and 7% of phenotypic variation. The correlation coefficient between both traits was positively correlated (r = 0.48; P < 0.01; **Table 3**).

According to Naz et al. (2014), tiller number per plant is a major determinant of yield in crops like barley. In this study, the nine QTL detected for TN were located on all chromosomes except 2H and 7H, and all were environment-specific. There have also been conflicting reports on the QTL detection for TN and their chromosomal location in barley, although chromosomes 3H and 4H appear to be consistent (Elberse et al., 2004; Wang et al., 2010; Honsdorf et al., 2014). Naz et al. (2014) identified five QTL for TN on chromosomes 1H, 2H, 4H, and 5H; of which one QTL (located on 5H between 203.85 and 231.75 cM) accounted for 70.5% increase in TN. On the other hand, Honsdorf et al. (2014), for instance, found two QTL for this trait on chromosomes 3H and 4H.

Thousand kernel weight (TKW) is one of the major yield components having direct effect on the final yield (Pasam et al., 2012). In this study, five significant QTLs associated with TKW were found on chromosomes 2H (three QTLs), 5H (one QTL), and 7H (one QTL), under rain-fed condition, which explained between 5.2 and 6.3% of the phenotypic variation (**Table 3**). In contrast, Pasam et al. (2012) found 21 QTL associated with thousand grain weight, which were present on all chromosomes. Comadran et al. (2011) detected three QTL associated with TKW on chromosome 2H, and, similarly with our study, none of these associations accounted for more than 10% of the phenotypic variation; the largest effect was over 5% of the trait mean. Kalladan et al. (2013) found consistent QTL for TKW across the environments (stable QTL), which were mapped to all seven linkage groups except chromosomes 4H and 5H.

Two genomic region on chromosomes 1H and 3H were associated with grain yield under favorable conditions. These correspond to the SNP markers 2711-234 and 1923-265, on chromosome 1H, and ConsensusGBS0598-3 (60 cM), 9018- 522 (61 cM), 4105-1417 (64 cM), and 7045-950 (67 cM) on chromosome 3H. Similarly, Rode et al. (2012) found two QTLs associated with grain yield on chromosome 3H, but at 42.1 cM (SNP 15141-288) and 169.3 cM (SNP ConsensusGBS0632-3). The two QTLs detected for grain yield in rain-fed conditions, SNPs 2711-234 (1H) and 1923-265 (1H), were stables across the two contrasting water regimes. This result is in accordance with the findings of Kalladan et al. (2013) who found altogether five stable QTL for yield; of them three were mapped to chromosome 1H: Moreover, the QTL explaining most of the phenotypic variations for yield were found on chromosome 1H and 2H. In contrast, seven QTLs were found in the Rode's study on chromosome 5H. Comadran et al. (2011) found three main QTL for grain yield located on chromosomes 2H and 7H. Inostroza et al. (2009) although using simple sequence repeats (SSRs) found that the yield QTLs are distributed throughout the genome, on chromosomes 1H, 2H, 3H, 5H, 6H, and 7H; similar to the findings of Mansour et al. (2014).

Importantly, Comadran et al. (2011) mentioned that colocalization of several QTL related to yield components traits suggest that major developmental loci may be linked to most of the associations in barley. In our study, some markers on chromosome 3H were concomitantly associated with biological and grain yield, harvest index, spike length, and kernel per spike, in the irrigated environment. Most of the correlation coefficients among these traits were positive and statistically different from zero (P < 0.01) varying from r = 0.24 to 0.64. In both environmental conditions, grain yield and HW (correlation coefficient r = 0.48, P < 0.01) shared two QTLs located on chromosome 1H (SNPs 1923-265 and 2711-234), both QTLs correspond to stable associations across environments.

In this study, eight putative QTLs underlying harvest index were identified on chromosomes 2H, 3H, and 5H. Most of them were localized on chromosome 3H (6/8) under irrigation conditions. Only one significant QTL (SNP 5880-2547) controlling HI was found on 2H under rain-fed conditions, which explained 6% of phenotypic variation. This QTL controlling HI was also concomitantly associated with thousand kernel weight, in rain-fed (explained 6%); the correlation coefficient between these traits was statistically different from zero r = 0.2, P < 0.05. Comadran et al. (2011) detected three QTL for HI on chromosomes 1H, 2H, and 3H, with the most significant being located on chromosome 2H, in the same region as QTL for heading date and yield.

Fan et al. (2015) found that the physiological trait RWC had a very close correlation (r = 0.73, P < 0.01) with drought tolerance in barley; in fact, one QTL for RWC was identified on chromosome 2H and it explained 44.3% of phenotypic variation. In the current study, no QTL controlling RWC was identified in rain-fed, and only a small effect was identified on 1H, 3H, and 7H accounting for around 5% of the phenotypic variation. This was expected considering that there was no significant difference in RWC between rain-fed and irrigated environments. Li et al. (2013) carried out a meta-analysis of QTL associated with tolerance to abiotic stresses in barley, identifying MetaQTL for RCW under abiotic stress on H1, H2, H5, H5, and H7. Wójcik-Jagła et al. (2013) in a comparative QTL analysis of early short-time drought tolerance in Polish fodder and malting spring barleys, found 18 QTLs for nine physiological traits on all chromosomes except 1H in malting barley and 15 QTLs for five physiological traits on chromosomes 2H, 4H, 5H, and 6H in fodder barley.

Fluorescence parameters for dark adapted flag leaves (Fo, Fm, Fv, Fv/Fm) of 194 recombinant inbred lines (RILs), developed from the cross between the cultivar "Arta" and H spontaneum 41- 1, measured under well-watered and drought stress conditions, showed significant differences among RILs but no differences between water regimes (Guo et al., 2008). Using SSRs and AFLPs markers they were able to identified nine and five QTLs, under well-watered and drought stress conditions, respectively; a QTL for Fv/Fm [i.e., (Fm – Fo)/Fm], which explained 15% of the phenotypic variance, was identified on chromosome 2H at 116 cM in the linkage map under drought stress. In our study two QTLs for quantum yield of PSII on chromosome 4H were identify under rain-fed condition and five in the combined analysis (**Table 4**). In the study of Wójcik-Jagła et al. (2013) one major QTL related to photochemical quenching of chlorophyll fluorescence was located on chromosome 4H in fodder barley. In an advanced backcross quantitative trait locus (AB-QTL) analysis performed by Sayed et al. (2012) to elucidate genetic mechanisms controlling proline content (PC) and leaf wilting (WS) in barley under drought stress conditions, QTL for WS were localized on chromosome 1H, 2H, 3H, and 4H. Among these, QWS.S42.1H and QWS.S42.4H were associated to decrease in WS due to the introgression of exotic alleles. QTL for PC were localized on chromosome 3H, 4H, 5H, and 6H. QTL effects on 3H, 4H, and 6H were responsible to heighten PC due to the preeminence of elite alleles over the exotic alleles which ranged from 26 to 43%.

The results are in agreement with the findings of Malosetti et al. (2011) who screened a population of 161 inbred lines of barley with 1536 SNPs, which were used for gene and QTL detection. The model incorporating kinship, co-ancestry information, was consistently superior to the one without kinship (according to the Akaike information criterion), similarly with this study. Importantly, Malosetti et al. (2011) showed that ignoring this type of information results in an unrealistically high number of marker–trait associations, without providing clear conclusions about QTL locations. In this work, a large number of spurious QTLs were detected when the genetic covariance matrix is ignored in the mixed model (444 and 460 falsepositives for all traits studied in the rain-fed and full irrigation conditions, respectively) confirming that ignoring this type of genetic relatedness will increase the rate of false-positives. As with the study carried out by Malosetti et al. (2011) we highlight the importance of the inclusion of kinship information when detecting QTL in populations that have undergone some process of selection. This research provides useful information for MAS programs in areas where drought is a significant constraint. As drought stress tolerance has become an important goal, this analysis of QTL identified significant genomic regions that can be used for breeding purposes.

## AUTHOR CONTRIBUTIONS

FM and YQ made the QTLs analysis and FM wrote the first manuscript; IM was responsible of the field trial and agronomic trait determination; JR and RW were responsible of the research grants CGIAR-GCP Challenge and performed the SNPs analysis; AD performed the physiological evaluations and is the leader of the drought tolerance studies of barley in Chile. All the authors contributed to the final manuscript.

## ACKNOWLEDGMENTS

This work was supported by the research grants CGIAR-GCP Challenge (Genomic dissection of tolerance to drought stress in wild barley) and FONDECYT N◦ 1150353. We thank to Professor Patrick Hayes for scientific support and Alejandro Castro for technical assistance in field experiments.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 00909

## REFERENCES


barley introgression lines. PLoS ONE 9:e97047. doi: 10.1371/journal.pone. 0097047


isotope discrimination in a worldwide germplasm collection of spring wheat using SNP markers. Mol. Breeding 35, 1–12. doi: 10.1007/s11032-01 5-0264-y


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Mora, Quitral, Matus, Russell, Waugh and del Pozo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Glutamine synthetase in Durum Wheat: Genotypic Variation and Relationship with Grain Protein Content

Domenica Nigro<sup>1</sup> , Stefania Fortunato1,2, Stefania L. Giove<sup>1</sup> , Annalisa Paradiso<sup>3</sup> , Yong Q. Gu<sup>4</sup> , Antonio Blanco<sup>1</sup> , Maria C. de Pinto<sup>3</sup> and Agata Gadaleta<sup>2</sup> \*

<sup>1</sup> Department of Soil, Plant and Food Sciences, University of Bari Aldo Moro, Bari, Italy, <sup>2</sup> Department of Agricultural and Environmental Sciences, Research Unity of Genetic and Plant Biotechnology, University of Bari Aldo Moro, Bari, Italy, <sup>3</sup> Department of Biology, University of Bari Aldo Moro, Bari, Italy, <sup>4</sup> Crop Improvement and Genetics Research, Western Regional Research Center, United States Department of Agriculture – Agricultural Research Service, Albany, CA, USA

Grain protein content (GPC), is one of the most important trait in wheat and its characterized by a very complex genetic control. The identification of wheat varieties with high GPC (HGPC), as well as the characterization of central enzymes involved in these processes, are important for more sustainable agricultural practices. In this study, we focused on Glutamine synthetase (GS) as a candidate to study GPC in wheat. We analyzed GS expression and its enzymatic activity in different tissues and phenological stages in 10 durum wheat genotypes with different GPC. Although each genotype performed quite differently from the others, both because their genetic variability and their adaptability to specific environmental conditions, the highest GS activity and expression were found in genotypes with HGPC and vice versa the lowest ones in genotypes with low GPC (LGPC). Moreover, in genotypes contrasting in GPC bred at different nitrogen regimes (0, 60, 140 N Unit/ha) GS behaved differently in diverse organs. Nitrogen supplement increased GS expression and activity in roots of all genotypes, highlighting the key role of this enzyme in nitrogen assimilation and ammonium detoxification in roots. Otherwise, nitrogen treatments decreased GS expression and activity in the leaves of HGPC genotypes and did not affect GS in the leaves of LGPC genotypes. Finally, no changes in GS and soluble protein content occurred at the filling stage in the caryopses of all analyzed genotypes.

Keywords: wheat, grain protein content, GS (Glutamine synthetase), qRT-PCR, enzyme activity, western blot

## INTRODUCTION

Global agriculture urgently requires a modification of standard breeding practices and management policies. A recent report by the United Nation (The Millennium Development Goals Report, 2014) highlighted that the world's population reached 7.2 billion in 2014 and is expected to increase by more than 2 billion by 2050. This means that in the very near future even higher production will be needed to maintain food supplies. Indeed, breeders and scientists have focused their efforts on the identification of agricultural practices and the development of new genetic technologies.

Edited by:

José Luis Araus, Universitat de Barcelona, Spain

#### Reviewed by:

Fernando Martinez, University of Seville, Spain Jingjuan Zhang, Murdoch University, Australia

> \*Correspondence: Agata Gadaleta agata.gadaleta@uniba.it

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

> Received: 16 March 2016 Accepted: 20 June 2016 Published: 13 July 2016

#### Citation:

Nigro D, Fortunato S, Giove SL, Paradiso A, Gu YQ, Blanco A, de Pinto MC and Gadaleta A (2016) Glutamine synthetase in Durum Wheat: Genotypic Variation and Relationship with Grain Protein Content. Front. Plant Sci. 7:971. doi: 10.3389/fpls.2016.00971

**199**

Agricultural productivity had increased in recent decades through the diffusion of modern crop production practices, such as the spread of high-yielding crop varieties and a heavier use of mineral fertilizers. Nitrogen is the most important nutrient and, secondly only to water, a limiting factor for plant growth and development (Kraiser et al., 2011). In the last 40 years, the amount of nitrogen fertilizers supplied to crops has risen dramatically from 12 to 104 Tg/year (Mulvaney et al., 2009). This excess in synthetic N supply significantly affected yield increase. However, as reported in the statistics from the Food and Agricultural Organization of the United Nations, the yield of crops, especially wheat, soybean and maize, have slowed to a growth rate of about 1% annually, and in some specific cases, as in developed countries, the growth rate is quite close to zero (Fischer et al., 2009). Much of this nitrogen is wasted, as well – of the total amount of N supplied, only 30–50% is actually taken up by the plant (depending on the species and cultivar) and used in different biochemical pathways. Most is lost to the environment in several ways, such as surface run-off, leaching of nitrates, ammonia (NH3) volatilization or bacterial competition (Garnett et al., 2009). This represents a considerable expense both in terms of cost and environmental impact. Control directives and best management practices have been implemented several years ago to minimize environmental damage from nitrogen run-off (The Nitrates Directive, EC91/676/EEC, The EU Water Framework Directive, 2000/60/EC). Several studies and international projects have since highlighted the importance of defining the optimum timing and rate of nitrogen application during plant growth to maximize yield.

One of the most valuable agronomic and physiological indicators of how plants respond and use available N is nitrogen use efficiency (NUE), at first defined as the yield of grain per unit of available nitrogen in the soil (Moll et al., 1982, 1987). Currently, NUE could be defined as the ratio among plant grain yield and plant-available N in the soil, including soil-native N and N applied as fertilizer, and is composed of N-uptake efficiency and physiological N-use efficiency (De Macale and Velk, 2004). There is a need to diversify NUE significance, as there are several interpretations of this agronomic trait, depending on species and parameters of interest to be evaluated (Pathak et al., 2011; Hawkesford et al., 2013). Barraclough et al. (2014) studied how to quantify genetic variation in the uptake, portioning and remobilization of nitrogen in individual plant organs at extreme rates on N supply and can influence grain protein content (GPC). They found out that biggest contributor to variation in plant and crop performance was N-rate, followed by growth stage and finally genotype.

Glutamine synthetase (GS), an enzyme with an essential role in the assimilation of inorganic N, has been proposed as a candidate for improving NUE in wheat (Habash et al., 2007; Gadaleta et al., 2011, 2014; Thomsen et al., 2014). GS is present in most species, with three to five isoforms localized in the cytosol (GS1) and a single isoform (GS2) in plastids (Swarbreck et al., 2011). On the bases of phylogenetic studies and mapping data in wheat, 10 GS cDNA sequences were classified into four subfamilies denominate GS1 (a, b, and c), GS2 (a, b, and c), GSr (1 and 2), and GSe (1 and 2; Bernard et al., 2008; Thomsen et al., 2014). Bernard et al. (2008) reported that QTLs for flag leaf and total GS activity were positively co-localized with QTLs for grain and stem nitrogen amount, but smaller correlations were established with loci for grain yield components; they identified QTLs for GS activity co-localized to a GS2 gene mapped on chromosome 2A and to the GSr gene on 4A. Genetic studies in rice (Obara et al., 2004) and maize (Gallais and Hirel, 2004) demonstrated co-localizations of QTLs for GS protein or activity with QTLs relating to grain parameters at the mapped GS genes.

To date few studies are available on the role of genotypic variation of GS for GPC. In this work, we present data on total GS activity and expression in 10 wheat genotypes in relation to their final GPC. Moreover, the response to nitrogen supplies in terms of total GS expression and activity of four different wheat genotypes, differing in GPC, has been investigated.

## MATERIALS AND METHODS

## Plant Material and Field Experiment Design

Ten different durum wheat genotypes (the breeding lines PI191145 and PC32, and genotypes Svevo, Cannizzo, Gianni, Ciccio, Appio, Lucanica, Canyon, and Vesuvio) were chosen from a collection of tetraploid wheat genotypes described by Laidò et al. (2013) and Marcotuli et al. (2015). Wheat genotypes were grown for 6 years (2009–2013) without any external nitrogen supply at Valenzano (Bari, Italy); geographical coordinates: 41◦ 2 0 0 <sup>00</sup> North, 16◦ 530 0 <sup>00</sup> East. A randomized complete block design with three replications and plots consisting in 2.0 m × 1.5 m, with a seed density of 350 germinated seeds/m<sup>2</sup> . According to the standard agronomic practices in the study's area, fertilizer applications were made at pre-sowing (90 kg/ha P2O5). During the growing season standard cultivation practices were adopted without water supply. The plants were harvested after physiological maturity, on July 10, of each year.

The different selected genotypes were chosen according to previous evaluation of yield and quality component trait (unpublished data). In order to evaluated the involvement of candidate enzymes and genes only in the accumulation of GPC, genotypes with similar value of grain yield per spike (GYS) and thousand kernel weight (TKW) were chosen, in order to avoid the negative correlation between GPC and GYS or dilution factor due to TKW (Supplemetary Table S1).

In 2014, four genotypes (PC32, Cannizzo, Ciccio, and Vesuvio) were grown in Valenzano (BA) at three different nitrogen regimes: 0, 60, and 140 N Unit/ha in randomized blocks with replicates (indicated as N0, N60, and N140). Each genotype was sown in one linear meter row and 20 cm apart. Nitrogen was supplied, in the form of ammonia nitrate, in three equal rates, 10 days before collecting samples at stages of first leaf, flowering, and grain filling. Roots were collected from plants at the seedling stage immediately washed, removed excess of water, frozen in liquid nitrogen, and stored at −80◦C. Leaf tissues of each sample were collected in each phase 10 days after nitrogen implementation, immediately frozen in liquid nitrogen, and stored at −80◦C until used in further assays.

Total GPC was assessed on 3 g of whole meal flour using a dual beam near infrared reflectance spectrophotometer (Zeutec Spectra Alyzer Premium, Zeutec Büchi, Rendsburg, Germany). Soluble proteins were assayed according to Bradford (1976), using bovine albumin as a standard.

#### GS Activity Determination

Plant tissues were frozen in liquid N and ground in a mortar with 1:10 (w/v) extraction buffer (100 mM triethanolamine, 1 mM EDTA, 10 mM MgSO4, 5 mM glutamate, 10% v/v ethylene glycol, 10 µM leupeptin and 6 mM DTT- pH7.6). Crude extracts were centrifuged at 21000 × g for 30 min at 4◦C and the supernatant used for GS activity determination. GS activity was measured according to Bernard et al. (2008).

## GS Immunoblotting

The soluble proteins in each extract were separated through SDS-PAGE (Laemmli, 1970). Equal concentrations of denatured proteins (5 µg) were loaded in each track of a 12% polyacrylamide gel. The proteins were electrophoretically transferred to an ImmunoBlot PVDF membrane (Bio-Rad, München, Germany) with a Trans-Blot Semi-Dry (Bio-Rad, München, Germany) by using a transfer buffer containing 25 mM Tris, 190 mM Glycine, 20% methanol. The electrophoretic transfer was conducted at 15 V for 60 min. After the transfer, the PVDF membrane was soaked in blocking solution (20 mM Tris-HCl pH 7.5, 150 mM NaCl, 0,05% Tween 20, and 1% BSA) for 30 min; incubated overnight with primary antibody, and incubated with secondary antibody for 60 min. GS proteins were detected with GS1/GS2 glutamine synthetase global primary polyclonal antibody (Agrisera Vännäs, Sweden); which recognizes both cytoplasmic and chloroplastic forms of the GS enzyme. The secondary antibody was Anti-Rabbit IgG (H+L) HRP conjugate (Promega, Madison, WI, USA). The antibody-protein complex was detected with enhanced chemiluminescence (ECL) detection reagents (Amersham Buchler Ltd); ImmunoBlot PVDF membrane was incubated for 2 min in ECL, then exposed in an X-ray cassette with Amersham Hyperfilm ECL (Amersham Buchler Ltd) for 2 min. The hyperfilm was soaked in a developing and fixing solution (Kodak Inc.).

## Quantitative RT-PCR

Total RNA was extracted with the RNeasy Plant Mini Kit (QIAGEN <sup>R</sup> ), and checked on 1.5% denaturing agarose gels. The total amount of RNA and its purity was determined using a Nano-Drop ND1000 spectrophotometer (Thermo Scientific, Walthman, MA, USA). All RNA samples were adjusted to the same concentration (1 µg) for subsequent treatment with recombinant DNase I (Roche Applied Science, Mannheim, Germany) to remove genomic DNA, and then reversetranscribed into double stranded cDNA with the Transcriptor First Strand cDNA Synthesis Kit (Roche Applied Science, Mannheim, Germany).

Data were normalized using three reference genes: Cell Division Control AAA-Superfamily of ATPases (CDC), ADP-Ribosylation Factor (ADP-RF), and RNase L Inhibitor-like protein (RLI; Paolacci et al., 2009; Giménez et al., 2011). These genes were previously used as references in other wheat gene expression studies (Nigro et al., 2013); all three have a stability value around 0.035 when evaluated with NormFinder software (Andersen et al., 2004). In order to pick a primer combination which could detect total GS expression, sequences of known GS genes were aligned in order to find conserved regions. Specifically, cDNAs sequences of both plastidic GS2 (DQ124212, DQ124213 and DQ124214) and cytosolic isoforms GS1 (DQ124209, DQ124210 and DQ124211), GSe (AY491970 and AY491971), and GSr (AY491968 and AY491969), reported by Bernard et al. (2008), were aligned and compared (**Supplementary Figure S1**). The primer combination was chosen in the region with higher homology among them, in particular a fragment of 149 bp (F 5 <sup>0</sup>–3<sup>0</sup> : CCCTGGCCCCCAGGGTCCATACTACTG; R 50–3<sup>0</sup> : GTCATGCCTGGTCAGTGGGAGT).

Quantitative Real-Time PCR analyses to determine GS genes expression levels were carried out using EVA GREEN <sup>R</sup> in the CFX96TM Real-Time PCR System (Bio-rad). The PCR cycle was 95◦C for 3 min, followed by 40 cycles of 95◦C for 10 s, 60◦C for 30 s. Amplification efficiency (98–100%) for the primer set was determined by amplification of cDNA with a series of six scalar dilutions (1:5) per reaction. Each 10 µl PCR reaction contained 1 µl of a 1:5 dilution of cDNA, 5 µl of EvaGreen Mix 10X (Bio-Rad), and 500 nM of each primer. All experiments were performed in Hard-Shell 96-well skirted PCR plates (HSP9601) with Microseal <sup>R</sup> 'B' Adhesive Seals (MSB-1001) from Bio-Rad. Fluorescence signals were recorded each cycle. The specificity of each amplicon was confirmed by the presence of a single band of the expected size during agarose gel electrophoresis (2% w/v), single peak melting curves of the PCR products, and sequencing of the amplified fragment. qRT-PCR data for both GS and endogenous controls genes are derived from the mean values of three independent amplification reactions carried out on five different plants harvested in the same phenotypic stage (biological replicates). All calculations and analyses were performed using CFX Manager 2.1 software (Bio-Rad Laboratories) using the 1Ct method, which uses the relative quantity (RQ) calculated with a ratio of the RQ of the target gene to the relative expression of the reference gene (including the three reference targets in each sample). Standard deviations were used to normalize values for the highest or lowest individual expression levels (CFX Manager 2.1 software user manual, Bio-Rad Laboratories).

## Statistical Analysis

Values are expressed as mean ± SEM. The medium values reported for GPC, GS activity, and GS expression in the high (H) and low (L) GPC groups were obtained mediating the values of all genotypes belonging to each group (Lucanica, PI191145, PC32, Cannizzo and Svevo for the HGPC; Ciccio, Vesuvio, Appio, Gianni, and Canyon for LGPC). One-way analysis of variance was conducted to calculate differences within and among the groups and for each treatment.

The Dumm's test was used for comparisons among 10 genotypes with no treatment.

The Tukey's test was used for comparisons among four treated genotypes. Correlations were calculated using the Spearman test. Differences were considered significant at P-values <0.05 (twotailed). Analyses were performed using Sigma Plot software 12.0 (Systat Software, Inc., San Jose, CA, USA).

## RESULTS

## Glutamine Synthetase Activity and Expression in 10 Durum Wheat Genotypes

The GPC, expressed as percentage of protein per dry weight, was analyzed for five consecutive years from 10 wheat genotypes. The genotypes were classified into either high GPC (HGPC: Lucanica, PI191145, PC32, Cannizzo, and Svevo), or low GPC (LGPC: Ciccio, Vesuvio, Appio, Gianni, and Canyon) groups. The average GPC values of the two groups were significantly different (**Figure 1**).

Both enzyme activity and gene expression of GS, a candidate gene for NUE and GPC, were analyzed in roots and leaves at different phenological stages in the 10 selected wheat genotypes grown in the field during the 2014 season.

Glutamine synthetase activity in roots significantly differ among genotypes. However, the highest GS activities were found in two HGPC genotypes (Cannizzo and Svevo) and the lowest ones in the LGPC genotypes Vesuvio and Canyon (**Figure 2A**). As a consequence the overall mean of GS specific activity of HGPC genotypes was significantly higher than the average value of LGPC genotypes (**Figure 2B**). Expression data of GS in the roots of each cultivar were consistent with enzyme activity (**Figure 2C**) and again an higher mean value of GS expression was found in the HGPC group when compared with the LGPC one (**Figure 2D**).

Similar trends were observed for GS activity and expression in the leaves at the first leaf stage. Indeed, although differences were found among GS activity and expression of each cultivar, the highest values were found in the HGPC group (PI191145, PC32, and Cannizzo) and the lowest ones in the LGPC group (Vesuvio, Appio, and Canyon; **Figures 3A,C**). Also in this case the medium values of enzyme activity and expression for HGPC genotypes, were significantly higher than that observed in LGPC genotypes (**Figures 3B,D**).

At the flowering stage, differences in GS activity and expression in the leaves of the different genotypes were less marked, even if, also in this case, two HGPC genotypes (PI191145 and Svevo) showed the highest values and two LGPC genotypes (Vesuvio and Appio) the lowest ones (**Figures 4A,C**); the overall mean of GS activity and expression of HGPC genotypes resulted higher than that observed in LGPC genotypes (**Figures 4B,D**).

In the caryopses at the filling stage, the activity and the expression of GS did not change significantly among the genotypes of the two groups, with the exception of PI191145 that showed the highest values, and Vesuvio that had the lowest ones (**Figures 5A,C**). In this case the average values of GS activity and expression in the HGPC and LGPC groups did not differ significantly (**Figures 5B,D**).

Regression analysis conducted between GPC, enzymatic activity, and gene expression revealed significant correlation and were reported in **Table 1**.

#### Effect of Nitrogen Treatments on GS Activity and Expression in Different Wheat Genotypes

Two wheat genotypes from each group (PC32 and Cannizzo from the HGPC and Ciccio and Vesuvio from the LGPC) were grown in 2014 at Valenzano (BA) under three different rates of nitrogen application, N0, N60, and N140 units/ha (see Mat and Meth). GS activity and expression were followed in roots, leaves and caryopses of the four selected genotypes grown at different N fertilization.

Root GS activity increased in all genotypes after the N treatment (**Supplementary Figure S2**). However, in the HGPC genotypes PC32 and Cannizzo, the maximum increase in GS was already evident after application of 60 N units/ha and no further increase occurred after application of 140 N units/ha. The LGPC genotypes, Ciccio and Vesuvio, behaved differently. Ciccio increased root GS activity proportionally to the N application, whereas only the application of 140 N units/ha increased GS activity in Vesuvio roots (**Supplementary Figure S2A**). GS expression in roots had almost the same behavior of GS activity: a maximum increase in gene expression was observed after the application of 60 N units/ha in PC32 and Cannizzo; Ciccio showed an increase of root GS expression proportional to nitrogen supply and Vesuvio had no significant differences between gene expression at N0 and N60, but a significant increase occurred when 140 N units/ha was supplied (**Supplementary Figure S2B**).

The western blot analysis show the presence of a 40 kDa band in the four genotypes, indicating that only the cytosolic GS isoenzyme was present in the roots. The band intensity in the three different N treatments was consistent with GS transcript level and activity (**Supplementary Figure S2C**).

In leaf tissues at the first leaf stage, GS activity and expression significantly decreased with nitrogen application (N60 and N140 units/ha) in the two HGPC genotypes (PC32 and Cannizzo). On the other hand, nitrogen did not significantly change GS activity and expression in the Ciccio and Vesuvio genotypes (**Supplementary Figures S3A,B**). The Western blot analysis highlighted the presence of two bands of 44 and 40 kDa, indicating that both plastidic and cytosolic isoenzymes, respectively, were active in the leaves. Moreover, GS activity in the leaves at this stage seemed to be principally due to the plastidic GS, that was more abundant compared to the cytosolic one. Consistently with GS activity and expression, the intensity of the bands of GS proteins after the application of 140 N units/ha decreased in PC32 and Cannizzo and did not show differences in Ciccio and Vesuvio (**Supplementary Figure S3C**).

Glutamine synthetase activity in the leaves at the flowering stage was similar to that observed in the first leaf stage. In PC32 and Cannizzo, GS activity was still highest when no nitrogen was supplied, and decreased significantly with nitrogen applications. On the other hand, Ciccio and Vesuvio genotypes did not show significant differences in GS activity in all N regimes (**Supplementary Figure S4A**). RT-PCR analysis and western blot showed a decrease in the transcript and protein levels

only in the PC32 genotype after application of 140 N units/ha (**Supplementary Figures S4B,C**).

Glutamine synthetase activity in the caryopses at the filling stage significantly increased only in the Vesuvio cultivar, which has the lowest GPC (**Supplementary Figure S5A**). However, when soluble GPC was measured in the three analyzed genotypes under different N treatment, no statistically significant differences were observed in the four genotypes after N supplies (**Supplementary Figure S5B**), indicating that final GPC was not affected by nitrogen application.

#### DISCUSSION

Nitrogen uptake and utilization is a very complex process in plants, and deciphering all its components is a challenge for scientists and breeders (Hawkesford et al., 2013). The quantitative traits of NUE and GPC are influenced both by the actions of multiple genes and environmental influence (Blanco et al., 2012). In the present work, the enzyme activity and expression of GS, a candidate gene for N-utilization efficiency, were studied in wheat in order to define its role in NUE and GPC. Genetic studies on NUE in maize and rice have shown that GS activity of a cytosolic GS isoform 1 co-localized with QTLs for N remobilization and grain size (Gallais and Hirel, 2004; Obara et al., 2004). In addition, rice mutants lacking the cytosolic GS gene OsGS1;1 were severely limited in growth and grain filling (Tabuchi et al., 2005). In Triticum aestivum a QTL for leaf GS activity, mapped to the TaGSr locus, co-localized with a QTL for grain N concentration. In this case, increased GS activity was associated with higher grain N. Phenotypic and genotypic correlations between flag leaf weight, soluble protein content and GS activity suggest shared

TABLE 1 | Coefficients of correlation (R 2 ) and probability (P-value) between GPC, enzymatic activity, and gene expression.


GPC, grain protein content; REA, root enzymatic activity; RGE, root gene expression; LIEA, leaf I stage enzymatic activity; LIGE, leaf I stage gene expression; LIIEA, leaf II stage enzymatic activity; LIIGE, leaf II stage gene expression; CEA, caryopsis enzymatic activity; CGE, caryopsis gene expression. Significance at P ≤ 0.05, P ≤ 0.01, and P ≤ 0.001 levels, respectively.

control of leaf size and metabolic capacity during grain filling in wheat (Habash et al., 2007). Other two wheat GS genes, the plastidic GS2 and the cytosolic GS1.3, have been associated with QTLs for GPC (Gadaleta et al., 2011, 2014). Moreover, in winter wheat GPC is positively correlated with amino acid and soluble protein content, and with GS activity (Fontaine et al., 2009). Our results show that a clear genotypic variation in GS activity and expression occurs in roots and leaves of the 10 durum wheat genotypes analyzed. However, despite the genotypic variation, the highest GS activities and expression have been found in genotypes of the HGPC group and vice versa the lowest ones in the genotypes of the LGPC group. As a consequence, GS activity and expression are on average higher in the HGPC group than in the LGPC one. Another study on five wheat cultivars exhibiting different NUE showed a good correlation between GS activity and the amount of N re-mobilized from the top section of the plant, or even from the flag leaf alone, to the grain (Kichey et al., 2006).

The situation is different in the caryopses at the filling stage, where no significant differences in GS activity and expression between the LGPC and HGPC genotypes were observed. This suggests that GS could be related to the maintenance of critical N flows and sensing during crucial developmental stages, as proposed by Thomsen et al. (2014).

To assess the effect of GS on NUE and GPC, four wheat genotypes were grown under different nitrogen regimes in field conditions. The obtained results are reasonably different for roots and leaves. In roots of all selected wheat genotypes, only cytosolic GS was present. Moreover, after N supply, an increase in GS expression and activity occurred both in the HGPC and LGPC genotypes. These data are consistent with results obtained in Arabidopsis; in roots, cytosolic GS is essential for ammonium detoxification and nitrogen assimilation under ample nitrate supply (Lothier et al., 2011). In rice, most of the ammonium taken up by the roots can be assimilated within the organ, as shown by the rapid up-regulation of OsGS1;2 in the cell layers of the root surface following the supply of ammonium ions (Tabuchi et al., 2007).

The results are quite different in leaves at both the phenological stages considered. In accordance with what reported by Bernard et al. (2008), both plastidic and cytosolic enzymes were detected by Western blot, as two proteins of 44 and 40 kDa, respectively. After supplying nitrogen, total GS activity and expression in leaves did not change in the LGPC genotypes (Ciccio and Vesuvio) and significantly decreased in the HGPC genotypes (Cannizzo and PC32). These results are in accordance with data reported by Tian et al. (2015) showing that the expression of GS genes was higher in the N-efficient wheat genotype than in the N-inefficient one regardless of N treatment. Soluble GPC was not statistically significant in our genotypes after N treatments, implying that the genetic difference between cultivars caused differences in GPC. This is also consistent with the results by Gaju et al. (2011), who in analyzing fourteen UK and French wheat cultivars and two French advanced breeding lines showed that genetic variability in NUE related mainly to differences in N-utilization efficiency, rather than N-uptake efficiency.

Previous studies have reported that when NUE is calculated as a function of grain yield per estimated N input, this decreases with the increasing N input (Grant et al., 1991; Muurinen et al., 2007; Sylvester-Bradley and Kindred, 2009; Anbessa and Juskiw, 2012). The total N uptake of each cultivar in these studies was quite similar, implying that the differences observed in terms of grain yield in response to different N regimes and NUE was best assessed as differences in the efficiency of utilization. This suggests that the rate of nitrogen fertilizer application might be adjusted according to the individual cultivar to improve NUE, while maintaining potential grain yield.

Nitrogen use efficiency is a complex trait that cannot be explained by the action of a single gene. In a recent study on 24 Australian spring wheat genotypes, Mahjourimajd et al. (2016) analyzed how nitrogen supplies can affect NUE and yield in different environmental conditions. They demonstrated that there was significant genetic variation for NUE-related traits among wheat genotypes, allowing them to define a ranking of genotypes for NUE stability. Focusing and explaining the genetic mechanisms underlying traits associated with NUE are essential to contribute to wheat breeding efforts in order to develop high NUE genotypes. In this context, our data contribute to highlight that NUE is a genotype-dependent parameter, and that GS plays a very important role in terms of N utilization. So far, these studies confirm that the efficient management of N through the use of appropriate germplasm is essential for sustainability of agricultural production and that the use of genotypes optimized for traits relating to N-use efficiency rather than yield alone is of primary importance (Hawkesford, 2014). In this view, a more "precision farming" approach could be helpful to guarantee high grain yield while wasting little fertilizer, leading to both economic and environmental benefits.

#### AUTHOR CONTRIBUTIONS

DN, AG, and MP: Conceived and designed the experiments. DN, SF, SG, and AP: Performed the experiments. AG, AB, and MP: Contributed reagents/materials/analysis tools. DN, AG, MP, YG, and AB: Wrote the paper. DN, AG, MP, YG, and AB: Analyzed the data.

### ACKNOWLEDGMENTS

This research was supported by a grant from Ministero dell' Università e della Ricerca, Italy, projects: PON-ISCOCEM and PRIN 2010-11; and from Regione Puglia, project: Future in Research.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016.00971

FIGURE S1 | Alignment of glutamine synthetase (GS) genes region chosen for RT-PCR primer design.

FIGURE S2 | Glutamine synthetase (GS) in roots of four selected wheat genotypes bred at three different nitrogen rates. (A) GS specific activity; (B) GS normalized fold expression; (A) GS specific activity and (B) GS expression; Data are the means ± SE of five experiments; different letters indicate significant differences after N treatment in each cultivar (one-way ANOVA test; P < 0.05). (C) Representative image of western blotting analysis with the GS1/GS2 antibody; each well was loaded with 5 µg soluble proteins.

FIGURE S3 | Glutamine synthetase in leaves at the first leaf stage, of four selected wheat genotypes bred at three different nitrogen rates. (A) GS specific activity and (B) GS expression; data are the means ± SE of five experiments; different letters indicate significant differences after N treatment in each cultivar (one-way ANOVA test; P < 0.05). (C) Representative image of western blotting analysis with the GS1/GS2 antibody; each well was loaded with 5 µg soluble proteins.

FIGURE S4 | Glutamine synthetase in leaves at the flowering leaf stage, of four selected wheat genotypes bred at three different nitrogen rates. (A) GS specific activity and (B) GS expression; data are the means ± SE of five experiments; different letters indicate significant differences after N treatment in each cultivar (one-way ANOVA test; P < 0.05). (C) Representative image of western blotting analysis with the GS1/GS2 antibody; each well was loaded with 5 µg soluble proteins.

FIGURE S5 | Glutamine synthetase and soluble protein content in caryopses at the filling stage of four selected wheat genotypes bred at three different nitrogen rates. (A) GS activity and (B) soluble protein content. Data are the means ± SE of five experiments; different letters indicate significant differences after N treatment in each cultivar (one-way ANOVA test; P < 0.05).

## REFERENCES

fpls-07-00971 July 11, 2016 Time: 11:35 # 10



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Nigro, Fortunato, Giove, Paradiso, Gu, Blanco, de Pinto and Gadaleta. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Nitrogen Contribution of Different Plant Parts to Wheat Grains: Exploring Genotype, Water, and Nitrogen Effects

Rut Sanchez-Bragado, M. Dolors Serret and José L. Araus \*

Plant Physiology Department, University of Barcelona, Barcelona, Spain

The flag leaf has been traditionally considered as the main contributor to grain nitrogen. However, during the reproductive stage, other organs besides the flag leaf may supply nitrogen to developing grains. Therefore, the contribution of the ear and other organs to the nitrogen supplied to the growing grains remains unclear. It is important to develop phenotypic tools to assess the relative contribution of different plant parts to the N accumulated in the grains of wheat which may helps to develop genotypes that use N more efficiently. We studied the effect of growing conditions (different levels of water and nitrogen in the field) on the nitrogen contribution of the spike and different vegetative organs of the plant to the grains. The natural abundance of <sup>15</sup> δ N and total N content in the flag blade, peduncle, whole spike, glumes and awns were compared to the <sup>15</sup> δ N and total N in mature grains to trace the origin of nitrogen redistribution to the grains. The <sup>15</sup> δ N and total N content of the different plant parts correlated positively with the <sup>15</sup> δ N and total N content of mature grains suggesting that all organs may contribute a portion of their N content to the grains. The potential contribution of the flag blade to grain N increased (by 46%) as the growing conditions improved, whereas the potential contribution of the glumes plus awns and the peduncle increased (46 and 31%, respectively) as water and nitrogen stress increased. In general, potential contribution of the ear providing N to growing grains was similar (42%) than that of the vegetative parts of the plants (30–40%), regardless of the growing conditions. Thus, the potential ear N content could be a positive trait for plant phenotyping, especially under water and nitrogen limiting conditions. In that sense, genotypic variability existed at least between old (tall) and modern (semidwarf) cultivars, with the ear from modern genotypes exhibiting less relative contribution to the total grain N. The combined use of <sup>15</sup> δ N and N content may be used as an affordable tool to assess the relative contribution of different plant parts to the grain N in wheat.

Keywords: nitrogen content, nitrogen isotope composition, ear, grains, wheat

## INTRODUCTION

In terrestrial ecosystems, nitrogen is often the most limiting element in plant growth (Vitousek, 1994) and alongside drought stress, a lack of nitrogen can limit crop quality and productivity (Passioura, 2002). In particular, water availability can affect crop growth, soil nitrogen dynamics and the utilization of plant nitrogen from soil and fertilizers (Raimanová and Haberle, 2010).

*Edited by:*

Marcello Mastrorilli, CREA, Italy

#### *Reviewed by:*

Marta Silva Lopes, International Maize and Wheat Improvement Center, Turkey Gemma Molero, International Maize and Wheat Improvement Center, Mexico

> *\*Correspondence:* José L. Araus jaraus@ub.edu

#### *Specialty section:*

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

*Received:* 20 June 2016 *Accepted:* 14 December 2016 *Published:* 09 January 2017

#### *Citation:*

Sanchez-Bragado R, Serret MD and Araus JL (2017) The Nitrogen Contribution of Different Plant Parts to Wheat Grains: Exploring Genotype, Water, and Nitrogen Effects. Front. Plant Sci. 7:1986. doi: 10.3389/fpls.2016.01986 In addition, grain filling and by extension grain yield is dependent on carbon and nitrogen metabolism (Zhang et al., 2010). The nitrogen requirements of developing grains in wheat are mainly supplied by degradation of proteins derived from different plant organs (Dalling et al., 1976; Simpson et al., 1983; Bancal, 2009). Cereals may accumulate most of their nitrogen in vegetative organs before ear emergence and then redistribute it during grain development (Dalling et al., 1976; Hortensteiner, 2002). Nevertheless, even though an additional source of N may be taken up by roots directly from the soil and assimilated between anthesis and physiological maturity (Dupont and Altenbach, 2003), the greatest proportion of N in the grain is redistributed from the vegetative parts (Perez et al., 1989; Jukanti et al., 2008). In fact, N uptake after anthesis is considered in most cases to be minimal (Perez et al., 1989). Moreover, grain filling is the period when the nitrogen content of different plant parts is substantially reduced (Lopes et al., 2006) as a consequence of protein hydrolysis, which remobilizes amino acids for export to developing grains (Feller and Fischer, 1994).

Traditionally, the flag leaf has been considered as the main contributor to grain nitrogen due to its large protein content (Millard and Grelet, 2010). Indeed, during the last decade, most of the studies dealing with N accumulation in grains have only focused on leaf lamina nitrogen (Hortensteiner, 2002; Bahrani and Joo, 2010). However, during the reproductive stage, as sink strength (grain N) increases after anthesis, other organs besides the flag leaf may become organs of nitrogen supply to developing grains (Waters et al., 1980). Thus, apart from the flag leaf blade, the contribution of the ear as well as the lower parts of the plant may be relevant. Ear photosynthesis is considered as an important source of assimilates for grain filling in wheat and other cereals (Araus et al., 1993; Bort et al., 1994; Tambussi et al., 2007; Sanchez-Bragado et al., 2014a,b) especially under drought conditions (Tambussi et al., 2005). To date, several studies have analyzed the photosynthetic contribution of the ear to grain filling (Tambussi et al., 2007; Maydup et al., 2010, 2012, 2014). Nevertheless, the ear's contribution in terms of nitrogen supply to growing grains still remains unclear, despite studies that have emphasized its importance (Simpson et al., 1983; Lopes et al., 2006). Thus, contribution of other parts of the spike such as the glumes and awns to grain nitrogen might be relevant because these tissues have the longest period of metabolic activity during grain filling (Simpson et al., 1983; Jukanti et al., 2008; Bahrani and Joo, 2010). In fact, measurements of enzyme activities related to nitrogen metabolism (i.e., glutamine synthetase and glutamate dehydrogenase) in floral parts of wheat (awns, glumes) and the flag leaf reveal the existence of ammonia turnover activity in all these organs (Maheswari et al., 1992), suggesting that glumes and awns in addition to the flag leaf can play an important role in nitrogen metabolism during grain filling. For instance, the relative daily contribution of different organs to N accumulation in wheat grains during mid grain filling has been reported as 40% from leaves (leaf lamina and sheath), 23% from glumes, 23% from stems, and 16% from roots (Simpson et al., 1983). However, in advanced stages of grain development, the role of the glumes in terms of contribution to grain nitrogen has been observed to be greater (38%) than in the flag leaf (19%) (Lopes et al., 2006). Thus, protein content in the flag leaf blade seems to be constant until anthesis but reduces during grain filling, whereas glumes can accumulate proteins until 5 days after anthesis (Waters et al., 1980). Thus, glumes can act as a temporal sink for nitrogen in the absence of alternative sinks prior to rapid grain filling (Dalling et al., 1976). Then, during rapid grain filling as the sink strength of the grain increases, glumes are converted into a nitrogen source, presumably remobilizing their accumulated nitrogen to the grains (Waters et al., 1980; Lopes et al., 2006). In contrast, the nitrogen contribution of the stem to grain nitrogen has been observed to be minor due to its low protein content available for mobilization (and low N content loss) compared to the glumes and the flag leaf (Waters et al., 1980). However it may be relevant to study the potential contribution of this plant part since all the N that leaves remobilize to the grains has to pass through that plant part, especially the upper part of the stem (peduncle). This is particularly important for the leaves below the flag leaf since start to senesce earlier (compared with the flag leaf) during grain filling. Thus, it is important to have a better understanding of the role of the spike as a source of N to growing grains, in order to develop phenotypic tools to assess the relative contribution of different plant parts to the N accumulated in the grains of wheat. Long term, the objective is to develop genotypes that use N more efficiently.

The natural variation of the stable nitrogen isotopes (15N/14N) has been considered as a tool to study nitrogen plant dynamics and as a tracer of the nitrogen sources used by the plant (Evans, 2001; Rossato, 2002; Malagoli et al., 2005). Although it is known that plant nitrogen isotope abundance (δ <sup>15</sup>N) is linked to nitrogen metabolism (Robinson, 2000; Ellis, 2002; Pritchard and Guy, 2004), underlying biochemical mechanisms that affect nitrogen isotope composition (δ <sup>15</sup>N) are not yet completely understood (Cernusak et al., 2009). One of the reasons for this might be the fractionation that the nitrogen isotope undergo during enzymatic assimilation of ammonium or nitrate into other forms (Yousfi et al., 2012). Besides, other processes such as volatilization, translocation, or nitrogen recycling in the plant can discriminate positively or negatively against <sup>15</sup>N (Robinson et al., 1998; Evans, 2001). Nevertheless, in spite of the discrimination processes affecting δ <sup>15</sup>N (Evans, 2001), vegetative organs in wheat have been reported to exhibit different δ <sup>15</sup>N (Lopes et al., 2006; Yousfi et al., 2009, 2013). Furthermore, the natural abundance of the stable N isotopes has been observed to be affected by water availability (Handley et al., 1994; Robinson, 2000; Lopes et al., 2004, 2006; Araus et al., 2013) and the nitrogen source (Cliquet et al., 1990). Thus, providing that nitrogen isotope fractionation from the vegetative organs to the growing grains is negligible (or at least constant; Dawson et al., 2002; Serret et al., 2008), an alternative method to study N uptake and remobilization by the plant could be to compare the δ <sup>15</sup>N in its natural abundance in the different plant organs during grain filling.

The aim of this work is to study the contribution of different plant parts to the nitrogen in grains under different growing conditions (using different water and nitrogen levels) in a set of old (i.e., tall) and modern (i.e., semidwarf) durum wheat genotypes. To this end, the natural abundance of δ <sup>15</sup>N and the total N content in the flag leaf blade, peduncle, roots, the whole spike and different tissues (glumes and awns) of the plant were compared to values of δ <sup>15</sup>N and total N content in mature grains in order to trace the origin of the nitrogen redistributed into the grains. The final objective is to assess the performance of the total N content together with the δ <sup>15</sup>N in its natural abundance as an affordable tool to assess the relative contribution of different plant parts to the N accumulated in the grains of wheat.

## MATERIALS AND METHODS

## Germplasm Used and Experimental Conditions

Ten durum wheat [Triticum turgidum L. ssp. durum (Desf.) Husn.] genotypes were studied: five old Spanish cultivars (Blanqueta, Griego de Baleares, Negro, Jerez 37, and Forment de Artes) and five modern (i.e., semidwarf) Spanish cultivars delivered after 1990 (Anton, Bolo, Don Pedro, Regallo, and Sula). Old cultivars were chosen based on their similarity to the phenology of modern cultivars. Field experiments were conducted during two growing seasons, one in 2011 and the other in 2012 (Sanchez-Bragado et al., 2014a), at the experimental station of the Instituto Nacional de Investigación y Tecnología Agraria y Alimentaria (INIA) in Aranjuez (40◦ 03′N, 3◦ 31′E, 500 m asl). In the experimental field, soil is Entisol Fluvent Xerofluvent, with the upper 0.4 m having an organic matter content of 4.9 g/kg, total nitrogen content of 0.37 g/kg, carbonate content of 233 g/kg, pH of 8.1 and electric conductivity of 0.164 dS/m (Araus et al., 2013). During 2012 the five old Spanish genotypes grown under support irrigation conditions were also discarded due to lodging. Two water treatments (support irrigation, SI, and rain-fed, RF) combined with two nitrogen regimes (fertilized, HN, and non-fertilized, LN) were assayed. The trials were planted on 30 December 2010 and 18 November 2011 (from now on designated by their harvest year 2011 and 2012, respectively) in plots with six rows 0.20 m apart, covering a total area of 7.1 m<sup>2</sup> (5 m length and 1.42 m width) per plot. Total accumulated precipitation during the 2011 and 2012 seasons was 275.4 and 126.1 mm, respectively. For both years sprinkler irrigation was applied to irrigated plots around initiation of booting (beginning of April) and grain filling (around May 15th and 30th) with ∼60 mm of water on each date. Prior to sowing, all trials received 60 kg ha−<sup>1</sup> of phosphorous as superphosphate (18%) and 60 kg ha−<sup>1</sup> potassium as potassium chloride (60%). Further, the HN plants were dressed with nitrogen applied at the beginning of tillering (January 27th in 2011 and December 29th in 2012) and jointing (March 20th in 2011 and February 20th in 2012) using a dose of 45 and 105 kg ha−<sup>1</sup> of urea (46%), respectively. The LN plants were not N fertilized, relying exclusively on the N availability in the soil before sowing. Water and nitrogen treatments were arranged according to a split-split plot design with three replicates. Experiment plots were kept free of weeds, insect pests, and diseases by recommended chemical measures (Sanchez-Bragado et al., 2014a).

Phenology was recorded throughout the growth cycle (Zadoks et al., 1974). Sampling was performed around 7 days after anthesis (7th May) in 2011 and 10 days after anthesis (18th April) in 2012. In 2011 stomatal conductance (gs) was measured with a leaf porometer (Decagon; Pullman, USA) in one leaf per plot at the mid point of grain filling. Similarly, chlorophyll content was measured with a SPAD-502 Minolta chlorophyll meter (Spectrum Technologies, Plainfield, IL, USA). In 2011, roots were collected from the upper layer (0–10 cm) with a split tube (Eijkelkamp Soil & Water, The Netherlands), rinsed with distilled water and then placed inside a paper envelope. Based on similar values of GY obtained from previous studies two old Spanish cultivars (Blanqueta and Negro) and two modern cultivars (Anton and Bolo) with three replicates for each growing condition (4 treatments) were selected for root extraction (48 plots). Thereafter, five representative flag leaves, peduncle and ears were collected per plot, and oven dried together with the collected roots at 70◦C for 48 h. Once dried, in 2011 the entire spike and flag leaf blade were weighed and ground, whereas in 2012 the glumes, awns, flag leaf blade, and peduncle were separated after drying, weighed, and finely ground for total nitrogen content and nitrogen isotope signature analyses as described below. In addition, based on similar grain nitrogen contents obtained in previous studies, developing grains of a modern cultivar (Regallo) and an old Spanish cultivar (Jerez 37) from the same set of samples (with four growing conditions and three replicates per genotype) were also separated, weighed and finely ground for total nitrogen content and nitrogen isotope analyses as explained below. At maturity, the central four rows of each plot were harvested and grain yield (GY) recorded. Harvesting was performed manually and by machine in 2011 and 2012, respectively. Subsequently mature grains were processed for N content and isotope analysis. Total nitrogen GY was then calculated as the product of GY by N content on a dry matter basis of mature grains.

#### Nitrogen Concentration and Stable Isotope Composition

The total N content and stable nitrogen isotope signature in the dry matter of the entire spike, glumes, awns, flag leaf, peduncle, roots, developing, and mature grains were analyzed. Approximately 1 mg of each sample was weighed into tin capsules and measured with an elemental analyser (Flash 1112 EA; ThermoFinnigan, Bremen, Germany) coupled with an Isotope Ratio Mass Spectrometer (Delta C IRMS, ThermoFinnigan, Bremen, Germany) operating in continuous flow mode in order to determine the total N content and the stable nitrogen (15N/14N) isotope ratios. The (15N/14N) ratios of plant material were expressed in δ notation (Coplen, 2008): δ <sup>15</sup>N = ( <sup>15</sup>N/14N)sample/(15N/14N)standard − 1, where "sample" refers to plant material and "standard" N<sup>2</sup> in air.

#### Water-Soluble Fraction

The protein-free water-soluble fractions (WSFs) of the flag leaf and spike (entire spike, awns, and glumes) were extracted from the same dry samples tested for nitrogen isotopes, as described previously (Cabrera-Bosquet et al., 2011; Yousfi et al., 2013). Summarizing, 50 mg of either fi ne leaf or ear powder were suspended with 1 ml of MilliQ water in an Eppendorf tube (Eppendorf Scientific, Hamburg, Germany) for 20 min at about 5◦C. After centrifugation (12,000 g for 5 min at 5◦C), the pellet was discarded and the supernatant containing the WSF was heated at 100◦C for 3 min, where the heat-denatured proteins precipitated. Subsequently, samples were centrifuged again (12,000 g for 5 min at 5◦C) to separate previously denatured proteins from the soluble fraction. Aliquots of 40 µl of supernatant containing protein-free WSF were transferred into tin capsules for nitrogen analysis. The capsules containing the aliquots were oven dried at 60◦C.

## Total Organ N and Potential, Relative Organ N Contribution to Grain N

Total organ nitrogen content of the flag leaf blade, peduncle, whole spike, glumes, and awns was calculated as the product of nitrogen content on a dry matter basis in the different organs multiplied by their respective dry weight. For the whole spike N calculation, the total grain N of developing grains was subtracted from the calculation (taking into account the treatment and genotype). The potential organ nitrogen contributions of the flag leaf blade, peduncle, spike, glumes, and awns to the nitrogen accumulated in the grains were calculated as the product of N content of each organ multiplied by its respective dry weight and standardized (i.e., divided) by the total N content of mature grains per spike. The relative contribution of the different organs was calculated as the potential organ N contribution for the specific organ divided by the sum of the potential organ N of all organs studied. In addition the ratio between mean values of dry weight (g) under rainfed (RF) vs. supplemental irrigation (SI) was calculated for the spike, flag leaf, peduncle, glumes, and awns.

#### Statistical Analysis

Treatment, organ, and genotype effects were assessed by means of Analysis of Variance (ANOVA). Water regime, nitrogen supply, organ and their interactions were included as fixed factors. Means were compared by Tukey's HSD test. A bivariate correlation procedure was constructed to analyse the relationships between the measured traits. Statistical analyses were performed using the SPSS 18.0 statistical package (SPSS Inc., Chicago, IL, USA). Figures were created using Microsoft Excel 2010 (Microsoft Corporation).

## RESULTS

#### Grain Yield and Organ Nitrogen Content

Average grain yield (GY) and nitrogen grain yield (nitrogen GY) across treatments and genotypes were lower in 2011 (1.7 Mg·ha−<sup>1</sup> ) than in 2012 (3.1 Mg·ha−<sup>1</sup> ; **Tables 1, 2**, respectively). However, the interaction of nitrogen and water showed an effect on GY in the two-way ANOVA analysis (Table S1). Considering the different treatments, the highest average GY and nitrogen GY was observed under supplemental irrigation, regardless of the N fertilization (fertilized, SI+HN or non-fertilized, SI-LN) conditions for both 2011 (**Table 1**) and 2012 (**Table 2**). Modern cultivars showed higher GY and nitrogen GY than the old cultivars during both growing seasons (Table S1).

In 2011 the average leaf chlorophyll content (SPAD) values across water and nitrogen conditions at the time of organ collection ranged between 52.9 and 52.6 for supplemental irrigation vs. rain-fed and between 55.8 and 49.8 for the fertilized and non-fertilized treatments, respectively (being only significant in fertilized conditions, data not shown). Regardless of the growing conditions, the total grain N content per spike (Grain N·spike−<sup>1</sup> ) was much higher than the total N content of any of the different organs analyzed in 2011 and 2012 (**Tables 1, 2**). Moreover in 2011, irrespective of the organ analyzed and the water regime, the N content on a dry matter basis and the total organ N in fertilized conditions were higher than non-fertilized conditions (**Table 1**). Conversely, in 2012 and regardless of water regime, even though the N content on a dry matter basis was lower in non-fertilized (LN) than in fertilized (HN) conditions, the total organ N of all organs was higher in LN conditions with the exception of the peduncle (**Table 2**). At the same time, the flag leaf blade, peduncle, spike, roots, and awns together with mature grains exhibited higher total organ N content under support irrigation (SI) than rain-fed (RF) conditions (irrespective of nitrogen level) with the exception of the glumes (**Tables 1, 2**). In fact, organs that developed under SI conditions also showed larger dry weight than under RF conditions (Table S2). The organ with the highest total N content on a dry matter basis in 2011 was the spike (**Tables 1**), whereas in 2012 it was the flag leaf blade. Conversely, the glumes in 2012 showed the lowest N content on a dry matter basis (**Table 2**). With regards to genotypic differences, modern cultivars showed higher N content on a dry matter basis in the flag leaf, spike, peduncle, and awns than old cultivars, whereas total organ N was only higher in mature grains, the peduncle, and awns for modern cultivars compared to old cultivars (Table S3). Similarly, only N content showed a genotypic effect in all studied organs, whereas water supply, fertilization and organ showed an effect on almost all studied parameters (**Tables 1, 2**). Besides, the interaction of water supply, fertilization and organ was only significant for N content and for δ <sup>15</sup>N in WSF (see Table S1).

In 2011, the average nitrogen isotope composition in the dry matter (δ <sup>15</sup>N DM) within the four growing conditions was enriched in the grains (4.6%) and depleted in the roots (2.2%), whereas the spike and the flag leaf blade were between these values (3.8 and 3.5% for the flag leaf blade and the spike, respectively; **Table 1**). Similarly, in 2012 the δ <sup>15</sup>N DM of all organs studied (flag leaf blade, peduncle, glumes, and awns) was depleted compared to mature grains (4.5%) (**Table 2**). However, during both seasons the δ <sup>15</sup>N was enriched in all organs under non-fertilized conditions (SI-LN and RF-LN) compared to fertilized conditions (SI+HN and RF+HN). In particular, the δ <sup>15</sup>N in all organs under LN conditions was enriched when associated with SI plots (SI-LN) compared with RF plots (RF-LN) for both growing seasons (**Tables 1, 2**). Similarly, under fertilized conditions, the δ <sup>15</sup>N in all organs was enriched when associated with SI plots (SI+HN) compared to RF plots (RF+HN). With regard to the WSF, δ <sup>15</sup>N was generally lower (depleted) than in the DM but the trends across water and fertilization regimes and genotypes were similar (**Tables 1, 2**). However, modern cultivars showed enriched δ <sup>15</sup>N only in mature grains compared to old cultivars (Table S3).

TABLE 1 | Genotype (G), water supply (W), nitrogen (N), and organ (O) effects (ANOVA) and mean values and of nitrogen isotope composition (δ <sup>15</sup>N) in the water-soluble fraction (WSF) and dry matter (DM) of the flag leaf blade and entire spike as well as in mature grains and roots, nitrogen content (N content), total nitrogen content per organ (Total organ N), grain yield (GY) and total nitrogen content of mature grains per spike (Grain N·spike−<sup>1</sup> ).


Nine durum wheat genotypes (genotype Forment de Artes was discarded due to late phenology) and three replicates per genotype (5 modern, i.e., semidwarf) cultivars for SI and RF conditions and four old (i.e., tall) cultivars only under RF conditions, 84 plots) were considered under rainfed N fertilized (RF+HN) and non-fertilized conditions (RF−LN) and supplemental irrigation N fertilized (SI+HN) and non-fertilized conditions (SI−LN). Sampling was performed 7 days after anthesis and the experiment was performed under field conditions during the 2011 crop season at the INIA's Experimental Station, Aranjuez, Spain. Mean values across plant tissues with different letters are significantly different according to the Tukey's honestly significant difference test (P < 0.05).

Level of significance: ns, not significant; \*P < 0.05; \*\*P < 0.01; \*\*\*P < 0.001.

### Potential and Relative Organ N Contribution to Grain N

The potential contribution of the entire ear as a source of N to the grain nitrogen (grain N) was on average (across all growing conditions) almost double (42%) to that of the flag leaf blade (17%) in 2011 (**Figure 1**, upper panel), whereas in 2012 the potential N contribution of the flag leaf blade (33%) to grain N was higher than in the main individual tissues of the spike such as the glumes (13%) or awns (13%) as well as higher than in the peduncle (28%) (**Figure 1**, lower panel). Further, the potential N contribution (i.e., compared with the other plant parts studied) of the flag leaf as a source of N to grain N increased to a 46% with better growing conditions (SI+HN) in 2012 (**Figure 1**, lower panel), whereas the opposite trend occurred for the entire spike (50% in 2011), peduncle (31% in 2012) and glumes and awns (23% in 2012). Indeed, the sum of the potential N contribution of awns and glumes to the total N accumulated in the grains (46%) under RF was comparable to that of the flag leaf blade (46%) under SI conditions (**Figure 1**, lower panel). Moreover, the ratio between the weights of all analyzed organs under RF divided by the weight under SI conditions (Table S2) was lower in the flag leaf (0.42) than other spike organs such as glumes (0.88) and awns (0.77). Furthermore, regardless of the growing conditions, old cultivars showed higher potential N content in the whole spike, its specific parts (glumes and awns) and the organs below the spike (peduncle and flag leaf) than in the modern cultivars (Figure S1). However, no genotypic effect was observed on the potential N content of the different organs (Table S4).

Correlations across growing conditions between nitrogen grain yield and the relative N content of glumes (R <sup>2</sup> = 0.69, P < 0.001), awns (R <sup>2</sup> = 0.46, P < 0.001), and peduncle (R 2 = 0.42, P < 0.001) were negative (**Figure 2**), whereas the same category of relationships in the flag leaf were positive (R <sup>2</sup> = 0.78, P < 0.001). Thus, whereas the relative N content of the flag leaf with regard to the nitrogen GY increased with better growing conditions (and thus higher nitrogen grain yield), the relative N contribution in the glumes, awns, and peduncle increased under more stressed conditions (low nitrogen grain yield; **Figure 2**).

The correlations of total N content per organ against the total grain N content per spike across growing conditions supported a greater role for the spike as a whole than the flag leaf blade, at least under the low yielding conditions of 2011. Thus the total N in the spike was better related to total grain N content per spike than the total N in the flag leaf (**Figure 3**, left panel). However, for the

TABLE 2 | Genotype (G), water supply (W), nitrogen (N), and organ (O) effects (ANOVA) and mean values of nitrogen isotope composition (δ <sup>15</sup>N) in the water-soluble fraction (WSF) and dry matter (DM) of the flag leaf blade, peduncle, glumes, awns as well as in mature grains, nitrogen content (N content DM), total nitrogen content per organ (Total organ N), grain yield (GY), and total nitrogen of mature grains per spike (Grain N·spike−<sup>1</sup> ).


Ten durum wheat genotypes and three replicates per genotype (5 modern cultivars for SI and RF conditions and five old cultivars under RF conditions alone) were considered under rainfed N fertilized (RF+HN) and non-fertilized conditions (RF-LN) and supplemental irrigation N fertilized (SI+HN) and non-fertilized conditions (SI-LN). Sampling was performed 10 days after anthesis and the experiment was performed under field conditions during the 2012 crop season at the INIA's Experimental Station, Aranjuez, Spain. Mean values across plant tissues with different letters are significantly different according to the Tukey's honestly significant difference test (P < 0.05). Level of significance: ns, not significant; \*P < 0.05; \*\*P < 0.01; \*\*\*P < 0.001.

2012 season, correlations of total organ N content, and total grain N content per spike were positive and significant for all the plant parts studied, but those of the flag leaf blade and also the awns were slightly higher (P < 0.001) than those of the peduncle and the glumes (**Figure 3**, right panel) in 2012. These results suggest an increase in the role of the flag leaf blade as a source of N to the grains as growing conditions improved (**Figure 3**, right panel).

The relationships between HI and the potential N contribution of the different organs across all genotypes and growing conditions were studied (**Table 4**). Except for the flag leaf in 2012 a negative correlation between the potential N contribution of a given organ and HI was observed, with old cultivars exhibiting, in general, higher ear contributions and lower HIs than the modern cultivars.

## Fractionation Nitrogen Isotope Composition

Strong linear correlations (including all growing conditions) were observed between the δ <sup>15</sup>N in the mature grains and the δ <sup>15</sup>N in the dry matter of all studied plant organs and growing seasons (P < 0.001; **Table 3**, **Figure 4**). Nevertheless, the strongest correlation against the δ <sup>15</sup>N of the grains was achieved by the flag leaf blade (r = 0.96 P < 0.001; **Table 3**, **Figure 4**, high-left panel) followed by the entire spike (r = 0.92 P < 0.001) and the glumes (r = 0.91 P < 0.001). In addition, in the relationship between the δ <sup>15</sup>N in the grains and the corresponding values within each organ represented in **Figure 4**, the higher the δ <sup>15</sup>N in the grains, the more similar the δ <sup>15</sup>N values of the different organs and the δ <sup>15</sup>N in the grains (**Figure 4**).

FIGURE 1 | Potential N contribution of the flag leaf blade, peduncle, spike, glumes, and awns to the nitrogen accumulated in the grains. Values were calculated as the product of nitrogen content (N content) of the different organs multiplied by their respective dry weight and standardized by the total nitrogen content of mature grains per spike (Grain N·spike−<sup>1</sup> ). For the total spike N calculation, total grain N of developing grains was subtracted from the calculation (see Section Materials and Methods). Ten durum wheat genotypes (genotype Forment de Artes was discarded due to late phenology in 2011) and three replicates per genotype (totalling 84 plots in 2011 and 90 plots in 2012) were considered under rainfed N fertilized (RF+HN) and non-fertilized conditions (RF−LN) and supplemental irrigation N fertilized (SI+HN) and non-fertilized conditions (SI−LN). Sampling was performed 7 and 10 days after anthesis (2011 and 2012, respectively) and the experiment was performed under field conditions at the INIA's Experimental Station, Aranjuez, Spain during the 2011 (upper panel) and 2012 (lower panel) growing seasons. Mean values across organs and different growing conditions with different letter are significantly different according to the Tukey's honestly significant difference test (P < 0.05).

Conversely, the more depleted the δ <sup>15</sup>N in the grains the more different (further below) the δ <sup>15</sup>N values in the different organs and the δ <sup>15</sup>N in the grains in 2011 (Figure S2) and 2012 (**Figure 4**). Specifically, the δ <sup>15</sup>N in the flag leaf blade showed the most different values (further below) compared with the δ <sup>15</sup>N of the grains, especially under RF conditions. Conversely the δ <sup>15</sup>N in the glumes and awns showed constant (i.e., regardless of the δ <sup>15</sup>N of the grains) equidistant values away from the δ <sup>15</sup>N in the grains (**Figure 4**, left-upper and lower panel). Nevertheless, nitrogen fractionation from source organs to the sink (assumed to be the difference between δ <sup>15</sup>Norgans minus δ <sup>15</sup>Ngrains) increased with improvements in growing conditions as observed by the increases in the difference between the δ <sup>15</sup>Norgans minus δ <sup>15</sup>Ngrains associated with an increase in the GY (**Figure 4**, see figure inset). In particular, the flag leaf showed the highest nitrogen fractionation associated with the increase in GY, whereas the awns showed the lowest nitrogen fractionation associated with GY (**Figure 4**, right lower panel, see figure inset). In addition, in 2011 the δ <sup>15</sup>N in the flag leaf was positively and strongly related to stomatal conductance (r = 0.75; P < 0.001) under RF conditions (Figure S3). Moreover, δ <sup>15</sup>N in the different organs was positively related to GY and nitrogen grain yield in 2011 and 2012 (**Figure 5**). The organ where δ <sup>15</sup>N was best related to GY was the spike (P < 0.001) in 2011 and the flag leaf blade (P < 0.001) in 2012 (**Figure 5**, right and left panels, respectively).

## DISCUSSION

Grain yield (GY) during 2011 was lower than in 2012, and was within the range of GY previously reported under very severe drought stress conditions in the Mediterranean basin (Araus et al., 1998; Oweis et al., 2000). In addition, the interaction of nitrogen and water showed an effect on GY (Table S1). During both years support irrigation had a positive effect on GY, whereas, nitrogen fertilization did not have any positive effect on GY except for SI in 2011, and otherwise tended to have the opposite effect (**Tables 1, 2**). N fertilization could have caused haying off, thus decreasing GY, which might have been triggered by a terminal stress during reproductive stage (Araus et al., 2013).

### Effect of Water and N Fertilization in Nitrogen Isotope Composition

RF conditions caused a decrease in organ δ <sup>15</sup>N (flag leaf, peduncle, glumes, awns, and grains) compared with the irrigated trial. Likewise, a decrease in δ <sup>15</sup>N under stress conditions has been previously reported in shoots (Yousfi et al., 2012) and mature grains of durum wheat (Araus et al., 2013) and bread wheat (Robinson, 2000). The increase in δ <sup>15</sup>N in response to SI may be the consequence of labile nitrate derived from chemical fertilizers (urea in our case) with a depleted (near to zero) δ <sup>15</sup>N (Bateman and Kelly, 2007) leaching out of the root zone (Hawkesford, 2014) to lower subsoil layers (Raimanová and Haberle, 2010). Alternatively, the increase in plant biomass due to irrigation may lead to exhaustion of the nitrogen fertilizer, therefore causing the crop to rely on natural sources of soil N, which are characterized by higher δ <sup>15</sup>N values (Serret et al., 2008). Therefore, apart from chemical fertilizers, nitrogen pools derived from the mineralization of soil organic matter and with an δ <sup>15</sup>N enriched signature (Raimanová and Haberle, 2010) may become available to the plant (Evans and Belnap, 1999).

### Nitrogen Isotope Composition and Grain Yield

Linear correlations of organ δ <sup>15</sup>N with GY and nitrogen GY were strong and positive for both growing seasons (**Figure 5**), as has also been observed in the past with maize (Coque et al., 2006) and durum wheat (Yousfi et al., 2009). Such

positive relationships between δ <sup>15</sup>N and GY could be related to some extent with stomatal conductance (Farquhar et al., 1980; Araus et al., 2013). Accordingly, stomatal conductance under RF conditions was positively related to δ <sup>15</sup>N in the flag leaf blade in 2011 (r = 0.75, P < 0.001, Figure S3). Thus, low stomatal conductance may reduce losses of ammonia and nitrous oxides which would reduce the enrichment of δ <sup>15</sup>N (Farquhar et al., 1980; Smart and Bloom, 2001). In fact, linear regression between δ <sup>15</sup>N in the mature grains and δ <sup>15</sup>N in the dry matter of the different organs (**Figure 4**) supports this finding. Thus, the enricher was the δ <sup>15</sup>N in the grains; the more similar were the δ <sup>15</sup>N values of the different organs to the δ <sup>15</sup>N in the grains (slope 1:1 of the relationships between δ <sup>15</sup>N of grains and organs, respectively). Thus, taking into account that for a given nitrogen fertilization value higher values of δ <sup>15</sup>N represented better growing conditions (observed by the positive correlation


TABLE 3 | Linear regression of the relationship between the nitrogen isotope composition (δ <sup>15</sup>N) in the mature grains (δ <sup>15</sup>N grain DM) against the δ <sup>15</sup>N in the dry matter (DM) and water-soluble fraction (WSF) of the flag leaf blade, spike and roots (2011) and flag leaf blade, peduncle, glumes and awns (2012).

Ten durum wheat genotypes and three replicates per genotype (five modern cultivars for SI and RF conditions and five old cultivars only under RF conditions, 84 plots in 2011, and 90 in 2012 as genotype Forment de Artes was discarded due to late phenology in 2011) were considered under rainfed N fertilized (RF+HN) and non-fertilized conditions (RF-LN) and supplemental irrigation N fertilized (SI+HN) and non-fertilized conditions (SI-LN). Sampling was performed 7 and 10 days after anthesis in 2011 and 2012, respectively and the experiment was performed under field conditions at the INIA's Experimental Station, Aranjuez, Spain, during two consecutive seasons (2011 and 2012). The significant correlations are marked in bold. Levels of significance: ns, not significant; \*P < 0.05; \*\*P < 0.01; \*\*\*P < 0.001.

between GY and δ <sup>15</sup>N), enrichment of δ <sup>15</sup>N due to either N volatilization or leakage and/or exhaustion of chemical fertilizer (with a lower δ <sup>15</sup>N) probably occurred; as a consequence values of organ δ <sup>15</sup>N became closer to the δ <sup>15</sup>N in the grains (Raimanová and Haberle, 2010). Conversely, different values of δ <sup>15</sup>N in the grains in comparison to the δ <sup>15</sup>N of the different organs suggest that reduction in stomatal conductance in response to RF conditions prevented enrichment in the δ <sup>15</sup>N of the grains. In addition, the flag leaf blade showed the highest nitrogen fractionation associated with the increase in GY compared with the awns, glumes, and peduncle (**Figure 4**, see figure inset). This finding suggests that losses of <sup>14</sup>N due to volatilization were higher in the flag leaf due in part to higher stomata density (and thus conductance and transpiration) in the flag leaf (which is an amphystomatic organ) than in the awns (Tambussi et al., 2005; Li et al., 2006). In addition, the δ <sup>15</sup>N in the roots also showed a similar trend, suggesting that N losses through volatilization or exudation might also be occurring in the roots (Johansson et al., 2009; Figure S1). However, the δ <sup>15</sup>N in field experiments should be interpreted with caution as miscellaneous biotic and abiotic factors can affect the natural abundance and discrimination of δ <sup>15</sup>N in the soil-plant system (Hogberg, 1997; Robinson et al., 1998; Evans, 2001; Robinson, 2001; Cernusak et al., 2009; Yousfi et al., 2012). In fact, in our study fractionation of δ <sup>15</sup>N was present (**Figure 4**) because the δ <sup>15</sup>N of the grains was increased in comparison to the values in the individual photosynthetic organs (**Figure 4**). Linear correlations of GY and nitrogen GY against the organ δ <sup>15</sup>N in the WSF were positive for both growing seasons but weaker (data not shown) compared with the correlation against organ δ <sup>15</sup>N in the dry matter. Such weaker correlations in the organ δ <sup>15</sup>N WSF might be related to the fact that the WSF is protein-free because enzymatic N is removed from the WSF and it only contains free amines (Cabrera-Bosquet et al., 2011).

#### Relative and Potential Organ Contribution to Grain Nitrogen

The relative (i.e., compared with the other plant parts studied) and potential (i.e., relative value with regard to the total N accumulated in the grains of a spike) contributions of the flag leaf as a source of N for grain nitrogen (grain N) increased as growing conditions improved at least during the second year, whereas the opposite occurred for the peduncle, glumes, awns, and entire spike (**Figure 1**). The flag leaf blade has been reported as the main nitrogen exporter to the grains in bread wheat (Simpson et al., 1983). In spite of the high contribution traditionally assigned to the flag leaf blade as a source of N to the growing grains (Evans, 1983; Araus and Tapia, 1987),

in our study, the potential contribution of the flag leaf blade shortly after anthesis was <50% (regardless of the growing conditions; **Figure 1**). Leaf N remobilization to grain N in rice, wheat or maize has been observed to vary from 50 to 90% (Masclaux et al., 2001). Contrastingly, Simpson et al. (1983) reported that less than half of the nitrogen retranslocated from the leaves arrives directly in the grain, whereas the rest is mostly translocated to the roots. However, most of the nitrogen translocated to the roots is further retranslocated to the shoots via xylem sap (Simpson et al., 1983) where it supplies transpirative organs such as the glumes, leaves, and stem (Simpson et al., 1983). However, N directly exported from the roots to nontranspirative organs such as the grains may be a minor player because grains may receive only 1% of the nitrogen exported from the roots (Simpson et al., 1983). Additionally, the relatively low potential N contribution of the flag leaf blade to grain N may also indicate that aside from other parts of this leaf (such as the sheath), other organs of the plant contribute to the N accumulated in the grains. Leaves below the flag leaf were not considered in this study as their potential contribution to grain N is reported to be minor Del Pozo et al. (2007). Therefore we only considered the upper part of the plant, including the peduncle and the spike (**Figure 1**, upper panel and lower panel). This view is supported by the strong linear correlations (including all growing conditions) between the δ <sup>15</sup>N in the mature grains and the δ <sup>15</sup>N of the dry matter of these plant organs (P < 0.001; **Table 3**). In spite of this, the strongest correlation against δ <sup>15</sup>N in the mature grains was achieved by the flag leaf blade followed by the glumes and the entire spike (**Table 3**). Additionally, the correlations of the total N content per organ against the total grain N content per spike also support a slightly greater role for the flag leaf than the

other plant parts (**Figure 3**, right panel). Besides, the experiment conducted under more severe stress in 2011 (growth cycle with low GY) supports the important role of the spike to grain N, since N content in the entire spike was better related to Grain N·spike−<sup>1</sup> in comparison to the flag leaf (**Figure 3**, left panel). Furthermore, the role of the flag leaf sheath supplying N to the growing grains should be not neglected as a potential source of N (Araus and Tapia, 1987).

Therefore, apart from the potential contribution of the flag leaf (and eventually other vegetative organs), the role of the ear as a supplier of N should be taken into account, as total N allocation in the ear has been observed to be higher than in the flag leaf blade in durum wheat in field chambers (Vicente et al., 2015a). Indeed, accumulation of nitrogen in the grains is closely dependent on N mobilization originating from the glumes (23% of N contribution to the grain) in the absence of an exogenous supply of nitrogen (Simpson et al., 1983). Similarly, in our study the potential and relative N contribution of the awns and glumes as well as the peduncle with respect to the flag leaf blade increased under water stress and non-fertilized conditions (**Figures 1**, **2**, respectively). In fact, the sum of the potential N contribution of awns and glumes under RF conditions was comparable to that of the flag leaf blade under SI conditions (**Figure 1**). In a recent study performed in wheat grown hydroponically, the total N accumulated in the flag leaf blade was comparatively lower than the amount accumulated in the remaining upper parts of the plant (Vicente et al., 2015b). The lower potential N content of the flag leaf blade under RF conditions could be related to organ size but was not concomitant with leaf senescence (as chlorophyll content was similar under RF and SI conditions at the moment of sampling, data not shown). Thus, the ratio between the weight of all analyzed organs under RF divided by the SI conditions (Table S2) was lower in the flag leaf (0.42) than other spike organs such as glumes (0.88) and awns (0.77). These results suggest that the leaf lamina was smaller under RF, whereas the spike size (entire spike, awns, and glumes) was not reduced under RF conditions (Table S2). On the other hand, growing conditions may also affect the efficiency of N transfer from source organs to the grains (Masclaux-Daubresse et al., 2010). For example, part of the N accumulated in the flag leaf and other leaves may be exported back to the roots, particularly under stress (drought, low fertility) to promote root development mainly in vegetative stages (Palta and Gregory, 1997) and to a lesser extent during grain filling (Jensen, 1994; Swinnen et al., 1994). Conversely, during rapid grain filling glumes may only retranslocate N to the grains, thus increasing the role of N under water stress conditions (Waters et al., 1980). In addition, the potential N contribution of the ear, peduncle and glumes was higher in old than modern cultivars (Figure S2). Such differences might be related, at least in part, to the larger harvest index (HI) in modern cultivars than old cultivars (Reynolds et al., 2009; Foulkes et al., 2011; Sanchez-Bragado et al., 2014a) due to the introduction of dwarfism alleles (Maydup et al., 2012). The negative correlations between the potential organ nitrogen contribution and HI (**Table 4**) suggest that old cultivars may provide a greater contribution of N to the grains in relative terms, especially from the ear due to a sink driven phenomenon. In a semidwarf (i.e., modern) genotype the absolute amount of N accumulated in the peduncle may be lower than in a tall (i.e., old) cultivar, although not significant differences were observed (Table S1). Conversely, larger spike in the modern cultivars decrease the surface (where the photosynthetic tissues accumulating N are placed) to volume (where grain N is accumulated) ratio in comparison to spikes in the old cultivars. That is, the spike surface (assumed to be the photosynthetic tissues of the spike such as the glumes and awns) decreases relative to the spike volume (assumed to be the grains). Thus, in modern cultivars a lower relation between spike surface and volume may have

TABLE 4 | Linear regression of the relationship between the HI and Potential N contribution of the flag leaf blade, peduncle, spike, glumes, and awns to the nitrogen accumulated in the grains.


Values were calculated as the product of nitrogen content (N content) of the different organs multiplied by their respective dry weight and standardized by the total nitrogen content of mature grains per spike (Grain N·spike−<sup>1</sup> ). For the total spike N calculation, total grain N of developing grains was subtracted from the calculation (see Section Materials and Methods). Ten durum wheat genotypes and three replicates per genotype (five modern cultivars for SI and RF conditions and five old cultivars under RF conditions alone, 84 plots in 2011 and 90 in 2012 as the genotype Forment de Artes was discarded due to late phenology in 2011) were considered combining all growing conditions under rainfed N fertilized (RF+HN) and non-fertilized conditions (RF-LN) and supplemental irrigation N fertilized (SI+HN) and non-fertilized conditions (SI-LN). Sampling was performed 7 and 10 days after anthesis (2011 and 2012, respectively) and the experiment was performed under field conditions at the INIA's Experimental Station, Aranjuez, Spain during the 2011 and 2012.

decreased the potential of the ear to provide nitrogen to the grains compared to old cultivars. Consequently, these results suggest that among the set of genotypes studied, potential spike N content is not only dependent on growing conditions but also on genotype (sink strength and plant height).

Summarizing, the significant correlations between the total N content of the different plant organs studied (flag leaf blade, peduncle, glumes, and awns) against the grain N per spike, suggest that all these organs can potentially export a proportion of their N to the grains. This view was supported by the strong linear correlations (including all growing conditions) between the δ <sup>15</sup>N in the mature grains and the δ <sup>15</sup>N in the dry matter of all studied plant organs. Moreover, the large amount of N accumulated in the whole grains of the spike, together with the relatively low amount of N available in the different organs supports the concept that N imported into the grains cannot be sustained by one organ alone; rather, different organs may simultaneously export nitrogen to the grains. In spite of that, the role of the flag leaf blade as a potential supplier of N to grains increased, in comparison to other upper parts of the plant, under improved growing conditions (and thus higher GY) as well as in the modern (semidwarf) cultivars compared to the

#### REFERENCES


old cultivars. In contrast, the relative importance of the ear and peduncle increased under water stress conditions (low GY) or in the old genotypes compared to the new genotypes. Such findings indicate that other than the flag leaf (and eventually other vegetative organs), the role of the ear as a supplier of N should be taken into account even though growing conditions may affect the relative potential contribution of the different plant parts. Thus, the potential ear N content could be a positive trait for plant phenotyping, especially under water limiting and/or low fertility conditions. The total N content of the spike at early grain filling should be considered a trait amenable in crop management (e.g., precision agriculture) as well as for breeding (phenotyping). Nevertheless, the challenge is to find high throughput monitoring techniques for this trait. Besides the specific results achieved, the objective of this study was to test methodologies to asses the potential N contribution of different organs to developing grains. It does not pretend to in detail estimate the integrated in time contributions of different organs to grain N throughout grain filling, but to provide a comparative view across organs potentially useful as an affordable phenotyping tool.

#### AUTHOR CONTRIBUTIONS

RS and JA conceived designed the study; RS and MS carried out the field measurements; RS conducted laboratory work, RS and JA analyzed the data; RS and JA interpreted the results; RS took the principal role in writing the manuscript under supervision of JL. All authors have contributed to the revision of the manuscript.

#### ACKNOWLEDGMENTS

This work was supported by the projects AGL2013-44147-R and AGL2016-76527-R from the Secretaría de Estado de Investigación, Desarrollo e Innovación. Dirección General de Investigación Científica y Técnica. Ministerio de Economía y Competitividad, Spain. ICREA-Academia bourse for research quality. 2014-2018. Institut Català de Recerca Avançada (ICREA) Generalitat de Catalunya.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 01986/full#supplementary-material

wheat. Plant Cell Environ. 16, 383–392. doi: 10.1111/j.1365-3040.1993. tb00884.x


filling: the case of a spring crop grown under mediterranean climate conditions. Plant Physiol. 85, 667–673. doi: 10.1104/pp. 85.3.667


rhizosphere carbon budget estimations. Soil Biol. Biochem. 26, 161–170. doi: 10.1016/0038-0717(94)90159-7


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Sanchez-Bragado, Serret and Araus. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Transcriptomic Analysis for Different Sex Types of Ricinus communis L. during Development from Apical Buds to Inflorescences by Digital Gene Expression Profiling

Meilian Tan<sup>1</sup> , Jianfeng Xue<sup>1</sup> , Lei Wang<sup>1</sup> , Jiaxiang Huang<sup>2</sup> , Chunling Fu<sup>1</sup> and Xingchu Yan<sup>1</sup> \*

*<sup>1</sup> Key Laboratory of Biology and Genetic Improvement of Oil Crops, Ministry of Agriculture, Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, Wuhan, China, <sup>2</sup> Castor Oil Research Institute of Jiaxiang, Zibo, China*

#### Edited by:

*John Doonan, Aberystwyth University, UK*

#### Reviewed by:

*Lijun Chai, Huazhong Agricultural University, China Konstantinos Vlachonasios, Aristotle University of Thessaloniki, Greece*

> \*Correspondence: *Xingchu Yan yanxc@oilcrops.cn*

#### Specialty section:

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

Received: *07 July 2015* Accepted: *15 December 2015* Published: *12 February 2016*

#### Citation:

*Tan M, Xue J, Wang L, Huang J, Fu C and Yan X (2016) Transcriptomic Analysis for Different Sex Types of Ricinus communis L. during Development from Apical Buds to Inflorescences by Digital Gene Expression Profiling. Front. Plant Sci. 6:1208. doi: 10.3389/fpls.2015.01208* The castor plant (*Ricinus communis* L.) is a versatile industrial oilseed crop with a diversity of sex patterns, its hybrid breeding for improving yield and high purity is still hampered by genetic instability of female and poor knowledge of sex expression mechanisms. To obtain some hints involved in sex expression and provide the basis for further insight into the molecular mechanisms of castor plant sex determination, we performed DGE analysis to investigate differences between the transcriptomes of apices and racemes derived from female (JXBM0705P) and monoecious (JXBM0705M) lines. A total of 18 DGE libraries were constructed from the apices and racemes of a wild monoecious line and its isogenic female derivative at three stages of apex development, in triplicate. Approximately 5.7 million clean tags per library were generated and mapped to the reference castor genome. Transcriptomic analysis showed that identical dynamic changes of gene expression were indicated in monoecious and female apical bud during its development from vegetation to reproduction, with more genes expressed at the raceme formation and infant raceme stages compare to the early leaf bud stage. More than 3000 of differentially expressed genes (DEGs) were detected in *Ricinus* apices at three developmental stages between two different sex types. A number of DEGs involved in hormone response and biosynthesis, such as auxin response and transport, transcription factors, signal transduction, histone demethylation/methylation, programmed cell death, and pollination, putatively associated with sex expression and reproduction were discovered, and the selected DEGs showed consistent expression between qRT-PCR validation and the DGE patterns. Most of those DEGs were suppressed at the early leaf stage in buds of the mutant, but then activated at the following transition stage (5-7-leaf stage) of buds in the mutant, and ultimately, the number of up-regulated DEGs was equal to that of down-regulation in the small raceme of the mutant. In this study, a large number of DEGs and some suggestions involved in sex expression and reproduction were discovered using DGE analysis, which provides large information and valuable hints for next insights into the molecular mechanism of sex determination. It is useful for other further studies in *Ricinus*.

Keywords: castor bean, digital expression profile, pistillate, sex determination, transcriptome analysis

## INTRODUCTION

Castor (Ricinus communis L.), an important industrial oilseed crop belonging to the Euphorbiaceae family, grows as an indeterminate annual or perennial depending on climate and soil types in tropical, sub-tropical, and warm temperate regions (Anjani, 2012). Because of the high fatty acid content in its seeds (more than 45%) and the rich ricinoleic acid content in its oil (80– 90%), castor is a versatile raw material in industrial chemistry: for example, castor oil can be used for the production of lubricants, nylon, hydraulic and brake fluids, paints, dyes, coatings, inks, cold resistant plastics, waxes and polishes, pharmaceuticals, perfumes, and biodiesel (Jeong and Park, 2009; Halilu et al., 2013). To date, several high-yield varieties and hybrids have been developed (Baldanzi and Pugliesi, 1998; Amaral, 2003; Savy Filho, 2005; Pranavi et al., 2011), but to meet the tremendous global demand for castor oil, cultivars with even higher yield and oil content are needed (Anjani, 2012). However, hybrid breeding for improved yield and high purity in Ricinus is still constrained by the genetic instability of females and the unknown mechanism of sex expression.

In castor plant, the standard type of inflorescence is gradient monoecious raceme (female flowers at the apex and staminate flowers on the lower portion) (Shifriss, 1960). However, a wide variation of inflorescence patterns occurs in natural cultivation, including other kinds of racemes such as strictly pistillate (bearing only female flowers), male (only staminate flowers), apically interspersed (monoecism with interspersed male flowers in the apical pistillate region), and entirely interspersed (female and male flowers uniformly interspersed) (George and Shifriss, 1967). In addition, inflorescence setting with one or a few hermaphrodite flowers occurs occasionally (Jacob, 1963). Accordingly, for castor individuals, there are several sex models, including normal monoecism, sex reversal, interspersed sexuality, and strictly female (Shifriss, 1956; Jacob and Atsmon, 1965; George and Shifriss, 1967).

In the past, scientists focused on the sexuality of Ricinus to interpret the inheritance and instability of sex variation. Shifriss (1956) described the sex tendency, sex patterns, inheritance and reversion of Ricinus, and proposed a hypothesis: Gene F controls a genetically stable series of sex variants ranging from female (f) to strongly male inbreds, and an unknown factor could affect male tendency, sex reversion, and sex instability by suppressing gene F or mutating itself. Later, he described sex variations of Ricinus in two genetic systems, which he tentatively named as "conventional" and "unconventional" (Shifriss, 1960). Monoecious variants and rare recessive female mutants were ascribed to the conventional form, sex reversals and non-reverted females belonged to the unconventional form. Monoecism is governed by two major groups of genes: qualitative genes that determine flower type, and polygenes regulating gradient differentiation and racial differences in sex tendency. In addition, gene modifiers such as id and th were also considered to affect the pattern of sex differentiation. He postulated that genetic instability of spontaneous mutants, so-called positioneffect variegation, may result from a rearrangement undergoing two basic kinds of genetic changes: mutation into new hereditary potentialities, and transformation between an "active" and "inactive" state. Moreover, further evidence regarding the former findings (Shifriss, 1956, 1960) revealed that femaleness transmits more effectively to progeny through female inflorescences of sexreversal plants than through reverted monoecious inflorescences of the same plant (Jacob and Atsmon, 1965).With regard to interspersed sexuality, Shifriss (1956, 1960) believed that the interspersed pattern of sex differentiation is determined by hereditary factors for femaleness and genes for interspersed staminate flowers (id). George and Shifriss (1967) investigated more deeply into the inheritance of interspersed inflorescence patterns, and concluded that the level of expressivity of interspersed staminate flowers depends on the dosage of two independent genes (id1 and id2), their loci, and the environment.

In addition to genetic regulation, sex expression or variation of Ricinus is simultaneously affected to some degree by fluctuations in the environment, including temperature, vegetative activity, nutrition level, pruning, and seasonal variations (Shifriss, 1956), as well as by plant hormones (Shifriss, 1961; Philipos and Narayanaswamy, 1976; Kumar and Rao, 1980; Mohan Ram and Sett, 1980; Varkey and Nigam, 1982; Tan et al., 2011). Despite a great deal of progress, the hypotheses described above are vulnerable in the absence of cytological information, and the molecular mechanisms of sex variation and genes determining sex expression in Ricinus remain poorly understood.

Genome-wide analyses have dramatically improved the efficiency of gene discovery. Next-generation sequencing (NGS) technologies provide new approaches for global measurements of gene expression. Due to its high efficiency and low cost, NGS has become an attractive alternative method for more efficient study of the genome, epigenome, and transcriptome (Oshlack et al., 2010; McIntyre et al., 2011). Many plant species have benefited from this technology, and large-scale genome sequences and transcriptome data are available in both model and non-model species (Goff et al., 2002; Yu et al., 2002; Kaplan et al., 2006; Ramsey et al., 2008; Huang et al., 2009; Wang et al., 2011; Xu et al., 2013). Cucumber, as a model plant for study of floral sex expression, its sex-controlling genes, sexmodifying plant hormones, and interactions with environmental conditions are clearly understood (Galun, 1962; Trebitsh et al., 1987, 1997; Malepszy and Niemirowicz-Szczytt, 1991; Yin and Quinn, 1995; Yamasaki et al., 2000; Kater et al., 2001; Mibus and Tatlioglu, 2004; Li et al., 2009b), and a number of candidate genes required for sex determination have been identified by transcriptome profile analysis (Guo et al., 2010; Wu et al., 2010). In Ricinus, however, a gap still exists between interpreting the molecular mechanism of sex expression and isolating the sexdetermining genes. The complete sequencing of the castor bean genome (Chan et al., 2010) provides tremendous opportunities for genomic analysis and facilitates identification of genes of biological interest. Digital gene expression (DGE) analysis, a powerful and recently developed tool, is based on ultra-highthroughput sequencing of millions of signatures in the genome, allowing identification of specific genes and direct quantitation of transcript abundance (Bentley, 2006; Hong et al., 2011). DGE analysis detects gene expression more quantitatively than microarray assays (`t Hoen et al., 2008) while providing similar assessments of relative transcript abundance. In comparison to high-throughput mRNA sequencing (RNA-seq), DGE more accurately detects expression differences in poorly expressed genes and exhibits much less transcript length bias (Hong et al., 2011). Thus, DGE has been widely utilized to discriminate differences in transcriptional responses among different tissues and organs in plants in the contexts of biotic and abiotic stress, metabolite biosynthesis, and developmental biology (Tian et al., 2013; Wei et al., 2013; Zhang et al., 2013; Guo et al., 2014; Yu et al., 2014; Zhao et al., 2014).

In this study, we performed DGE using the Illumina HiSeq 2000 to investigate the differences between the transcriptomes of apices and racemes from female (JXBM0705P) and monoecism (JXBM0705M) of Ricinus. In an attempt to get some hints associated with sex expression in Ricinus, we conducted (after high-throughput tag sequencing) an integrated bioinformatic analysis to identify expression patterns of genes and critical pathways in the female and monoecious lines at three stages of apex development. Assessment of the changes in gene expression between female and monoecious apices yielded sets of up-regulated and down-regulated genes associated with sex expression. Based on these differentially regulated genes (DEGs), some DEGs putatively related to sex expression were selected to confirm by quantitative RT-PCR analysis. Thus, comparison of gene expression patterns at three developmental stages of monoecious and female apices provides important hints for further insight into the molecular mechanisms underlying Ricinus sex variation.

#### MATERIALS AND METHODS

#### Plant Material

Seeds of nearly isogenic monoecious (JXBM0705M) and female (JXBM0705P) castor lines were kindly provided by Professor Huang (Castor Oil Research Institute of Jiaxiang, Zibo, Shandong Province). JXBM0705M is a wild monoecious line that bears male and female flowers (**Figure 1A**), and JXBM0705P, a pistillate line that bears only female flowers (**Figure 1B**), was developed by consecutively selecting spontaneous mutants of JXBM0705M. Seeds were sown and cultivated in the Yangluo experimental field of the Oil Crops Research Institute of the Chinese Academy of Agricultural Sciences, under standard field conditions, from spring to autumn of 2012. A previous study revealed that apices are leaf buds prior to the 5-leaf stage (i.e., five fully expanded leaves) and change to floral buds during the 6–9-leaf stage (Wang et al., 2012). According to this criterion, and combined with anatomical observation of a series of apical buds of the cultivar, in order to investigate global changes in the transcriptome during the development of buds from leaf apices to racemes, apical buds at the 3–4-leaf stage (ABML1 for monoecism and ABPL1 for female) and the 5–7-leaf stage (ABML2 for monoecism and ABPL2 for female) (**Figure 1C**), and small racemes 2–3 cm in size (RML for monoecism and RPL for female, **Figure 1D**), were collected in triplicate to construct 18 DGE libraries. Excised apices and infant racemes were immediately frozen in liquid nitrogen and stored at −80◦C until use.

### RNA Preparation, Library Construction, and Sequencing

(right) inflorescences, 2 cm in length. Scale bars, 0.5 cm.

stage (left) and 5–7-leaf stage (right); (D) infant female (left) and monoecious

Total RNA was extracted from buds and racemes using the GeneJET Plant RNA Purification Mini Kit (Thermo Fisher Scientific, Waltham, USA). RNA was dissolved in diethylpyrocarbonate (DEPC)-treated water and stored at −70◦C. The quality of total RNA was checked on 1% agarose gels, and purity was evaluated by OD260/280 ratio using a NanoDrop ND 1000 spectrophotometer (NanoDrop, Wilmington, DE, USA). Concentration and integrity of RNA were verified using an Agilent 2100 Bioanalyzer. The extracted RNA was prepared for next sequencing and quantitative RT-PCR validation.

A total of 8µg of total RNA was subjected to oligo (dT) magnetic bead adsorption to purify mRNA. Oligo (dT) was then used as a primer for the synthesis of first- and second-strand cDNA. The 5′ -ends of tags could be generated using two types of endonucleases, namely, NlaIII or DpnII. Bead-bound cDNA was subsequently digested with the restriction enzyme NlaIII, which recognizes and cleaves CATG sites. Fragments other than 3 ′ -cDNA fragments connected to oligo (dT) beads were washed away, and Illumina adaptor 1 was ligated to the sticky 5′ -end of the digested bead-bound cDNA fragments. The junction of the Illumina adaptor 1 and the CATG site constitutes the recognition site for MmeI, an endonuclease with separate recognition and cleavage sites. MmeI cleaves 17 bp downstream of the CATG site, yielding tags with adaptor 1. After removing 3′ fragments by magnetic bead precipitation, adaptor 2 was ligated to the 3′ ends of tags, yielding tags with different adaptors at both ends to form a tag library. After linear PCR amplification, fragments were purified by PAGE gel electrophoresis. During the quality control steps, the Agilent 2100 Bioanalyzer and ABI Step One Plus Real-Time PCR System were used for quantitation and quality verification of the sample library. Finally, the library was sequenced on an Illumina HiSeq 2000.

#### Data Analysis and Mapping of DGE Tags

Millions of raw 49 bp sequences were generated. Image analysis, base calling, generation of raw tags, and tag counting were performed using the Illumina pipeline (`t Hoen et al., 2008). Prior to mapping of tags to the reference database, empty tags (i.e., those with no tag sequence between the adaptors), adaptors, lowquality tags (tags containing one or more unknown nucleotides "N"), and tags with a copy number of 1 were removed from raw sequences to obtain clean tags.

To evaluate the normality of the entire DGE dataset, the distribution of clean tag expression was analyzed. The saturation analysis of the 18 libraries was also performed to estimate whether sequencing depth was sufficient for transcriptome coverage. To identify the gene expression patterns in apical buds and small racemes of two Ricinus lines, all clean tags were annotated by mapping to the castor genome (Chan et al., 2010) using the SOAP2 software (Li et al., 2009a), with a maximum of one nucleotide mismatch allowed. All tags mapped to reference sequences were filtered, and the remaining tags were designated as ambiguous tags. Mapping events on both sense and antisense sequences were included in data processing. For gene expression analysis, in order to identify genes that were differentially expressed between monoecious and female apices at the same developmental stages, as well as genes that exhibited distinctive expression between different developmental stages in the same line, we converted the number of raw clean tags in each library to TPM (number of transcripts per million tags) (`t Hoen et al., 2008; Morrissy et al., 2009) to obtain normalized gene expression levels. Then, nine pairs of DGE profiles of different sample libraries (ABML1 vs. ABPL1, ABML2 vs. ABPL2, RML vs. RPL, ABML1 vs. ABML2, ABML2 vs. RML, ABML1 vs. RML, ABPL1 vs. ABPL2, ABPL2 vs. RPL, and ABPL1 vs. RPL, where a is the control and b is the experimental group in "a vs. b") were compared to assess the diversity of gene expression. A rigorous algorithm for identifying DEGs (differentially expressed genes) between two samples, false discovery rate (FDR), was applied to determine the threshold P-value in multiple tests and analyses. Criteria for DEGs were as follows: FDR, ≤0.001, and |log2Ratio|, ≥1 (Audic and Claverie, 1997; Benjamini and Yekutieli, 2001). Following, we performed functional annotation to assign the unambiguously mapped genes identified in all libraries to Gene ontology (GO) terms. Gene annotation was conducted using Blast2GO (Conesa et al., 2005). Gene ontology (GO) was used to determine the possible functions of all DEGs by searching the GO database (http://www.geneontology.org/), and Web Gene Ontology Annotation Plot (WEGO) was also used for GO classification of genes identified in each DGE library (Ye et al., 2006). Moreover, the GO distribution of DEGs in comparisons of each pair of libraries was determined and compared. To further characterize gene function, pathway enrichment analysis of the DGE results was conducted by BLAST search of the KEGG database (http://www.kegg.jp/kegg/). Clustering analysis of differential gene expression patterns was also performed using MultiExperiment Viewer (MeV) (Chu et al., 2008; Howe et al., 2011). A P-value of 0.05 was selected as the threshold for a gene set to be considered significantly enriched.

## Quantitative RT-PCR Analysis

Quantitative RT-PCR analysis was used to verify the DGE results. The RNA samples used for the qRT-PCR assays were identical to those used for the DGE experiments. Three biological replicates and two technical replicates were performed for each sample. The first-strand cDNA was synthesized from the total RNA (1 ug) from each sample using the PrimeScript RT reagent kit with gDNA Eraser (perfect Real Time) (TaKaRa, RR047A). The primers designed for qPCR analysis was listed in Additional File S1. Actin gene (forward primer: 5′ -TGCTGACAGAATGAGCAAGG-3′ ; reverse primer: 5 ′ -AATCCACATCTGCTGGAAGG-3′ ) was used as an internal control gene for normalization. Quantitative PCR reactions used ABI7500 quantitative PCR, which was performed using SYBR <sup>R</sup> Premix Ex Taq™II(Tli RNaseH Plus) (TaKaRa, RR820A) kit according to the manufacturer's instructions. Reaction conditions were 95◦C for 5 s in order to activate enzyme reaction. Two step cycles were then used: 95◦C for 5 s, then 60◦C for 30 s, 40 cycles; solubility curve conditions were 95◦C for 15 s, 60◦C for 1 min, 95◦C for 30 s, 60◦C for 15 s. The specificity of the SYBR green PCR signal was further confirmed by melting curve analysis and agarose gel electrophoresis, and the relative expression levels of genes were calculated with the 2−△△Ct method.

#### Hormone Measurements

Contents of several phytohormones including auxin or indole-3-acetic acid (IAA, the major form of auxin in plants), abscisic acid (ABA), jasmonic acid (JA), and gibberellins (GAs) were analyzed to compare and verify the physiological differences between the monoecism and the pistillate. The samples used for the hormone measurement were the same as those used for the DGE experiments (Three development stages of apices from leaf bud to raceme for the monoecism and female, respectively; ABML1, ABML2, and RML for monoecism, ABPL1, ABPL2, and RPL for female.) Three biological replicates were designed for each sample. Quantification of endogenous IAA, ABA, JA, and GAs (GA1, GA3, GA4, and GA7) were performed as described previously (Chen et al., 2012). The data of detection was analyzed using Microsoft Excel 2010 and SAS V8 softwares.

## RESULTS

#### Analysis of DGE Libraries and Tag Mapping

The 18 DGE libraries were sequenced and generated approximately 6 million raw tags for each library, and more than 96% of the raw tags were clean (**Table 1**). There produced approximately 5.7 million clean tags per library, of which 148,526–174,340 were distinct. The number of unique distinct clean tag sequences ranged from 134,541 to 159,229 (**Table 1**).



Categorization

and

abundance

of

tags.

The RPL libraries and ABPL2 libraries contained the two highest numbers of distinct clean tags; the other libraries had similar numbers.

Our results showed that mRNAs transcribed from major genes were often present at fewer than 10 copies, and only a small proportion of genes were expressed at higher levels (**Figure 2**; Additional Files S2, S3). The distributions of total and distinct clean tag copy numbers had highly similar properties in libraries from all six sample types, with most tags coming from highly expressed genes (**Figure 2**). Among the distinct clean tags, more than 56% were present at 2–5 copies, 33.81–36.84% were present at 5–100 copies, and fewer than 6.27% had copy numbers higher than 100. However, more than 70% of total clean tags had counts above 100 in each library, indicating that the overall DGE data among the 18 libraries was normally distributed.

A reference gene database that included 120,799 R. communis Unigene sequences was preprocessed for tag mapping. Among the sequences, genes with a CATG site accounted for 90.03% (Additional File S3). Distinct clean tags (4.94–5.34%) were mapped unambiguously in the unigene database, whereas 7.48– 8.99% of distinct clean tags were not mapped to the unigene virtual tag database, in libraries from the six sample types (**Table 1**). Around 95% of total clean tags were mapped onto the R. communis genome with a perfect match or 1 bp mismatch to sense or antisense genes, and approximately 90% of distinct clean tags were successfully mapped. Ultimately, tag mapping onto the R. communis genome generated 50,722 tag-mapped genes for ABML1, 38,867 for ABPL1, 45,392 for ABML2, 54,090 for ABPL2, 44,709 for RML, and 41,658 for RPL (**Table 1**; Additional File S4).

In this research, the number of detected genes increased with the amount of sequencing amount until the number of tags reached 3 million or higher (Additional File S5), 20.61–25.73% of distinct clean tags perfectly matched antisense transcripts, and 43.14–53.23% of distinct clean tags perfectly matched sense strand-specific transcripts (Additional Files S3, S4). In total,

more than 88% of genes were transcribed from both strands, indicating the importance of RNA-mediated gene regulation in bud transformation from leaf to inflorescence.

## Global Gene Expression of Buds at Different Stages and Analysis of Differential Gene Expression

In this study, we identified 5526 and 5203 genes in the transcriptomes of monoecious and pistillate apical buds, respectively, at the 3–4-leaf stage; 5779 and 6118 genes at the 5–7-leaf stage; and 5870 and 5988 genes in racemes (**Figure 3A**).

Our results showed that whether monoecism or female, the transcriptome of their terminal buds during the three developmental stages was dynamic. In total, 7495 and 7610 genes were expressed in monoecious and pistillate apical buds over the three stages, respectively. Of which, 4066 and 4013 were constitutive, 1881 and 1924 were specific to a single stage, and 1548 and 1637 were expressed at two stages, correspondingly for monoecism and female (**Figures 3B,C**; Additional File S6). The complex changes of gene expression indicated that the transition from leaf apical bud to raceme is an involute process in castor bean. More genes were expressed in the buds closer to raceme formation and incipient racemes than in early apical buds (**Figure 3**), and more stage-specific transcripts were detected in the 5–7-leaf stage bud and the small raceme than in the early leaf bud (**Figure 3D**; Additional File S6). These observations suggest that identical molecular pathways are involved in apical bud development from vegetation to reproduction in the monoecism and female, the developmental process may require expression of a larger number of unique genes involved in regulatory processes and related pathways.

Comparison of the monoecious line with the female line revealed that 1386 DEGs were up-regulated and 1946 DEGs were down-regulated at the 3–4-leaf stage, but the majority of genes were up-regulated during the subsequent two stages: 1952 up-regulated and 1408 down-regulated genes at the 5–7 leaf stage, and 1667 up-regulated and 1451 down-regulated genes at the initial raceme stage (**Table 2**; **Figures 4A–C**; Additional File S7). The number of up-regulated DEGs in the female was greater than the number of down-regulated DEGs during the developmental transition from leaf bud to infant raceme, which might indicate that females require a number of genes involved in silencing of male flower expression. Similarly, it was showed that up-regulated genes outnumbered down-regulated genes in these comparisons between different development stages, except for the RPL vs. ABPL2 (**Table 2**; **Figures 4D–I**; Additional File S7), which also revealed that the expression of many genes tended to increase; a few genes were activated, whereas others were suppressed, during bud development from vegetation to reproduction.

### Functional Annotation and DEG Clustering Analysis

Go analysis revealed that these well-annotated sequences belonged to three main categories (cellular component, molecular function, and biological process) and mainly

libraries from the monoecious and pistillate lines.

FIGURE 3 | Transcriptome analysis of apical buds and racemes of the monoecious line (ML) and pistillate line (PL). (A) Transcriptome sizes of the monoecious and pistillate lines at three developmental stages. (B,C) Venn diagram showing the overlaps between three stages (two apical bud stages and raceme stage) of ML and PL. The number in parentheses after each stage designation is the total transcripts detected in that stage(s). (D) Analysis of transcriptome changes from apical buds to raceme development of the ML and PL. Transcripts shared by three stages of ML and PL are not shown; numbers above the x-axis represent transcripts present in the indicated stage that are stage-specific (dark red), not shared with the previous stage (orange), or shared with the prior stage but missing in at least one other stage (yellow-green). Numbers below the x-axis represent transcripts present in the prior stage that were not detected in the current stage (green and blue).

#### TABLE 2 | Gene expression levels across different sample libraries.


*Significantly up-regulated: log*2*Ratio,* ≥*1, and probability,* ≥*0.8; Up-regulated: log*2*Ratio,* ≥*1, and probability,* <*0.8; Not DEGs: genes not differentially expressed (0*≤*|log*2*Ratio|*<*1); Down-regulated: log*2*Ratio,* ≤ −*1, and probability,* <*0.8; Significantly down-regulated: log*2*Ratio,*≤ −*1, and probability,* ≥*0.8.*

distributed into 43 categories, including the most predominant pathways such as "Intracellular membrane-bounded organelle," "Adenyl ribonucleotide binding," "Transition metal ion binding," "Intrinsic to membrane," "Cellular protein modification process," "Plastid," "Membrane and cell part," "Binding and protein binding," "Metabolic process," and "Transcription, DNAdependent." The distributions among the pairwise comparisons were similar, except for small differences in the numbers of genes in each category and the total numbers of main category (Additional Files S8, S9). In addition, all of the annotated genes were mapped to terms in the KEGG database to search for significantly enriched genes involved in spliceosome, cell cycle, homologous recombination, or signaling pathways (Additional File S10).

In general, the 7687 DEGs in all comparisons were clustered as the union of DEGs. A total of 617 transcripts occurring

FIGURE 4 | Comparison of gene expression levels across all libraries. All genes mapped to the reference sequence were examined for differential expression across the libraries. ABML1 and ABPL1: apical bud of 3–4-leaf stage of monoecious and female lines, respectively; ABML2 and ABPL2: apices of 5–7-leaf stage of monoecious and pistillate lines; RML and RPL: small raceme (2–3 cm long) of monoecious and female lines. Red prism represented significantly up-regulated genes (log2Ratio, ≥1; probability, ≥0.8), yellow represented up-regulated genes (log2Ratio, ≥1; probability, <0.8), black showed not DEGs (log2Ratio, ≥0 and <1), light green indicated down-regulated genes (log2Ratio, ≤ −1; probability, <0.8) and green prism demonstrated significantly down-regulated genes (log2Ratio, ≤ −1; probability, ≥0.8).

simultaneously in the three comparisons of three development stages between the monoecism and female were identified, and the comparison of the ratio values of these genes were used for clustering as the intersections of DEGs (**Figure 5**). Among the nine major clusters, 62 genes grouped into Cluster A were down-regulated in the three comparisons of developmental stages between the monoecism and the female, whereas the 97 genes in Cluster D were up-regulated in the three comparisons. GO analysis of these clustered genes indicated that the known genes were mainly involved in membrane or vesicle structures of intracellular or cytoplasmic components, activity and binding functions, response to light or hormone stimulus, transport, signal transduction, metabolic processes, and developmental processes involved in differentiation and reproduction (Additional File S11).

A heat map was generated of DEGs between different growth stages for both the monoecious and female buds (**Figure 6**). There were 621 DEGs in the four comparisons that clustered as the intersections of DEGs (Additional File S12). As shown in **Figure 6**, the DEGs in "ABML1 vs. ABML2" and "ABPL1

vs. ABPL2" were closely correlated to the DEGs in "ABML2 vs. RML" and "ABPL2 vs. RPL," respectively, indicating that gene expression differences between monoecious and female lines were not obvious during the same development processes. However, this analysis revealed a greater number of DEGs between different development processes, with a relatively distant relationship, which suggests that transformation of apical bud from leaf to flower is a complex process.

## Genes Associated with Sex Expression and Reproduction in Castor Bean

Based on prior knowledge of the putative involvement in sex determination and the expression levels and functions of DEGs between the monoecious and female buds in three development stages, several subgroups were assumed to be putatively related

to sex expression and reproduction (**Table 3**). We identified seven genes involved in response to hormone stimulus, whose expression changed significantly in mutant females over the three developmental stages (**Table 3**). These genes included those that encode dynamin-2A, auxin response factor, PCI domaincontaining protein, ATP-binding protein, spermidine synthase, Xaa-Pro amino peptidase, and a conserved hypothetical protein. These genes are associated with tissue or organ developmental processes, and several of them take part in signal transduction, auxin transport, phytohormone biosynthesis, and metabolism (such as polyamine and abscisic acid).

In this study, we detected eight transcription factors that exhibited significantly different expression patterns between the pistillate and monoecious lines during the three developmental stages. These transcriptional regulators were identified as DNA-dependent transcription factors, including MADS box proteins, DNA-binding proteins, the Axial regulator YABBY5, transcription initiation factors, auxin response factor, and

#### TABLE 3 | Some selected differentially expressed genes detected by digital expression profiling in apical bud of castor bean.


the RNA polymerase sigma factor rpoDI. Auxin response factor is respond to hormone stimulus and participates in hormone-mediated signaling pathways, also plays a role in tissue and organ developmental processes. Furthermore, one subgroup is composed of four genes related to signal transduction, including Gcn4-complementing protein, histidinecontaining phosphotransfer protein, an unknown protein, and cyclic nucleotide-gated ion channel, which exhibited consistent expression changes: down-regulation at the early leaf-bud stage, up-regulation at the 5–7-leaf stage, and down-regulation at the young raceme stage. Another two genes listed above (PCI domain-containing protein and ATP-binding protein) play roles in signal transduction in addition to response to hormone stimulus.

One gene (arginine/serine-rich splicing factor) associated with sex differentiation and another gene (acid phosphatase) related to pollination were identified as differentially expressed between monoecious and pistillate apical bud development. In female apical buds, both genes were distinctly down-regulated at the early leaf-bud stage, but up-regulated at the 5–7-leaf and initial raceme stages. In addition, expression of a PCD (programmed cell death)-related gene that encodes cysteine proteinase was altered in pistillate apices, with down-regulation at the early leaf-bud and initial raceme stages and up-regulation during the period of transition from leaf bud to floral bud. Several genes involved in reproduction exhibited the same expression patterns in pistillate terminal bud: sentrin/sumo-specific protease, the U4/U6 small nuclear ribonucleoprotein Prp4, sorting and the assembly machinery (sam50) protein were down-regulated at the early leaf stage, but up-regulated at the 5–7-leaf and small raceme stages, (**Table 3**); notable exceptions were 1,4-alphaglucan branching enzyme and DNA replication helicase (dna2). Histone modification and DNA methylation or demethylation, play an important role in the regulation of gene expression or silencing during developmental processes. We identified four genes involved in histone methylation and demethylation (**Table 3**), including eukaryotic translation initiation factor 2c, set domain protein, DNA (cytosine-5)-methyltransferase, and s-adenosyl-methyltransferase (mraw). These genes were also significantly differentially expressed between the monoecious and female apical bud development.

#### DEGs Were Confirmed by Quantitative RT-PCR

To confirm the differentially expressed genes in the monoecious and female apical buds, 14 genes were selected for quantitative RT-PCR analysis at the three apical bud developmental stages. Representative genes selected for qPCR validation were those involved in response to hormone stimulus, transcription factor, signal transduction, sex differentiation, pollination, reproduction and histone demethylation/methylation because those phenomena were putative to associate with plant sex determination(Guo et al., 2010; Wu et al., 2010). The expression of the 13 genes (Dynamin-2A, Auxin response factors, ATP-binding protein, Spermidine synthase, auxin transport, conserved hypothetical protein, MADS box protein, two unknown proteins, arginine/serine-rich splicing factor, acid phosphatase, DNA replication helicase dna2, and eukaryotic translation initiation factor 2c) indicated by qRT-PCR was basically congruent with the DGE analysis patterns (**Figure 7**). Only one gene (s-adenosyl-methyltransferase mraw) did not show consistent expression between the qRT-PCR and DGE data sets.

#### Differential Hormone Levels in the Apices and Racemes Between the Monoecism and Female

DGE analysis and qRT-PCR verification showed that some genes which response to hormone stimulus were expressed differentially in apices and racemes between the monoecism and female, and possibly associated with castor sex expression. To validate whether the levels of phytohormone in the apices and inflorescences of the monoecious line and the pistillate line vary or not, the IAA, ABA, JA, and GAs contents of the two lines at three developments were quantitatively measured. The result showed that the auxin in the tow castor lines indicated a similar variation pattern in that IAA was equivalent or increased from the 3–4-leaf stage to the 5–7-leaf stage, and later significantly reduced (P < 0.01) (**Figure 8A**; Additional File S13). It was worthwhile to note that the auxin level in pistillate line was remarkably higher than that in the monoecism at each of the three development stages. Four kinds of GAs were performed in the measurement, but only GA<sup>4</sup> were detected with a relatively low level, and other three GA were not detected. The GA<sup>4</sup> level in the apices except little racemes also showed significant differences between the monoecism and female (**Figure 8A**; Additional File S13). The level of GA<sup>4</sup> in the female was significantly lower (P < 0.01) than that in the monoecism at the stage of early leaf bud, but markedly higher (P < 0.01) at the 5–7-leaf stage and similar at the stage of young inflorescence.

The ABA levels in the monoecism and female revealed an identical change pattern from early leaf apices to infant racemes, with a notable consecutive reduction (**Figure 8B**; Additional File S13). The ABA level in female was significantly higher (P < 0.01) than that in monoecism at early leaf apices, but notable lower (P < 0.01) at following 5–7-leaf stage, and comparative level at little raceme stage. In addition, the variation in endogenous JA levels showed a similar pattern as IAA in the two castor lines during the three development stages (**Figure 8B**; Additional File S13). However, opposite to IAA, the JA level in the pistillate line was extremely significant lower (P < 0.01) than that in the monoecism at the two apical development stages, and significantly higher (P < 0.01) than that in monoecism at the stage of little raceme.

#### DISCUSSION

Sex expression in plants is controlled by genetic factors and non-genetic fluctuations (Shifriss, 1956; Ainsworth, 1999), and phytohormones play important roles in sex determination in some species (Yin and Quinn, 1995; Ainsworth, 1999; Tanurdzic and Banks, 2004). In cucumber, ethylene, a true sex hormone, was proven to be a "female" hormone that exerts a strong feminizing effect (Dellaporta and Calderon-Urrea, 1993); Auxin also promotes cucumber femaleness, potentially playing an indirect role in sex determination by promoting the action of ethylene (Yamasaki et al., 2000). The level of gibberellic acid (GA), which acts downstream of ethylene or through a parallel pathway (Ainsworth, 1999), is correlated with male tendency. Sex expression in castor bean is also regulated by plant hormones (Shifriss, 1961; Tan et al., 2011), but in contrast to cucumber, ethylene and ethylene-like substances (NIA 10637) result in masculinization and can transform female flowers into male ones in monoecious plants (Philipos and Narayanaswamy, 1976); moreover, GA causes feminization in castor, and spraying monoecious inbreds with GA can markedly increase female tendency (Shifriss, 1961). In addition, previous work showed

that the activity of auxin-like substances causes feminization of castor bean, a process influenced by kinetin and morphactin (Kumar and Rao, 1981). Morphactin causes maleness in R. communis (Rkey, 1978; Varkey and Nigam, 1982), whereas daminozide promotes femaleness with reduction in the position of bearing the first female flower and the ratio of male and female flowers (Chauhan et al., 1987). To date, however, the key sex hormone in castor bean remains unknown. In this

study, we identified several genes involved in the response to hormone stimulus (**Table 3**; **Figure 7**), including auxin transport and auxin response factor genes and other genes involved in plant hormone-mediated biosynthetic and metabolic process, such as polyamine biosynthesis and abscisic acid and carotenoid metabolism. In the pistillate line, three genes (auxin response factor, spermidine synthase, and Xaa-Pro amino peptidase) related to auxin response and transport, were down-regulated in early leaf apices and young racemes, but up-regulated during the transformation from leaf apices to floral buds. In addition, the result of hormone measurement detected that auxin level in the female was higher than that in the monoecism at each of the three development stages and extremely significant higher at the 5– 7-leaf stage. The levels of JA, ABA, and GA in the apices also displayed significant differences between the pistillate line and the monoecious line, except the GA and ABA level in the infant racemes (**Figure 8**). Our results suggested that DEGs involved in the pathways of plant hormone response and hormone-mediated biosynthetic and metabolic processes, may participate in or affect the regulation of sex expression and reproduction by changing the hormone level and transport, which will be elucidated in further research. Our result was consistent with previous reports (Heslop-Harrison, 1957; Rkey, 1978). Those authors showed that variation in sex expression in plants is associated with variations in the hormone level (Heslop-Harrison, 1957), and that auxins promote femaleness (Wittwer and Hillyer, 1954). Interference in auxin transport, or quick degradation of the hormone, can decrease the auxin level, resulting in altered sex expression (Thomson and Leopold, 1974; Gaganias and Berg, 1977). The fact that lateral branches of castor bean plants produce only male flowers after decapitation supports the view that inhibition of auxin transport can result in maleness (Rao, 1969). Furthermore, our previous research (Wang et al., 2012) demonstrated that IAA and ABA levels in the inflorescence and flower vary remarkably between the pistillate and the monoecism, also suggesting that some phytohormones such as IAA etc. and related genes are important for sex determination in R. communis. In plants, phytohormones modulate various growth and developmental events through signal transduction pathways (Bleecker and Kende, 2000; Davies, 2004). Notably, in this study, three genes involved in signal transduction exhibited expression patterns similar to those of genes involved in auxin response and transport (i.e., down-regulated in early leaf apices and small racemes, but up-regulated during the transformation stage from leaf to floral bud) in the pistillate line. However, other genes involved in signal transduction were down-regulated throughout all three stages. Subsequent investigations of these genes and crosstalk between other hormones and auxin may reveal their specific roles in sex expression or sex determination.

Castor is a monoecious species, with pistillate flowers borne on short pedicels along the upper portion of the raceme and staminate flowers borne similarly on the lower portion (Brigham, 1967). Castor flowers are unisexual and lack petals. Male and female flowers differ radically in general morphology and size: male flowers are devoid of rudiments of organs of the opposite sex, whereas pistillate flowers occasionally exhibit vestiges of stamen (Wu et al., 1998). The occasional appearance of rudimentary androecium in Ricinus female flowers suggests that like unisexual flowers of other species (Atsmon and Galun, 1960; Cheng et al., 1983; Bracale et al., 1991; Malepszy and Niemirowicz-Szczytt, 1991; Veit et al., 1993), castor flowers (especially pistillate flowers) pass through a "bisexual stage" during which development of all floral organs is initiated. Castor sex determination probably involves PCD of opposite sex organs in the flower and inflorescence, similar to what is observed in several other plant species such as maize, cucumber, and campion (Ye et al., 1991; Dellaporta and Calderon-Urrea, 1993; Kater et al., 2001); these plants follow a sex determination pathway involving arrest of preformed sexual organs in bisexual primordia in which some PCD-related genes participate (DeLong et al., 1993). In this study, we detected a PCD-related gene (cysteine protease) that was significantly up-regulated in female apices at the 5–7-leaf stage, but obviously down-regulated at the two other stages (early leaf apices and initial inflorescence). This result was consistent with recent findings that PCD-related genes, including cysteine protease, are more highly expressed at the peak of anther abortion in CMS and GMS lines of cotton (Lorrain et al., 2003; Wei et al., 2013). It is possible that the cysteine protease might play a role in Ricinus apical development, and thus be associated directly or indirectly with sex expression. Our study identified hundreds of genes potentially involved in sex determination, including one (arginine/serine-rich splicing factor) associated with sex differentiation; this gene was up-regulated during apical transformation and initial raceme formation in female apical buds.

Transcription factors such as the maize DELLA protein D8 and the melon zinc finger protein CmWIP1 have been functionally associated with the sex determination process (Peng et al., 1999; Martin et al., 2009). Moreover, several transcription factors were identified as differentially expressed during sex determination in cucumber (Guo et al., 2010; Wu et al., 2010), including zinc finger transcription factors, Aux/IAA transcription factor, auxin-induced protein, BRI1, BRI1-associated receptor kinase, MYC transcription factor, BEL1-like homeodomain protein, bHLH proteins, WRKY DNAbinding protein, and NAC domain protein, IF-2, and so on. Consistent with the results in cucumber, we identified eight DNA-dependent transcription genes in castor bean that were significantly differentially expressed between the three apical developmental stages in the pistillate line. Of these, auxin response factor, MADS box protein, IF-2, and two DNA-binding proteins were also confirmed to be differentially expressed during sex determination. These transcription factors detected in this study were mainly binding proteins, response factors, initiation and regulation factors, and even MADS box proteins; the relationship between these transcription factors and plant sex determination remains to be determined, and should be investigated in future work. In addition, we also identified a few genes involved in histone demethylation/methylation and reproductive processes. DNA methylation and histone modification play essential roles in genome management and control gene expression or silence (Cheng and Blumenthal, 2008). DNA methyltransferase, the main enzyme involved in DNA methylation, was detected in female apices; it was obviously up-regulated at the stage of transformation from leaf to floral apices, and down-regulated at the other two developmental stages, relative to monoecious apices (**Table 3**). DNA methyltransferase is not only associated with DNA methylation, but is also associated with many important biological activities, including cell proliferation and senescence (Berger, 2007). The increase in methyltransferase gene expression

#### REFERENCES


during the transitional stage may change the level of DNA methylation, leading to the subsequent repression of male flower gene expression and the appearance of only female flowers in female racemes. Genes associated with the reproduction process, identified as DEGs here and in future research, will contribute to elucidation of the molecular mechanisms underlying Ricinus sex determination and sex patterns.

## DATA DEPOSITION

The Illumina pair-end reads of Ricinus communis L. obtained in the study are available at NCBI SRP064760.

## AUTHOR CONTRIBUTIONS

YX conceived and designed the study, amended the manuscript. TM participated in its design and performed the experiments, analyzed the data and drafted the manuscript. XJ and WL prepared the samples, RNA extraction, cDNA library construction and helped with the sequencing. FC collected and prepared the samples. HJ provided the castor seeds and took part in field cultivation and bud collections. All authors read and approved the final manuscript.

## FUNDING

This work was jointly supported by the National Natural Science Foundation of China (Grant No. 31271765 and No. 31000726), the National Department (Agriculture) Public Benefit Scientific Research Foundation (Grant No. 201003057), Ministry of Science and Technology, Ministry of Finance and the National Science and Technology Infrastructure Platform (NICGR2015-014).

## ACKNOWLEDGMENTS

We thank Dr Shunmou Huang from Databridge Technologies Corporation, for his assistance in Venn diagram, and thank the group team of Researcher Wei Hua, for the help in qRT-PCR validation. We extend many thanks to anonymous reviewers for their constructive comments during the manuscript review.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 01208


Rao, P. G. (1969). Sex expression in Ricinus communis L. Sci. Cult. 35, 326–327.


Shifriss, O. (1956). Sex instability in ricinus. Genetics 41, 265–280.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Tan, Xue, Wang, Huang, Fu and Yan. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Physiological Mechanisms Underlying the High-Grain Yield and High-Nitrogen Use Efficiency of Elite Rice Varieties under a Low Rate of Nitrogen Application in China

Lilian Wu, Shen Yuan, Liying Huang, Fan Sun, Guanglong Zhu, Guohui Li, Shah Fahad, Shaobing Peng and Fei Wang\*

National Key Laboratory of Crop Improvement, Key Laboratory of Crop Ecophysiology and Farming System in the Middle Reaches of the Yangtze River, Ministry of Agriculture, College of Plant Science and Technology, Huazhong Agricultural University, Wuhan, China

Edited by: José Luis Araus, Universitat de Barcelona, Spain

#### Reviewed by:

Iker Aranjuelo, Agrobiotechnology Institute-CSIC-UPNA, Spain Sushma Naithani, Oregon State University, USA

> \*Correspondence: Fei Wang fwang@mail.hzau.edu.cn

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 23 February 2016 Accepted: 28 June 2016 Published: 15 July 2016

#### Citation:

Wu L, Yuan S, Huang L, Sun F, Zhu G, Li G, Fahad S, Peng S and Wang F (2016) Physiological Mechanisms Underlying the High-Grain Yield and High-Nitrogen Use Efficiency of Elite Rice Varieties under a Low Rate of Nitrogen Application in China. Front. Plant Sci. 7:1024. doi: 10.3389/fpls.2016.01024 Selecting rice varieties with a high nitrogen (N) use efficiency (NUE) is the best approach to reduce N fertilizer application in rice production and is one of the objectives of the Green Super Rice (GSR) Project in China. However, the performance of elite candidate GSR varieties under low N supply remains unclear. In the present study, differences in the grain yield and NUE of 13 and 14 candidate varieties with two controls were determined at a N rate of 100 kg ha−<sup>1</sup> in field experiments in 2014 and 2015, respectively. The grain yield for all of the rice varieties ranged from 8.67 to 11.09 t ha−<sup>1</sup> , except for a japonica rice variety YG29, which had a grain yield of 6.42 t ha−<sup>1</sup> . HY549 and YY4949 produced the highest grain yield, reflecting a higher biomass production and harvest index in 2014 and 2015, respectively. Total N uptake at maturity (TNPM) ranged from 144 to 210 kg ha−<sup>1</sup> , while the nitrogen use efficiency for grain production (NUEg) ranged from 35.2 to 62.0 kg kg−<sup>1</sup> . Both TNPM and NUEg showed a significant quadratic correlation with grain yield, indicating that it is possible to obtain high grain yield and NUEg with the reduction of TNPM. The correlation between N-related parameters and yield-related traits suggests that promoting pre-heading growth could increase TNPM, while high biomass accumulation during the grain filling period and large panicles are important for a higher NUEg. In addition, there were significant and negative correlations between the NUEg and N concentrations in leaf, stem, and grain tissues at maturity. Further improvements in NUEg require a reduction in the stem N concentration but not the leaf N concentration. The daily grain yield was the only parameter that significantly and positively correlated with both TNPMand NUEg. This study determined variations in the grain yield and NUE of elite candidate GSR rice varieties and provided plant traits that could be used as selection criteria in breeding N-efficient rice varieties.

Keywords: daily grain yield, Green Super Rice, grain yield, nitrogen use efficiency

## INTRODUCTION

Rice is one of the staple food crops for approximately half of the global population (Godfray et al., 2010), and rice production must increase by 70% by 2050 to satisfy the requirements of the growing world population (Koning et al., 2008; Godfray et al., 2010). Moreover, increased rice production needs to be achieved under the pressures of decreased arable land area, global climate change (Peng et al., 2004), intensified natural disasters (Tao et al., 2013), and the frequent occurrence of diseases and pests (Sheng et al., 2003). Therefore, it is imperative to develop new varieties that have a higher yield potential and improved adaptation to the environment.

Yield potential is defined as the yield of a cultivar grown in environments to which it is adapted, with nutrients and nonlimiting water, as well as pests, diseases, weeds, lodging, and other stresses effectively controlled (Evans and Fischer, 1999). Cassman (1999) provided a more functional definition of yield potential, suggesting that this parameter is the yield obtained when an adapted cultivar is grown with the minimal possible stress, which is achieved by using the best management practices. In rice, yield potential has been significantly augmented, reflecting the utilization of semi-dwarf genes, heterosis, and the combination of intersubspecific heterosis and new plant types (Peng et al., 2008). In 2014, the elite super hybrid rice Y-Liang-You900 (YLY900) showed a record high yield of 15.4 t ha−<sup>1</sup> (Li et al., 2014). However, the main dilemma is that new varieties that have a high yield potential were achieved using surplus nutrient application, suggesting that farmers should apply a higher amount of fertilizers than the minimum required to produce the highest grain yield in rice production (Peng et al., 2002). The performance of these newly developed high-yielding varieties under low nutrient input remains unclear.

Nitrogen (N) is a vital element for plant development and growth, and the application of N fertilizer could significantly increase yield formation (Andrews et al., 2013). From 1960 to 2012, the global N fertilizer consumption increased by 800% and the annual N consumption in China increased from 8 to 35% of the world's N consumption (data from IFA). In China, the average rate of N application in rice production is ∼180 kg ha−<sup>1</sup> , which is 75% higher than the world average rate (Peng et al., 2002, 2006). High N fertilizer input leads to low nitrogen use efficiency (NUE) due to the rapid N losses from ammonia volatilization, denitrification, surface runoff, and leaching in the soil-floodwater system (Vlek and Byrnes, 1986; De Datta and Buresh, 1989). A low NUE results in significant environmental pollution, such as soil acidification (Guo et al., 2010), air pollution (Smil, 1999), and water eutrophication (Diaz and Rosenberg, 2008). To increase the NUE in rice production, scientists have developed a range of optimized crop management practices, such as Site-Specific N Management (SSNM, Dobermann et al., 2002), Real-Time N Management (RTNM, Peng et al., 2006), the San-Ding Cultivation Method (SDCM, Zou et al., 2006), and "Three Controls" Nutrient Management Technology (TCNM, Zhong et al., 2007).

One potential approach to reduce N fertilizer application in rice production is the development of varieties with an improved NUE (Sun et al., 2014; Hu et al., 2015). Variations in the NUE of different rice genotypes have been determined, and NUErelated traits have been evaluated for their accuracy in reflecting genotypic variation in rice NUE from 1987 to 2003 (Broadbent et al., 1987; De Datta and Broadbent, 1988, 1993; Tirol-Padre et al., 1996; Singh et al., 1998; Inthapanya et al., 2000; Ntanos and Koutroubas, 2002; Koutroubas and Ntanos, 2003). These studies reported significant differences in N uptake capacity and N use efficiency for grain production (NUEg), suggesting candidate parameters reflecting NUE variation, such as WP/Nt (panicle weight/total N uptake), NPI (the product of grain yield at zero N treatment and NUEg), among others. Since 2003, there have only been a few studies on NUE variation in rice. Recently, Ju et al. (2015) compared the grain yield of two N-efficient varieties and two N-inefficient varieties under low N input conditions, reporting that a high grain yield at a low N rate was associated with deeper roots, increased root oxidation activity, and a higher photosynthetic NUE. However, NUE differences among newly developed elite varieties under low N input condition have not been studied.

Zhang (2007) proposed strategies for developing Green Super Rice (GSR) to meet the challenges in rice production. In 2010, the Ministry of Science and Technology of China launched a mega project to develop GSR as proposed by Zhang (2007). One main aspect in this project is to decrease N fertilizer application in rice production through the genetic development of N-efficient varieties. Thus, the objectives of the present study were to evaluate the grain yield and NUE of the newly developed candidate GSR varieties from different breeding institutes under low N supply and to examine the physiological mechanisms underlying the differences in NUE.

## MATERIALS AND METHODS

#### Plant Materials

In 2014, 13 candidate GSR varieties were grown in the middle season with YLY6 (a super hybrid variety) and HHZ (a potential GSR) as control varieties. In 2015, 14 new candidate GSR varieties were grown in the middle season with YLY6 and HHZ as control varieties. HY549 and HLY630 were used in both years. All of the candidate GSR varieties were developed in recent years, achieving a high grain yield in local variety tests, while the two control varieties, YLY6 and HHZ, were widely planted in South China in the last decade. Detailed information concerning these varieties is shown in **Table 1**.

#### Experimental Design

Field experiments were conducted in Zhougan Village (2014) and Zhangbang Village (2015) of Wuxue County, Hubei Province, China. Prior to the experiments, soil samples from the upper 20 cm layer were collected to analyze the soil chemical properties. In 2014, the soil had a clay loam texture with a pH of 5.60, organic matter of 27.18 g kg−<sup>1</sup> , total N of 1.83 g kg−<sup>1</sup> , available P of 4.91 mg kg−<sup>1</sup> , and available K of 105.8 mg kg−<sup>1</sup> , while in 2015, the soil had a clay loam texture with a pH of 5.20, organic matter of 26.69 g kg−<sup>1</sup> , total N of 1.19 g kg−<sup>1</sup> , available P of 22.56 mg kg−<sup>1</sup> , and available K of 159.2 mg kg−<sup>1</sup> . The data for daily rainfall, solar


radiation, and minimum and maximum temperatures during the rice growing season were collected at a meteorological station (CR800, Campbell Scientific Inc., Logan, Utah, USA) near the fields, and are shown in **Figure 1**.

The experiment was arranged in a completely randomized block design with four replications. The seedlings were raised in the seedbed with a sowing date of May 23, 2014 and May 25, 2015. Twenty-five-day-old seedlings were transplanted on June 17 and June 19 in 2014 and 2015, respectively, at a hill spacing of 20.0 × 20.0 cm with two seedlings per hill and a plot size of 30 m<sup>2</sup> in 2014 and 25 m<sup>2</sup> in 2015. The fertilizers were manually broadcasted and incorporated 1 day before transplanting for basal application (40 kg N ha−<sup>1</sup> urea, 40 kg P ha−<sup>1</sup> calcium superphosphate, 50 kg K ha<sup>−</sup> potassium chloride, and 5 kg Zn ha−<sup>1</sup> zinc sulfate heptahydrate for 2 years). Nitrogen topdressings were applied at midtillering (20 kg ha−<sup>1</sup> ) and panicle initiation (PI; 40 kg ha−<sup>1</sup> ), and K was topdressed at PI at a rate of 50 kg ha−<sup>1</sup> during a 2-year experimental period. To minimize seepage between the plots, all of the bunds were covered with plastic film and installed at a depth of 20 cm below the soil surface. A water depth of 5–10 cm was maintained until 7 days prior to maturity when the fields were drained. The weeds were controlled manually and using herbicides. Pests and diseases were controlled using insecticides and fungicides; no obvious water, weed, pest, or disease stresses were observed during the experiment.

#### Crop Measurements

Twelve hills were sampled from each plot at mid-tillering (MT), PI, and heading (HD). Plant height and stem (main stems plus tillers) numbers were recorded. A tiller with at least one leaf was

counted as a stem. The plant samples were separated into leaf blade (leaf), culm plus sheath (stem), and panicle. The green leaf area was measured using a leaf area meter (LI-3100, LI-COR, Lincoln, NE, USA) and was expressed as the leaf area index (LAI). The specific leaf weight (SLW) was defined as the ratio of the leaf dry weight to leaf area. The dry weight of each component was determined after oven drying at 80◦C to a constant weight. The plant dry weight was the sum of all of the aboveground components.

At physiology maturity (PM), 12 hills were obtained from each subplot to determine the aboveground total biomass and other yield components. Plant height and panicle number were obtained from 12 hills. The plant samples were separated into leaf, stem and panicle. The dry weight of straw was determined after oven drying at 80◦C to a constant weight. The panicles were hand-threshed, and the filled spikelets were separated from unfilled spikelets after submerging them in tap water. The empty spikelets were separated from the half-filled spikelets through winnowing. Three 30-g subsamples of filled spikelets, three 2-g subsamples of empty spikelets, and the total number of halffilled spikelets were obtained to quantify the number of spikelets per m<sup>2</sup> . The dry weights of rachis, filled, half-filled, and unfilled spikelets were determined after oven drying at 80◦C to constant weight. The aboveground total biomass was calculated as the total dry matter of straw, rachis, and filled, half-filled, and unfilled spikelets. The spikelets per panicle, grain filling percentage (100× filled spikelet number/total spikelet number), and harvest index (HI) (100× filled spikelet weight/aboveground total biomass) were calculated. The grain yield was determined from a 5-m<sup>2</sup> area in each subplot and was adjusted to a standard moisture content of 0.14 g H2O g−<sup>1</sup> fresh weight. The grain moisture content was measured with a digital moisture tester (DMC-700, Seedburo, Chicago, IL, USA).

The tissue N concentration of each component at HD and PM was determined using an Elemental analyzer (Elementar vario MAX CNS/CN, Elementar Trading Co., Ltd, Germany). The plant N accumulation at HD and PM was calculated as the sum of N in each of the aboveground components. The nitrogen use efficiency for grain production (NUEg) was calculated as the grain yield per unit plant N accumulation. The nitrogen use efficiency in biomass production (NUEb) was determined as the ratio of biomass production to plant N accumulation. The nitrogen harvest index (NHI) was calculated as the percentage of accumulated N in grain to plant N accumulation (Peng et al., 1996).

#### Statistical Analysis

The data were analyzed using analysis of variance, and the mean values among the varieties were compared based on the least significant difference (LSD) test at the 0.05 probability level.

## RESULTS

### Growth Duration

The growth duration ranged from 118 to 141 d in 2014 and from 119 to 144 d in 2015. For the majority of varieties, the growth duration ranged from 130 to 141 d in 2014 and from 137 to 144 d in 2015 (**Table 2**). The days from sowing to flowering for these varieties ranged from 78 to 99 d in 2014 and from 84 to 101 d in 2015, and the grain filling period was from 33 to 47 d in 2014 and from 33 to 53 d in 2015. Generally, the growth duration of HY549 and HLY630 was similar in 2014 and 2015. The growth duration of HHZ in 2014 was longer than that in 2015, while the growth duration of YLY6 in 2014 was shorter than that in 2015. Notably, YY4949 had the shortest growth period prior to flowering but had the longest growth period after flowering in 2015 (**Table 2**).

#### Grain Yield and Yield Components

The grain yield ranged from 6.42 to 10.41 t ha−<sup>1</sup> in 2014 (**Table 3**). HY549 produced the highest grain yield, and YG29 produced the lowest grain yield. No significant difference in grain yield was observed between YLY6 and HHZ. Compared with the two controls, HY549, HYL858, JKY651, 9Y6H, HLY630, and RFY41 produced a significantly superior grain yield. The grain yields of RY225, WYH1573, QY982, and GLY5 were similar to TABLE 2 | Growth duration of the varieties in 2014 and 2015 at Wuxue County, Hubei Province, China.


those of the two controls. HY73, ZZ14, and YG29 produced a significantly lower grain yield compared with the two controls (**Table 3**). In 2015, the grain yield ranged from 8.96 to 11.09 t ha−<sup>1</sup> . Notably, a higher grain yield was observed in YY4949, HY549, and CY5727 than in either HHZ or YLY6. WSSM generated a significantly lower grain yield than HHZ. The average grain yield of HHZ, YLY6, HY549, and HLY630 was 9.82 t ha−<sup>1</sup> in 2014 and 10.13 t ha−<sup>1</sup> in 2015 (**Table 3**), and analysis of variance indicated that the difference in the average grain yield of the four common varieties between 2014 and 2015 was not statistically significant.

In 2014, the higher grain yields of HY549, HYL858, JKY651, and 9Y6H primarily reflected the higher biomass production (**Table 3**). HY549 had a smaller number of large panicles,



resulting in a significantly larger sink size (spikelets m−<sup>2</sup> ) compared with the other varieties. In 2015, the higher grain yield of YY4949 resulted from a higher harvest index, while the higher biomass production of HY549 and CY5727 contributed to the yield advantage of these two varieties. YY4949 and HY549 had a larger sink size than the other varieties. The higher grain yields of HHZ, YLY6, HY549, and HLY630 in 2015 reflected the higher biomass production (**Table 3**).

#### Nitrogen Uptake and Use Efficiency

Significant differences were observed among the varieties for total N uptake at the heading stage (TNHD), total N uptake during the grain filling period (TNGF), total N uptake at maturity (TNPM), nitrogen use efficiency for grain production (NUEg), nitrogen use efficiency for grain production (NUEb), and NHI in 2014 and 2015. The TNPM ranged from 144 to 172 kg ha−<sup>1</sup> in 2014 and from 158 to 210 kg ha−<sup>1</sup> in 2015. The NUEg of the varieties ranged from 35.2 to 62.0 kg kg−<sup>1</sup> in 2014 and from 43.1 to 58.4 kg kg−<sup>1</sup> in 2015. The TNPM of HHZ, YLY6, HY549, and HLY630 in 2015 was higher than that in 2014, resulting in a lower NUEg for these varieties in 2015 than that in 2014 (**Table 4**).

In both years, the leaf N concentration was significantly higher than that in the stem and panicle at the heading TABLE 4 | Nitrogen uptake at the heading stage (TNHD), nitrogen uptake during grain filling period (TNGF), nitrogen uptake at maturity (TNPM), nitrogen use efficiency for grain production (NUEg), nitrogen use efficiency for biomass production (NUEb), and nitrogen harvest index (NHI) of the varieties in 2014 and 2015 at Wuxue County, Hubei Province, China.


stage, while the grain N concentration was the highest at maturity only in 2014. At the maturity stage in 2015, the N concentration in the leaf was similar to that in the grain. However, HY549 had the lowest leaf N concentration at maturity in both 2014 and 2015, while YY4949 had the highest N concentration in the leaf and stem at the heading stage in 2015 (**Table 5**).

## Relationship between NUE and Growth Traits

The data for the varieties in 2014 and 2015 were used for correlation analyses to examine the relationship between the NUE-related parameters and growth analyses (**Figures 2**, **3**; **Table 6**). A significant quadratic relationship was observed between the grain yield and TNPM, demonstrating that the grain yield was augmented with an increase in TNPM until 180 kg ha−<sup>1</sup> (**Figure 2A**). The significant quadratic relationship between the grain yield and NUEg revealed that improvements in NUEg had no influence on the grain yield when NUEg was higher than 45 kg kg−<sup>1</sup> (**Figure 2B**). Improvements in the N uptake capacity and NUEg were accomplished through breeding for high biomass production and HI, respectively (**Figures 2C,D**). No significant relationship was observed between NUEg and TNPM (**Figure 3A**). NUEg was significantly and positively correlated with NUEb and NHI, but was negatively correlated with the N concentration in the grain, leaf, and stem (**Figures 3B–F**). The quadratic relationship between the NUEg and N concentration in leaf and stem suggested that improvements in NUEg were dependent on a decreased leaf N concentration when the NUEg value was lower than 50 kg kg−<sup>1</sup> , while further improvements in NUEg were dependent on a decreased stem N concentration (**Figures 3E,F**).

Many growth traits were significantly and positively correlated with TNPM, such as the total growth duration, daily grain yield, plant height at heading, leaf area index at heading, crop growth rate before heading, biomass at heading, and grain filling percentage. However, the biomass during the grain-filling period and the panicle N concentration at heading were negatively correlated with TNPM (**Table 6**). A significant and positive correlation was observed between the NUEg and biomass during the grain-filling period, spikelets per panicle, and daily grain yield. Most of the growth parameters affected the NUEb or NHI (**Table 6**).

## DISCUSSION

#### Intervarietal Difference in NUE

Variations in rice NUE have been studied since the research by Broadbent et al. (1987), who reported significant differences in the NUE of 24 rice genotypes at IRRI. Thereafter, many studies were conducted to examine the rice NUE, showing that TNPM and NUEg ranged from 48 to 130 kg ha−<sup>1</sup> and 35 to 79 kg kg−<sup>1</sup> , respectively, under irrigated lowland conditions (Tirol-Padre et al., 1996; Singh et al., 1998). Under rainfed lowland conditions, Inthapanya et al. (2000) showed that TNPM ranged from 25.7 to 40.4 kg ha−<sup>1</sup> and NUEg from 55.1 to 83.8 kg kg−<sup>1</sup> for 16 genotypes under a N fertilizer rate of 60 kg ha−<sup>1</sup> . Under Mediterranean direct water-seeded conditions, Koutroubas and Ntanos (2003) observed that the NUEg ranged from 60.9 to 90.9 kg kg−<sup>1</sup> for two indica and three japonica rice varieties at an N fertilizer rate of 150 kg ha−<sup>1</sup> . In wheat, significant differences in NUE have been examined (Le Gouis et al., 2000). The TNPM and NUEg in wheat ranged from 31 to 264 kg ha−<sup>1</sup> and 27 to 77 kg kg−<sup>1</sup> , respectively, depending on the N rate, variety, and TABLE 5 | Nitrogen concentration in various plant organs of the varieties at the heading stage and maturity in 2014 and 2015 at Wuxue County, Hubei Province, China.


year (Barraclough et al., 2010; Gaju et al., 2011; Bingham et al., 2012).

In a previous study, we observed that TNPM ranged from 138 to 248 kg ha−<sup>1</sup> , and NUEg ranged from 28.8 to 58.4 kg kg−<sup>1</sup> for 14 rice mega varieties developed at different ages. Similarly, both TNPM and NUEg were significantly enhanced through the advancements in genetic breeding (Zhu et al., 2016). In the present study, the TNPM of the elite varieties ranged from 144 to 210 kg ha−<sup>1</sup> , which was higher than the values observed for rice at a similar N rate, as previously discussed. NUEg ranged from 35.2 to 60.9 kg kg−<sup>1</sup> , which is consistent with the findings

#### TABLE 6 | Correlation between NUE-related parameters and growth-related parameters.


\* and \*\*indicate significance at 0.05 and 0.01 probability level.

of Tirol-Padre et al. (1996) and Singh et al. (1998), but was lower than the findings of Koutroubas and Ntanos (2003). Koutroubas and Ntanos (2003) reported a grain yield ranging from 6.0 to 8.3 t ha−<sup>1</sup> , thus the relatively high NUEg reflected a lower TNPM, which ranged from 76.2 to 124.2 kg ha−<sup>1</sup> . Notably, the grain yield in the present study was significantly higher than the values reported in all of the previous studies, indicating that it is feasible to simultaneously achieve high yield and high efficiency.

#### Relationship between Grain Yield and NUE

Broadbent et al. (1987) evaluated the stability of nine NUErelated parameters using the N<sup>15</sup> labeling method to rank the genotypes across different seasons, and De Datta and Broadbent (1988) further tested these methods without using isotopically labeled fertilizer to reflect genotypic variations in NUE. These studies showed that the yield and GW/Nt were the most stable parameters reflecting genotypic differences in the NUE of rice. In the present study, the grain yield and daily grain yield were significantly and positively correlated with both TNPM and NUEg (**Figure 2**, **Table 6**; Samonte et al., 2006). This finding is consistent with the evidence that genetic improvements in the yield potential improve both TNPM and NUEg (Fischer, 1981; Bingham et al., 2012; Zhu et al., 2016). However, the correlations between the grain yield and TNPM and between the grain yield and NUEg were quadratic (**Figures 2A,B**; Cassman et al., 1993; Singh et al., 1998). This finding indicated that the increase in grain yield through an increase in TNPM is marginal when TNPM is higher than 150 kg ha−<sup>1</sup> , and this increase is likely to improve NUEg while maintaining a high grain yield.

## Plant Traits Related with NUE

The N uptake efficiency accounted for 64% of the variation in the NUE at zero N rate, while the NUEg was more significant

at a higher N rate (Le Gouis et al., 2000). Gaju et al. (2011) also demonstrated the association between the N uptake efficiency and showed that NUE increased with increasing N limitation. Thus, breeders should select varieties with a high N uptake efficiency for low-yield crops, and varieties with high NUEg for high-yield crops, although it is possible to simultaneously

improve the N uptake efficiency and NUEg (**Figure 3A**; Moll

and harvest index (HI) and NUEg (D).

et al., 1982). The following plant traits were associated with TNPM and NUEg. TNPM could be estimated from primary plant parameters, as this measurement was significantly correlated with the tiller number, spikelet number, main culm panicle node number reflecting the potential tillers and leaves of a plant (Singh et al., 1998; Samonte et al., 2006). Moreover, Singh et al. (1998) observed that varieties with long growth durations had higher TNPM values compared with varieties with medium growth durations. In addition, deeper roots and greater root oxidation activities are important for N uptake at low N rates in both rice and wheat (Foulkes et al., 2009; Worku et al., 2012; Ju et al., 2015). In the present study, TNPM was significantly and positively correlated with total growth duration, plant height at heading, leaf area index at heading, crop growth rate before heading, biomass at heading, and grain filling percentage, but it was negatively correlated with biomass accumulation during the grain filling period and panicle N concentration at heading (**Table 6**). Consequently, genetically promoting plant growth prior to heading is important for improvements in the TNPM at low N rates.

N utilization efficiency is dependent on the N efficiency of biomass formation, the effect of N on carbohydrate partitioning, nitrate reduction efficiency, and remobilization of N from senescent tissues and storage functions (Foulkes et al., 2009). NUEg was significantly correlated with HI (**Figure 2D**), as HI was positively and significantly correlated with the dry matter translocation efficiency (Ntanos and Koutroubas, 2002). Mathematically, NUEg is equal to the ratio of the NHI and grain N concentration; thus, the NUEg was positively and significantly correlated with the NHI but was negatively correlated with the grain N concentration (**Figures 3C,D**). In the present study, the NHI ranged from 57.5 to 75.0%, which is consistent with the values reported in the studies of Tirol-Padre et al. (1996) and Singh et al. (1998). Consequently, it might be possible to further increase the NHI of rice to some extent (Sinclair and Vadez, 2002). Significant negative correlations between the grain

and NUEg, (E) Correlation between N leaf and NUEg, (F) Correlation between N stem and NUEg.

N concentration and NUEg have been widely demonstrated among different genotypes in rice and wheat (Singh et al., 1998; Inthapanya et al., 2000; Koutroubas and Ntanos, 2003). Moreover, Cassman et al. (1993) demonstrated that a lower N content grain in rice than that in bread wheat contributes to a higher NUEg in rice, particularly at high yield levels. The straw N concentration explained a large percentage of the genotypic variation in NUEg in the studies of Singh et al. (1998) and Koutroubas and Ntanos (2003). In the present study, we further demonstrated that the variation in NUEg was dependent on the changes in the leaf N concentration at maturity at low NUEg levels, while further increases in NUEg resulted from decreases in the stem N concentration (**Figures 3E,F**). These results are consistent with the findings in wheat, suggesting that delayed leaf senescence is a key trait for increasing NUEg at low N supply (Foulkes et al., 2009; Gaju et al., 2011). Moreover, NUEg was significantly and positively correlated with biomass accumulation during the grain-filling period, spikelets per panicle and daily grain yield (**Table 6**).

In conclusion, the present study determined the genotypic variation in NUE among newly developed elite rice varieties in China and demonstrated that genetic improvements in the yield potential under high nutrient input conditions also increased the TNPM and NUEg at a low N supply. The quadratic correlation between the grain yield and TNPM and between the grain yield and NUEg suggests that a further increase in N uptake results in a small increase in grain yield when TNPM is above 160 kg ha−<sup>1</sup> , and it is possible to simultaneously achieve a high grain yield and high NUEg under low N supply. Improvements in the NUE are likely to occur with simultaneous increases in TNPM and NUEg through the improvements in the daily grain yield. Plant traits associated with the rapid crop growth rate prior to heading could be used to increase TNPM, while biomass accumulation and a large panicle are essential for

#### REFERENCES


improvements in NUEg. Moreover, further improvements in NUEg depend on the increase in the translocation of N from the stems to delay leaf senescence during the grain-filling period.

#### AUTHOR CONTRIBUTIONS

LW, SY, LH, and FS conducted the field experiments. SP and FW designed the experiments. SF revised the manuscript. LW analyzed the data, and FW drafted the manuscript.

#### FUNDING

This work was financially supported by grants from the National High Technology Research and Development Program of China (the 863 Project no. 2014AA10A605), the Bill and Melinda Gates Foundation (OPP51587), and the Fundamental Research Funds for the Central Universities (Project 2015BQ002).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Wu, Yuan, Huang, Sun, Zhu, Li, Fahad, Peng and Wang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Overexpression of OsDof12 affects plant architecture in rice (Oryza sativa L.)

#### Qi Wu1, 2 †, Dayong Li 2 †, Dejun Li <sup>3</sup> , Xue Liu2, 4, Xianfeng Zhao<sup>2</sup> , Xiaobing Li <sup>2</sup> , Shigui Li <sup>1</sup> \* and Lihuang Zhu<sup>2</sup> \*

*<sup>1</sup> Rice Research Institute, Sichuan Agricultural University, Chengdu, China, <sup>2</sup> State Key Laboratory of Plant Genomics and National Center for Plant Gene Research, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China, <sup>3</sup> Key Laboratory of Biology and Genetic Resources of Rubber Tree, Rubber Research Institute, Chinese Academy of Tropical Agricultural Sciences, Danzhou, China, <sup>4</sup> Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing, China*

#### Edited by:

*Rodomiro Ortiz, Swedish University of Agricultural Sciences, Sweden*

#### Reviewed by:

*Chenglin Chai, Louisiana State University-Baton Rouge, USA Hao Peng, Washington State University, USA*

#### \*Correspondence:

*Shigui Li, Rice Research Institute, Sichuan Agricultural University, No. 211 Huimin Road, Chengdu 611130, China lishigui@sicau.edu.cn; Lihuang Zhu, Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, No.1 West Beichen Road, Chaoyang District, Beijing 100101, China lhzhu@genetics.ac.cn*

> *† These authors have contributed equally to this work.*

#### Specialty section:

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

Received: *17 August 2015* Accepted: *23 September 2015* Published: *08 October 2015*

#### Citation:

*Wu Q, Li D, Li D, Liu X, Zhao X, Li X, Li S and Zhu L (2015) Overexpression of OsDof12 affects plant architecture in rice (Oryza sativa L.). Front. Plant Sci. 6:833. doi: 10.3389/fpls.2015.00833* Dof (DNA binding with one finger) proteins, a class of plant-specific transcription factors, are involved in plant growth and developmental processes and stress responses. However, their biological functions remain to be elucidated, especially in rice (*Oryza sativa* L.). Previously, we have reported that *OsDof12* can promote rice flowering under long-day conditions. Here, we further investigated the other important agronomical traits of the transgenic plants overexpressing *OsDof12* and found that overexpressing *OsDof12* could lead to reduced plant height, erected leaf, shortened leaf blade, and smaller panicle resulted from decreased primary and secondary branches number. These results implied that *OsDof12* is involved in rice plant architecture formation. Furthermore, we performed a series of Brassinosteroid (BR)-responsive tests and found that overexpression of *OsDof12* could also result in BR hyposensitivity. Of note, in WT plants the expression of *OsDof12* was found up-regulated by BR treatment while in *OsDof12* overexpression plants two positive BR signaling regulators, *OsBRI1* and *OsBZR1*, were significantly down-regulated, indicating that *OsDof12* may act as a negative BR regulator in rice. Taken together, our results suggested that overexpression of *OsDof12* could lead to altered plant architecture by suppressing BR signaling. Thus, *OsDof12* might be used as a new potential genetic regulator for future rice molecular breeding.

#### Keywords: OsDof12, Dof transcription factor, plant architecture, rice (Oryza sativa L.)

### Introduction

DOF (DNA binding with one finger) proteins are plant-specific transcription factors (Yanagisawa, 2002; Moreno-Risueno et al., 2007; Noguero et al., 2013). The Dof domain consists of 52 amino acid residues encompassing the CX2CX21CX2C motif (Yanagisawa, 2002; Umemura et al., 2004). Dof transcription factors, with the exception in pumpkin, usually regulate the expression of the target genes via binding a core DNA motif with an essential sequence of AAAG (Yanagisawa and Schmidt, 1999). Dof proteins are widespread and versatile regulators for various biological processes such as metabolism regulation, phytohormone response, seed germination and development, photoperiodic flowering and plant patterning in plants (Yanagisawa, 2002; Noguero et al., 2013). In carbohydrate metabolism, ZmDof1 and ZmDof2 acted antagonistically to control the expression of C4 phosphoenolpyruvate carboxylase (PEPC) in maize (Zea mays) (Yanagisawa, 2000). In Arabidopsis (Arabidopsis thaliana), AtDof1.1/OBP2 participated in regulation of indole glucosinolate biosynthesis (Skirycz et al., 2006). Overexpression of OsDof25 in Arabidopsis changed the nitrogen and carbon metabolism (Santos et al., 2012). In tobacco (Nicotiana tabacum), NtBBF1 (roiB domain B Factor) can bind to the rolB promotor in an auxin-regulated way to modulate its expression, which betters our understanding the mechanism underlying auxin induction (Baumann et al., 1999). Besides, several cases have documented that Dof genes are implicated in seed germination and development. The DOF gene Affecting Germination-1 (DAG1) and DOF gene Affecting Germination-2 (DAG2) controlled seed germination in Arabidopsis via a maternal switch (Papi et al., 2000; Gualberti et al., 2002). In rice, OsDof3 might enhance expression of type3 carboxypeptidase (CPD) under GA control in aleurones (Washio, 2001), and further investigation indicated that OsDof3 interacts with GAMYB to regulate synergically the expression of RAmy1A to mediate GA signaling during rice seed germination (Washio, 2003). RPBF (rice prolamin box binding factor) interplayed with the rice basic leucine zipper factor RISBZ1 to maintain proper expression of rice seed storage protein genes during seed development (Kawakatsu et al., 2009).

Moreover, Dof factors are involved in photoperiod flowering. In Arabidopsis, Cycling Dof Factor-1 (CDF1) binds to the COSTANS (CO) and FLOWERING LOCUS T (FT) promotor regions to block transactivation of this two flowering genes, whereas this inhibition could be released based on the GIGANTEA-FLAVIN-BINDING, KELCH REPEAT, F-BOX1(GI-FKF1) complex mediated degradation of CDF1 under long-day conditions (Imaizumi et al., 2005; Sawa et al., 2007; Song et al., 2012). Fornara et al. (2009) systematically studied a subset of Dof family related to CDF1 and found that CDF1, CDF2, CDF3, and CDF5 acted redundantly to repress flowering by decreasing the mRNA level of CO (Fornara et al., 2009). In rice, RDD1 (rice Dof daily fluctuations 1) was controlled by circadian clock, and repressing the expression of RDD1 led to delayed flowering time (Iwamoto et al., 2009). Furthermore, several studies have revealed the importance of Dof factors on plant patterning. In Arabidopsis, AtDof5.1 modulated leaf adaxial–abaxial polarity via binding to the promotor of Revoluta (REV) and enhancing expression of REV (Kim et al., 2010). AtDOF4.2 and AtDOF4.4 were engaged in regulating shoot branching and seed development (Skirycz et al., 2007; Zou et al., 2013). OBF-binding factor-1(OBP1) is a cell cycle regulator, and overexpressing OBP1 in Arabidopsis affects cell size and number, rendering dwarfish plant morphology (Skirycz et al., 2008).

To now, of the 36 predicted Dof homologous genes in whole Arabidopsis genome, 16 have been confirmed to participate in various biological processes (Noguero et al., 2013). However, in rice, to our knowledge, the studies deciphering Dof factors are rather limited; only 5 of the 30 predicted Dof members have been characterized in detail (Washio, 2003; Yamamoto et al., 2006; Li et al., 2008, 2009b; Iwamoto et al., 2009; Santos et al., 2012). In a previous study, we characterized the function of OsDof12 in rice (Li et al., 2008, 2009b). We found a pair of sense-antisense transcript at the locus of OsDof12 (LOC\_Os03g07360), denoted as OsDof12 (sense transcript) and OsDof12os (antisense transcript), respectively. OsDof12 encodes a nuclear-localized protein of 440 amino acids (Li et al., 2008). Expression pattern analysis showed OsDof12 and OsDof12os were co-expressed but reciprocally regulated by each other (Li et al., 2008). Moreover, overexpression of OsDof12 promoted flowering in rice under long-day conditions by up-regulating rice florigenencoding gene Hd3a and its downstream gene OsMADS14 (Li et al., 2009b). In wild type rice plants, the transcripts of OsDof12 were detected in various rice tissues at different development stages (Li et al., 2009b), which suggests OsDof12 might take part in various biological processes.

In current study, we demonstrated that the transgenic plants overexpressing OsDof12 indeed display pleiotropic phenotypes such as reduced plant height, shortened leaf length, more erected leaf, smaller panicle size and decreased grain yield. Further molecular biological analyses indicated that these changes in plant architecture could be ascribed to attenuated BR signaling in OsDof12 overexpression plants.

## Materials and Methods

#### Plant Materials and Growth Conditions

To observe the influence of OsDof12 on plant architecture, the wild type cultivar Nipponbare (Oryza sativa L. ssp. japonica cv. Nipponbare) and the previous reported two OsDof12 overexpressing lines (Line OD2 and Line OD5, Nipponbare background) (Li et al., 2009b) were grown on the research field located in the Experimental Stations of the Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, under nature field conditions. After mature stage (approximately 120 days after sowing), we measured the agronomic traits including plant height, internode length, leaf length, and panicle architecture.

For BR sensitivity tests, plants were grown on 1/2 Murashige and Skoog (MS) culture medium in a growth chamber under controlled conditions (16h/light, 30◦C and 8h/dark, 26◦C) for 7 or 8 days. For skotomorphogenesis analysis, the WT and OD plants were grown on 1/2 MS at 28◦C in darkness for 2 weeks. For OsDof12 induction analysis, 2-week-old WT plants grown on 96-well PCR plates were treated with 100 nM Brassinolide (BL, the most bioactive BR compound, not being synthesized in rice, Yokota, 1997; Fujioka and Yokota, 2003; Kim et al., 2008; Vriet et al., 2013) solution, then the hole plants were sampled at different time points for expression assay. For expression levels analysis on BR related genes, 7-day-old plants of WT and OD were used.

#### Microscopic Observation

The middle part of the second internode from mature plants were harvested and fixed with 2.5% glutaraldehyde solution overnight at 4◦C, and then dehydrated in order by 40, 50, 60, 70, 80, 90, 95, and 100% ethanol solutions. The samples got dried through critical point drying by liquid carbon dioxide. The dry specimens were mounted on a stub, gold-coated with an ion sputter coater (Hitachi, Tokyo, Japan) and then imaged with a scanning electron microscope (Hitachi, Tokyo, Japan).

#### BR Treatment

The seeds were dehusked and sterilized with 75% ethanol for 1 min, 3% NaClO for 25 min and then washed five times with sterile water. The seeds were grown on solid 1/2 MS medium containing a series of concentration of BL (WAKO, Japan) for 7 or 8 days. The lengths of second leaf sheath, second leaf, seedling height, root, and coleoptile were measured for further analyses.

#### BR Measurement

After sterilization, the seeds of WT and OD were sown and grown on 1/2 MS culture medium for 10 days, then about 1 gram of fresh seedlings were collected for measurement of Castasterone (CS, one of the biologically active BR compounds, a likely end product of brassinosteroid biosynthetic pathway in rice, Kim et al., 2008; Vriet et al., 2013) according to the method described previously (Ding et al., 2013).

#### Laminal Inclination Assay

The seeds were immersed for 2 days, and then the germinated seeds were sown on 1/2 MS supplied with different concentrations of BL. After incubation in a growth chamber for 8 days, the second lamina joints on WT and OD plants were imaged and measured by IMAGEJ according to the method described by Tong et al. (2009).

#### RNA Extraction and qRT-PCR Analysis

The samples were harvested and stored in liquid nitrogen until RNA extraction. The isolation of total RNA was performed applying Trizol reagent (Invitrogen, California, USA) with corresponding protocol. DNA digestion was accomplished by DnaseI (Takara, Japan), and first-strand cDNA was obtained by GoScript Reverse Transcription System (Promega, http:// www.promega.com/). With a rice ubiquitin gene (UBQ, LOC\_Os03g13170) set as the internal control, qRT-PCR was performed using EvaGreenq PCR MasterMix (Abm, Canada) on a real-time PCR System (Bio-Rad CFX96) with the specific primers listed in **Table S2**. The qRT-PCR program consists of 95◦C for 3 min and 42 cycles of 95◦C for 5 s, 60◦C for 10 s. The relative expression level of each examined gene was quantified by a relative quantization method.

## Results

#### Overexpression of OsDof12 Reduces Plant Height

In our previous study, we developed OsDof12 overexpressing lines in rice (hereafter named OD) (Li et al., 2009b). The OD lines, OD2 and OD5, and WT plants were grown in a paddy field in Beijing (40◦ 10′N, 116◦ 42′E) under nature field conditions. In order to investigate the effects of OsDof12 overexpression on rice agronomic traits, we traced the growth performance of OD and WT plants and found that the plant height of OD was almost the same as WT before heading stage. However, after the plant height was stable at mature stage, on average, the WT plant height reached approximate 83.7 cm while OD plant height was about 65.3 cm and 20% shorter than the WT (**Figures 1A,C**). Rice plant height consists of the internodes length and the panicle length. We measured and compared the internode and panicle lengths of OD plants with those of WT plants, respectively. Statistic data showed that the individual internode and panicle lengths in OD were remarkably shorter than those in WT (**Figures 1B,D**, **2B**), which consequently led to decreased plant height of OD plant. Comparing the percentage length of each internode to the total culm in OD with that in WT, we found that the second and fourth internode lengths of OD were largely and slightly reduced (**Figure 1E**), respectively.

Internode elongation is determined by cell division activity in the intercalary meristem, followed by cell elongation in the elongation zone (Yamamuro et al., 2000). To investigate whether those two factors caused the dwarfish morphology of OD, we observed the longitudinal cell morphology of the OD and WT internodes. After heading stage, we collected and fixed the middle sections of the second internodes, then observed the cell length under scanning electron microscope (SEM). As shown in **Figure 1F**, there was no obvious difference in the longitudinal cell length between WT and OD, thus the reduction in longitudinal cell number on the elongation zone may account for the shortage of the internodes in OD.

#### OsDof12 Overexpression Plants Produce Smaller Panicles

Besides plant height, we also analyzed the panicle structure and seed size in OD plants. As shown in **Figures 2A,B**, the panicles of OD are shorter and smaller than that of WT. We further investigated the panicle index including numbers of primary branches, secondary branches, and total spikelets, and found that the number of primary branches in OD was remarkably reduced (**Figure 2C**), meanwhile OD plants produced significantly less secondary branches (**Figure 2D**). And, expectedly, there was very significant reduction in total spikelet number per panicle in OD as compared with that in WT (**Figure 2E**). Nevertheless, no obvious differences of seed size and weight appeared between OD and wild type (**Figure S1**). These results imply that overexpression of OsDof12 could lead to smaller panicle structure with reduced primary branches, secondary branches and total spikelets.

#### Abnormal Leaf Morphology in OsDof12 Overexpression Plants

Aside from the alternations on plant height and panicle structure, OD also exhibited some other characteristic phenotypes. In comparison with WT plants, OD plants generated more compacted plant architecture with more erected leaves (**Figure 1A**). As shown in **Figure 3**, obviously the leaf joint angle on OD plants was smaller than that on WT plants. Moreover, we measured the length of flag leaves, penultimate leaves and antepenultimate leaves of OD and WT plants. Statistical analysis indicated that the leaves in OD plants were remarkably shorter than those in WT plants (**Figure S2**).

#### Overexpression of OsDof12 Altered BR Sensitivity

Brassinosteroids (BRs) play pivotal roles in regulating plant growth and development (Müssig and Altmann, 2001; Fujioka and Yokota, 2003; Salas Fernandez et al., 2009; Tong and Chu, 2012; Wang et al., 2012; Zhu et al., 2013). By now, many BR metabolic and signaling-related genes have been well characterized (Yamamuro et al., 2000; Hong et al., 2002, 2003, 2005; Bai et al., 2007; Tanaka et al., 2009; Tong et al., 2009, 2012; Li et al., 2009a; Sakamoto et al., 2011; Thornton et al., 2011; Zhang et al., 2012). Notably, almost all of the corresponding mutants or misexpressors display abnormal plant height and leaf inclination phenotypes. Considering that the plants overexpressing OsDof12 exhibit reduced plant height and erect leaf morphology, we assumed that OsDof12 might be involved in BR metabolism or signal transduction. To test this hypothesis, we designed and conducted a series of experiments to evaluate the BR response of OD2 plants.

Firstly, we measured the endogenous BR levels by quantifying the CS content in WT and OD2 plants, respectively. As a matter of fact, no obvious difference in BR levels between WT and OD plants was observed (**Table S1**), suggesting that OsDof12 overexpression may not affect the metabolism of BRs, which was further confirmed by expression level analysis on BR metabolism related genes (**Figure S3**).

Then, we performed lamina joint bending assay by applying BL on the plants. Notably, the lamina joints of OD plants were remarkably less enlarged than that of WT when both subjected to a mock treatment without BL (**Figure 4A**). When treated with

respectively, determined by student's *t*-test.

the increasing doses of BL, though the angles of lamina joint on both OD and WT plants became relatively larger, the lamina inclination curve of OD ascended much slower than that of WT (**Figure 4B**), indicating that overexpression of OsDof12 may cause defects on the BR signal transduction pathway leading to impaired bending of leaves.

Besides OD plants exhibiting lamina joint abnormality, we also found that overexpression of OsDof12 affected other aspects of BR-related morphology. For instance, the second leaf sheath length of OD plants was apparently shorter than that of WT plants. In WT plants, when treated with 1 nM BL, the second leaf sheath length slightly but not significantly increased than those in the mock treatment, however, 10 nM or higher concentrations of BL treatments gave rise to evidently shorter second leaf sheath length in WT plants (**Figure 5A**), suggesting that relative higher level of exogenous BR inhibits the elongation of the leaf sheath. In contrast, various concentrations of BL treatment on OD could hardly altered the elongation of the second leaf sheath (**Figure 5A**), indicating that the elongation of the second leaf sheath in OD was hyposensitive to exogenous BR treatment. The similar situation also happened to the elongation of the second leaf and seedling height in WT and OD plants, respectively. The second leaf length and seedling height in OD plants was apparently shorter than those in WT. Relative higher concentrations of BL decreased evidently the elongation of the second leaf and seedling height in WT plants (**Figures 5B,C**), while various levels of BL treatment could hardly affect the elongation of the second leaf length and seedling height in OD plants (**Figures 5B,C**), indicating OD plants may have lower sensitivity in response to exogenous BR.

#### Elongation of the Coleoptiles and Root in Response to BL

The coleoptile length and root elongation are respects to evaluate BR sensitivity of plants (Yamamuro et al., 2000). Thus, we measured the coleoptile length of 8-day-old seedlings. As shown

in **Figure 5D**, the coleoptile elongation of WT was promoted by BL treatment in dose-dependent manner. However, the coleoptile elongation of OD plants showed no difference between BL and mock treatment.

Similar root growth pattern was shared by OD and WT plants when grown in 0 nM or 1 nM BL medium, 1 nM BL slightly increased the root elongation (**Figure 5E**). When treated with 10 nM or 100 nM BL, the WT plants generated apparently shorter root. However, for OD plants, 10 nM BL could not obviously reduce root elongation, only up to 100 nM BL treatment could inhibit the root elongation (**Figure 5E**). The results also suggested that overexpression of OsDof12 may result in lower BR sensitivity.

#### Skotomorphogenic Phenotypes of OsDof12 Overexpressing Plants

The phenotypes of mesocotyl and internode elongation in darkness have been employed as a good criteria to determine whether the dwarf phenotype is related to BR (Hong et al., 2002, 2003). To test whether overexpression of OsDof12 affects the elongation of mesocotyl and internode, we grew the seeds on half-strength MS under totally dark conditions. Two weeks later, we observed the phenotypes of WT and OD plants and found that the elongation of mesocotyl and first internode in OD plants was similar to those in WT plants (**Figures 6A–C**), however, the second internode length in OD plants was much less than that in WT plants (**Figures 6A,D**), indicating overexpression of OsDof12 specifically inhibits the elongation of the internode in dark conditions.

#### Expression Patterns of BR Signaling Related Genes Were Changed in Transgenic Plants Overexpressing OsDof12

As demonstrated above, OsDof12 may be a regulator for maintaining normal BR signaling. We then test whether OsDof12 responds to BR treatment. As shown in **Figure 7A**, the OsDof12 transcripts rapidly accumulated after BL treatment for 1 h; although OsDof12 gradually descended from 2 to 4 h, it was still higher than that before BR treatment; after BR treatment the transcriptional level peak of OsDof12 appeared at 8 h. This result suggests OsDof12 could be induced by exogenous BR, which indicates OsDof12 might be involved in BR regulation.

Using qRT-PCR, we further analyzed the expression level of six major BR signaling components. As shown in **Figure 7B**, OsBRI1 and OsBZR1 were down-regulated, while OsBU1, OsLIC,

respectively stand for *P* < 0.05 and *P* < 0.01 determined by student's *t*-test.

and DLT were not affected, suggesting that OsDof12 is possibly involved in BR signaling pathway.

#### Discussion

Dof genes have been known involved in various processes at different development stages in plants (Yanagisawa, 2002; Noguero et al., 2013). In previous study, we focused on overexpression of OsDof12 promoting flowering in rice under long-day conditions. In the present study, we demonstrate that overexpression of OsDof12 can alter the rice plant architecture. Further evidences indicate OsDof12 may participate in BR signaling to modulate rice architecture. All these clearly show that OsDof12 is a pleiotropic regulator for plant growth and development.

Higher plants have developed a complex of metabolic mechanisms, which include biosynthetic and catabolic pathway, to maintain BR homeostasis for normal growth and development (Tanaka et al., 2005; Vriet et al., 2013). The abnormal phenotypes in OD plants are similar with the phenotypes of BR-deficient mutants. Therefore, we investigated this possible clues that OsDof12 might be involved in BR metabolism. Measurement of the endogenous BR levels indicated that the BR content in plants overexpressing OsDof12 was not affected. Meanwhile, we quantified expression levels of BR metabolic genes, such as D2, D11, OsDWARF, OsDWARF1, OsDWARF4, CYP734As and no obvious difference were found between OD and WT plants. These results excluded the possibility that OsDof12 might modulate plant architecture by affecting BR metabolism pathway.

Notably, a range of BR response tests on OsDof12 overexpression plants, including lamina joint assay, sheath, root, coleoptile elongation pattern analysis, and skotomorphogenesis analysis all suggest that overexpression of OsDof12 reduces the BR sensitivity. Besides, the OsDof12 expression was induced by exogenous BL treatment and the expression patterns of two BR signaling components were suppressed in OsDof12 overexpression plants, which further suggest that OsDof12 is involved in coordinating rice BR signaling.

Rice plant architecture is composed of tiller number, internode elongation, leaf angle and panicle structure, and favorable architecture is able to facilitate improving the yield (Wang and Li, 2006, 2008; Yang and Hwa, 2008). Plant height is one of

the most important agronomic traits (Sakamoto and Matsuoka, 2004; Wang and Li, 2008). Rice dwarf phenotypes have been well categorized into several patterns (Yamamuro et al., 2000), and BR-associated mutants usually display dm-type dwarfism where the second internode is specifically shortened. The OsDof12 overexpression lines, OD2 and OD5, both generated shortened internodes, among which the relative length of the second internode were shortened significantly, therefore, we speculate that they belong to dm-type dwarfism. As suggested by our microscopic observation in the elongation zone, being different from the two BR mutants, d61-2 (Yamamuro et al., 2000) and dlt (Tong et al., 2009) where the elongation of longitudinal cell is severely hampered, the cell length in OD plants is not affected, which might indicate that the cell number decrease should be the major cause for the dwarfism in OD. These results suggest that the regulatory mechanism underlying cell elongation and plant height appear to be quite complicated, thus the detail mechanism in OsDOf12 regulation of plant height should be further studied.

Erect leaf pattern is another desirable trait for ideal plant architecture. Generally, the erect leaves in dense plantings allow more light penetrating through the upper leaves layer to make the lower leaves layer capture more light for photosynthesis and assimilation (Sinclair and Sheehy, 1999). Indeed, several cases have witnessed the effect of erect leaf on improving yield (Morinaka et al., 2006; Sakamoto et al., 2006). For example, the weakest allele of OsBRI1 (d61-7), the counterpart of Arabidopsis BR receptor BRI1, produced erect leaf and generate greater biomass under high density planting conditions (Morinaka et al., 2006). The erect leaf trait of OsDof12 overexpression lines is favorable for ideal plant architecture. However, similar to d61- 7 (Morinaka et al., 2006), at least under normal planting density, OsDof12 overexpression lines would produce decreased spikelets. Therefore, whether the harvest index of OsDof12 overexpression lines increase or not under different higher planting densities need to be further investigated.

Overexpression of OsDof12 leads to shorter leaves and less grain yield under normal planting dense, which is unfavorable for utilization in plant breeding. On the contrary, whether the plants down-expressing OsDof12 induce longer leaves and higher grain yield should be further validated. Indeed, we have tried in this direction but failed to obtain OsDof12 suppressed lines with obviously distinguishable phenotypes from WT (Li et al., 2009b). We speculated this might result from functional redundancy in rice Dof genes, and the functional redundancy in Dofs have been demonstrated in Arabidopsis

(Ahmad et al., 2013). Therefore, in our future study, we will try to better understand the full-version function of OsDof12 by utilizing CRISPR/Cas9 (Belhaj et al., 2015) and Chimeric Repressor gene Silencing Technology (CRES-T) (Mitsuda et al., 2011).

In summary, we show the involvement of OsDof12 in BR signaling coordination in rice and the effects of overexpression of OsDof12 on rice plant architecture, and the findings imply OsDof12 might be a potential genetic module for future rice breeding strategy.

#### Author Contributions

SL and LZ conceived and designed the experiments. QW and DaL performed most of the experiments. QW, DaL, SL, and LZ wrote the manuscript. DeL and XZ performed phenotypes observation and statistic analysis. XuL and XiL performed microscopic observation. All authors have read and approved the manuscript.

#### Acknowledgments

This work was supported by the grants from the National Natural Science Foundation of China (31471475), Ministry of Agriculture of China (2014ZX08009-001), and the State Key Laboratory of Plant Genomics, China (2015B0129-03). We thank Mr. Yanbao Tian (Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, China) for technical assistance with SEM observation, and we also thank Dr. Yi Xu (Rutgers University, USA) for critical reading on the manuscript.

#### Supplementary Material

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2015. 00833

Figure S1 | Comparison of the grain size between wild type and OD lines. (A) There was no obvious difference between WT and *OsDof12* overexpression lines in grain size. Bar = 1.5 mm. (B) Statistical data of 1000-grain weight. Values are mean ± sd, *n* = 1000.

Figure S2 | Morphology and statistical analysis of the leaf length. (A) Gross morphology of flag leaves, penultimate leaves and antepenultimate leaves. Bar = 8 cm. (B)The length of flag leaves, penultimate leaves and antepenultimate leaves in *OsDof12* overexpression plants was shorter than WT plants. Values are mean ± sd (*n* = 30). Single asterisk and double asterisks stand for *P* < 0.05 and *P* < 0.01 determined by student's *t*-test, respectively.

#### References


#### Figure S3 | Expression analysis of BR metabolism related genes in WT plants and transgenic plants overexpressing OsDof12. Expression level of

*OsDof12* in OD plants was significantly higher than in WT plants. *D11* and *CYP734A6* was slightly but not obviously down-regulated in OD plants, while expression levels of the other BR metabolism related genes were comparable in both plants. n.d means not detected. Double asterisks stand for *P* < 0.01 determined by student's *t*-test.

Table S1 | Quantification of CS in WT and OD2. Means ± SD of two replicates are shown (ng·g-1 F.W.).

Table S2 | The primers used for qRT-PCR in this study.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2015 Wu, Li, Li, Liu, Zhao, Li, Li and Zhu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Identification of Quantitative Trait Loci and Water Environmental Interactions for Developmental Behaviors of Leaf Greenness in Wheat

Delong Yang<sup>1</sup> \*, Mengfei Li <sup>1</sup> , Yuan Liu<sup>1</sup> , Lei Chang<sup>2</sup> , Hongbo Cheng<sup>1</sup> , Jingjing Chen<sup>1</sup> and Shouxi Chai <sup>2</sup>

<sup>1</sup> Gansu Provincial Key Lab of Aridland Crop Science/School of Life Science and Technology, Gansu Agricultural University, Lanzhou, China, <sup>2</sup> School of Agronomy, Gansu Agricultural University, Lanzhou, China

#### Edited by:

Rodomiro Ortiz, Swedish University of Agricultural Sciences, Sweden

#### Reviewed by:

Abu Hena Mostafa Kamal, University of Texas at Arlington, USA Cándido López-Castañeda, Colegio de Postgraduados, Mexico

> \*Correspondence: Delong Yang yangdl@gsau.edu.cn

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 19 November 2015 Accepted: 21 February 2016 Published: 08 March 2016

#### Citation:

Yang D, Li M, Liu Y, Chang L, Cheng H, Chen J and Chai S (2016) Identification of Quantitative Trait Loci and Water Environmental Interactions for Developmental Behaviors of Leaf Greenness in Wheat. Front. Plant Sci. 7:273. doi: 10.3389/fpls.2016.00273 The maintenance of leaf greenness in wheat, highly responsible for yield potential and resistance to drought stress, has been proved to be quantitatively inherited and susceptible to interact with environments by traditional genetic analysis. In order to further dissect the developmental genetic behaviors of flag leaf greenness under terminal drought, unconditional and conditional QTL mapping strategies were performed with a mixed linear model in 120 F8-derived recombinant inbred lines (RILs) from two Chinese common wheat cultivars (Longjian 19 × Q9086) in different water environments. A total of 65 additive QTLs (A-QTLs) and 42 pairs of epistatic QTLs (AA-QTLs) were identified as distribution on almost all 21 chromosomes except 5A, explaining from 0.24 to 3.29 % of the phenotypic variation. Of these, 22 A-QTLs and 25 pairs of AA-QTLs were common in two sets of mapping methods but the others differed. These putative QTLs were essentially characteristic of time- and environmentally-dependent expression patterns. Indeed some loci were expressed at two or more stages, while no single QTL was continually active through whole measuring duration. More loci were detected in early growth periods but most of QTL × water environment interactions (QEIs) happened in mid-anaphase, where drought stress was more conducted with negative regulation on QTL expressions. Compared to other genetic components, epistatic effects and additive QEIs effects could be predominant in regulating phenotypic variations during the ontogeny of leaf greenness. Several QTL cluster regions were suggestive of tight linkage or expression pleiotropy in the inheritance of these traits. Some reproducibly-expressed QTLs or common loci consistent with previously detected would be useful to the genetic improvement of staygreen types in wheat through MAS, especially in water-deficit environments.

Keywords: Triticum aestivum, leaf greenness, drought stress, developmental genetics, QTL mapping, water environmental interactions

## INTRODUCTION

Wheat (Triticum aestivum L.) is one of the most important foodstuff crops in semiarid and arid areas around the world. As current changes in global climate have increased the precipitation variability with frequent episodes of drought (Trenberth, 2011), the wheat production in rainfed regions is strongly constrained by erratic drought stresses (Gregersen et al., 2013). In particular, terminal drought occurring during the reproductive phase in wheat is responsible for poor grain set and development, which finally results in substantial reductions in grain yield (Nawaz et al., 2013). Although, principal explanations for these losses are still complicated, it is critically associated with the drought-induced premature senescence of flag leaf (Verma et al., 2004). Here terminal drought is considered to essentially accelerate leaf chlorophyll degradation and thus impede carbon fixation (Guóth et al., 2009) and assimilate remobilization (Gregersen et al., 2008). In this context, wheat genotypes with a functional staygreen characteristic, i.e., delayed leaf senescence, can maintain photosynthetic capacity and favorable supply of assimilates to grain for a longer duration of time to assure better grain yield (Gong et al., 2005; Christopher et al., 2008; Chen et al., 2010). Therefore, the staygreen attribute of flag leaf under terminal drought is of great importance for determining wheat yield potential and resistance to drought stress (Biswal and Kohli, 2013; Farooq et al., 2014).

To develop the staygreen trait of flag leaf as an effective selection criterion in drought-tolerant breeding in wheat, much effort has been exerted to understand the genetic mechanism of the trait in wheat (Kumar et al., 2010; Naruoka et al., 2012; Barakat et al., 2013) and other cereal crops (Yoo et al., 2007; Kassahun et al., 2010; Wang et al., 2012; Emebiri, 2013). Current genetic gains in this phenotype, as reviewed by Thomas and Ougham (2014), are proved to be polygenic with functional genes and transcription factors by typical approaches of mutation, gene expression profiles and transgenic plants. Alternatively, polygenes with quantitative effects can also be developed by quantitative trait loci (QTLs) analysis (Verma et al., 2004; Yoo et al., 2007; Kassahun et al., 2010; Kumar et al., 2010; Wang et al., 2012; Emebiri, 2013). In wheat, these putative QTLs were almost mapped on all 21 chromosomes, with a widely flexible expression in response to various genetic populations and environments (Verma et al., 2004; Yang et al., 2007; Zhang et al., 2009a,b, 2010; Kumar et al., 2010; Li H. et al., 2012; Naruoka et al., 2012; Barakat et al., 2013; Czyczyło-Mysza et al., 2013). The genetic components estimated from segregation progenies of wheat crosses elucidated that the leaf staygreen trait was governed by only a few of major genes /QTLs with high predominance of additive effects (Silva et al., 2000; Verma et al., 2004; Joshi et al., 2007; Kumar et al., 2010). However, most of present studies indicated that the phenotype was under polygenes control by minor additive effects, which were variable across environments (Li H. et al., 2012; Naruoka et al., 2012; Barakat et al., 2013; Czyczyło-Mysza et al., 2013). In some cases, epistatic effects (Zhang et al., 2009a,b, 2010; Kumar et al., 2012) or QTL × environment interaction (QEI) effects (Yang et al., 2007; Peleg et al., 2009) were also highlighted in the modulation of its genetic variation. Under these observations, although flag leaf staygreen in wheat was confirmed to be inherited quantitatively, few studies have been undertaken to adequately dissect its genetic components and QEI variability in a same experiment system, especially under the terminal drought.

On the other hand, leaf staygreen per se is a complex developmental process (Thomas and Ougham, 2014). The statistical analysis showed that the development of such a quantitative trait occurs through the actions and interactions of polygenes and their environmental interactions, behaving differentially during different growth stages (Atchley and Zhu, 1997). Nevertheless, present QTL information for leaf staygreen in wheat was only achieved at a specific time point without considering sequential effects due to distinct gene expression at different developmental stages. Actually, it is inadequate to discover the real genetic information on the developmental processes of target quantitative traits (Wang et al., 2010). To dissect the dynamics of QTL expression, unconditional and conditional methods have been proposed (Zhu, 1995; Atchley and Zhu, 1997; Wu et al., 1999). Unconditional analysis is a traditional method for studying developmental behavior, which unravels genetic cumulative effects from the original to time t, rather than the real effects of gene expression during the ontogeny. Conditional analysis is another method to assess net genetic effects in the period from time (t-1) to time t on trait development. Being independent of the causal genetic effects and susceptible to the developmental status and environments, conditional analysis is valid to identify dynamic gene expression and new genetic variation arising in specific development periods (Cao et al., 2001; Wu et al., 2010). Present studies have also been verified that conditional effects in early growth periods, as normally cumulative components, could affect later unconditional effects (Wang et al., 2010; Wu et al., 2010; Li S. et al., 2012).Thus, the combination of unconditional and conditional analyses is employed to identify dynamic expressions of developmental QTLs and reveal the comprehensive inheritance of quantitative traits (Wu et al., 2010; Li S. et al., 2012). The strategy has been applied to understanding the genetic basis of crop developmental traits, such as plant height (Wang et al., 2010; Wu et al., 2010), tiller number (Yang et al., 2006; Liu et al., 2010), grain weight (Han et al., 2012; Li S. et al., 2012), grain filling (Takai et al., 2005), and seed quality (Han et al., 2011). All the studies show that the genetic architecture of developmental traits gets more involved in a timedependent expression of polygenes through additive, epistatic and QEI effects. However, few reports so far on QTLs analysis by this methodology have documented the staygreen of flag leaf in wheat under the terminal drought stress across diverse water environments.

In this study, we used recombinant inbred lines (RILs) in wheat grown under different water environments to explore genetically the developmental behaviors of flag leaf staygreen. Using both the unconditional and conditional genetic models analyzed the trait performances at multiple observation times in the reproductive duration, respectively. The objectives of the work reported here were (1) to identify the QTLs with genetic main effects and QEI effects controlling flag leaf staygreen and (2) to unravel its dynamics of QTL expression during ontogeny, even under terminal drought stress. The findings might be valuable for well-understanding the genetic basis governing leaf staygreen development, and for wheat genetic improvement of drought tolerance by marker-assisted selection (MAS).

## MATERIALS AND METHODS

#### Plant Materials

A subset of 120 F8-derived RILs was developed from the cross between two Chinese winter wheat varieties, Longjian 19 and Q9086. The male parent Longjian 19 is an elite droughttolerance cultivar widely grown in rainfed areas (300∼500-mm annual rainfall) in northwestern China and was released by Gansu Academy of Agricultural Sciences, Lanzhou, China. The female Q9086 is a high-yielding cultivar alternative to relatively sufficient water and fertile conditions, but easily susceptible to pre-senescence under terminal drought stress, and was released by Northwest Agriculture & Forestry University, Yangling, China. In addition, the two parents differed significantly in major agronomical and physiological characteristics under terminal drought stress, as described in our previous studies (Yang et al., 2012, 2014; Li et al., 2013; Ma et al., 2014; Hu et al., 2015; Ye et al., 2015).

## Field Trials

Two parents together with the RILs were grown at Lanzhou (103◦ 51′ E, 36◦ 04′ N, 1520 m Altitude), Gansu, China, in 2012– 2014, denoted in turn as E1, E2, and E3, respectively. The experimental fields in each year were treated under drought stress (DS) and well-watered (WW) conditions. The DS treatment was equivalent to the rainfed condition with a total of 99.6, 113.5, and 101.8 mm rainfall during the growing season (from early October in the sowing year to late June in harvesting year), respectively. The WW treatment was irrigated with 750 m<sup>3</sup> ha−<sup>1</sup> water supply at the pre-overwintering, jointing, and flowering stages, respectively. The field design of each plot consisted of randomized complete blocks with three replications. Each plot was 2 m long with six rows spaced 20 cm apart with approximately 160 plants per row. Field management was conducted following the local practice in wheat production.

Leaf greenness, a surrogate measure of leaf chlorophyll content, was monitored using a Minolta Chlorophyll Meter SPAD 502 (Konica Minolta, Japan), which has been used extensively in the accurate diagnosis of the staygreen characteristics in many crops (Borrell et al., 2001; Jiang et al., 2004; Harris et al., 2007). In this study, SPAD readings were thus made a direct assessment of the degree of leaf greenness. The flag leaves of 10 main shoots growing uniform of each RIL were tagged from the middle of each plot for the assessment of the dynamic greenness degree. SPAD readings were taken at the central point of target flag leaves at each time of measurement. At the onset of flowering, SPAD values were scored every 6 d until 24 d after flowering, which duration actually covered peak grain-filling stages. Therefore, the foregoing five measurements were designated as S1, S2, S3, S4, and S5, respectively. The trait means of 10 samples from each plot with three replications were applied to the data analysis.

### Data Analysis

According to the development theory proposed by Zhu (1995), the actual SPAD data measuring at the above-mentioned stages (S1–S5) were defined as unconditional values. Using the program of QGA Station based on a mixed model (Yang et al., 2006), the conditional SPAD values were generated by converting the actual SPAD data at every two consecutive occasions, designated as S1|S0, S2|S1, S3|S2, S4|S3, and S5|S4, respectively. When phenotypic values were measuring at the first time point, the unconditional genetic effects were equivalent to those obtained from conditional analysis.

Basic statistics and Pearson phenotypic correlations between the traits were performed by SAS software (SAS Institute, 1996). The broadsense heritability (h 2 B ) of the greenness of flag leaf was estimated with the method proposed by Toker (2004). To dissect the quantitative genetic basis of the developmental behavior of the post-anthesis greenness of flag leaf, its unconditional and conditional phenotypic data at five growth stages were subjected to QTL analysis. A genetic linkage map, consisting of 524 simple sequence repeats (SSRs) marker loci mapped on 21 chromosomes, was available. The map was covered 2266.7 cM with an average distance of 4.3 cM between adjacent markers (Hu et al., 2015; Ye et al., 2015). QTL analysis was implemented by the mixed linear model mapping (Wang et al., 1999), using the Windows version computer program QTLNetwork-2.0 (Yang et al., 2008). The genetic model could divide genetic effects into additive effects (A), epistatic effects (AA), and QEI effects (including AE and AAE). An experiment-wise type I error of 0.05 was designated for candidate interval selection and putative QTL detection. The critical F-value to declare putative QTLs and to control genome-wise type I errors was accommodated by 1000 permutation tests. Both the testing and filtration window were set at 10 cM, with a walk speed of 2 cM. QTLs were named according to the rule of 'QTL+trait+research department+chromosome'.

## RESULTS

## Phenotypic Variation and Trait Correlation

The genotypic means for SPAD values as the greenness of flag leaf in all treatments are summarized in **Table 1**. The phenotypic variations of both RILs and two parents showed a progressive depletion trend with the growth progression across water environments. Moreover, phenotypic values in the DS were significantly lower than those in the WW. The parents, Longjian 19 and Q9086, differed in consecutive traits in response to the water regimes. Under the WW, phenotypic values of Q9086 at most of stages were higher than those of Longjian 19, whereas the case in the DS was reverse and much noticeable. This suggested Longjian 19 was capable of stronger staygreen to withstand drought stress. Across all measuring stages, the mean values of RILs displayed a consistent reduction and were intermediate between those of the two parents. Highly phenotypic variability was found in the population, with coefficients of variation (CV) ranging from 6.31 to 27.00% in the DS and from 4.91 to 15.76% in


TABLE 1 | Phenotypic performance for the greenness (SPAD values) of flag leaf of the parents and RILs at five growth stages in different water environments.

The numbers at the left of the slash ("/") are the phenotypic values of traits identified in the drought stress, and the numbers at the right indicate the well-watered condition. <sup>a</sup>E1–E3 represent the location at Lanzhou (103◦51′ E, 36◦04′ N, 1520 m Altitude), Gansu, China, in 2012–2014, respectively. <sup>b</sup>S1–S5 indicate the first to the fifth measuring stage, respectively. <sup>c</sup>CV(%) means the coefficient of variation.

the WW depending on different stages and water environments. Some lines had more extreme values than the parents, showing substantial transgressive segregation. All skewness and kurtosis values were less than 1.0 occurring at all treatments, suggestive of their continuous distributions and quantitative bases.

Results of variation component analysis (**Table 2**) showed that all the variances for both unconditional and conditional values in RILs reached the 0.05 or 0.01 significant level, except for the interaction variances of both genotype × environment and genotype × environment × water. In contrast, the dominant source of variations for unconditional and conditional traits was the water regime, which accounted for 83.82–93.26% and 83.82– 95.42% of total variation, respectively. This indicated that water environments could exert more considerable influence on the ontogeny of flag leaf greenness. Despite the substantial variation, the h 2 B estimates were reasonable for both unconditional and conditional values across environments, differing from 0.34 to 0.64 and from 0.31 to 0.64, respectively, which were present in a gradual decline trend with developmental stages.

Correlation analysis based on both unconditional and conditional SPAD values in RILs between different growth stages across water environments were given in **Table 3**. All correlations were positive and their correlation coefficients widely varied from 0.13 to 0.98∗∗ across environments. Howbeit it appeared obvious that correlation coefficients decreased with the growth course, and even occurred weakly at the last stage. For most stages across water environments, unconditional correlations (r <sup>2</sup> = 0.21∼0.98∗∗) were more significant and substantially higher than conditional ones (r <sup>2</sup> = 0.13∼0.51∗∗). On the other hand as shown in **Table 4**, most of the correlations between unconditional and conditional data were poorly positive but rather low, with the exception of significant correlations occurring between the conditional period of S1|S0 and all unconditional stages, and between conditional periods and their corresponding unconditional stages. It was inferred that the greenness ontogeny of flag leaf was highly characteristic of dynamic and environmentally influenced scenarios.

### Unconditional QTL Analysis for Leaf Greenness Development

A total of 50 additive QTLs (A-QTLs) with significant A effects and/or AE effects were detected by unconditional SPAD data at different growth stages across water environments (**Table 5**). These loci were mapped on almost all chromosomes except 1D–4D, 5A, and 6D (**Figure 1**), individually explaining from 0.24 to 1.80% of the phenotypic variation. Half of them had positive A effects with 0.28\* to 0.94∗∗∗ conferred by favorable alleles from Q9086, whereas the others with negative A effects of -0.26\* to -1.10∗∗∗ were from Longjian 19. By contrast, a majority of putative A-QTLs (34 of 50) were identified at only one specific stage, but the remaining 16 loci were detectable at two to four stages, implying that no QTLs was continually active during the whole period of growth. For example, three of them, Qspad.acs-1B.1, Qspad.acs-2A.3, and Qspad.acs-5B.2, were related to four stages from S1 to S4. Two A-QTLs, Qspad.acs-1A.2 and Qspad.acs-6A.2, were involved in three stages from S1 to S3 and from S2 to S4, respectively. The other 11 were just associated with two stages. Interestingly, these A-QTLs expressed at more than one stage were always conducted in the same sources of A effects, whereas effect values from each QTL progressively

#### TABLE 2 | Analyses of variance (ANOVA) for the greenness of flag leaf in RILs at five growth stages.


The numbers at the left of the slash ("/") are mean squares or heritability for the unconditional phenotype of RILs, and the numbers at the right indicate the conditional phenotype of RILs. \*P ≤ 0.05 and \*\*P ≤ 0.01. S1–S5 are as shown in Table 1. S|S-1 indicates the time interval from S-1 to S.


TABLE 3 | Correlation coefficients of the greenness of flag leaf in RILs between different growth stages in different water environments.

The numbers at the left of the slash ("/") are correlation coefficients for the unconditional phenotype of RILs, and the numbers at the right indicate the conditional phenotype of RILs. \*P ≤ 0.05 and \*\*P ≤ 0.01. Numbers in the upper right segment apply to the drought stress; those at the lower left are for the well-watered condition. E1–E3 and S1–S5 are as shown in Table 1, and S1|S0 to S5|S4 are as shown in Table 2.

decreased with advanced stages. The number of A-QTLs per stage differed from 11 at S1 to 24 at the S3, and finally to 5 at S5. This suggested that expressions of A-QTLs governing leaf greenness were highly modulated by the developmental stage and were more activated in early periods. In addition, 27 A-QTLs showed significantly QEI effects, explaining from 0.87 to 5.42% of the phenotypic variations (**Table 5**), implying their genetic susceptibility to environments. Of these, about 90% of additive QEIs (A-QEIs) was highlighted in the period after S2, but few of them reacted at S1. During the period from S1 to S3, nine A-QTLs got involved in negative AE effects (−0.25<sup>∗</sup> to −1.03∗∗∗) with the WW, but in positive effects (0.65∗∗∗ to 1.12∗∗∗) with the DS. This indicated that AE effects of these loci exposed during the early period could be more up-regulated by the DS. However, the cases for most of remainder loci expressed after S3 were opposite, with negative AE effects (−0.23<sup>∗</sup> to −0.87∗∗∗) in the DS but positive effects (0.22\* to 0.85∗∗∗) in the WW, suggestive of up-regulating them by the WW in the later duration.

All 38 pairs of epistatic QTLs (AA-QTLs) for the greenness of flag leaf were available with significant AA effects, accounting for phenotypic variations of 1.08–3.29% in response to different stages and environments (Table S1). These loci involved in epistasis were distributed on all chromosomes apart from 1B and 5A. Among them, 13 pairs behaved with positive AA effects (0.82∗∗∗ to 1.32∗∗∗), indicating that parent-type effects were higher than recombinant-type effects. And, 23 pairs showed negative effects (−0.42∗∗∗ to −1.51∗ ∗ ∗) where recombinanttype effects were higher than parent-type effects. The remaining two pairs altered in effect directions responsive to the specific stage, where one pair had positive effects (0.78∗∗∗) at S4 but



The numbers at the left of the slash ("/") are correlation coefficients estimated in the drought stress, and the numbers at the right indicate the well-watered condition. \*P ≤ 0.05 and \*\*P ≤ 0.01. E1–E3 and S1–S5 are as shown in Table 1, and S1|S0–S5|S4 are as shown in Table 2. Underline values mainly highlight in the higher and more significant correlation coefficients.

negative effects (−0.45∗∗∗) at S5, and another pair showed positive effects (1.34∗∗∗) at S1 but negative effects (−0.93∗∗∗ and −0.75∗∗∗) at S2 and S3, respectively. Similar to the expression patterns of A−QTLs, putative AA-QTLs were also expressed dynamically and alternatively. For example, the amounts of significant AA-QTLs for each stage subsequently decreased, ranging from 24 at the S1 to 2 at the S5. Moreover, 20 pairs were detected at only one specific stage, while the other pairs were identified more at two to three early consecutive stages (S1 to S2 or S3) except three pairs available after S3. Thus, epistatic reactions to the ontogeny of leaf greenness were short lived without more genetic effects appearing in the later duration, of which AA effects progressively decreased. Apart from AA effects, significant AAE effects were also involved in 24 AA-QTLs, explaining phenotypic variations of 0.82–2.54%. Over 76% of epistatic QEIs (E-QEIs) reacted in the period after S2. Of these, just two pairs involved positive AAE effects (0.66∗∗∗ and 0.52∗∗∗ , respectively) with the DS. The other 22 pairs entirely specified their QEI with negative effects (−0.33∗∗ to −0.95∗∗∗) in the DS, while positive effects (0.31∗∗ to 0.94∗∗∗) in the WW. This indicated that AAE effects were highly up-regulated by the WW.

Concerning the source of epistatic loci, only 10 significant A-QTLs participated in epistatic interactions and therefore exhibited their pleiotropic functions. Howbeit most of epistatic interactions (nearly 90%) were derived from non-individual QTLs, which were involved in epistatic QTLs without any significant A effects. These loci even constituted QTL-interacting networks at different interaction levels (**Figure 2**) to realize different AA effects. For instance, five A-QTLs interacting with 16 non-individual loci were composed of seven smaller networks by three-locus interactions, respectively. The remaining 13 non-individual QTLs made up three relatively bigger networks from four or five-locus interactions, respectively. Almost 60% of interactions exhibited negative AA effects with unequal magnitudes at one to three stages. This further suggested that the genetic control of the greenness development of flag leaf was complex and, to a certain extent, reacted as part of QTL networks.

## Conditional QTL Analysis for Leaf Greenness Development

Based on conditional mapping, a total of 37 A-QTLs were identified with significant A effects and/or QEI effects for the greenness ontogeny of flag leaf across water environments (**Table 6**). These loci were nearly distributed on the same chromosomes as unconditional ones (**Figure 1**), individually explaining from 0.36 to 1.29% of the phenotypic variation. Of these, nearly half of favorable alleles derived from Q9086 with significantly positive A effects (0.30∗∗ to 0.94∗∗∗), whereas another half came from Longjian 19 with significantly negative effects (−0.31∗∗ to −0.88∗∗∗). The result was exactly consistent with the finding from unconditional mapping, further confirming that favorable alleles were averagely dispersed within the two parents. All of conditional A-QTLs were available just in one specific period. The QTL number detected in each period gradually reduced from 11 at S1 to 2 at S5. This indicated that genes governing leaf greenness were expressed selectively but more in early development period. In addition, 18 loci were noticeably associated with A-QEIs, individually accounting for 0.82–3.80% of the phenotypic variation. Most of them (nearly 85%) happened in the duration from S2|S1 to S5|S4, suggesting that the real gene expression in the period of S1|S0 was less influenced by water environments, whereas it was extremely done thereafter. In this regards, all of A-QTLs interacting with the DS showed significantly negative AE (−0.32∗∗ to −0.79∗∗∗)

#### TABLE 5 | Unconditional additive and interacting effects of QTL × water environment of identified QTLs for the greenness of flag leaf.


(Continued)

#### TABLE 5 | Continued


<sup>a</sup>S1–S5 are as shown in Table 1. <sup>b</sup>A, the additive effect. A positive value indicates the genetic effect from Q9086 allele, and a negative value represents the genetic effect from Longjian 19 allele; \*P ≤ 0.01, \*\*P ≤ 0.005, and \*\*\*P ≤ 0.001; H<sup>2</sup> (A) (%) indicates the proportion of phenotypic variance explained by additive QTL. <sup>c</sup>AE, the additive QTL × environment interaction effects in drought stress (DS) and the well-watered (WW) conditions in E1–E3 shown in Table 1. H<sup>2</sup> (AE)(%) indicates the phenotypic variance explained by additive QTL × environment interaction.

in different periods across environments. By contrast, most of them reacting to the WW across environments exhibited significantly positive AE (0.25<sup>∗</sup> to 0.80∗∗∗), with exception of two QTLs, Qspad.acs-2A.1 and Qspad.acs-5B.2, behaving significantly negative AE (−0.25<sup>∗</sup> to −0.99∗∗∗) in S1|S0. This indicated that the net expression of one locus in the specific period was sensitive to water supply with alternative effect directions, and generally highlighted in up-regulation by the WW.


<sup>a</sup>S1|S0–S5|S4 are as shown in Table 2. <sup>b</sup>A, the additive effect. A positive value indicates the genetic effect from Q9086 allele, and a negative value represents the genetic effect from Longjian 19 allele; \*P ≤ 0.01, \*\*P ≤ 0.005, and \*\*\*P ≤ 0.001; H<sup>2</sup> (A) (%) indicates the proportion of phenotypic variance explained by additive QTL. <sup>c</sup>AE, the additive QTL × environment interaction effects in drought stress (DS) and the well-watered (WW) conditions in E1–E3 shown in Table 1. H<sup>2</sup> (AE)(%) indicates the phenotypic variance explained by additive QTL × environment interaction.

Total of 29 epistatic pairs were mapped on the nearly same chromosomes as unconditional loci other than chromosome 1D, accounting for from 0.66 to 3.29% of the phenotypic variation (Table S2). Of these, 10 pairs appeared significantly positive AA effects (0.72∗∗∗ to 1.32∗∗∗), whereas the other 19 pairs behaved significantly negative effects (−0.58∗∗∗ to −1.51∗∗∗). Apart from one pair detectable in two continual periods (S2|S1 and S3|S2), the other 28 pairs were identified in a specific period, especially in S1|S0—i.e., 25 pairs available. This further confirmed that epistatic effects were short lived but highly predominant in early period. In addition, 14 pairs got involved in E-QEIs, explaining from 0.86 to 4.08% of the phenotypic variation. Nearly 43% of them occurred in S1|S0, the remainders were involved in other periods. Although expressional patterns of AA-QTLs differed from one to another period in response to water environments, all of them exhibited negative effects (−0.54∗∗∗ to −0.95∗∗∗) in the DS, but positive effects (0.52∗∗∗ to 0.98∗∗∗) in the WW. This indicated that the expressions of conditional AA-QTLs were also enhanced by the WW, in accordance with results of the unconditional analysis.

For the compositions of conditional epistatic loci, most of epistatic interactions were performed by non-individual QTLs, besides seven significant A-QTLs involving epitasis. Thus, QTL networks were involved at different interaction levels (**Figure 3**), but simpler than those of unconditional AA-QTLs. For example, two significant A-QTLs and 10 non-individual QTLs were participated in four three-locus-interaction networks. One A-QTL and other eight non-individual QTLs were involved in two bigger networks from four or five-locus interactions, respectively. In these QTL networks, most of epistatic interactions (nearly 65%) had positive AA effects and more occurred in S1|S0, indicative of predominance in parent-type effects and in time-independent expressions in the first period.

## Comparative Analysis between Conditional and Unconditional QTLs

Following the above mapping results, 22 A-QTLs (**Tables 5**, **6**) and 25 pairs of AA-QTLs (Tables S1, S2) were common between unconditional and conditional mapping strategies across five measuring stages. With regard to these common loci, 11 A-QTLs and 24 AA-QTLs were detected at the first stage, where both genetic effects and contribution rates for the phenotypic variation were exactly equal to conditional ones. However, the remainder loci expressed at one or more stages, responding to specific stages and water environments, remarkably differed in A and AE effects, as well as in their contribution rates between two sets of mapping strategies. This suggested that common loci also behaved alternatively for inheritance of leaf greenness, with cumulative or net genetic effects highly modulated by developmental courses and water environments. Opposite to these common loci, 28 A-QTLs and 13 pairs of AA-QTLs were specifically detected by the unconditional mapping, whereas, in this way, 15 A-QTLs and 4 pairs of AA-QTLs were identified only by the conditional mapping. Therefore, by combining these two sets of mapping strategies based on time-dependent evaluation, more novel loci might be available and further exposed dynamic expression of polygenes governing the ontogeny of leaf greenness. Furthermore, an interesting feature was that both common and specifically-expressed A-QTLs detected by two sets of mapping strategies were nearly distributed in cluster occurred in specific neighboring marker intervals in several chromosomes (**Figure 1**). For example, two to seven loci shared neighboring intervals with flanking markers from Xgwm357 to Xwmc304 on chromosome 1A, from Xwmc85 to Xgwm374 on chromosome 1B, from Xwmc 296 to Xmag1730 on chromosome 2A, and so on, respectively. This indicated that specific marker intervals might carry a wealth of genetic information for the ontogeny of leaf greenness.

With regard to the general effects of genetic components from two sets of mapping analysis, the dynamics of QTL expressions was highly visible in the measuring duration of ontogeny (**Figures 4A,B**). The general effects almost appeared negative, but their absolute values greatly altered in different periods, as the trend in noticeable increase before the third phase and thereafter decrease to the minimum in the final period. Except the equal effect values emerging in the first period, the other effect values of unconditional QTLs were considerably higher than those of conditional ones. It could be perceived that the genetic regulation for the development of leaf greenness by cumulative effects was stronger than that by net effects. On the other hand, both unconditional and conditional AAE effects were greatly higher than other genetic components in the first period. During the second period, superior genetic components were associated with AA effects in unconditional mapping, but AE and AAE in conditional mapping. Thereafter, superior genetic components almost tended to the similarity between two sets of mapping strategies, highlighting in AA and/or AE effects. This suggested that, in each developmental period, QTLs governing the ontogeny of leaf greenness were also expressed dynamically, ascribed to the specific effect strength of genetic components.

On the other hand, the general contribution rates explaining the phenotypic variation were also further illustrated the dynamic characteristics of QTL expressions for ontogeny of leaf greenness (**Figures 4C,D**). The trend appeared decline in whole measuring duration, whereas drift magnitudes were less in unconditional QTLs than conditional ones. During the early period (S1–S3), unconditional QTLs showed higher general contribution rates (69.06–75.24%), and then decreased to the bottom (17.82%) at last stage. However, the maximum (71.23%) of conditional loci happened in S1|S0, and then sharply declined to the minimum (3.28%) in S5|S4. This indicated that cumulative genetic effects could be maintained longer and stronger activation, whereas net genetic effects were weaker and short lived. By contrast to the contribution rates of genetic components, the predominant performances were mainly attributed to AA effects at S1–S2 or AE effects at S3–S5 by the unconditional mapping, and AA effects in S1|S0 or AE effects in S2|S1 to S5|S4 by the conditional mapping. Owing to the above analysis, both genetic effects and their contribution rates altered dynamically with a similar variation trend in progressive reduction during the whole measuring period, but mainly expressed in the early duration. Furthermore, the performances of AA or AE effects were highly predominant to determine the developmental genetics of leaf greenness.

## DISCUSSION

According to the theory of developmental genetics, functional genes will be expressed dynamically in response to different growth stages. Furthermore, the expression pattern essentially occurs through the actions and interactions of polygenes during the ontogeny and is also flexibly modified by environments (Atchley and Zhu, 1997; Wu et al., 1999). Thus, the combination of unconditional with conditional mapping is verified to be an efficient approach to dissect dynamic expressions of developmental QTLs and reveal the inheritance of quantitative traits (Wu et al., 2010; Li S. et al., 2012). The strategy has been successfully applied to developmental QTLs analysis for agronomic and physiological traits in many crops (Liu et al., 2010; Wu et al., 2010; Han et al., 2011; Li S. et al., 2012). So far, although a multitude of previous studies have identified a wealth of QTLs for leaf staygreen and its associated traits by the method of unconditional mapping (Verma et al., 2004; Kumar et al., 2010; Li H. et al., 2012; Naruoka et al., 2012;

AE, the additive QTL × environment interaction effects; AA, the epistatic effect; and AAE, the interaction effect of epistatic QTL × environment. H <sup>2</sup>(A), H <sup>2</sup>(AE), H <sup>2</sup>(AA), and H <sup>2</sup>(AAE) indicate the phenotypic variance explained by corresponding genetic effect. S1–S5 are as shown in Table 1, and S1|S0 to S5|S4 are as shown in

Table 2.

Barakat et al., 2013; Czyczyło-Mysza et al., 2013), it seems rather obscure on genetic information of the ontogeny of leaf staygreen in wheat, because of the complexity. In this study, by employing unconditional and conditional mapping, a total of 65 A-QTLs and 42 pairs of AA-QTLs for the ontogeny leaf staygreen after flowering were detected on almost all 21 chromosomes except 5A across water environments, individually explaining from 0.24 to 3.29% of the phenotypic variation, indicative of typical quantitative traits controlled by minor-effect polygenes. Regardless of cumulative or net genetic effects, all loci were highly characteristic of time-dependent expressions. In this context, most of them were associated with specific development stage, while no locus was continually detectable over measuring time, except few loci active in two or more stages. Obviously, more QTLs were detected in earlier development stages and showed higher performances of genetic effects (**Figure 4**). The selective expressions of QTLs might be favorable to the complicated genetic regulations responsible for the ontogeny (Wu et al., 1999, 2010; Li S. et al., 2012). Likewise, the result could interpret why the phenotypic values of leaf greenness varied in subsequently decreased trend during the measuring period (**Table 1**). It was confirmed that the maintenance of leaf greenness might be greatly dependent upon early-expressed QTLs. And in another aspect, some major QTLs could be easily neglected in case only evaluating them by the phenotypic data at a specific stage, especially in the later period. Similar findings were also observed in wheat plant height (Wu et al., 2010), soybean pod number (Sun et al., 2006), and rice tiller trait (Yang et al., 2006).

By contrast, a total of 22 A-QTLs and 25 pairs of AA-QTLs were common between two sets of mapping methods. Besides, specifically- and reproducibly-expressed QTLs were more detected by the unconditional analysis (**Table 5**, Table S1). Each reproducibly-expressed A-QTL or AA-QTL showed significant genetic effects with same effect directions. For instance, one A-QTL, Qspad.acs-1B.1, showed negative A effects (−0.41∗∗∗ to −1.10∗∗∗) across four stages (S1–S4). One pair, Qspad.acs-3D.2 × Qspad.acs-6A.5, exhibited positive AA effects (1.00∗∗∗ to 1.32∗∗∗) across three stages (S1–S3). This could be explained by the fact that unconditional QTLs were attributed to the cumulative expression from the initial time to stage t (Zhu, 1995). On the other hand, conditional QTLs per se could interpret the real gene expression in the specific period from stage t-1 to t (Zhu, 1995; Atchley and Zhu, 1997). Here almost all conditional QTLs were expressed only in a specific period (**Table 6**, Table S2). Other than QTLs detected at the first stage, conditional loci were significantly distinct from unconditional ones—e.g., some loci were observed with unconditional effects but without any conditional effect, and vice versa. Even though there existed some common QTLs between two mapping strategies, their expression profiles were variable in response to the specific measuring stage. For example, Qspad.acs-1B.2 showed the unconditional A effect with -0.76∗∗∗ at S2, but the conditional effect with 0.59∗∗∗ in S3|S2. This indicated that parental contribution of favorable alleles at the same map position was also variable along with the development of leaf greenness. Therefore, the above evidence clearly suggests that QTL expressions for the ontogeny of leaf greenness are time-dependent. By combining unconditional QTL mapping with conditional QTL one of time-dependent measures, it is quite possible to reveal the dynamic gene expressions for the development of leaf staygreen.

The ability of a genotype to adapt its phenotype to different environments is referred as phenotypic plasticity (Ungerer et al., 2003). The phenotypic plasticity of quantitative traits arises in nature from QEIs at molecular levels (Campbell et al., 2003). Several examples of QEIs for developmental traits showed that the expression of particular chromosome regions differs across environments (Wu et al., 2010; Li S. et al., 2012). In this study, 53.0% of A-QTLs and 62.8% of AA-QTLs for the ontogeny of leaf greenness were significantly interacted with water environments, explaining from 0.82 to 5.42% and from 0.82 to 4.08% of the phenotypic variation, respectively. This indicated that the expressions of QTLs governing the ontogeny of leaf greenness were more susceptible to water environments, and to a certain extent, environmentally dependent. As for the attributes of QTL expressions influenced by water availability, there were significant differences among different developmental stages and between two mapping strategies. Firstly, although all of QEIs reacted to at least one water environment, the stages responsive to QEIs were widely various. Generally, most of them were highlighted in the periods after the first stage. For example, about 90% of unconditional A-QEIs and 85% of conditional A-QEIs flexibly occurred in the mid-anaphase, indicating that water environments highly affected the expressions of developmental QTLs in the later period of growth. Secondly, more QEIs occurred in unconditional QTLs than in conditional ones. For example, 27 A-QEIs and 24 E-QEIs were detected by the unconditional QTL mapping, where only 18 and 14 QEIs were unraveled by the conditional QTL mapping. Especially in the unconditional QTL mapping, four A-QEIs and seven E-QEIs had continually-expressed AE or AAE effects at two or three stages. However, the similar result was observed with only one E-QEI in the conditional QTL mapping. Of course, QEI effects were thus greatly distinct, except the QEIs involved in the first stage. This indicated that cumulative genetic effects were more prone to interact with water environments than net genetic effects. Thirdly, the behaviors of QEIs differed from responses to specific water environments. For example, nine unconditional A-QTLs expressed during the period from S1 to S3 got involved in negative AE effects with WW environments, but positive effects with DS environments. Thereafter, the case was opposite. This indicated that AE effects of these loci exposed during the early period could be more up-regulated by the DS. However, the other QEIs showed negative interaction effects with the DS, but positive effects with the WW. This clearly suggested that putative QTL expressions for the ontogeny of leaf greenness were environmentally-dependent and significantly up-regulated by WW environments. These results could provide detailed information on the variable performance of quantitative loci controlling the development of leaf greenness under different water environments.

Regarding the genetic components of leaf staygreen in wheat, several previous studies were elucidated that additive effects are predominant (Silva et al., 2000; Verma et al., 2004; Joshi et al., 2007; Kumar et al., 2010). In some cases, epistatic effects (Zhang et al., 2009a,b, 2010; Kumar et al., 2012) and QEI effects (Yang et al., 2007; Peleg et al., 2009) were also considered to be important. However, current genetic gains for leaf staygreen trait are made only by the traditional mapping analysis depending on phenotypic data at one time point. Indeed, it is inadequate to deeply dissect genetic components and their dynamical behaviors for the ontogeny of leaf staygreen. In the present study, the genetic behaviors for respective components were obviously dynamical and time-dependent during the whole measuring stages (**Figure 4**). Of these, general effects were almost negative, but had a large alternation in the third duration, which might essentially illustrate why phenotypic values of leaf greenness always decreased in the duration of growth, along with significant reduction around the stage of S3 across environments (**Table 1**). In view of the respective effects of genetic components, AA and QEIs effects were superior to other genetic components by two sets of mapping analysis, while effect directions and magnitudes varied in response to specific periods. The similar result was also observed in the developmental behavior of rice tiller number (Liu et al., 2010). On the other hand, the general contribution rates showed progressive reduction over the measuring time (**Figure 4**), consistent with the findings from the developmental genetic attributes of plant height (Wang et al., 2010; Wu et al., 2010) and grain weight (Li S. et al., 2012) in wheat. Thus, genetic effects active in early stages might play a critical role in modulating the phenotypic variation of development traits, due to higher contribution rates. Similar to the variations of genetic effects, the respective contribution rates of genetic components were highly flexible, whereas their magnitudes were incompletely equal to those of genetic effects. It was considered that some specific effects might be counteracted or pyramided each other during the development of quantitative traits (Wu et al., 2010). Nevertheless, the contribution rates of AA and AE effects were more predominant than other effects. The finding is confirmed that the action a specific gene to one quantitative phenotype is the collective property of a network of polygenes and/or its tight interactions with environments, rather than the behavior of a single gene (Wade, 2002; Malmberg et al., 2005).

In this study, although putative A-QTLs were widely dispersed on almost all chromosomes except 1D–4D, 5A, and 6D, they were nearly concentrated in specific neighboring marker intervals in several chromosomes. Moreover, these important marker intervals harbored many reproducibly-expressed QTLs in two or more periods (**Figure 1**). Using a wheat microsatellite consensus map (Somers et al., 2004) as a reference map, some QTLs controlling leaf greenness in the present work have been previously mapped on similar chromosomal regions. For example, four A-QTLs were detected in the marker interval from Xgwm357 to Xwmc304 on chromosome 1A, which overlapped with the location of a staygreen QTL (Qsg.bhu-1A) reported by Kumar et al. (2010). The reproducibly-expressed QTL, Qspad.acs-1B.1, shared the similar interval (Xgwm11– Xwmc626) of putative loci controlling leaf chlorophyll content (Czyczyło-Mysza et al., 2013). Another reproducibly-expressed QTL, Qspad.acs-2A.1, was possibly equal to Qchl a+b.igdb-2A (Li H. et al., 2012), as both were very close to the marker Xgwm339. Likewise, the other loci, such as Qspad.acs-2B.1, Qspad.acs-3B.2, Qspad.acs-4A.3, Qspad.acs-5B.1, Qspad.acs-5D.2, Qspad.acs-7B.3, Qspad.acs-7D.1, and Qspad.acs-7D.2, were identical or adjacent to the corresponding loci governing leaf chlorophyll content or its component content observed in different wheat populations (Zhang et al., 2009a; Kumar et al., 2010; Li H. et al., 2012; Czyczyło-Mysza et al., 2013). These common QTLs and their tightly-linked molecular markers would be of great importance for MAS. By contrast to current results reported by Czyczyło-Mysza et al. (2013), some typical marker intervals harboring QTLs clusters especially on chromosomes of 2B, 3A, 4B, 5B, 6B, and 7A (**Figure 1**) were also co-located considerable loci related to chlorophyll fluorescence parameters, carotenoid content, and even agronomic traits. For example, in the marker interval from Xwmc494 to Xgwm193 on chromosome 6B, five A-QTLs were mapped in this study, in which loci for related traits, such as amount of excitation energy trapped in PSII reaction centers (ET0/CSm), overall performance index of PSII photochemistry (PI), the maximum photochemical efficiency (Fv/Fm), light energy absorption (ABS/CSm), and grain yield for the main stem (GWE), were also located. Another marker interval from Xwmc11 to Xbarc19 harboring five loci for leaf greenness was possibly equivalent to the QTL-rich interval from Xwmc11 to Xcfa2262, where Czyczyło-Mysza et al. (2013) identified several QTLs controlling leaf carotenoid content, ET0/CSm, number of active reaction centers (RC/CSm), and GWE. This indicated that the hot-spot regions of QTLs could carry a wealth of genetic information on leaf greenness and its associated traits of wheat. Therefore, further studies on the possibility of a tight linkage or genetic pleiotropism on the QTL-rich regions will be very important, so as to elucidate the genetic nature of leaf greenness, and to use them in wheat improvement program.

## CONCLUSION

Flag leaf greenness of wheat in reproductive phase was controlled by minor-effect polygenes, which were expressed selectively as a time- and environmentally-dependent pattern during ontogeny. No single QTL was continually active in measuring period. But more loci were identified in early development periods, showing the higher performance of genetic effects. QEIs mainly happened in the mid-anaphase of development, where drought stress was more conducted with negative regulation on the QTL expressions. By contrast, AA and AE effects could be predominant in regulating phenotypic variations during the ontogeny of leaf greenness. In this regards, cumulative genetic effects could be maintained longer and stronger activation, whereas net genetic effects were weaker and short lived. Several QTL cluster regions were suggestive of tight linkage or expression pleiotropy in the inheritance of these traits. Some reproducibly-expressed QTLs or common loci consistent with previously detected would be useful to the genetic improvement of staygreen types in wheat through MAS, especially in waterdeficit environments.

#### AUTHOR CONTRIBUTIONS

DY designed the whole experiments and wrote the manuscript. ML and YL performed statistic analysis. LC, JC, and HC accomplished the phenotypic observation and measurement. SC performed the management of field experiments. All authors have read the manuscript.

#### ACKNOWLEDGMENTS

This work was supported by the grants from the National Natural Science Foundation of China (31460348, 30960195),

#### REFERENCES


Project of Application Development and Research of Agricultural Biotechnology of Gansu Province (GNSW-2015-18), Research Program Sponsored by Gansu Provincial Key Laboratory of Aridland Crop Science, Gansu Agricultural University (GSCS-2010-04), and Fuxi Youth Talent Program of Gansu Agricultural University (FXRC20130102).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 00273


grain filling in different water conditions. Acta Pratacul. Sin. 23, 68–78. doi: 10.11686/cyxb20140408


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Yang, Li, Liu, Chang, Cheng, Chen and Chai. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# VaERD15, a Transcription Factor Gene Associated with Cold-Tolerance in Chinese Wild Vitis amurensis

Dongdong Yu1,2,3† , Lihua Zhang1,2,3† , Kai Zhao1,2,3, Ruxuan Niu1,2,3, Huan Zhai1,2,3 and Jianxia Zhang1,2,3 \*

<sup>1</sup> College of Horticulture, Northwest A&F University, Yangling, China, <sup>2</sup> Key Laboratory of Horticultural Plant Biology and Germplasm Innovation in Northwest China, Ministry of Agriculture, Yangling, China, <sup>3</sup> State Key Laboratory of Crop Stress Biology in Arid Areas, Northwest A&F University, Yangling, China

Early responsive to dehydration (ERD) genes can be rapidly induced to counteract abiotic stresses, such as drought, low temperatures or high salinities. Here, we report on an ERD gene (VaERD15) related to cold tolerance from Chinese wild Vitis amurensis accession 'Heilongjiang seedling'. The full-length VaERD15 cDNA is 685 bp, including a 66 bp 5<sup>0</sup> -untranslated region (UTR), a 196 bp 3<sup>0</sup> -UTR region and a 423 bp open reading frame encoding 140 amino acids. The VaERD15 protein shares a high amino acid sequence similarity with ERD15 of Arabidopsis thaliana. In our study, VaERD15 was shown to have a nucleic localization function and a transcriptional activation function. Semi-quantitative PCR and Western blot analyses showed that VaERD15 was constitutively expressed in young leaves, stems and roots of V. amurensis accession 'Heilongjiang seedling' plants, and expression levels increased after low-temperature treatment. We also generated a transgenic Arabidopsis Col-0 line that over-expressed VaERD15 and carried out a cold-treatment assay. Real-time quantitative PCR (qRT-PCR) and Western blot analyses showed that as the duration of cold treatment increased, the expression of both gene and protein levels increased continuously in the transgenic plants, while almost no expression was detected in the wild type Arabidopsis. Moreover, the plants that over-expressed VaERD15 showed higher cold tolerance and accumulation of proline, soluble sugars, proteins, malondialdehyde and three antioxidases (superoxide dismutase, peroxidase, and catalase). Lower levels of relative ion leakage also occurred under cold stress. Taken together, our results indicate that the transcription factor VaERD15 was induced by cold stress and was able to enhance cold tolerance.

#### Keywords: grapevine, Vitis amurensis, VaERD15, cold tolerance, functional analysis

#### Edited by:

John Doonan, Aberystwyth University, UK

#### Reviewed by:

Antonio Ferrante, University of Milan, Italy Liezhao Liu, Southwest University, China

\*Correspondence:

Jianxia Zhang zhangjx666@126.com

†These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 08 July 2016 Accepted: 17 February 2017 Published: 07 March 2017

#### Citation:

Yu D, Zhang L, Zhao K, Niu R, Zhai H and Zhang J (2017) VaERD15, a Transcription Factor Gene Associated with Cold-Tolerance in Chinese Wild Vitis amurensis. Front. Plant Sci. 8:297. doi: 10.3389/fpls.2017.00297

**279**

**Abbreviations:** AtGAPDH, Arabidopsis thaliana glyceraldehyde 3-phosphate dehydrogenase; CAT, catalase; CBF, C-repeatbinding factor; cDNA, complementary DNA; DREB1, dehydration responsive element-binding factor 1; ERD, early responsive to dehydration; EST, expressed sequence tag; GFP, green fluorescent protein; GUS, β-glucuronidase; MDA, malondialdehyde; MS, Murashige and Skoog; NCBI, National Center for Biotechnology Information; ORF, open reading frame; PCR, polymerase chain reaction; POD, peroxidase; qRT-PCR, real-time quantitative polymerase chain reaction; ROS, reactive oxygen species; SOD, superoxide dismutase; WT, wild type.

#### INTRODUCTION

fpls-08-00297 March 4, 2017 Time: 16:56 # 2

Grapevine (Vitis vinifera L.) is one of the most important multiuse fruiting plants of the world, and its berries are used in wine-making, or eaten fresh or dried. In addition to their value as a food (providing sugars, roughage etc.), grapes and their processed products have also been shown to have nutraceutical benefits including those attributed to resveratrol, which has a role in preventing and treating cardiovascular disease and cancer (Jeandet et al., 1995; Yang et al., 2009). These beneficial properties have fostered the recent growth in the grape industry.

The majority of commercial grape cultivars belong to the European grape (V. vinifera L.). While these cultivars have excellent organoleptic qualities, they suffer relatively poor tolerance to the cold experienced during winter, resulting in significant damage to grapevines growing in the cooler regions of the world, including northern America and northern China.

China has abundant resources of wild grape germplasm. These include V. amurensis, a species that can tolerate very low winter temperatures approaching −32◦C. Therefore, this species has great potential as a germplasm resource for cold-resistant breeding (He and Niu, 1989). Study of the cold-tolerance genes in V. amurensis has significantly contributed to understanding the mechanisms of cold-tolerance, as well as transgenic breeding (Xu et al., 2014a,b).

In Arabidopsis, ERD genes can be induced within 1 h by drought stress (Kiyosue et al., 1994). Kiyosue et al. (1994) divided a total of 26 ERD cDNA clones into 16 different gene families based on their expression in response to drought. Early research on ERDs was concentrated on Arabidopsis, and many studies have shown that ERD genes can be induced by a diversity of stresses including drought (Rai et al., 2012), low temperature (Kiyosue et al., 1998), salinity (Rai et al., 2015) and abscisic acid (ABA) (Aalto et al., 2012). In recent years, ERD genes have been isolated from corn (Liu et al., 2009), soybeans (Alves et al., 2011a), tobacco (Shao et al., 2014), and tomato (Ziaf et al., 2016). Importantly, their roles in response to a range of stresses have been identified. However, of these ERDs, only ERD6, ERD10, and ERD15 are related to cold tolerance in Arabidopsis. The first of these, ERD6, encodes a sugar carrier protein and is induced by low temperatures and moisture stress (Kiyosue et al., 1998). The gene AtERD10 is induced by cold and regulates CBF transcription factors, and the transfer DNA (T-DNA) insertion silence strain of AtERD10 reduces stress tolerance of mutants compared with wild type Arabidopsis plants (Kim and Nam, 2010). An earlier report indicates that, at low temperatures, SpERD15 can protect the cell membrane, improve photosynthetic efficiency and promote the accumulation of soluble substances (Ziaf et al., 2011). However, another report (Kariola et al., 2006) showed that over-expression of AtERD15 decreased the sensitivity to ABA, tolerance to drought, and also to low temperatures. Conversely, the AtERD15 mutant was more sensitive to ABA and enhanced the tolerance to salinity and drought (Kariola et al., 2006). Therefore, in different plants, ERD15 may have different functions. There have been no previous reports on the ERD genes in Chinese wild Vitis spp., and for this reason we are keen to explore the role of ERD15 in V. amurensis.

We have already confirmed that V. amurensis is the most cold-resistant species of the 18 wild grapes species native to China and the seven wild grapes species native to North America (Zhang et al., 2012). Subsequently, a cold-induced cDNA library was constructed using potted plants of V. amurensis accession 'Heilongjiang seedling', in which one EST sequence encoding the ERD protein was obtained, and the gene was named as VaERD15. Real-time quantitative PCR (qRT-PCR) analyses revealed that VaERD15 was induced by cold stress (Zhang et al., 2013). In this study, we clone the full-length VaERD15, and confirm its function in response to cold stress.

#### MATERIALS AND METHODS

#### Plant Materials and Growth Conditions

Plants of Chinese wild V. amurensis accession 'Heilongjiang Seedling', which originates in northeast China, were maintained in the grape germplasm repository at the Northwest A & F University, Yangling, Shaanxi, the People's Republic of China. This accession is highly resistant to cold (Zhang et al., 2012).

In early January 2010, 1-year-old shoots were taken from mature vines for sand-storage at under 4◦C. At the end of March, the shoots were retrieved and used for cuttings, soaked for 2 h in Transplantone (500 mg/L), allowed to develop roots, and cultivated in a greenhouse (25◦C, light 12000 lux). In July, wellgrown and healthy potted plants were selected for further cold stress and total RNA and protein isolation.

### Isolation and Sequence Analysis of the VaERD15 Gene

'Heilongjiang seedling' plants growing in pots were placed in a pre-chilled growth chamber at 4◦C. Young leaves, stems and roots were harvested at 0, 2, 6, 12, 24, and 48 h after exposure to 4 ◦C. Samples were frozen in liquid nitrogen prior to extraction of RNA and protein.

Total RNA was extracted using the improved sodium dodecyl sulfate (SDS)/phenol method (Zhang et al., 2003). The firststrand cDNA was synthesized using the Easy Script First-Strand cDNA Synthesis Super Mix (Transgen, China), according to the manufacturer's protocol. The cDNA templates for 5<sup>0</sup> - and 3<sup>0</sup> - Rapid Amplification of cDNA Ends (RACE) were synthesized using the SMARTTM RACE cDNA Amplification Kit (Clontech, Palo Alto, CA, USA). Primers for 5<sup>0</sup> - and 3<sup>0</sup> -RACE are listed in **Table 1**. All amplified RACE fragments were sequenced three times for each sample. Based on the 5<sup>0</sup> -RACE and 3 0 -RACE results, a pair of full-length primers VaERD15-F/R (including initiation and termination codons) were designed (**Table 1**). Amplification and sequencing of the full-length cDNA of VaERD15 were repeated using two replicates.

The cDNA sequence of VaERD15 gene from V. amurensis was translated into amino acid sequences on the NCBI website<sup>1</sup> . The protein conserved domain analysis website<sup>2</sup> was used to predict conserved domains, theoretical molecular weights and isoelectric

<sup>1</sup>http://www.ncbi.nlm.nih.gov/gorf/gorf.html

<sup>2</sup>http://prosite.expasy.org/

#### TABLE 1 | Primers used in this study.

fpls-08-00297 March 4, 2017 Time: 16:56 # 3


points. The homologies of ERD15 in V. amurensis and in corn, pepper and Arabidopsis were analyzed by means of DNAMAN analysis software. Finally, phylogenetic analyses were generated using MEGA 5.0 software<sup>3</sup> .

#### Expression Pattern Analysis of VaERD15

Total RNA was extracted from various grapevines after exposure to low-temperature stress (4◦C) for 0, 2, 6, 12, 24, and 48 h, and first-strand cDNA was synthesized, as previously described. Semi-quantitative PCR was carried out using the primers rtVaERD15-F and rtVaERD15-R (**Table 1**). The volume of the semi-quantitative PCR amplification was 20 µl, and Actin1 (accession no. AY680701) was used as the internal reference gene. All reactions were repeated for three biological replicates.

Western blotting was carried out to further analyze the expression pattern of VaERD15. Total protein was extracted from various samples according to the method of Méchin et al. (2006). The extracted protein concentration was determined using the method of Bradford (1976). Proteins samples (25 µg) were prepared for SDS-polyacrylamide gel electrophoresis (PAGE), and blotted onto a polyvinylidene fluoride membrane (Roche, product no. 03010040001). Immune antibodies were prepared in our laboratory, as previously reported (Zhang et al., 2014).

#### Subcellular Localization of VaERD15

The ORF of VaERD15 without a termination codon was obtained by PCR amplification using specific primers VaERD15-XbalI-F and VaERD15-KpnI-R (**Table 1**). The PCR products digested by XbalI and KpnI were cloned into a pMD18-T vector (TaKaRa, Japan). The fragments were fused into the N-terminus of the GFP expression vector driven by the 35S promoter. The vector carrying 35S::GFP was used as a control. The plasmids of the 35S::VaERD15-GFP fusion construct and 35S::GFP were purified for subsequent experiments. Plasmids were then transformed into onion epidermal cells using the particle bombardment method, as described by Varagona et al. (1992). Transformed onion epidermal cells were cultured on MS media under dark conditions for 24 h at 25◦C. Expression of the genes transformed into the onion epidermal cells was observed using confocal laser scanning microscopy (LSM 510 META, ZEISS, Germany).

#### Transcriptional Activation Assay

For the transcriptional activation assay, the ORF of VaERD15 was generated and fused into the frame to the NcoI and BamHI sites of the GAL4 DNA-binding domain in the pGBKT7 vector by recombination reactions (Invitrogen, USA). The expression vector pGBKT7 carrying the GAL4 gene constructed by our laboratory was used as a positive control and the empty pGBKT7 vector was used as a negative control. These constructs were then transformed into the yeast strain AH109. The resulting transformants were streaked onto Synthetic Defined (SD)/-Trp medium. After incubation at 30◦C for 3 days, transformed strains on the SD/-Trp plates were selected and streaked onto SD/-Trp/- His/-Ade plates containing x-α-gal; the level of transcriptional activation was evaluated by color reaction.

#### Generation and Detection of Transgenic Arabidopsis Seedlings

The full-length cDNA of VaERD15 was amplified by PCR and cloned into the BglII/NcoI site of pCAMBIA3301, generating pCAMBIA3301-VaERD15. The specific primers VaERD15-Ncol-F/R are shown in **Table 1**. Constructs were verified by sequencing. The constructed plasmid was introduced into Agrobacterium tumefaciens GV3101 cells by electroporation. Transgenic Arabidopsis plants were obtained using the floral dipping method (Clough and Bent, 1998). Putative transgenic Arabidopsis plants harboring the pCAMBIA3301-VaERD15 construct were selected on MS plates containing 10 mg/L Basta.

Homozygous T3 Arabidopsis strains were tested by PCR and genomic DNA was extracted from 3-week-old leaves of the putative transgenic Arabidopsis seedlings using the cetyltrimethyl ammonium bromide (CTAB) method with appropriate modification (Wu et al., 1998). The specific primers VaERD15-NcoI –F/R were used for amplifying the exogenous

<sup>3</sup>http://www.megasoftware.net

gene, with pCAMBIA3301-VaERD15 plasmid DNA as a positive control, and DNA from WT Arabidopsis (Col-0) leaves as a negative control. At the same time, leaves of VaERD15-positive Arabidopsis plants were taken for GUS staining, with WT Arabidopsis plants serving as controls.

Southern blot analysis was carried out as previously described Southern (1975). After plant genomic DNA was extracted and purified, 50 µg DNA was digested by the restriction enzyme HindIII in 40 µl for 12 h. Electrophoresis and Southern blot analysis were then carried out according to the standard methods.

### Cold Tolerance, qRT-PCR and Western Blot Assay

Two T3 Arabidopsis lines (L1, L2) were used for the cold tolerance assay. Similarly, robust transgenic and WT 3-week-old Arabidopsis plants were used for the cold treatment. Plants were placed in a pre-chilled chamber at 4◦C to cold acclimate for 48 h, and then transferred to a pre-chilled chamber at −6 ◦C for 72 h. Plants suffering from cold stress were transferred to room temperature (approximately 23◦C) for 5 days to recover. The phenotypic changes of Arabidopsis plants were observed and photographed during this period. Leaf samples were collected at 0, 2, 4, 8, 12, 24, 48, and 72 h after exposure to cold stress and stored at −80◦C for qRT-PCR and Western blot analyses. qRT-PCR was carried out to determine expression changes of VaERD15 and AtERD15 [primers: rtVaERD15-F/R and rtAtERD15- F/R (**Table 1**)], in vivo, using the Takara SYBR Premix Ex TaqTM II (Perfect Real Time) on a Bio-Rad IQ5 Real-Time PCR Detection System (Bio-Rad Laboratories, Hercules, CA, USA). The volume used for qRT-PCR amplification was 20 µl, and AtGAPDH (accession no. 101214) was used as the endogenous reference gene. Western blotting was carried out as previously described. All experiments were carried out for three biological replicates.

### Biochemical Indicator Assays of Transgenic Arabidopsis

Biochemical indices relating to cold stress were determined in 3-week-old Arabidopsis seedlings in both transgenic and control plants. All plants had been subjected to the cold treatment described above. Determinations in this part of the study were carried out for three biological replicates.

Relative electrolyte leakage was assessed according to the previously described method (Weigel et al., 2001). Proline content was determined following the method of Shan et al. (2007). MDA content was measured according to the method of Puckette et al. (2007) with minor modifications. About 200 mg leaves frozen in liquid nitrogen was homogenized in 4 ml 10% trichloracetic acid (TCA), then centrifuged at 10,000 rpm for 10 min. The supernatant (2 ml) was mixed with 2 ml thiobarbituric acid (TBA) and heated at 95◦C for 30 min, quickly cooled on ice and then centrifuged at 10,000 rpm for 10 min. The soluble sugar and soluble protein contents were measured according to the methods of Bradford (1976) and Machado et al. (2013), respectively.

The activity of SOD was measured using the nitroblue tetrazolium (NBT) method (Giannopolitis and Ries, 1977; Puyang et al., 2015). POD activity was measured using the method of Pagariya et al. (2012). CAT activity was determined as described by Tseng et al. (2007).

## Statistical Analysis

All physiological data was analyzed using the IBM SPSS Statistics 18.0 software. The differences between the transgenic samples and the corresponding wild type samples were calculated by the independent sample t-test. Significant differences were represented by <sup>∗</sup>P < 0.05; ∗∗P < 0.01.

## RESULTS

## Sequence Analysis of VaERD15

Specific primers were designed according to the acquired ORF of VaERD15, which had a total length of 423 bp with a 5 0 -untranslated region (UTR) of 66 bp, a 3<sup>0</sup> -UTR of 196 bp and an intron size of 88 bp. VaERD15 encoded a predicted polypeptide of 140 amino acids with a molecular weight of 16.2 kD (Zhang et al., 2014), and pI of 4.80. Alignment analysis with the grapevine genome (Jaillon et al., 2007) showed that VaERD15 was initially located on chromosome 13.

Multiple sequence alignment analysis (**Figure 1A**) showed the VaERD15 shared 36% homology with the ZmERD15 predicted protein (accession no. ACG25626.1), 35% homology with the CaERD15 predicted protein (accession no. ABB89735.1) and 33% homology with the AtERD15 predicted protein (accession no. AAM64638.1). Phylogenetic analyses (**Figure 1B**) indicated that the relationship of VaERD15 with other ERDs from similar plant species can be divided into two families. The closest relationship with VaERD15 was found in Vitis vinifera and Cucumis sativus.

## Expression Pattern Analysis of VaERD15

Semi-quantitative PCR and Western blotting were carried out to determine the expression patterns of VaERD15. The relative expressions at transcription level are shown in **Figure 2A**; the transcript was detected in all tissues measured. At 0 h, high VaERD15 expression was found in stems, low expression in leaves and zero expression in roots. After 2 h of cold stress, the expression of VaERD15 in stems decreased significantly, but recovered slightly after 12 h. The expression of VaERD15 in leaves and roots showed a rising trend that peaked at 24 h, but then decreased.

The Western blot analysis results are shown in **Figure 2B**. We observed 16.2 kD bands on the polyvinylidene difluoride, membrane, indicating that the tissue proteins in roots, stems and leaves of V. amurensis accession 'Heilongjiang seedling' specifically reacted with VaERD15 antibodies. VaERD15 expression in leaves increased gradually and peaked at 48 h under the cold stress treatment. VaERD15 expression in stems showed a downward trend up to 12 h, which recovered at 24 h, but subsequently decreased. In the roots, VaERD15 expression levels increased, with a maximum at 6 h and a second small peak after 24 h.

ADP37978.1), Solanum lycopersicum (SlERD15, NP\_001234461), Zea mays (ZmERD15, ACG25626.1), and Capsicum annuum (CaERD15, ABB89735.1).

## VaERD15 Functions as a Transcription Factor

The subcellular localization results showed that the 35S: VaERD15-GFP fusion expression vector was transiently expressed in onion epidermal cells. Green fluorescence was only observed in the nuclei, while in the control green fluorescence was visible throughout the entire onion cell (**Figure 3**). This indicates that VaERD15 is localized to the nuclei.

The AH109 strains with the recombinant plasmid of pGBKT7-GAL4 (positive control) and pGBKT7-VaERD15 or the empty pGBKT7 vector (negative control) were all able to grow well on the SD/-Trp medium (**Figure 4A**). This demonstrates that the pGBKT7-VaERD15 recombinant plasmid, and the positive and negative controls were all transferred into the yeast. The strains transformed with VaERD15 were able to grow well on the SD/-Trp/-His/-Ade + X-α-gal selective medium, and turned blue on the SD/-Trp/-His/-Ade + X-α-gal medium. Accordingly, the negative control did not grow on the SD/-Trp-His-Ade medium (**Figure 4B**), indicating that VaERD15 could activate the expression of the reporter gene and synthesize the histidine and adenine required for the normal growth of yeast AH109. Taken together, these results illustrate that VaERD15 could function as a transcriptional activator in yeast.

## Molecular Detection of Transgenic Plants

Arabidopsis plants over-expressing VaERD15 were subjected to PCR. Five T3 Arabidopsis plants that survived on selection medium were used for the PCR experiments, and four expected bands were observed (**Figure 5A**). We also carried out a GUS staining assay. Results showed that transgenic Arabidopsis leaves were stained blue (**Figure 5B**), indicating that the GUS was expressed in the plant in vivo. This demonstrates that the plant genome had successfully integrated the target gene VaERD15.

Southern blot analysis was performed using four VaERD15 positive lines to further confirm that VaERD15 had been integrated into the Arabidopsis genome. The Southern blot results showed that specific hybridization bands could be clearly observed in all positive lines (**Figure 5C**), but not in the WT plants. The four transgenic lines all had a specific hybridization signal, but the band sizes were not all the same. This dissimilarity

fluorescent microscopy.

## Response of VaERD15 to Cold Treatment

FIGURE 3 | Subcellular localization of 35S::VaERD15-GFP fusion protein and control in onion epidermal cells. Cells were observed using

qRT-PCR and Western blotting were carried out to determine expression changes of VaERD15 in vivo. As shown in **Figure 6A**, VaERD15 transcripts gradually increased in L1 and L2 lines with longer durations of cold stress, but the transcripts were barely detectable in the WT plants. We also detected a change in endogenous AtERD15 after transferring VaERD15 into the Arabidopsis plant. The qRT-PCR results showed that there were no significant differences in the expression of endogenous AtERD15 between wild type and transgenic Arabidopsis, except that the expression in transgenic plants was nearly four times higher than that in WT Arabidopsis at 12 h (**Supplementary Figure S1**). Western blot analysis showed that the VaERD15 expression in two transgenic Arabidopsis lines both tended to increase with longer durations of cold stress, and the VaERD15 expression level in L1 was higher than that in L2 (**Figure 6B**). However, only weak bands were observed in the WT plants, indicating that expression levels of VaERD15 in WT and in transgenic Arabidopsis were different.

## VaERD15 Enhanced Cold Tolerance in Arabidopsis

The cold tolerance assay showed transgenic and WT Arabidopsis were both subjected to freezing injury, but to different extents, with longer durations of cold stress. However, freezing injuries were more serious in WT Arabidopsis. Almost all the transgenic plants suffering from cold stress were able to resume normal growth after being transferred to room-temperature (∼23◦C).

fpls-08-00297 March 4, 2017 Time: 16:56 # 6

However, the same recovery did not occur with the WT plants (**Figure 7**).

Cold tolerance is strongly correlated with a number of physiological parameters in plants (Liu et al., 2010; Li et al., 2014; Xu et al., 2014a). To investigate whether the cold tolerance of Arabidopsis lines over-expressing VaERD15 was improved, we measured several cold-related physiological indices including: relative electrolyte leakage; the contents of proline, MDA; soluble sugars and soluble proteins; and the activities of SOD, POD and CAT.

Results showed that relative electrolyte leakage in both transgenic and WT Arabidopsis plants showed an upward trend with longer durations of low temperature treatment (**Figure 8A**). Both transgenic and WT Arabidopsis plants showed similar conductivities in the early stages of cold stress but after 2 h, the conductivity in the WT plants was significantly higher than that in the transgenic plants. This indicates that WT plants suffered greater cell membrane damage than the transgenic plants.

We also investigated proline and MDA contents (**Figures 8B,C**). At the start of the stress period, the proline content of the transgenic plants was slightly higher than that of the WT plants but the difference was not significant. After 4 h of cold stress, proline in the transgenic plants accumulated rapidly, and proline abundance and growth rate were both significantly higher than in the WT plants. Changes in proline content were consistent with the phenotypic changes under cold stress. While the MDA content in the transgenic plants was lower than in the WT plants after 2 h of cold stress, it increased rapidly with longer durations of cold treatment and was consistently higher than in the WT plants.

Similar to conductivity, the soluble sugar contents of both transgenic and WT Arabidopsis plants increased with longer durations of cold treatment (**Figure 8D**). The soluble sugar content of the transgenic plants was slightly lower than in the WT plants for the first 12 h of cold stress, but then accumulated rapidly to become significantly higher than in the WT plants.

As shown in **Figure 8E**, the soluble protein content of the transgenic plants was consistently higher than that of the WT plants during cold stress. However, the pattern of change was very similar in both plant, falling to a minimum during the first 12 h, and then gradually increasing.

We also measured the activity of antioxidant enzymes. The results showed that SOD activity in transgenic Arabidopsis was consistently higher than in WT plants (**Figure 9A**), indicating that plants over-expressing VaERD15 generally had elevated SOD activity. POD and CAT activities showed a late increase. The POD content curve of the transgenic plants was relatively flat, but that of the WT plants decreased over the first 8 h and then increased significantly after 12 h (**Figure 9B**). The CAT content in the transgenic plants was lower than in the WT plants at first, but rose slightly after 2 h (**Figure 9C**).

## DISCUSSION

Grapes are of considerable economic importance and are grown over large areas of the world. However, close to the cooler limits of where this crop can be grown in the higher latitudes, both north and south of the equator, and at higher altitudes, chilling and frost damage can result in major economic loss. Therefore, the study of genes relating to cold tolerance is crucially important.

ERD genes were first isolated from Arabidopsis suffering from drought stress (Kiyosue et al., 1994). A large number of studies have shown that over-expressing ERD genes can improve the ability of plants to withstand biotic and abiotic stresses. AtERD10 and AtERD14 proteins have been reported to interact with phospholipid vesicles and protect membranes during conditions of high salinity, drought, and low temperature stress (Kovacs et al., 2008). Brassica juncea ERD4 encoding a RNA-binding protein can respond to the induction of dehydration, ABA, salicylic acid, sodium chloride, cold and heat treatments, and overexpressing BjERD4 can improve the tolerance of Arabidopsis plants to salt and dehydration stresses (Rai et al., 2015). Overexpressing Arabidopsis ERD10 can activate CBF/DREB1 genes and enhance the tolerance of plants to cold stress, while a T-DNA insertion mutant of ERD10 is more sensitive than WT plants to cold stress (Kim and Nam, 2010). However, studies involving ERD15 and responses to cold stress are few. In our study, a putative transcription factor VaERD15, was isolated from a cDNA library of V. amurensis induced by low temperatures, and its major function in cold tolerance was investigated.

Our results indicate that VaERD15 is expressed in diverse plant tissues, which suggests that VaERD15 is not specific to grapevine. Semi-quantitative PCR results show that accumulation of VaERD15 transcripts in stems is significantly higher than in roots or leaves after 0 h of chilling. Similar research has shown that transcript accumulation of SpERD15 under non-stress conditions is higher in roots and old leaves of tobacco (Ziaf et al., 2011). Increasing the duration of cold stress led to down-regulation in

FIGURE 6 | Assessment of ERD15 expression levels in VaERD15-over-expressed Arabidopsis lines at different times of cold stress at −6 ◦C. (A) The relative expression changes of VaERD15 in WT and two transgenic Arabidopsis plants under cold stress. (B) Western blotting was carried out to detect expression changes of target protein in VaERD15-over-expressing Arabidopsis plants induced by cold stress. WT Arabidopsis plants were used as controls. All experiments were carried out for three biological replicates.

Asterisks indicate a significant difference (∗P < 0.05; ∗∗P < 0.01) compared with the WT Arabidopsis.

the expression of the desired gene in stems, with a concomitant increase in the expression of the gene in leaves and roots. As the expression patterns of VaERD15 in leaves, roots and stems were different, we speculated that after low-temperature treatment of 'Heilongjiang seedling', VaERD15 could be involved in different regulatory pathways. The study by Yan et al. (2006) analyzed mRNA and protein level expression of 44 genes from Oryza sativa, and of the 27 up-regulated genes at the protein level, only five were up-regulated at a transcriptional level due to low-temperature treatment. Baginsky et al. (2005) also found

that the correlation in expression at the mRNA and protein levels was lower when Arabidopsis chloroplasts and pollen were studied. This phenomenon, i.e., asynchrony in transcription and translation, can be caused by post-translational modifications in the protein-expression process, as well as by operator error (Baldi and Long, 2001). These previous reports support the idea that the expressions of VaERD15 at the mRNA and protein levels are not identical.

Alves et al. (2011b) suggested that GmERD15, as a transcription factor localized to the nucleus and cytoplasm, can activate N-rich protein (NPR) gene expression under osmotic stress. GmERD15 can specifically bind to a 187 bp fragment of NPR-B promoter in yeast, and activate the expression of downstream genes. Ziaf et al. (2011) reported that SpERD15, located mainly in the nucleus, could enhance the plant's ability to resist external stress by increasing the accumulation of intracellular solutes and by inhibiting lipid peroxidation. In this study, the analysis of subcellular localization in onion epidermis showed that VaERD15 was mainly located in the nucleus. Transcription activation experiments confirmed that VaERD15 was able to activate reporter genes in yeast. Therefore, these results show that VaERD15 acts as a transcription factor.

To verify the function of VaERD15 in mitigating cold stress, Arabidopsis plants over-expressing VaERD15 were generated. A survival assay indicated that Arabidopsis plants over-expressing VaERD15 survived better than WT plants under cold stress. We measured the expression levels of the VaERD15 gene in transgenic Arabidopsis and WT plants and the results showed that the transcripts of VaERD15 increased dramatically with longer durations of cold stress in transgenic Arabidopsis compared with WT plants. This observation is consistent with a number of other studies of cold-related genes in transformed plants (Park et al., 2010; Checker et al., 2012; Kidokoro et al., 2015). In addition, qRT-PCR results showed that the expression levels of endogenous AtERD15 had a slight upward trend in WT plants with increasing duration of cold stress, while the expression of AtERD15 showed a peak at 12 h in transgenic Arabidopsis, which indicated that the introduction of VaERD15 could enhance the expression of endogenous AtERD15. These results illustrate that these transgenic plants are more tolerant to low temperatures. Further investigation indicated that transgenic Arabidopsis over-expressing VaERD15 also showed improved cold tolerance.

Physiological assessment showed cell concentrations of proline, MDA, soluble sugars and proteins, were higher in the over-expressing plants than in the WT plants, especially in the later stages of cold stress. These solutes act in different ways to mitigate stress, including protection of cellular structures, scavenging of ROS and detoxification of enzymes (Xiong et al., 2002; Verma and Dubey, 2003). It has been shown that increases in proline expression under drought, cold and salt stresses can help to protect plants from damage (Wanner and Junttila, 1999; Trovato et al., 2008; Xu et al., 2014b). In our study, rapid increases in proline occurred in transgenic plants after 4 h of cold stress, and after 72 h, proline was 1.5-times higher than in the WT plants. Many studies have reported similar results in plants overexpressing VaICE1 or VaICE2, OsCOIN, OsDREB1 or DREB1, Osmyb4 under cold stress (Ito et al., 2006; Liu et al., 2007; Pasquali et al., 2008; Xu et al., 2014a). Proline accumulation in transgenic plants has been widely reported (Gao et al., 2011; Movahedi et al., 2015; Zhou et al., 2015). Moderate proline accumulations have been observed in transgenic tobacco plants over-expressing SoMYB18 and a tendency for proline to decrease after cold stress has ceased has also been reported. These inconsistent results may be due to the observation that free proline is not a unique physiological index of osmotic potential reduction (Mao et al., 2011).

Electrolyte leakage is a key indicator of membrane injury caused by stress (Dexter et al., 1932; Jaglo-Ottosen et al., 1998). In the early stages, our VaERD15-over-expressing plants and control plants were both able to tolerate cold stress even though their conductivity levels increased. However, under more prolonged stress, the conductivity in the control plants was higher than that in the transgenic plants, indicating greater damage in the controls. The study by Parvanova et al. (2004) was consistent with our results and supported the idea that increased conductivity is indicative of membrane dysfunction.

Malondialdehyde is an end product of lipid peroxidation, and its level is therefore a key indicator of cold stress injury in plants (Jouve et al., 1993; Zhang and Kirkham, 1994). Previous studies on over-expression of OsAPXa, SpERD15 in citrus, rice and tobacco plants reported that, compared with controls, lower

levels of MDA were found in transgenic plants after exposure to cold stress (Hara et al., 2003; Sato et al., 2011; Ziaf et al., 2011). In contrast with the above, our results showed that after 2 h of being subjected to cold temperatures, the MDA content of transgenic plants was consistently higher than that of WT plants. However, a similar finding to ours was reported for transgenic tobacco by Parvanova et al. (2004); their results suggested plants develop stress tolerance by producing large amounts of MDA.

Reactive oxygen species accumulation can lead to membrane peroxidation and thus destroy cell structure and function (Mittler et al., 2004). One way in which plants respond to stress is to accelerate free radical scavenging by increasing the activity of protective enzymes. The activities of three antioxidant-related enzymes were measured in our assay. In general, POD activity increased slowly, CAT activity increased rapidly while SOD activity initially increased, then decreased. The time courses of the activity trends for SOD and CAT in our transgenic plants were generally similar to those in WT plants, but the enzyme activities were significantly higher in the transgenic plants. Many reports on environmental stress have recorded the activities of antioxidant enzymes. For example, Yang et al. (2012) confirmed that the activities of POD, SOD and CAT increased in plants over-expressing OsMYB2, under salt stress. However, Yuan et al. (2015) reported that under cold stress, antioxidase activity was not significantly different in VaPAT1-over-expressing plants compared to WT plants. Shingote et al. (2015) found POD and CAT activities fell, while SOD activity increased in transgenic plants under cold stress. This indicates there may be synergy between the various antioxidant enzymes in the presence of active oxygen scavenging. Therefore, we hypothesize that within 24 h of the imposition of cold stress, the continuing decline in SOD activity that we observed was the result of over-accumulation of CAT in the cell.

We found that in V. amurensis the transcription factor VaERD15 can significantly improve the tolerance of plants to low temperatures. Previous reports have shown that over-expression of individual genes can improve tolerance to cold stress (Saijo et al., 2000; Mukhopadhyay et al., 2004; Mishra et al., 2013). Other studies have also indicated that transformation of cold-resistant

#### REFERENCES


genes can activate and enhance the expression of related genes under cold stress (Ziaf et al., 2011; Xu et al., 2014a). Therefore, clarifying the interaction between VaERD15 and other genes appears to be a promising direction for future research in this field. In summary, our findings confirm the significant value of continued investigation into the function and mechanisms of ERD genes in grapes, for the further development of cold-tolerant strains.

#### AUTHOR CONTRIBUTIONS

DY: Expression and analysis of VaERD15 in Arabidopsis, determining the physiological and biochemical indices for transgenic plants. LZ: RT-PCR analysis of AtERD15, data collation and manuscript writing. KZ: Subcellular localization of VaERD15 and transcriptional activation analysis of VaERD15. RN: Cloning and sequence analysis of VaERD15 gene, as well as RT-PCR analysis for V. amurensis. HZ: Semi-quantitative RT-PCR analysis for V. amurensis. JZ: Experimental design, plant material preparation and manuscript modification.

#### ACKNOWLEDGMENTS

This work received financial support from The National Science-Technology Support Plan Projects from the Ministry of Science and Technology of the People's Republic of China (grant no. 2013BAD02B04-06).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2017.00297/ full#supplementary-material

FIGURE S1 | The relative expression changes of VaERD15 and AtERD15 in transgenic and WT Arabidopsis plants under cold stress. All experiments were carried out for three biological replicates.


in Arabidopsis. Funct. Integr. Genomics 11, 445–465. doi: 10.1007/s10142-011- 0218-3


fpls-08-00297 March 4, 2017 Time: 16:56 # 12


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Yu, Zhang, Zhao, Niu, Zhai and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpls-08-00297 March 4, 2017 Time: 16:56 # 13

# Methodology for High-Throughput Field Phenotyping of Canopy Temperature Using Airborne Thermography

David M. Deery <sup>1</sup> \*, Greg J. Rebetzke<sup>1</sup> , Jose A. Jimenez-Berni <sup>2</sup> , Richard A. James <sup>1</sup> , Anthony G. Condon<sup>1</sup> , William D. Bovill <sup>1</sup> , Paul Hutchinson2 †, Jamie Scarrow<sup>2</sup> , Robert Davy <sup>3</sup> and Robert T. Furbank 1, 4

#### Edited by:

*Gustavo A. Lobos, University of Talca, Chile*

#### Reviewed by:

*Jeff W. White, Agricultural Research Service (USDA), USA Lee Hickey, The University of Queensland, Australia*

> \*Correspondence: *David M. Deery david.deery@csiro.au*

† Present Address: *Paul Hutchinson, Hussat Pty Ltd., Australia*

#### Specialty section:

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

Received: *04 October 2016* Accepted: *16 November 2016* Published: *06 December 2016*

#### Citation:

*Deery DM, Rebetzke GJ, Jimenez-Berni JA, James RA, Condon AG, Bovill WD, Hutchinson P, Scarrow J, Davy R and Furbank RT (2016) Methodology for High-Throughput Field Phenotyping of Canopy Temperature Using Airborne Thermography. Front. Plant Sci. 7:1808. doi: 10.3389/fpls.2016.01808* *<sup>1</sup> CSIRO Agriculture and Food, Canberra, ACT, Australia, <sup>2</sup> High Resolution Plant Phenomics Centre, Australian Plant Phenomics Facility, CSIRO Agriculture and Food, Canberra, ACT, Australia, <sup>3</sup> CSIRO Information Management and Technology, Canberra, ACT, Australia, <sup>4</sup> ARC Centre of Excellence for Translational Photosynthesis, Australian National University, Canberra, ACT, Australia*

Lower canopy temperature (CT), resulting from increased stomatal conductance, has been associated with increased yield in wheat. Historically, CT has been measured with hand-held infrared thermometers. Using the hand-held CT method on large field trials is problematic, mostly because measurements are confounded by temporal weather changes during the time required to measure all plots. The hand-held CT method is laborious and yet the resulting heritability low, thereby reducing confidence in selection in large scale breeding endeavors. We have developed a reliable and scalable crop phenotyping method for assessing CT in large field experiments. The method involves airborne thermography from a manned helicopter using a radiometrically-calibrated thermal camera. Thermal image data is acquired from large experiments in the order of seconds, thereby enabling simultaneous measurement of CT on potentially 1000s of plots. Effects of temporal weather variation when phenotyping large experiments using hand-held infrared thermometers are therefore reduced. The method is designed for cost-effective and large-scale use by the non-technical user and includes custom-developed software for data processing to obtain CT data on a single-plot basis for analysis. Broad-sense heritability was routinely >0.50, and as high as 0.79, for airborne thermography CT measured near anthesis on a wheat experiment comprising 768 plots of size 2 × 6 m. Image analysis based on the frequency distribution of temperature pixels to remove the possible influence of background soil did not improve broad-sense heritability. Total image acquisition and processing time was *ca.* 25 min and required only one person (excluding the helicopter pilot). The results indicate the potential to phenotype CT on large populations in genetics studies or for selection within a plant breeding program.

Keywords: field experiments, wheat, thermal imaging, image analysis, data processing, pixel histogram analysis

## 1. INTRODUCTION

The gaseous exchange of water for carbon occurs at the stomata. From this exchange, plant surfaces, particularly leaves, are cooled by evaporation, and their temperatures typically decrease with increased evaporation. Stomatal closure and reduced transpiration manifest as a warmer canopy temperature (CT), while cooler CT is related to more open stomata and higher transpiration. Cooler CT has been associated with genetic gains in wheat yield, higher stomatal conductance, and maximum photosynthetic rates under non-water-limited conditions (Fischer et al., 1998). Similarly, cooler CT has been associated with increased grain yield in warm, irrigated conditions in Mexico (Reynolds et al., 1994; Amani et al., 1996; Ayeneh et al., 2002), and in a study comparing a selection of spring wheat cultivars from Australia and the International Maize and Wheat Improvement Center (CIMMYT) (Rattey et al., 2011). Similar findings were reported in water-limited environments, with cooler CT in wheat associated with increased yield (Blum et al., 1989; Rashid et al., 1999; Olivares-Villegas et al., 2007). When measured during grain-filling, cooler CT has been associated with increased rooting depth (Reynolds et al., 2007a), water use, and grain yield (Lopes and Reynolds, 2010). Conversely, warmer CT has been associated with conservative water use in different crops. In wheat, Pinter et al. (1990)reported that varieties with warmer CT in well-watered conditions had reduced stomatal conductance, used less water and were higher yielding when grown under water limitation.

Researchers have investigated the genetic basis underpinning CT in wheat using different populations. For example, Saint Pierre et al. (2010) studied five populations grown in three environments (water-limited, well-watered, and heat stress) and reported that gene effects were mostly additive with some dominance. Genetic mapping has revealed multiple quantitative trait loci for CT that are often pleiotropic with other important agronomic traits including yield and biomass (Pinto et al., 2010; Bennett et al., 2012; Mason et al., 2013; Rebetzke et al., 2013b). These studies generally report a strong association between cooler CT and yield, particularly when CT is measured during grain-filling. However, the polygenic control, together with the environmental sensitivity of stomatal conductance and CT (Rebetzke et al., 2013b), may reduce the heritability of the trait and hence the utility of CT for selection within a breeding program. Mason and Singh (2014) investigated CT as an indirect selection criterion for wheat under water limitation and heat stress environments. They concluded that the most useful application of CT within a breeding program would occur in the early generations, where yield testing is not performed and therefore indirect selection would be beneficial.

In the aforementioned studies, CT was measured with handheld infrared thermometers. Use of hand-held instruments in large experiments is laborious, time-consuming and sensitive to weather fluctuations over short periods of time. Moreover, difficulties associated with maintaining a constant view angle and avoiding "contamination" from soil further complicate hand-held CT measurements. To address these issues, infrared thermography has been proposed as a method for CT phenotyping, owing to the advent of relatively affordable thermal cameras and user-friendly software for image processing (Jones et al., 2009; Takai et al., 2010; Prashar et al., 2013; Prashar and Jones, 2014). Recent studies have used unmanned aerial vehicles (UAVs) for the acquisition of thermal images for quantifying water stress in various field crops including cotton (Sullivan et al., 2007) and perennials including olives, mandarins, oranges, and apples (Berni et al., 2009a,b; Zarco-Tejada et al., 2012; Gómez-Candón et al., 2016). Chapman et al. (2014) demonstrated the use of UAV for various phenotyping applications including CT in sugarcane using thermal imaging.

For successful deployment of CT phenotyping within breeding programs, a scalable, and reliable methodology must first be developed and validated. Such a methodology must enable acquisition of CT from a large number of plots in a short time period (in the order of seconds), to reduce variance associated with weather fluctuations. The method must be accurate and precise to enable reliable and confident discrimination between genotypes. Moreover, the method must enable fast data acquisition and timely data processing. It must also be routine in delivery and readily accessible.

In this paper, we evaluate such a method developed for assessing CT on large field experiments. The method involves (i) airborne thermography from manned helicopter using a radiometrically-calibrated thermal camera to acquire CT data for large experiments in the order of seconds, and then (ii) data processing within minutes. The aim of this paper is to demonstrate the repeatability, scalability, and operative nature of the airborne thermography method for potential use in plot-scale phenotyping within a genetics study or within a plant breeding program.

## 2. MATERIALS AND METHODS

## 2.1. Field Experiments

A field experiment containing contrasting wheat genotypes was grown in two successive years at the Managed Environment Facility (MEF) (Rebetzke et al., 2013a), located at Yanco (34.62◦ S, 146.43◦E, elevation 164 m) in SE Australia. The soil at the Yanco MEF has been classified as chromosol and has a clay-loam texture (Isbell, 1996). The experiment was sown on 28th May in 2013 and 11th June in 2014 following canola or field pea break-crops and then managed with adequate nutrition and chemical controls as required for pest, weed, and leaf diseases. The experiments comprised 768 experimental plots, of size 2 × 6 m with 18 cm row spacing (orientated North-South), and included a range of germplasm that conformed to the following criteria described in Rebetzke et al. (2013a): contemporary high-yielding, elite germplasm with agronomically-acceptable flowering time and plant height, to minimize confounding variation in CT with canopy architecture.

Genotypes were sown into a partial-replicate design trial (average number of replicates was 1.4) at a sowing density of 200 seeds per m<sup>2</sup> . As described in Rebetzke et al. (2013a), two irrigation treatments (384 experimental plots per treatment) were used to simulate appropriate target environments, namely: Treatment 1, where irrigation was supplied to achieve a water limitation pattern close to the long-term climate median for the site and; Treatment 2, where irrigation was supplied to achieve the equivalent of a decile eight rainfall (wettest 20% of years) for the site. The mean grain yields for Treatment 1 and Treatment 2 were 2.2 and 2.0 t/ha in 2013 and 1.7 and 2.1 t/ha in 2014, respectively. For the majority of entries and for both treatments, the anthesis growth stage occurred between the 19th and 24th of September in 2013 and between the 24th of September and 2nd of October in 2014.

#### 2.2. Hand-Held Thermography

Hand-held CT measurements were made by a single operator walking through the plots with an infrared thermometer (Mikron 1600, Mikron Infrared Instrument Co., Inc., Oakland, NJ, USA). To minimize capturing soil in the instrument's field-of-view, the infrared thermometer was held obliquely to each plot and scanned across the canopy at an angle of ca. 20◦ (above the horizontal) for ca. 4 s to derive an average CT-value for each plot (after Rebetzke et al., 2013b). Measurements were taken on the morning of 18th October 2013, between 11:00 and 11:30, (Treatment 2 only) and on the afternoon of 25th October 2013, between 13:50 and 14:20, (Treatment 2 only). The majority of entries in the experiment were in the grain-filling growth stage. Weather conditions on both days were sunny and clear, and winds light (≤20 km/h). Air temperature, relative humidity, and wind speed were recorded at a weather station located ca. 400 m from the experiment site. Weather conditions were only recorded prior to the commencement of hand-held CT measurements on the 18th October 2013, while on the 25th October 2013, weather conditions were recorded prior to the start of measurements and at the completion of measurements.

## 2.3. Airborne Thermography

Airborne thermal images were acquired using the system described below on the 24th October 2013 at 10:00, 11:30, 12:30, and 13:30 and on the 2nd October 2014 at 09:00, 10:00, 11:00, 12:00, 13:00, and 14:00. On the 24th October 2013, the majority of entries in the experiment were in the grain-filling growth stage. On the 2nd October 2014, the majority of entries in the experiment were at anthesis. Weather conditions were sunny and clear, and winds light (≤20 km/h) on all days. The image acquisition and processing pipeline is depicted in **Figures 1**, **2**, and the major steps are described below.

#### 2.3.1. Image Acquisition

Thermal images were acquired using a thermal infrared camera (FLIR <sup>R</sup> SC645, FLIR Systems, Oregon, USA, for which the technical specifications are: ±2 ◦C or ±2% of reading; < 0.05◦C pixel sensitivity; 640 × 480 pixels; 0.7 kg without lens; 13.1 mm lens). The camera was mounted in a commercially-available helicopter cargo pod (R44 Helipod II Slim Line Top Loader, Simplex Aerospace, Oregon, USA) and fitted to a Robinson R44 Raven helicopter (**Figure 1**). Highly visible infra-red (IR) targets, made of black fabric and of size ca. 1 m<sup>2</sup> , were systematically positioned throughout the field to identify the experiment from adjacent collocated experiments. The IR targets were initially used for flight navigation and later for spatial referencing in post-processing of the thermal images. In contrast to other studies (e.g., Gómez-Candón et al., 2016), the IR targets were not used for temperature correction of the thermal images. Prior to acquiring thermal images, a GPS tracking line (an "AB line") for subsequent flights was recorded by flying ca. 10 m above ground level (AGL) directly along the middle of the intended flight line.

In order to capture the experiment in a single flight pass whilst maximizing image resolution and avoiding motion blur, images were typically acquired at heights of 60 to 90 m AGL and at a flight velocity of 25–35 knots (45–65 km/h). Using the camera described above, at 60 m AGL, an image swath 43.6 by 32.1 m was obtained with a pixel size 7 × 7 cm, which equated to 204 temperature pixels per m<sup>2</sup> . At 90 m AGL, an image swath 65.3 by 48.1 m was obtained with a pixel size 10 × 10 cm, which equated to 100 temperature pixels per m<sup>2</sup> .

Thermal images were recorded on a laptop computer with FLIR <sup>R</sup> ResearcherIRTM software which was also used to control the camera. This proprietary software is provided for camera control and comprises basic image analysis features. The laptop and camera were manually operated by the helicopter passenger. Immediately prior to acquiring data for a particular experiment, the passenger would manually apply the shutterbased non-uniformity correction (NUC) and focus the camera, thereby ensuring image sharpness and that the NUC was not automatically applied during the run. Whilst acquiring thermal images, the passenger checked the images for complete coverage of the experiment using the IR targets and, in this way, provided real time assessment of the images and feedback on the helicopter flight path to the pilot (**Figure 2A**).

This method enabled capture of multiple high quality single images with at least 30% frame overlap in the direction of travel. Image acquisition with this system took < 10 s for the experiment described above comprising 768 plots.

#### 2.3.2. Image Processing

The thermal images were pre-processed with FLIR <sup>R</sup> ResearcherIRTM software using the basic image analysis and processing features provided. Pre-processing included trimming of the image stack, to exclude extraneous images, and conversion from the RAW file format to Matlab (MAT) file format. This processing took ca. 2 min and was independent of experiment size.

Experimental plots were segmented from each thermal image using custom software developed with Python 2.7 (Python Software Foundation, https://www.python.org/); alias "ChopIt". The ChopIt software works on a frame-by-frame basis extracting data from the raw imagery, whereby the user navigates through the image stack to ensure that each plot in the experiment has been sampled. A screenshot of the ChopIt graphical user interface is shown in **Figure 2B**. The ChopIt software is designed for semi-automated plot segmentation whereby the user controls the area sampled within plots by placement of bounding corners. The software also assigns a unique identifying number to each plot. The core geometric algorithm in ChopIt divides a foursided region into a predefined number of rows and columns based on the placement of the bounding corners. The algorithm

FIGURE 1 | Airborne thermography image acquisition system comprising a helicopter cargo pod with thermal camera and acquisition kit mounted on the skid of a Robinson R44 Raven helicopter. Photo insert shows the inside of the helicopter cargo pod with arrow denoting FLIR® SC645 thermal camera: ±2 ◦C or <sup>±</sup>2% of reading; < 0.05◦C pixel sensitivity; 640*x*480 pixels; 0.7 kg without lens.

experiment comprising 1000 plots of size 2 × 6 m.

uses the concept of vanishing points and thus can accommodate situations where the image plane is not parallel to the ground. For a given row and column value, a plot rectangle is defined with a surrounding buffer, and the CT data are extracted from within the plot rectangle. The ChopIt software produces two output files comprising the CT data for each plot rectangle assigned by the user: (1) SQLite database file comprising all the CT pixel values for each experimental plot rectangle; and (2) an Excel file comprising a descriptive statistical summary for each experimental plot rectangle.

The process of plot segmentation and extraction of CT for each individual plot for statistical analysis took ca. 20 min for the experiment described above comprising 768 plots. Total image acquisition and processing time was ca. 25 min.

#### 2.3.3. Image Quality Control

The custom-developed ChopIt software provides a high level of quality control for the user to manually exclude poor quality sections of the plot or removed sections (e.g., where biomass cuts have been earlier sampled). This user-enabled flexibility in the image analysis protocol is demonstrated in **Figure 3**, where a section comprising a previous biomass sampling has been excluded on a plot with approximate dimensions of 2 × 6 m. In this fashion, sections of plots comprising biomass samples were excluded in the study reported herein. In addition to this feature of manually excluding poor quality sections of the plot during plot segmentation, post-processing of the temperature pixels is also possible, as all the pixel data for each plot are stored in a SQLite database file.

analysis protocol, whereby the user can manually exclude exposed soil patches within a field plot. In this example, exposed soil patch is from biomass sample taken earlier on a 2 × 6 m field plot of wheat. (A) ChopIt user interface, where user has avoided exposed soil within the plot. (B) Magnified view showing exposed soil patch. (C) CT histogram is therefore void of pixels from the exposed soil patch. Compare with (D), where for the same plot as (A), user has included the exposed soil, evident in magnified view (E) and pixels from the soil patch are evident in the CT histogram (F). Where *<sup>x</sup>*¯ denotes the respective mean for (C,F). The *<sup>x</sup>*¯ from (C) is 0.87◦<sup>C</sup> cooler than (F).

## 2.4. Analysis of the Pixel Frequency Distribution

# 2.4.1. Rationale

The water limitation imposed on the crop in the MEF can often result in incomplete crop ground cover. The incomplete ground cover may have implications for the airborne thermography measurements through the potential aggregation of crop canopy and the background soil temperatures, which in the case of dry soil is often warmer than the crop canopy. The potential for the background soil temperature to bias estimates of CT is exacerbated when the size of the image pixels is the same as, or greater than, the individual plant organs that comprise the crop canopy. In such cases, a pixel is likely to comprise both soil and plant canopy temperatures, thereby resulting in "mixed pixels". The presence of mixed pixels is likely to bias the observed temperature toward the soil background temperature (Jones and Sirault, 2014).

In the airborne thermography system described above (Section 2.3), at an above-ground altitude of ca. 60 m, the pixel size is ca. 7 × 7 cm. This pixel resolution is several times greater than the leaf width of a typical wheat plant (ca. 1 cm) and, together with variation in plant establishment and canopy architecture, can result in mixed pixels and the need for image analysis to remove temperature pixels arising from the background soil that can bias the intrinsic measures of plant-based CT. Methods for handling thermal images containing mixed pixels were reviewed by Jones and Sirault (2014). These methods include automated thresholding algorithms such as the Otsu method (Otsu, 1979) and work best when discrete peaks are present in the histogram, representing multiple frequency distributions of temperature pixels.

A bimodal distribution with discrete peaks representing soil and plant canopy was not evident in the data acquired. Rather, pixel frequency distributions were unimodal with a long tail of warm temperature pixels, as shown in **Figures 3**, **4**. The unimodal distribution may have resulted from the ca. 7 × 7 cm pixel resolution, whereby no clear difference between the mean temperature of the background soil and the mean temperature of the plant canopy was evident (Jones and Sirault, 2014). To account for the unimodal pixel distribution, the below-described methods of pixel frequency analysis were used.

#### 2.4.2. Methods for Analysis of the Pixel Frequency Distribution

The frequency distribution of the temperature pixels from a given plot rectangle produced from the ChopIt software was analyzed to determine if the observed temperature was biased by the background soil and whether this influenced the measurement repeatability. Three methods for analysis of pixel temperature bias were evaluated, depicted in **Figure 4**, and hereafter referred to as M1, M2, and M3:

	- a. For a given set of plot pixel temperatures, x, the mode of the distribution was estimated.
	- b. Then, a filter cut-off, c, was calculated according to: c = min(x) + 2(mode(x) − min(x))
	- c. The set, x, was then filtered by retaining only values where x < c.
	- d. The mean of this filtered set was then calculated.

From the representative example of pixel frequency distribution shown in **Figure 4**, the difference between M1 and M2 was 2.4◦C, the difference between M1 and M3 was 0.6◦C and the difference between M3 and M2 was 1.7◦C.

#### 2.5. Statistical Analysis

CT data were analyzed after first checking for normality and error variance homogeneity at each date by time sampling event. Each event was analyzed separately with the best spatial models being determined after first fitting the experimental design and then modeling the residual variation with autoregressive row and column terms in the Genstat <sup>R</sup> statistical program (https://www. vsni.co.uk/software/genstat/). Significant spatial effects were identified and residuals assessed before determinations made to the need for fitting of other (e.g., linear) effects (Gilmour et al., 1997). Generalized heritabilities were then estimated after Holland et al. (2003).

## 3. RESULTS

### 3.1. Hand-Held Thermography

The results from the hand-held thermography are summarized in **Table 1**, and box-plots for each sample time are shown in **Figure 5A**. The range in plot CT was large in each sampling event. Broad-sense heritabilities for CT using the hand-held thermography method were 0.17 and 0.13 for the morning (18 October 2013) and afternoon measurements (25 October 2013), respectively (Treatment 2 only). The time taken to measure CT using hand-held thermography on 384 plots was ca. 30 min on both days. On the 25th October 2013, the air temperature changed from 17.8◦C prior to the start of measurements to 19.0◦C at the completion of measurements. At the same time, the relative humidity remained constant at 28%, and the wind speed remained constant at 17 km/h. Weather conditions were only recorded prior to the commencement of measurements on the 18th October 2013: air temperature was 11.9◦C, relative humidity was 48% and wind speed was 11 km/h.

#### 3.2. Airborne Thermography

Box-plots summarizing the airborne thermography CT data for each flight time and irrigation treatment are shown for 2013 and 2014 in **Figures 5C,D**, respectively. Each box-plot represents CT data from 384 experimental plots (384 experimental plots per treatment and 768 experimental plots in total). The CT for box-plots shown in **Figures 5C,D** were derived from each experimental plot using M1, the mean of all pixels from a given plot rectangle produced from the ChopIt software with no pixels removed. For 2013 (**Figure 5C**) and 2014 (**Figure 5D**), Treatment 2 is consistently cooler than Treatment 1, owing to the greater water limitation in Treatment 1. Pearson correlations between the hand-held CT, obtained on the 18th October 2013 11:00 and the 25th October 2013 14:00, and the airborne thermography CT, obtained on the 24th October 2013, are shown in **Figure 5B**. The correlations between hand-held CT and airborne CT were <0.25.

Broad-sense heritabilities for airborne thermography CT for each flight time and irrigation treatment are shown for 2013 and 2014 in **Figure 5**. For the 2013 data, broad-sense heritability was calculated using CT derived from each experimental plot using M1 (**Figure 6A**). For the 2014 data, to test the influence of soil temperature bias on measurement repeatability, broadsense heritability was calculated for CT estimated from M1, M2, and M3 (**Figure 6B**). With the exception of two early morning measurements on Treatment 1 in 2014, broad-sense heritability was high and ranged from 0.34 to 0.79. **Figure 6B** shows that for

TABLE 1 | Summary of hand-held CT sampling events, weather conditions and resulting broad-sense heritabilities.


*Hand-held CT measurements were made on two occasions in 2013 on Treatment 2 only. In Treatment 2, irrigation was supplied to achieve the equivalent of a decile eight rainfall (wettest 20% of years) for the site. Treatment 2 comprised 348 plots. Values in parenthesis denote the time of day when air temperature, relative humidity, and wind speed were recorded. Note, that weather conditions were only recorded prior to the commencement of measurements on the 18th October 2013.*

a given flight time, there was very little difference in broad-sense heritability for the three pixel handling methods. This result is in accordance with Figure S1, which shows the Pearson correlation calculated for all methods at each flight time. At any given flight time, all three pixel handling methods were highly correlated with Pearson correlations exceeding 0.86 and averaging 0.93. The background soil temperature did influence the observed CT but in this example, did not influence measurement repeatability (i.e., broad-sense heritability).

## 3.3. Analysis of the Pixel Frequency Distribution

The aerial CT data incorporates influences from the background soil and the plant canopy. To investigate the significance of the effect of background soil, pairwise difference plots between M1, M2, and M3 (i.e., M1 and M2, M1 and M3, M3 and M2) were generated for airborne thermography data captured from Yanco MEF, 2nd October 2014, using the method of Bland and Altman (1986):


The difference against mean plots are shown in Figure S2. For M1 and M2 (Figure S2A), and M3 and M2 (Figure S2C), the differences increased with time of day until 11:00 h, then from 12:00 to 14:00 h the differences decreased (M1 and M2 mean decreased 0.19◦C). From the mean difference calculated across all sample times, M1 and M3 were on average 1.13◦C and 0.94◦C warmer, respectively, than M2. Further, M1 and M3 were as much as ca. 3.0◦C warmer than M2 at sample times close to solar

noon (11:00, 12:00, and 13:00 h). The majority of differences between M1 and M3 (Figure S2B) were close to zero and the mean difference across all sample times was 0.19◦C.

supplied to achieve a water limitation close to the long-term climate median for the site.

## 4. DISCUSSION

## 4.1. High Broad-Sense Heritability Obtained from Airborne Thermography Methodology

The main finding reported herein is the large broad-sense heritability obtained for CT from the airborne thermography method, which contrasts with the low heritabilities reported with hand-held thermography sampling methods. Further, this was demonstrated in a large experiment comprising diverse wheat germplasm typical of a commercial wheat breeding program. Across both years, the broad-sense heritability for the airborne thermography ranged from 0.34 to 0.79, while for the handheld infra-red thermometer, broad-sense heritability ranged from 0.13 to 0.17. Further, aside from two early morning measurements (09:00 and 10:00 h) on Treatment 1 in 2014, which ranged from 0.34 to 0.46, broad-sense heritability for the airborne thermography ranged from 0.52 to 0.79. The larger broad-sense heritabilities obtained from the airborne thermography can be attributed to the acquisition of thermal images of the entire experiment at effectively a single point in time, thereby overcoming confounding changes in local weather conditions during sampling to provide reliable assessment of CT for large experiments comprising hundreds of 10 m<sup>2</sup> sized plots. Moreover, by measuring CT at effectively a single point in time, statistical analysis need only account for the spatial variation in CT, likely due to the below ground effects of soil structure and water availability, which can be accommodated by the experiment design and spatial analysis (Gilmour et al.,

1997). In contrast, for the hand-held thermography method, the spatial analysis is confounded by temporal variation in weather conditions, which are more difficult to account for in the statistical analysis.

Broad-sense heritabilities obtained from the hand-held infrared thermometer were small, ranging from 0.13 to 0.17, and typical of our experience with experiments of similar size previously undertaken at the Yanco site (data not shown). Further, Pearson correlations between the hand-held (18-Oct-2013 11:00 and 25-Oct-2013 14:00) and airborne thermography (24-Oct-2013) measures were <0.25, and the correlation between the two hand-held measurement events was low (0.13) (**Figure 5B**). It is likely that during the time required to measure all plots with the hand-held thermography method (ca. 30 min), the seemingly small changes in local weather conditions confounded the CT measurements, thereby resulting in low broad-sense heritabilities (**Table 1**). Another contributing factor might also be the range in area and canopy structure sampled by the user as they moved through the experiment. By contrast, an airborne thermography measurement of each plot in the entire experiment took approximately 3 s – a measurement of CT for 768 experimental plots at effectively a single point in time and at a common height above the ground. The heritability of hand-held CT can potentially be improved by using the time of sampling in the statistical analysis. For example, Rebetzke et al. (2013b) improved the heritability of hand-held CT by fitting "time of sampling" as a fixed linear effect in a mixed linear model.

To the best of our knowledge, heritability of CT is typically small on a single-plot basis and seldom reported in the literature. More commonly reported is heritability of CT estimated on a line-mean basis where multiple environments are included in the calculation. Heritabilities of CT calculated on a linemean basis are often small to moderate in size for both diverse germplasm and related families such as recombinant inbred lines (RILs) and doubled-haploid (DH) lines. For example, Rebetzke et al. (2013b), using hand-held CT in three wheat populations containing 144–178 DH lines assessed in four irrigated environments, reported small narrow-sense heritabilities (0.12–0.32) on a single-plot basis and moderate to high line-mean heritability ranging from 0.38 to 0.91. Pinto et al. (2010) reported broad-sense heritability of 0.49 for CT measured during grain-filling on a RIL wheat population comprising 167 lines grown in six field experiments under drought and heat environments. In a separate study under similar environmental conditions, Lopes and Reynolds (2012)reported moderate broadsense heritability on both a wheat population comprising 169 RILs (0.34) and 294 elite wheat lines from CIMMYT (0.38). Others have reported moderate line-mean heritability for diverse wheat germplasm calculated from studies comprising RILs in multiple environments (e.g., Reynolds et al., 2007b; Rattey et al., 2011; Lopes et al., 2012). In the above-mentioned studies, CT was measured using hand-held infrared thermometers. In the study reported herein, **Figure 5** shows that the broad-sense heritability for the airborne thermography method, calculated on a singleplot basis, was typically >0.50 and as high as 0.79, which is considerably greater than literature reported calculations of CT heritability on both a single-plot and line-mean basis.

## 4.2. Analysis of the Temperature Pixel Frequency Distribution Did Not Improve Broad-Sense Heritability

In this study, methods based on filtering the frequency distribution of the temperature pixels to remove the influence of background soil did not improve broad-sense heritability (**Figure 5**). However, it is likely that the accuracy of the CT data was improved with the CT derived from M2. The difference against mean plots (Figure S2) show that for M1 and M2 (Figure S2A), and M3 and M2 (Figure S2C), the differences increased with the time of day until 11:00 h. This is possibly because the soil temperature increased more than the plant temperatures, thereby biasing the CT derived from M1 and M3. For M1 and M2, and M3 and M2, the decrease in differences from 12:00 to 14:00 h (for M1 and M2, the mean decreased 0.19◦C) may have been due to the lower sun angle in the afternoon increasing the shaded portion of soil and thereby cooling it. That many of the differences between M1 and M3 (Figure S2B) were close to zero, indicates that M2 was more effective at deriving plant-based CT than M1 and M3. In contrast to M1 and M3, M2 was derived after discarding the warmest 70th percentile and it is therefore unlikely to be biased by the soil temperature, which, for dry soil, is likely to be warmer than the plant canopy. This approach of sampling cooler pixels, is likely to result in M2 more accurately approximating the actual plant CT than M1 and M3. The potential for improved CT accuracy may be beneficial in applications using energy balance equations to calculate stomatal conductance or transpiration, where soil-biased CT can lead to significant errors (Leinonen et al., 2006; Guilioni et al., 2008).

There may be phenotyping applications where the background soil could significantly reduce the accuracy and precision of CT measurements. For example: when multiple biomass samples have been taken from a plot leaving large areas of exposed soil; in plots with poor plant establishment; in early generation breeding trials or situations where seed number is limited and phenotyping is required on single plants or spaced rows; where row-spacing is too wide to completely cover the soil and in experiments that use raised beds with wide row spacings. To remove the influence of background soil, the custom developed ChopIt image processing software provides a high level of quality control to manually exclude poor quality sections of the plot or sections of the plot where biomass samples have been removed (**Figure 3**). In addition to this feature, post-processing based on the pixel frequency distribution (e.g., M2 and M3) is possible as all the pixels for a particular plot rectangle are stored in a SQLite database file.

## 4.3. Frame by Frame Image Processing

The ChopIt software enables processing of the images on a frameby-frame basis and was custom built for the application of field phenotyping of CT. Our image processing method contrasts with the widely used mosaicking method, where a large number of single frame images containing many plots are used to create a mosaic from which plot level information is extracted (e.g., Berni et al., 2009b; Chapman et al., 2014; Gómez-Candón et al., 2016). Mosaics were attempted with thermal images obtained from our image acquisition system. However, the use of mosaics presents a number of issues, namely: the fact that mosaicking software tends to modify the pixel's value in favor of the visual result; the mosaicking process is computationally intensive; and mosaicking requires accurate measurements of the external orientation of the images via the integration of the camera with a GPS and inertial measurement unit (IMU). For our application, processing thermal images on a single frame basis confers a number of advantages over mosaicking including: a reduction in image processing time; higher CT accuracy from working with original temperature values from the raw images without the application of any pixel interpolation or blending; and no mosaicking "black box", which introduces another layer of measurement uncertainty to the process.

Conversely, the requirement to process the images on a frame by frame basis introduced a trade-off between encompassing the entire experiment in a single helicopter pass, whilst maximizing the pixel resolution by flying no higher than necessary. However, the requirement to encompass an entire experiment in a single pass conferred many advantages including reduced helicopter flight time and cost, faster image processing, reduced image processing errors, and the influence of changing weather conditions on the observed CT were minimized.

## 4.4. Unmanned Aerial Vehicles

Unmanned aerial vehicles (UAVs) and tethered balloons have also been used for the acquisition of thermal images in field phenotyping applications (e.g., Sullivan et al., 2007; Berni et al., 2009a,b; Jones et al., 2009; Zarco-Tejada et al., 2012; Chapman et al., 2014; Gómez-Candón et al., 2016). The smaller form and, in some jurisdictions, non-requirement for a licensed operator may enable opportunistic sampling on small experiments, whereas the manned helicopter system used in this study might otherwise be considered too expensive to hire or may not be locally available. However, UAVs are often limited to a small camera payload (e.g., 1.5–1.1 kg in Chapman et al., 2014 and 3.0 kg in Gómez-Candón et al., 2016); have limited endurance (e.g., 30–60 min in Chapman et al., 2014); are highly susceptible to wind; are often required to be operated within line of sight and sometimes require a license to operate. Moreover, the image mosaicking process often reported in the literature with UAVs necessitates multiple passes of the experiment to achieve sufficient image overlap. For example, Chapman et al. (2014) used a transect width of 10 m, while Gómez-Candón et al. (2016) used track and cross-track overlaps of 80 and 60%, respectively. As discussed above, such requirement for multiple passes increases the required flight time for a given experiment and increases the likelihood that changes in local weather conditions will confound to compromise measurements of CT. However, the ChopIt frame-by-frame image processing software could potentially be used with images acquired from a UAV platform, provided the image acquisition considerations described in Section 2.3.1 are adhered to.

In contrast to many UAVs described in the literature, the thermal image acquisition system used in this study, comprising a manned helicopter fitted with a helicopter cargo pod (**Figure 1**), has a payload limit of 45 kg. The large payload limit permits the use of a radiometrically-calibrated thermal camera with high accuracy and pixel to pixel sensitivity that negates the need for ground infra-red calibration targets and temperature correction during post-processing (Gómez-Candón et al., 2016). Together, these simplify the image processing. Moreover, the large payload provides the option to add more cameras and sensors if required for additional tasks and enables carriage of a high-capacity battery sufficient for several hours operating time. Further, the use of manned helicopter enables acquisition of CT measurements from multiple large field trials in a short time, which would otherwise require a UAV to fly beyond visual line-of-site, which is not permitted in some jurisdictions.

## 4.5. Potential for Deployment of CT within Commercial Breeding Programs

In breeder's trials, plot size is often smaller than those sown in this study (e.g., Rebetzke et al., 2014). A single 10 s pass of the helicopter-mounted thermal camera can capture up to 5000 individual 4 m<sup>2</sup> plots in a breeder's yield trial. This application is ideally suited to the airborne thermography method, which we have shown readily scales up to experiments comprising 1000 individual 10 m<sup>2</sup> plots. For 1000 plots of 2 × 6 m, acquisition of CT on a per plot basis using the airborne thermography and data handling described here takes ca. 25 min and aside from the helicopter pilot, requires only one person. The method could be used within a breeding program to assess spatial uniformity of yield experiments and provide guidance to appropriate statistical spatial models. The demonstrated link between CT and grain yield (Reynolds et al., 1994; Amani et al., 1996; Fischer et al., 1998; Ayeneh et al., 2002; Rattey et al., 2011; Rebetzke et al., 2013b) should provide opportunity to select for CT in early generation screening. For example, selection of CT in early generations was demonstrated in studies reporting reasonable genetic correlation between small plot CT, leaf porosity and full plot yield (Condon et al., 2004, 2007). In addition, augmenting breeder's visual selection with early generation measurements of CT can potentially identify a greater number of equally high yielding lines compared to breeder's visual selection alone (van Ginkel et al., 2004). Importantly, economic analysis indicates that the incorporation of CT measurements within a wheat breeding program is likely to provide an economic benefit (Brennan et al., 2007).

To assist uptake by breeders, several improvements in the airborne thermography method described here are possible, including: remote automation of the image acquisition process; use of a smaller manned helicopter to reduce the operating cost (e.g., Robinson R22 Raven helicopter); and use of GPS georeferencing to improve image processing. Differences in canopy architecture that may influence CT could be accounted for by making use of measurements of fractional ground cover, from digital camera (e.g., Li et al., 2010), and canopy height that can now be routinely measured by ground-based LiDAR (e.g., Deery et al., 2014) but possibly aerial LiDAR in the future. Together, these potential improvements could reduce the cost per plot of the airborne thermography method.

The high helicopter operating cost, AU\$1000/h, may prohibit the use of the airborne thermography method within some breeding (and research) programs. However, the cost per plot of the airborne method, on 3000 plots of size 10 m<sup>2</sup> , equates to AU\$0.39 (ca. US\$0.30) (Table S1), which is within 30% of the hand-held cost per plot reported by Brennan et al. (2007) (US\$0.19 in 2007, which equates to US\$0.22 in 2016 after adjusting for inflation). Given the similar cost per plot of the two methods, together with the greater repeatability of the airborne CT method compared with the hand-held CT method, the airborne CT method could be a cost-effective CT phenotyping method for use within breeding (and research) programs.

## 5. CONCLUSION

CT, as a surrogate measure for stomatal conductance and potentially photosynthesis, has been associated with genotypic variation in grain yield in numerous studies and therefore mooted as a possible phenotypic selection tool for use in genetics studies or in breeder's trials. For this to be realized, an inexpensive, scalable, and reliable CT methodology is required. The airborne thermography methodology described herein is such a method. The method is highly repeatable, as evidenced by the high broad-sense heritabilities obtained. The method is scalable: for an experiment comprising 768 plots of size 2 × 6 m, it takes ca. 25 min to obtain a CT measurement for each individual plot for statistical analysis. Moreover, the method requires only one person (not including the helicopter pilot) and utilizes purpose built image processing software for use by a non-technical user.

## AUTHOR CONTRIBUTIONS

All authors contributed to the conception of the study. GR designed the field experiment and undertook statistical analysis. PH, JS, and JJ designed and integrated the helicopter cargo pod system components. JS, PH, and JJ developed the image acquisition protocol. PH, RD, JS, JJ, and DD designed and built the ChopIt image processing software. DD, GR, JJ, RJ, AC, WB, RF contributed to the conception of the article. DD made the figures and wrote the paper with input and advice from the co-authors.

## FUNDING

This research was funded by the Australian Government National Collaborative Research Infrastructure Strategy (Australian Plant Phenomics Facility) and the Grains Research and Development Corporation (GRDC).

## ACKNOWLEDGMENTS

We thank the following members of the Griffith Aeroclub for their support in initiating this work: Rob Robilliard (pilot) and Sam Hutchinson (pilot). We thank the proprietors of Riverina Helicopters, Gerry and Sally Wilcox, for their ongoing support of this work and the helicopter pilots for their skilful flying. We thank Kathryn Bechaz for dedicated assistance with the collection of hand-held thermography measurements. We would also like to thank staff at the Yanco Managed Environment Facility, Yanco NSW for excellent assistance with management of field experiments.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 01808/full#supplementary-material

## REFERENCES


with stomatal conductance and grain yield in wheat. Funct. Plant Biol. 40, 14–33. doi: 10.1071/FP12184


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Deery, Rebetzke, Jimenez-Berni, James, Condon, Bovill, Hutchinson, Scarrow, Davy and Furbank. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Assessing Wheat Traits by Spectral Reflectance: Do We Really Need to Focus on Predicted Trait-Values or Directly Identify the Elite Genotypes Group?

Miguel Garriga<sup>1</sup> , Sebastián Romero-Bravo<sup>1</sup> , Félix Estrada<sup>1</sup> , Alejandro Escobar <sup>1</sup> , Iván A. Matus <sup>2</sup> , Alejandro del Pozo<sup>1</sup> , Cesar A. Astudillo<sup>3</sup> and Gustavo A. Lobos <sup>1</sup> \*

#### Edited by:

Edmundo Acevedo, University of Chile, Chile

#### Reviewed by:

Hamid Khazaei, University of Saskatchewan, Canada Luis Morales-Salinas, University of Chile, Chile

> \*Correspondence: Gustavo A. Lobos globosp@utalca.cl

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 30 September 2016 Accepted: 15 February 2017 Published: 09 March 2017

#### Citation:

Garriga M, Romero-Bravo S, Estrada F, Escobar A, Matus IA, del Pozo A, Astudillo CA and Lobos GA (2017) Assessing Wheat Traits by Spectral Reflectance: Do We Really Need to Focus on Predicted Trait-Values or Directly Identify the Elite Genotypes Group?. Front. Plant Sci. 8:280. doi: 10.3389/fpls.2017.00280 <sup>1</sup> Facultad de Ciencias Agrarias, Plant Breeding and Phenomic Center, PIEI Adaptación de la Agricultura al Cambio Climático, Universidad de Talca, Talca, Chile, <sup>2</sup> CRI-Quilamapu, Instituto de Investigaciones Agropecuarias, Chillán, Chile, <sup>3</sup> Department of Computer Science, Faculty of Engineering, Universidad de Talca, Curicó, Chile

Phenotyping, via remote and proximal sensing techniques, of the agronomic and physiological traits associated with yield potential and drought adaptation could contribute to improvements in breeding programs. In the present study, 384 genotypes of wheat (Triticum aestivum L.) were tested under fully irrigated (FI) and water stress (WS) conditions. The following traits were evaluated and assessed via spectral reflectance: Grain yield (GY), spikes per square meter (SM2), kernels per spike (KPS), thousand-kernel weight (TKW), chlorophyll content (SPAD), stem water soluble carbohydrate concentration and content (WSC and WSCC, respectively), carbon isotope discrimination (113C), and leaf area index (LAI). The performances of spectral reflectance indices (SRIs), four regression algorithms (PCR, PLSR, ridge regression RR, and SVR), and three classification methods (PCA-LDA, PLS-DA, and kNN) were evaluated for the prediction of each trait. For the classification approaches, two classes were established for each trait: The lower 80% of the trait variability range (Class 1) and the remaining 20% (Class 2 or elite genotypes). Both the SRIs and regression methods performed better when data from FI and WS were combined. The traits that were best estimated by SRIs and regression methods were GY and 113C. For most traits and conditions, the estimations provided by RR and SVR were the same, or better than, those provided by the SRIs. PLS-DA showed the best performance among the categorical methods and, unlike the SRI and regression models, most traits were relatively well-classified within a specific hydric condition (FI or WS), proving that classification approach is an effective tool to be explored in future studies related to genotype selection.

Keywords: phenomic, high-throughput phenotyping, phenotyping, carbon isotope discrimination, reflectance

## INTRODUCTION

Wheat is one of the most important cereals in the human diet worldwide. This cereal is consumed in different types of processed foods, providing around 20% of total daily calories (Shewry, 2009). Due to world population growth, it is expected that current wheat production will need to be doubled by the middle of the century (Tilman et al., 2011; FAO, 2015). To accomplish this level of production, wheat yield must increase by 1.60% per year (Dixon et al., 2009), which is far from the 1.26% increase that was reached during the last decade (FAOSTAT, 2013). Additionally, the current effects of climate change on weather patterns, and unexpected events, are threatening maximum thresholds in many areas (Ayeneh et al., 2002; Azimi et al., 2010; Rebetzke et al., 2012; Hernández-Barrera et al., 2016).

This challenging scenario should encourage wheat breeders to accelerate the development and release of new high-yield cultivars that are adapted to more complex environmental conditions (Velu and Prakash, 2013). One strategy for improving and expediting the selection of these elite genotypes is the acquisition of high-dimensional phenotypic data (highthroughput phenotyping) (Bowman et al., 2015; Camargo and Lobos, 2016; Crain et al., 2016).

Remote sensing techniques, such as spectrometry, are increasingly used for plant phenotyping (Cabrera-Bosquet et al., 2012; Araus and Cairns, 2014). Spectral reflectance or the spectrum of energy reflected by the plant is closely associated with absorption at certain wavelengths that are linked to specific characteristics or plant conditions (Lobos and Hancock, 2015). Spectrometers can acquire detailed information regarding the electromagnetic spectrum in a short time, making this technology ideal for assessing hundreds or thousands of genotypes within a few hours (Cabrera-Bosquet et al., 2012). This would enable researchers and breeders to estimate multiple morpho-physiological and physico-chemical traits, which would otherwise be impossible to evaluate due to the time and cost involved (Lobos and Hancock, 2015). This would be reflected in reduced breeding program costs and, by allowing for the early selection of genetic material of interest, increase the chances of releasing improved cultivars in less time (Lobos and Hancock, 2015; Camargo and Lobos, 2016).

For the estimation of wheat traits, such as grain yield, biomass, leaf area index, plant height, and isotopic carbon discrimination, the majority of previous studies have resorted primarily to the use of "Spectral Reflectance Indices" (SRIs) (Aparicio et al., 2002; Babar et al., 2006a,b; Marti et al., 2007; Prasad et al., 2007; Gutierrez et al., 2010; Lobos et al., 2014; Hernández et al., 2015), whereas there has been less attention paid to the development of multivariate regression models, using part or all of the spectral reflectance (Pimstein et al., 2011; Dreccer et al., 2014; Li F. et al., 2014; Li X. et al., 2014; Hernández et al., 2015; Siegmann and Jarmer, 2015; Yao et al., 2015).

Current research into plant phenotyping and phenomics for plant breeding has focused on using the spectral signature to estimate predicted trait-values rather than exploring other tools that could directly identify elite genotypes for the desired trait. The use of reflectance data and categorical methods for breeding purposes has been scarcely addressed by the scientific community. Nonetheless, some successful experiments have been carried out: To classify lines for waxy alleles in durum wheat (Delwiche et al., 2006; Lavine et al., 2014) and bread wheat (Delwiche et al., 2011); to identify wheat lines possessing wheatrye translocations (Delwiche et al., 1999); to classify barley varieties (Porker et al., 2017); and to select haploid kernels from hybrid kernels in maize (Jones et al., 2012).

The aim of the present study was, based on plant reflectance data, to assess the feasibility of using a categorical approximation to select featured genotypes, by comparing the performance of a large set of SRIs, multivariate regression models (PCR, PLSR, ridge regression, and SVR), and categorical models (PCA-LDA, PLS-DA, and kNN) in the prediction of grain yield, agronomic yield components, and physiological traits.

## MATERIALS AND METHODS

## Plant Material and Experimental Conditions

A set of 384 cultivars and advanced lines of spring bread wheat (Triticum aestivum L.) with good agronomic characteristics and disease tolerance were evaluated (list at del Pozo et al., 2016). These genotypes were sourced from three wheat-breeding programs (INIA-Chile, INIA-Uruguay, and CIMMYT-Mexico) and are currently being used to breed for adaptation to drought stress and to develop suitable genotypes for wheat production in the drylands of Chile and other countries.

Experiments were conducted in 2012 in two contrasting Mediterranean environments of Chile: Cauquenes (35◦ 58′ S, 72◦ 17′W), with typical rain-fed conditions such that the plants were grown under water stress (WS); and Santa Rosa (36◦ 32′ S, 71◦ 55′W), under fully irrigated (FI) conditions. The precipitation in these locations during the experiments was 183 and 700 mm, respectively (**Table 1**). Under FI conditions, four furrow irrigations of approximately 50 mm each were applied at the end of tillering (Zadocks Stage 21; Z21), the flag leaf stage (Z37), heading (Z50), and middle grain filling (Z70) (Zadoks et al., 1974). There was a difference of 28 days (77–105 days) between the earlier and later genotypes in reaching the heading stage; 89% of the genotypes 81–94 days.

The experiment was conducted in an incomplete block design (α-lattice), with two replicates per genotype (n = 384 × 2). Plots had five rows, each 2 m in length, the distance between the rows was 0.2 m, and the distance between plots was 0.4 m. Similar agronomic practices were performed at the two locations. Sowing (20 g m−<sup>2</sup> ) dates were 23 May at Cauquenes and 7 August at Santa Rosa. Before sowing, the plots were fertilized with 260 kg ha−<sup>1</sup> of ammonium phosphate (46% P2O<sup>5</sup> and 18% N), 90 kg ha−<sup>1</sup> of potassium chloride (60% K2O), 200 kg ha−<sup>1</sup> of sul-po-mag (22% K2O, 18% MgO and 22% S), 10 kg ha−<sup>1</sup> of boronatrocalcite (11% B), and 3 kg ha−<sup>1</sup> of zinc sulfate (35% Zn). During tillering, an extra 153 kg ha−<sup>1</sup> of N was applied. Flufenacet + Flurtamone + Diflufenican (96 g a.i.) was applied for pre-emergence weed control and a further application of


TABLE 1 | Monthly maximum, minimum, and mean temperature, and monthly rainfall at the two experimental sites in central Chile during the trial (May 2012–January 2013).

MCPA (525 g a.i.) + Metsulfuron-metil (5 g a.i.) was used for post-emergence weed control.

#### Trait Measurements

#### Grain Yield and Agronomic Yield Components

The number of spikes per m<sup>2</sup> (SM2) was determined for a 1 m length of an inside row. The number of kernels per spike (KPS) and thousand-kernel weight (TKW) were determined in 25 spikes selected at random from the central row. Grain yield (GY) was assessed by harvesting the whole plot.

#### Leaf Chlorophyll and Water-Soluble Carbohydrate

Chlorophyll (Chl) content was determined using the SPAD index for five flag leaves per plot, at anthesis (an) and at grain filling (gf), with a portable leaf chlorophyll-meter (SPAD 502, Minolta Spectrum Technologies Inc., Plainfield, IL, USA). Watersoluble carbohydrate was determined using the anthrone reactive method (Yemm and Willis, 1954), for five main stems (excluding leaf laminas and sheaths) per plot, at an and at maturity (m), and expressed as concentration (WSC, mg g−<sup>1</sup> DW) and content (WSCC, mg stem−<sup>1</sup> ).

#### Carbon Isotope Discrimination

For kernels sampled randomly at m, the stable carbon (13C/12C) isotope ratio was measured using an elemental analyzer (ANCA-SL, PDZ Europa, UK) coupled with an isotope ratio mass spectrometer, at the Laboratory of Applied Physical Chemistry at Ghent University (Belgium). The carbon isotope discrimination (113C) was calculated as follows: 113C (‰) = (δ <sup>13</sup>Ca– δ <sup>13</sup>Cp)/[1+ (δ <sup>13</sup>Cp)/1000], where a and p refer to air and plant, respectively (Farquhar et al., 1989). δ <sup>13</sup>C<sup>a</sup> was taken as 8.0‰.

#### Leaf Area Index

The leaf area index (LAI) at an (under FI conditions only) was determined by measuring the incident light falling on the crop and the amount of light in each plot at ground level, using a BF5 Sunshine Sensor and SunScan Canopy Analyser (Delta-T, Cambridge, UK). The radiation transmitted and dispersed by the canopy was recorded, and the LAI then calculated.

## Spectral Reflectance Measurements

Canopy spectral reflectance (350–2,500 nm) was measured using a portable spectroradiometer (FieldSpec 3 JR, ASD, Boulder, CO, USA) at two developmental stages: Anthesis (AN; denoted with capital letters to avoid confusion with trait measurements stages) and grain filling (GF). The optical fiber (2.3 mm diameter with 25◦ full conical angles) was placed 80 cm above the canopy, at a 45◦ angle. From 11:00 to 17:00 h on clear sunny days, measurements were taken by moving (sweeping) the sensor over the plot, covering the three central rows. The equipment was set up to take three scans per plot and the average for each wavelength was considered in further analyses. To limit variations in reflectance induced by changes in the angle of the sun, radiometric calibration was performed every 15 min, using a white barium sulfate panel as the reference (Spectralon, ASD, Boulder, CO, USA). Prior to modeling, exploratory analysis and spectral noise deletion were performed using the software Spectral Knowledge (SK-UTALCA) (Lobos and Poblete-Echeverría, 2017).

#### Modeling Analysis

Spectral reflectance assessed at AN and GF stages was used to estimate the traits that were evaluated at anthesis (an) (Chl content, WSC, WSCC, and LAI), grain filling (gf) (Chl content), and maturity (m) (SM2, KPS, TKW, GY, WSC, WSCC, and 113C). The following analyses were considered.

#### Spectral Reflectance Indices

Spectral reflectance was used to assess the predictive performance of a set of 255 SRIs loaded on SK-UTALCA (Lobos and Poblete-Echeverría, 2017). Using the linear regression analysis option, the relationships between the each of the measured traits and each of the SRIs at AN and GF were examined using the coefficient of determination (r 2 ) and the root mean square error (RMSE). WS and FI conditions were analyzed independently, but also as one environment (WS+FI).

#### Multivariate Regression Methods

Four different regression methods were considered: Principal Components Regression (PCR), Partial Least Square Regression (PLSR), Ridge Regression (RR), and Support Vector Machine Regression (SVR). Prior to modeling with R 3.1.2 software (R Development Core Team, 2011), any samples with missing values were excluded, outliers were identified by Local Outlier Detection (LOF), and the data were normalized.

PCR is a combination of Principal Component Analysis (PCA) and Multiple Linear Regression (MLR), which first reduces the dimensionality of the spectral data, concentrating the information into so-called principal components. The transformed data is then used to train a MLR model that fits a linear equation (Jolliffe, 2002).

PLSR reduces the dimensionality of the data by constructing so-called latent factors. Unlike PCA, PLSR produces a set of factors that take into consideration the values of the independent and dependent variables simultaneously. In this sense, PLSR finds vectors that not only represent the variance of the data but that are also related to the response (Wold et al., 2001; Hastie et al., 2005).

RR works in a similar way to least square fitting, but adds a term that penalizes the values of the coefficients. The role of the penalization term is to "shrink" the estimates of the coefficients toward zero (Hastie et al., 2005). This penalization can be controlled using a tuning parameter λ, which has to be estimated independently. The optimization of λ was performed using a grid of 100 possible values of λ in a range of [10−<sup>2</sup> , 1010] with 10-fold cross-validation. The best λ identified was used to build the model.

SVR is a method derived from the Support Vector Machine (SVM). The SVM transforms data into a new high-dimensional space using a kernel function. In this newly created space, a predictive model is built using a subset of representative instances called support vectors. SVR estimates a linear dependency by fitting an optimal approximating hyperplane to the training samples in the multidimensional feature space. In the present study, several kernels (linear, polynomial, radial basis function, and sigmoidal) were automatically tested and selected based on a minimization criterion. The parameters C (regularization parameter) and ε (loss function parameter) were fixed to 1 and 0.1, respectively.

All models were validated by 10-fold cross-validation (10xCV) and their performances evaluated by the coefficient of determination (R 2 ), the root mean square error (RMSE), and the Index of Agreement (IA) in calibration and validation. The IA (Willmott, 1981) is a standardized measure for estimating the prediction error of the model and ranges from 0 (faulty model) to 1 (perfect fit).

#### Multivariate Classification Methods

Spectral reflectance data were also modeled by three different supervised classification methods: Principal Components— Linear Discriminant Analysis (PCA-LDA), Partial Least Square Discriminant Analysis (PLS-DA), and the k-Nearest Neighbor (kNN) algorithm. Two different categories were established by taking into account the total variation of each trait, as measured at each of the developmental stages and in each of the environments (FI, WS, or WS+FI). The first category, labeled as "Class 1," corresponds to instances with values in the lower 80% of the trait range. The remaining 20% of instances were considered as belonging to the elite group and were labeled as "Class 2" (Supplementary Table 1). The goal of this dichotomization was to develop predictive models that were able to identify those elite genotypes that had the highest trait performance (the upper 20% of the trait range). Model calculations were done using the Classification Tool Box (Version 4.2) developed by Milano Chemometrics and the QSAR Research Group (Ballabio and Consonni, 2013) and implemented in Matlab 8.2.0 (The Math Works Inc., MA, USA).

PCA-LDA is a classification technique based on the linear discriminant functions. PCA is used to reduce the dimensionality of the spectral matrix and LDA acts as the classifier. Classes are separated by maximizing the variance between the groups, and minimizing the variance within the groups, to determine the best fit of parameters for the classification (Lehmann et al., 2015). Before calculating the PCA-LDA models, the input data were mean-centered and the optimal number of principal components was searched in the interval 1–20, with 10xCV, on the basis of minimizing the error rate of validation. The discrimination of classes was established as linear.

PLS-DA is a pattern recognition method that combines the properties of PLSR, discriminating between the categories using the Discriminant Analysis technique (Ballabio and Consonni, 2013). PLS-DA works by finding the latent variables that describe the variance in both the independent X variables (spectra) and the dependent Y variables (classes) and are able to separate the data into two or more classes (Barker and Rayens, 2003). In PLS-DA, a model is developed for each class and the probability that a sample belongs to a specific class is calculated based on the estimated class values (Ballabio and Consonni, 2013). In the present study, the PLS-DA models were calculated using meancentered data and the optimal number of latent variables was searched in the range [1, 20], with 10xCV.

Finally, the k-nearest neighbors (kNN) method is based on the determination of the distances between an instance whose identity is assumed to be unknown and each instance belonging to the training set. Once the distances are computed, the elements are ranked according to their proximity to the query instance, selecting the k elements that are closest to this. Finally, the category of the query item is estimated using a majority voting scheme among the labels of the k selected items (Cunningham and Delany, 2007). In general, the distance function can be any mathematical function that expresses dissimilarity but, for simplicity, a common choice is the Euclidian distance. In this study, the data was mean-centered and the best value for the parameter k was obtained from values in [1,10], with 10xCV.

The predictive powers of the categorical PCA-LDA, PLS-DA, and kNN models were evaluated by calculations of accuracy, error rate, and prediction rates (determined by the sensibility of each class) for both classes in calibration and validation.

#### RESULTS

The range of values, and their means, for each of the traits evaluated in the 384 wheat genotypes grown under FI and WS conditions, are presented in **Table 2**.



<sup>y</sup>SM2: spikes m−<sup>2</sup> ; KPS, kernels spike-1; TKW, thousand kernels weight; GY, grain yield; Chl, SPAD index; water soluble carbohydrates concentration (WSC) and content (WSCC); ∆13C, isotopic discrimination of <sup>13</sup>C; LAI, leaf area index.

<sup>z</sup>Trait measurement at anthesis (an), grain filling (gf), or maturity (m).

## Spectral Reflectance Indices

Coefficients of determination greater than 0.8 were only reached when the hydric conditions were combined (WS+FI) for GYm (AN: 0.82 and GF: 0.92) and 113Cm (AN: 0.82 and GF: 0.92) (**Table 3**; Supplementary Table 2). Among the 255 SRIs tested, NWI-3 [(R970–R920)/(R970+R920) worked at AN and WI2 (R970/R900) worked at GF. When the hydric conditions were kept separate, predictions with r 2 values greater than 0.25 were achieved only for WS conditions and spectral measurements performed at GF (NWI-3; GYm: 0.51 and 113Cm: 0.26).

Coefficients of determination between 0.40 and 0.79 were found with combined hydric conditions for the following traits: SM2m [AN MTCI ((R800–R750)/(R750–R670)): 0.63; GF SAVI2 (1.5∗(R807–R736)/(R807+R736+0.5)): 0.66]; Chlan [AN TCARI2 (3∗((R700–R600)−((0.2∗(R700–R550))∗(R700/(R850+R670)))): 0.59; GF MTCI: 0.60]; Chlgf [AN NWI-3: 0.42; GF NWI-3: 0.44]; WSCan [AN MTCI: 0.41; GF NDSI4 ((R933–R948)/(R933+R948)): 0.44]; and WSCCan (AN MTCI: 0.47; GF WI2: 0.50). When the hydric conditions were kept separate, r 2 values between 0.4 and 0.79 were found only for LAIan under FI conditions [AN DATT ((R850–R710)/(R850–R680): 0.44] (**Table 3**).

#### Multivariate Regression Methods

As observed for the SRIs, the four multivariate regression models (PCR, PLSR, RR, and SVR) showed the highest predictive power for most traits when data from both hydric conditions were combined (WS+FI), with the exception of TKWm at AN under FI conditions (**Table 3**, Supplementary Figure 1, and Supplementary Table 2). The R 2 cv values were similar between RR and SVR, and greater than those for the other two multivariate models. Using RR or SVR, R 2 cv values greater than 0.8 were found for GYm (AN: 0.90 and GF: 0.93) and 113Cm (AN: 0.92 and GF: 0.94). In addition, R 2 cv values between 0.40 and 0.79 were found for the following traits when using SVR with combined hydric conditions: SM2m (AN: 0.73 and GF: 0.74), Chlan (AN: 0.59 and GF: 0.66), Chlgf (AN: 0.44 and GF: 0.51), WSCan (AN: 0.48 and GF: 0.49), and WSCCan (AN: 0.53 and GF: 0.56). When the hydric conditions were kept separate, r 2 values in this same range were achieved only for GYm (GF: 0.56) under WS conditions, and for LAI (AN: 0.45) under FI conditions (**Table 3**).

### Multivariate Classification Methods

The general performances of the categorical models were very similar in terms of model accuracy with 10xCV (**Figure 1A**); PCA-LDA, kNN, and PLS-DA showed average accuracies of 0.81, 0.76, and 0.71, respectively. PLS-DA, however, showed the lowest error rate of validation (**Figure 1B**); the average errors were approximately 0.42, 0.42, and 0.30 for PCA-LDA, kNN, and PLS-DA, respectively. Moreover, the error rate of validation showed differences within each model, being similar for the two hydric conditions when these were kept separate, but higher when the WS and FI conditions were combined.

The general performance of the three categorical models was evaluated based on the prediction rate of cross-validation for both classes (**Figure 2**), which proved to be different between models. PCA-LDA and kNN showed greater prediction levels for samples included in Class 1 (0.96 and 0.88, respectively), but very low prediction levels for samples from Class 2 (0.21 and 0.27, respectively) (**Figures 2A, B**). Meanwhile, PLS-DA showed lower prediction levels for Class 1, however, the prediction rates were similar for both Class 1 (∼0.70) and Class 2 (∼0.71) (**Figure 2C**).

Considering the prediction rates for both classes, the best genotype discriminations were obtained for most traits by PLS-DA when both hydric conditions were combined, with the exception of WSCm and WSCCm, at WS (AN and GF). Prediction rates from cross-validation by PLS-DA for Class 1 under WS+FI conditions ranged from 0.64 (KPSm at AN and GF, and WSCCm at AN) to 0.77 (SM2m at AN and GF), while prediction rates for Class 2 were between 0.52 (WSCm at AN) and 0.98 (GY at GF). However, the prediction rates for several traits were greater when the hydric conditions were kept separate, when compared to those rates achieved with combined WS+FI conditions. This was observed mainly for Class 1 (e.g., KPSm AN-FI, GYm GF-WS, and 113Cm GF-WS) but also for Class 2 (e.g., TKWm AN-FI, WSCm GF-WS, and WSCCm AN-WS).

Importantly, unlike the SRI and multivariate regression methods, the PLS-DA model generated high prediction levels for the individual hydric conditions, in both classes, for most of the traits evaluated. Under FI conditions, the prediction rates for Class 1 ranged from 0.56 (Chlan at GF) to 0.85 (LAIan at AN), while the prediction rates for Class 2 were between 0.47 (KPSm at AN) and 0.82 (TKWm and LAIan at AN). Under WS conditions, the prediction rates for Class 1 were between 0.56 (WSCCan at GF) and 0.85 (GYm at GF), while those for Class 2 ranged from 0.43 (KPSm at AN) to 0.78 (GYm at GF). Prediction levels between 0.40 and 0.79 for both classes were found for most traits when the hydric conditions were combined, although some of these showed prediction levels greater than 0.80 in one of the two classes: Class 1 (SM2m AN-FI and GF-FI, GYm GF-WS, 113Cm


Frontiers in Plant Science | www.frontiersin.org



TABLE

3


Continued


GF-WS, and LAIan GF-FI) and Class 2 (TKWm AN-FI and Chlgf AN-FI). Prediction levels greater than 0.80 for both classes under the individual hydric conditions were only achieved for LAIan with prediction rates of 0.85 and 0.82 for Class 1 and Class 2, respectively (**Table 3**; Supplementary Table 3).

## DISCUSSION

Despite the present study being conducted in a single year, it generated an interesting database for testing approximation methodologies. This was due to the use of a large number of cultivars/advanced lines of wheat that were grown under two contrasting hydric conditions and evaluated for a large number of traits, with spectral reflectance assessed at two developmental stages (AN and GF).

Unlike other studies, this work covers a large proportion of the SRIs reported in the remote sensing literature (Lobos and Poblete-Echeverría, 2017). In general, the regression analysis between the traits and SRIs showed an important increase in predictive potential when the experimental data from both hydric conditions were combined. Nonetheless, when compared with previous reports (Lobos et al., 2014), lower coefficients of determination were found for GYm and 113Cm when the individual environments were considered.

 between 0.40 and 0.79 (red) and ≥0.80 (blue).

 measurement

wSpectral reflectance xHydric conditions were

ySpectral Reflectance

z In bold values (R

 Index:

2 or

 at Lobos and

 at

 at anthesis (AN) and grain filling (GF).

 (WS), full irrigated (FI), and the combination

Poblete-Echeverría

 (2017).

 (WS+FI).

 or

kNN, and C: PLS-DA) on the basis of the average of prediction rate of cross-validation calculated to all traits and classes, and estimated by spectral reflectance at anthesis (AN) and grain filling (GF). Wheat genotypes growing under two hydric conditions (FI: fully irrigated and WS: water stress); combination of both conditions (WS+FI) for modeling purposes. Vertical bars represent the standard error.

The developmental stage at which spectral reflectance was assessed influenced the relationship between the traits and the SRIs, which has been reported previously (Marti et al., 2007; Lobos et al., 2014). The best predicted traits, GYm and 113Cm, had a greater r 2 at GF, while TKWm and LAIan had a greater r 2 at AN; no major changes were observed for the other traits. The better prediction of GYm at GF could be related to the fact that the three main yield components (SM2, KPS, and TKW) are determined in the crop during this stage. In the case of 113Cm, both stomatal conductance and carboxylation rate influence the carbon isotope ratio (13C/12C) in kernels, and are affected by WS in Mediterranean environments at the GF stage (Condon et al., 2004).

Regarding to GYm and 113Cm assessed at GF, among the 255 SRIs tested, water indices were the ones explaining the highest variability on each environment: NWI-3 on WS and WI2 on WS+FI, while on FI WI and NWI-3 highlighted. Water indices, which combines near infrared spectra wavelengths (NIR), do not directly measure water content but instead detect the changes in leaf anatomy and cell structure that are induced by the state of hydration (Lobos et al., 2014), which influences the productivity of the plant. For GYm, similar results have been reported by other authors (Babar et al., 2006a; Prasad et al., 2007; Gutierrez et al., 2010; Lobos et al., 2014), while for 113Cm, this has been reported just once (Lobos et al., 2014). This highlights the effectiveness of water indices over vegetation indices. Traits such as SM2m, Chlan, Chlgf, WSCan, and WSCCan correlated better with vegetation and chlorophyll SRIs, which combine visible and NIR wavelengths, but also with water indices.

Although SRIs are easy to calculate, they are limited by the use of few wavelengths. Of the multivariate regression models studied (PCR, PLSR, RR, and SVR) (**Table 3**, Supplementary Figure 1, and Supplementary Table 2), PCR and PLSR generally performed the same, or worse than, the SRIs. Although PLSR, which is the most popular technique used in studies of this kind, has the ability to reduce the effect of the spectral signatures collinearity through a reduction in the dimensionality of the data (Hastie et al., 2005); our results show that SRIs may perform similarly, or better, when used in plant phenotyping. On the other hand, the RR and SVR models performed the same, or better than, the SRIs (e.g., GYm and 113Cm estimations increased by 8 and 10%, respectively, when using SVR). This highlights that there are multivariate regression analysis models other than PLSR that should be used in plant phenotyping to improve prediction statistics in plant breeding.

RR is a method of multivariate linear regression that includes a contraction of the multivariate model regression coefficients (Hastie et al., 2005). Although there are fewer reports of RR in remote sensing studies, this method performed better than PLSR for the estimation of plant biomass from satellite images (Cai et al., 2009) and was successfully used by Hernández et al. (2015) to predict GY in a large set of wheat genotypes. RR is known to be useful when the number of observations is much lower than the number of variables (James et al., 2013). In the present study, the number of observations is ∼800, while the number of variables (reflectance values space) is around 2,000. Furthermore, RR is recognized to be an effective prediction model when there is high collinearity in the data (James et al., 2013), which is typical of spectroscopy studies where a full spectrum of reflectance is used. Our results provide empirical evidence that RR performs well in the context of spectral reflectance data, and better than more popular methods such as PLSR.

Optimization of a SVR model does not depend on the dimensionality of the input space (Smola and Vapnik, 1997). Thus, it has the ability to handle complex non-linear dependencies in high-dimensional feature spaces (Smola and Vapnik, 1997; Hastie et al., 2005), such as those modeled in this study, where it performs better than PCR and PLSR. SVR has been the subject of several comparisons in spectral studies. Wang et al. (2011) achieved more accurate estimates of rice LAI when using LS-SVM (Least Squares Support Vector Machines) rather than PLSR and MLR models. Similarly, better estimations of the nitrogen, phosphorus and potassium content in plant leaves were obtained by Zhai et al. (2013) when using SVR rather than PLSR. On the other hand, Yao et al. (2015) showed similar performance for PLSR and SVR in the prediction of nitrogen concentration in wheat leaves.

In addition to GYm and 113Cm traits, improvements of between 6 and 22% were shown for the estimation of the traits SM2m, WSCan, and WSCCan, when assessed with spectral reflectance at AN, and SM2m, and TKWm, when assessed at GF. The best improvement in prediction was achieved for TKWm; this was 17 and 22% when using RR and SVR, respectively. Additionally, RR, SVR, and SRIs showed similar trends in their estimations of most traits when assessed with spectral data measured at AN or GF. The best predicted traits, GYm and 113Cm, as well as SM2m, KPSm, Chlan, Chlgf, and WSCCan, were better-predicted using measurements of reflectance at GF. Meanwhile, LAIan was better-predicted with measurements of reflectance at AN, although this trait was evaluated only for well-watered plants.

The SRIs and multivariate regression models all performed much better when the data from both hydric conditions were combined (WS+FI). This situation was likely produced due to the increase in the number of samples but potentially the increase in the trait-range responses associated with the two contrasting environments was more important (**Table 2**). The increase in the trait-range by combining contrasting environments, and the effects on modeling improvement has been reported previously in wheat (Aparicio et al., 2002; Royo et al., 2003; Lobos et al., 2014). Because of this, special attention was paid to identification of elite genotypes (Class 2) in individual environments with a categorical approximation (PCA-LDA, PLS-DA, and kNN). Although the quality of a model is usually shown by its accuracy and error rate, the main difference in model performance was found to be the model's ability to identify samples from either individual (WS or FI) or combined (WS+FI) environments as Class 2, with PLS-DA being shown to be the stronger methodology (**Figure 2C**, **Table 3**). There are currently a few reports regarding the use of reflectance data and categorical methods for cereal breeding purposes (Delwiche et al., 1999, 2006, 2011; Lavine et al., 2014; Porker et al., 2017); however, all of these studies were carried out using the reflectance information from kernels or ground meal.

The selection of wheat genotypes suitable for water deficitprone environments has traditionally been based on grain yield under irrigation conditions (yield potential) and under water deficit conditions (Araus et al., 2008; Cattivelli et al., 2008; Araus and Cairns, 2014). Even though both selected sites in this study where relatively close (70 km apart), the environmental conditions were different. Clearly the environment where the plants grew influenced the phenotype of each genotype, and therefore the spectral signature of a given genotype at each site. This could explain, in part, the differences in the estimation of each character; traits such as SM2m, TKWm, Chlgf, WSCan, and WSCCan showed higher predictions at FI, while GYm, WSCm, WSCCm, and 113Cm were better estimated at WS.

The aim of this study was to assess the feasibility of using a categorical approximation to select featured genotypes, by comparing the performance of a large set of SRIs, multivariate regression models, and categorical models in the prediction of several traits using plant reflectance data. Even though information from only 1-year was considered, the data set used for modeling was large enough to determine the potential of each approach in plant breeding programs. Unfortunately, there are no previous studies contemplating the number of genotypes, SRIs, or the regression/categorical methods covered in this article.

Although data from additional studies and a greater number of years are needed, the present results suggest that future works oriented at plant breeding should focus on identification of elite genotypes in preference to predicting specific trait-values. The assessment of agricultural and physiological traits, such as those examined in this study, could contribute to the improvement of plant breeding programs and accelerate the selection and release of wheat genotypes/cultivars with greater adaptation to adverse environmental conditions.

### CONCLUSIONS AND FUTURE PERSPECTIVES

Field measurement of canopy spectral reflectance is an efficient and fast way to collect plant status information for a large number of genotypes simultaneously. Analysis of reflectance data, gathered from different hydric conditions and developmental stages, by SRIs, multivariate regression, and categorical models, allows for the prediction of agricultural and physiological traits that are related to wheat yield and water deficit adaptation. The categorical model PLS-DA proved to be a useful tool for identifying elite genotypes grown under FI or WS conditions, improving upon genotype selection based on SRIs and multivariate regression methods.

Although GY and some of the other traits evaluated in this study were predicted using SRI, multivariate regression, and categorical models, there remains a need for assessing other secondary traits that have yet to be explored in plant breeding programs.

To improve trait prediction, it will be crucial to consider other tools, such as machine learning approaches (e.g., random forest or tree-based neural networks), or include other variables that are usually assessed by remote sensing (e.g., plant temperature).

## AUTHOR CONTRIBUTIONS

AdP and IM designed the experiments, selected the germplasm, and participated on field evaluations. SR-B, FE, and AE contributed to experimental measurements and data collection. GL and AE contributed to the development of a tool for spectral analysis. CA was in charge of the implementation of modeling tools. MG and SR-B contributed to data analysis and development of the models. GL and MG were in charge of the writing up but all the authors contributed to the manuscript.

#### ACKNOWLEDGMENTS

This work was supported and financed by the research program "Adaptation of Agriculture to Climate Change (A2C2)" and the "Núcleo Científico Multidisciplinario" from the Universidad de Talca. We also received important funds from the National Commission for Scientific and Technological Research CONICYT (FONDEF IDEA 14I10106, FONDECYT N◦ 1150353 and 11121350, and FONDEQUIP IQM 130073). We would like to express our gratitude to

#### REFERENCES


Genberries Ltda. for equipment support, Alejandra Rodriguez and Alejandro Castro for technical assistance in field experiments, and Boris Muñoz for the analysis of soluble carbohydrates.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2017. 00280/full#supplementary-material


Jones, R. W., Reinot, T., Frei, U. K., Tseng, Y., Lübberstedt, T., and McClelland, J. F. (2012). Selection of haploid maize kernels from hybrid kernels for

Jolliffe, I. (2002). Principal Component Analysis. New York, NY: Springer-Verlag.

plant breeding using near-infrared spectroscopy and SIMCA analysis. Appl. Spectrosc. 66, 447–450. doi: 10.1366/11-06426


Shewry, P. R. (2009). Wheat. J. Exp. Bot. 60, 1537–1553. doi: 10.1093/jxb/erp058


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer LMS and handling Editor declared their shared affiliation, and the handling Editor states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2017 Garriga, Romero-Bravo, Estrada, Escobar, Matus, del Pozo, Astudillo and Lobos. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Precision Automation of Cell Type Classification and Sub-Cellular Fluorescence Quantification from Laser Scanning Confocal Images

Hardy C. Hall <sup>1</sup> \* † , Azadeh Fakhrzadeh<sup>2</sup> , Cris L. Luengo Hendriks <sup>2</sup> and Urs Fischer <sup>1</sup>

<sup>1</sup> Department of Forest Genetics and Plant Physiology, Umeå Plant Science Centre, Swedish University of Agricultural Sciences, Umeå, Sweden, <sup>2</sup> Centre for Image Analysis, Uppsala University, Uppsala, Sweden

#### Edited by:

John Doonan, Aberystwyth University, UK

#### Reviewed by:

Andrew French, University of Nottingham, UK Hao Peng, Washington State University, USA

> \*Correspondence: Hardy C. Hall hardy.hall@umu.se

†Present Address: Hardy C. Hall, Department of Plant Physiology, Umeå Plant Science Centre, Umeå University, Umeå, Sweden

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 24 October 2015 Accepted: 22 January 2016 Published: 09 February 2016

#### Citation:

Hall HC, Fakhrzadeh A, Luengo Hendriks CL and Fischer U (2016) Precision Automation of Cell Type Classification and Sub-Cellular Fluorescence Quantification from Laser Scanning Confocal Images. Front. Plant Sci. 7:119. doi: 10.3389/fpls.2016.00119 While novel whole-plant phenotyping technologies have been successfully implemented into functional genomics and breeding programs, the potential of automated phenotyping with cellular resolution is largely unexploited. Laser scanning confocal microscopy has the potential to close this gap by providing spatially highly resolved images containing anatomic as well as chemical information on a subcellular basis. However, in the absence of automated methods, the assessment of the spatial patterns and abundance of fluorescent markers with subcellular resolution is still largely qualitative and time-consuming. Recent advances in image acquisition and analysis, coupled with improvements in microprocessor performance, have brought such automated methods within reach, so that information from thousands of cells per image for hundreds of images may be derived in an experimentally convenient time-frame. Here, we present a MATLAB-based analytical pipeline to (1) segment radial plant organs into individual cells, (2) classify cells into cell type categories based upon Random Forest classification, (3) divide each cell into sub-regions, and (4) quantify fluorescence intensity to a subcellular degree of precision for a separate fluorescence channel. In this research advance, we demonstrate the precision of this analytical process for the relatively complex tissues of Arabidopsis hypocotyls at various stages of development. High speed and robustness make our approach suitable for phenotyping of large collections of stem-like material and other tissue types.

Keywords: automated image analysis, confocal microscopy, Arabidopsis, hypocotyl, automated phenotyping, code:matlab

### INTRODUCTION

Rapid and cheap sequencing technologies have dramatically changed plant breeding and functional genomics in the last decade. Availability of abundant genotyping data shifts the focus within the frame of genetic screening from more efficient genotyping to automated phenotyping technologies. Progress has been made on whole-plant phenotyping solutions, which for example record plant growth, photosynthesis rates, or stress markers (Furbank and Tester, 2011; Dhondt et al., 2013). Whole-plant phenotyping solutions have become commercially available for indoors and outdoors use and are now an integral part of numerous large breeding programs (Cobb et al., 2013; Rahaman et al., 2015). While such whole-plant phenotyping technologies are useful to facilitate breeding for higher yield many important qualitative and developmental traits cannot be assessed by these macroscopic approaches. Especially, genetic screens for chemical composition, anatomical, and mechanical properties of plant raw materials still rely on laborious low-throughput manual phenotyping. A wide range of molecular markers enable the spatially highly resolved study of such traits. Many of these markers can be fluorescently imaged, either through their own inherent fluorescence, via fluorescent fusion proteins, stains, probes, or through immunofluorescence. With the wide range of fluorescence tools, Laser Scanning Confocal Microscopy (LSCM) has become the method of choice to localize and quantify fluorescent markers. LSCM imaging provides fast, sensitive, inexpensive, spatially highly resolved images where pixel intensity reflects target abundance over a wide dynamic range. Fluorescent imaging of morphogen gradients in the Drosophila embryo and of auxin transport proteins in the Arabidopsis shoot and root tip have, for example, greatly contributed to our understanding of pattern formation and development (Gregor et al., 2007; Kierzkowski et al., 2013). Nevertheless, many of these studies rely on comparison of fluorescence intensity between manually defined regions of interest (ROI; Nilufar and Perkins, 2014), e.g., between different cell types. Obviously, manual segmentation into ROIs is laborintensive and underlies human subjectivity and inconsistency that may severely limit the interpretability of LSCM data.

Computer-assisted quantification of fluorescent targets on a cellular scale over large spatial ranges requires both accurate automatic segmentation and quantification of fluorescence in each individual segment (Luengo Hendriks et al., 2006). Multiple fluorescence sources, including a counterstain for segmentation, can be imaged simultaneously within a single field of view enabling the correlation of the segmented image with a host of other fluorescence targets. Whereas, for animal tissues the application of automated imaging analysis has become popular in the last decade and several software packages for such an approach are freely available (Wiesmann et al., 2015), the adaptation of automated image analysis is lagging behind in plants. This may be related to the limited optical transparency of most plant tissues and therefore a need of thin sectioning of plant specimens resulting in low throughput. In animal tissues the predominant strategy to segment tissues automatically makes use of fluorescently labeled nuclei and region growing algorithms, such an approach may fall short when a fluorescent target is localized to the plasma membrane or the cell wall. In plants, the few attempts to automatically segment tissue have made use of the cell wall stain propidium iodide and the plasma membrane marker FM4-64 (Federici et al., 2012; Pound et al., 2012; Band et al., 2014; Yoshida et al., 2014). These live stains are suitable for embryonic and meristematic tissue but not for mature plant tissues, as the Arabidopsis hypocotyl and stem, which contain terminally differentiated, dead cells with disrupted plasma membranes and which are in comparison to animal tissues of limited optical penetration depth. As an alternative, Sankar et al. (2014) suggested a protocol based on non-fluorescent differential interference contrast images (DIC). However, applicability of this approach is limited by insufficient accuracy, extensively long computing times and incompatibility with confocal imaging.

In hypocotyl and stem, emerging models for wood formation and stem cell research in plants (Jouannet et al., 2015), derivatives of stem cells differentiate into several different cell types of the xylem (inner tissue) and phloem (outer tissue). New divisions of stem cells push daughter cells either toward the in- or outside, and, with increasing distance from the stem cells, derivatives gradually differentiate. A morphogen-like gradient of the plant hormone auxin has been suggested to regulate stem cell activity and differentiation from cell expansion to cell wall thickening (Uggla et al., 1996; Bhalerao and Fischer, 2014). Changes in morphology and wall composition are indicative for the degree of differentiation and cell type (Liebsch et al., 2014). Compositional changes in the walls of stems have been successfully monitored with the help of monoclonal antibodies against specific wall epitopes (Hall et al., 2013). However, exploitation of such data is currently hampered by manual segmentation, classification and quantification of fluorescent signals. As a consequence, genetic improvement of woody feedstock, e.g., decreased lignin content in xylem fibers, is limited by the absence of automated highthroughput phenotyping tools.

Here, we provide an image analysis pipeline, which (i) accurately segments hypocotyls and stems into individual cells and subcellular regions, (ii) assigns each segment to a cell type, (iii) quantifies fluorescence intensity of the cell wall counterstain and, from a separate channel, quantifies several aspects of fluorescence intensity from cell wall epitopes for each individual segment and cell type, and also (iv) extracts a wealth of morphometric data. The pipeline is coded in a single software environment (MATLAB) and the data can easily be exported and, for example, be used in modeling or multivariate statistics. Short processing times permit large data sets, as required for mutant screens or association mapping, to be analyzed.

## MATERIALS AND METHODS

## Preparation of Plant Material

Wild-type (Col-0) and knat1bp−<sup>9</sup> seeds were planted on soil with 18:6 h (light:dark) at 21 ◦C. Germination times were recorded and hypocotyls excised from plants at 21 and 31 days after germination (dag). Hypocotyls were identified as the 5 mm region below the cotyledons. The 5 mm hypocotyls were immersed in 150 µL 1X PME (stock 2X PME: 50 mM PIPES, 2 mM MgSO4, 2 mM EGTA) fixation buffer, within 0.2 ml dome-cap thermal cycler tubes (Thermo Scientific, www.thermoscientificbio.com). Hypocotyls were then subjected to three consecutive 21◦C cycles of 5 min vacuum infiltration at 68 kPa, and washed three times in 1X PME (21◦C, 68 kPa) prior to storage at 4◦C in 1X PME. Segments were individually encased in 1 cm<sup>3</sup> blocks of 5% agar at 65◦C, and stored at 4 ◦C to set. Transverse sections (40 µm thick) were cut from segments using a VT100S vibrating microtome (Leica), separated from agar encasement using a sable hair ("00") brush, then blocked for at least 1 h in 5% bovine serum albumin in 1X TBST (10 mM Tris, 0.25 M NaCl, 0.1% Tween). Sections were mixed to randomize developmental difference, and randomly allocated from each biological replicate pool, together with 100 µl fresh blocking solution, to wells of a 96-well plate (Ibidi, www.bdbiosciences.com). Blocking solutions were swapped with 5 µl 1:36 dilutions of the LM10 antibody (Complex Carbohydrate Research Center, University of Georgia, US) using gel- loading tips, then sections were incubated at 4◦C for 16 h. Hypocotyls were washed twice in 100 µL 1X TBST, then incubated for 1 h at 21◦C in the dark in 10 µl of 2 µg/µl Alexa FluorTM 488 donkey anti-rat IgG (H + L) (Agrisera, Sweden). Sections were again washed twice in 40 µL 1X TBST prior to counter- staining with 0.015% Calcofluor White (Sigma-Aldrich, www.sigmaaldrich.com). Sections were again washed twice in 100 µL 1X TBST to remove excess counter-stain and unbound secondary antibody.

## Confocal Imaging

Hypocotyl sections were imaged using a confocal laser scanning microscope Zeiss LSM780 point-scan system at 1024 × 1024 pixels (pixel size, 0.6–0.83 µm) with a 10X objective (a planapochromat objective with a numerical aperture of 0.45) within the 96-well plate (Ibidi, Germany) fitted with 180 µm-thick coverslip bottoms. Immunofluorescence of AlexaFluor 568 was excited with a 561 nm laser, and emitted light filtered at 575–600 nm. Calcufluor White was subsequently scanned on an independent channel with a 405 nm laser and emission observed at 420–430 nm. Images were saved as "LSM" files with file names that included plate location, antibody, genotype, tissue type, and biological replicate separated by underscores (ex. C08R6\_LM10\_Col\_21-day-old\_Hyp\_BR1) to permit automated cataloging in the supplied MATLAB analysis pipeline.

#### Image Analysis–General

The following methods describe the analytical steps taken, and do not serve as an operation manual for processing images. Instead, refer to Supplemental Presentation 2 ("Precision Cell Classification and Quantification Manual") and Supplemental Video 1 in Presentation 1 (https://vimeo.com/148871821, password: Matlab4Segment) for details on system configuration, experimental setup, parameter optimization and data processing. The image analysis pipeline was implemented in MATLAB using the MATLAB Image Processing toolbox and the DIPimage toolbox (http://www.diplib.org/).

## Experimental Setup for Image Analysis

A working title for the experiment was entered to automatically generate a time-stamped folder to deposit analysis output (Supplemental Figure 5A). The target image files for the experiment that existed within a user-defined source folder were automatically cataloged within the database "ExperInfo" (Supplemental Figure 5). This database recorded file location and levels for experimental factors (age, tissue type, genotype) for each image file.

## Training Set Generation

Three to four training set images occupying a separate folder were selected that presented the range of morphological variation expected to be encountered in the experimental (testing) image set. These images were imported into MATLAB, and users prompted to enter parameters for image smoothing and segmentation (Supplemental Table 1). After data smoothing of the CFW channel (Supplemental Figure 5), tissue centers were manually selected via a user interface (Supplemental Figure 5), triggering the segmentation algorithm to generate ROIC (entire cell), ROIL (cell lumen), and ROIW (cell wall) for all objects. Prior to delineate the cell borders by Watershed segmentation, Gaussian filtering with a variance of 1 pixel was applied in order to remove background noise. Oversegmentation was corrected by merging regions where the difference in intensity between their minima and the first pixel on the watershed dam touching the two regions was <10. Lumen boundaries within each watershed region were precisely identified by applying Otsu's thresholding (Otsu, 1979). In some cases, we cropped the image to restrict the amount of tissue or the range of cell types to be examined. This restricted the ROICs to those that fell within the cropped region, and also restricted computation of ROIL and ROIW to those within the cropped region (Supplemental Figure 1). Measurements for all ROIs were saved for later access by the classification algorithm. The number of cell types and their names were subsequently defined, then selected within each training image (Supplemental Figure 5E). A graphical output of the selections was recorded (example in **Figure 2**) along with a MAT-file containing the locations and dimensions of those ROIs. In an iterative process (Supplemental Figures 5F– G), features (**Table 1**) and cell classes were selected and Random Forest classification (Breiman, 2001) executed for the chosen training set (cell selections). This model was then used to generate class predictions and confidence interval scores for all ROIs in the training set images. An overlay of the entire classification on the original cell selections was generated for each training image (data not shown), along with classification result at varying confidence interval thresholds (as in **Figures 6D–F**). Each training set iteration was outputted to a distinct timestamped folder for subsequent evaluation. The "ExperInfo" database was updated to include a record of all iterations, their parameters, and the locations of the model data sufficient for classification of test images. Quantitative data for the CFW channel were saved but not further utilized in the experiment. Immunofluorescence quantitation was not performed in the training set generation.

## Quantification of Test Images

Test images that were suitable for classification by a common training set iteration were processed in a similar manner as the training set (smoothing, defining tissue centers, segmenting) based upon parameters defined for the chosen training set (Supplemental Figures 5H–I). In an iterative approach, classification of the testing set was explored with different training set iterations (as in **Figure 3**). Quantification data and diagnostic plotting of the CFW channel similar to the training set iterations were stored in time-stamped folders. To

#### TABLE 1 | Features available in Random Forest classification analysis pipeline.


<sup>a</sup>Feature name as it appears in the diagnostic plotting.

<sup>b</sup>Brief description of the measurement, including appropriate units.

quantify the signal intensities attributable to the immunolabeling, immunofluorescence channel images were then segmented using ROIC, ROIL, and ROIW generated from the CFW channel segmentation which acted as a mask. In addition, the ROIs were divided into four quadrants (**Figure 1E**) for higher resolution-based quantification of signal as detailed in **Table 2**. These data were exported in to separate time-stamped folders for later access. Importantly, the locations of MAT-files containing quantification data for each image were stored in the "ExperInfo" database for access during data assembly and export.

#### Data Review and Assembly

The final step in the pipeline is to assemble the data for export based upon information stored in "ExperInfo" regarding which images were processed, and where the associated data is stored. From a user prompt, we selected the levels of each factor (ex. specific antibodies from the "antibody" factor), and files with those properties were concatenated into a common file ("DataRawCompile") to be used in downstream (multivariate) analysis in MATLAB or another environment. We assembled a pipeline that facilitates iterative summary plotting of spatial maps of features for any image present in the assembled data set. "DataRawCompile" was then used to generate means and standard deviations for each cell class within each image (as in **Figures 5**, **6**). The pipeline also permits iterative comparative plotting of the summary statistics (bar plots) for any combination of images present in the output data set. The structure of the output file "DataRawCompile" is detailed in Supplemental File 2.

## RESULTS

#### Automated Image Segmentation of Confocal Counterstain Channel

Embryonic and meristematic plant tissues have been successfully segmented with the help of propidium iodide. However, since propidium iodide is not retained in the cell wall of dead cells, this fluorescent stain is not suitable as a counterstain for segmentation of mature or fixed plant tissues. As an alternative to propidium iodide, we tested calcofluor white (CFW), which binds to cellulose and chitin in cell walls of plants, fungi and bacteria. In order to visualize boundaries between cells we counterstained cell walls of 21-day-old Arabidopsis hypocotyls with CFW and acquired images with a CLSM in a separate reference channel (Channel 1). CFW fluorescence was restricted to cell walls and not detected in cell lumen. After smoothing the reference channel, the watershed segmentation algorithm identified the outer boundaries of cells. Watershed dams matched cell-cell boundaries closely with little over-segmentation, and the ROIs for entire cells, denoted ROIC, matched morphology closely (**Figures 1A,B**). The lumen boundary within each watershed region was precisely identified using Otsu's threshold algorithm

to define ROIL (**Figure 1C**). Cell wall regions, referred to as ROIW, could be derived as the set difference of ROIC and ROIL (**Figures 1D,F**), thus giving us two distinct regions of the cell (wall/ROIW and lumen/ROIL). With the help of a manually selected center of tissue, ROIC, L, and W were subdivided into interior/exterior and left/right (lateral) quadrants in order to study polar or axial distribution of fluorescent signals (**Figure 1E**).

## Manual Selection of Training Set Cells for Classification Model

Supervised learning algorithms have been shown successful for classifying image segments corresponding to individual cells into cell type categories (Field et al., 2010; Sankar et al., 2014). These algorithms require the creation of training sets, i.e., in our case manually assigned classification of cells into groups of user-defined cell types. ROICs of reference images provide sufficient context for manual classification of representative cells for each of the cell types. We defined six different cell types in 21-day-old hypocotyls: xylem vessels and parenchyma, cells of the cambial zone, phloem fibers, phloem, and cortical cells (**Figure 2A**). For each class, cells were chosen that best represented the class, avoiding selection of cells that lay on vague boundaries between cell types or exhibited morphology that were intermediate between cell classes.

During development, organ types such as the Arabidopsis hypocotyl undergo substantial change in tissue composition (additional cell types) and architecture (e.g., cell morphology and relative position in the tissue context). With Arabidopsis hypocotyls, there is added complexity in 31-day-old hypocotyls with the addition of new cell types in the outer xylem (xylem II), demanding another, optimized training. For 31-day-old hypocotyls, we defined eight different cell types that included xylem I vessels, xylem I parenchyma, xylem II vessels, xylem II fibers, cambium, phloem fibers, phloem parenchyma, and cortex (**Figure 2B**). These selections were used as the basis for subsequent trials to determine the combination of cell types and features to use for classification.

## Feature Computation and Pre-Selection

Given accurate identification of cell boundaries (ROIC), lumen (ROIL), and cell wall (ROIW), abundant, precise morphometric data can be derived that collectively provide a rich set of features to classify segments into different cell type categories. Using the DIPimage package, we measured 22 different features that could be derived from ROIs, covering aspects of position within the tissue, shape, and fluorescence intensity for each segment (**Table 1**). Many of these ROI measures correlate well with tissue morphology while some of these features, such as "m.cx" and "m.cy" (xy-coordinates), provide non-sense information with respect to cell identity in a radially symmetrical organ (Supplemental Figures 2A,B). Feature selection is generally considered an important step to improving the accuracy of classification (Janecek et al., 2008). With the exception of TABLE 2 | Fluorescence measures available for quantification in analysis pipeline.


<sup>a</sup>Name of fluorescence measure as it appears in diagnostic plotting.

<sup>b</sup>Regions of interest (ROIs) to which the measure applies. Those marked as "derived" are computed from other measures generated from the ROIs.

<sup>c</sup>Description of how each measure is computed.

excluding "m.cx" and "m.cy" measures, we lacked a priori evidence to eliminate other variables without first examining their importance to successful classification. It was therefore necessary to take an iterative approach of feature selection, based upon the output of the classification, to arrive at an optimal set of features.

#### Classification

We then chose to compare two different supervised learning algorithms: Support Vector Machine (SVM), originally designed for binary classification problems, and Random Forest, developed specifically for multiclass problems. We tested the accuracy of the classification outputs employing all the above-mentioned features. Random Forest outperformed SVM using normalized measures, distance-scaled measures, and untransformed measures (Supplemental Figure 3). Interestingly, the Random Forest model with the untransformed data resulted in the best fit. We therefore focused on Random Forest for the optimization of the classification procedure.

As a first step of optimization we assessed the impact of removing features on the classification result, using 21-day-old hypocotyls as a guide. In the first case we admitted the 18 features into the model, all except the Cartesian coordinates ("m.cx," "m.cy," "Xnew," and "Ynew"). The Random Forest model yielded rank scores of the importance of these features (**Figure 3A**), indicating that the radial displacement from the center of the tissue ("radialV") was the most discriminate feature underlying the radial organization of the tissue types. Other features that contributed substantially to the discrimination between the different cell types were median fluorescence intensity of ROIC and ROIW ("medianROIC," "medianROIW") and the size of the luminal area ("s.area"). The incline angle ("inclV"), which was used as a discriminating feature by Sankar et al. (2014), played a minor role. We used spatial mapping of features (Supplemental Figure 2) to remove six features that we considered redundant with others. Again, "radialV" was dominant, followed by cell wall and cell intensity ("medianROIW," "medianROIC," respectively; **Figure 3A**). Finally, we reduced the selection to five features that were ranked highest in the 12-feature set. Again, "radial" was dominant, while rankings for the remaining features remained similar to those in the 12-feature set (**Figure 3A**).

The Random Forest algorithm, as with other classification methods, classifies all objects. This invariably results in misclassified objects. However, the Random Forest model assigns a "confidence interval score" to each object such that misclassifications can be largely avoided through filtering. We tested the performance of confidence filtering at 50, 70, and 90% confidence by examining misclassification in cells that were

Representative 21-day-old wild-type hypocotyl tissues showing selections for xylem I vessels (red), xylem I parenchyma (green), cambium (navy blue), phloem fibers (yellow), phloem parenchyma (light blue), cork (purple), and epidermis (orange). (B) Representative 31-day-old wild-type hypocotyl tissues showing xylem I vessels (red), xylem I parenchyma (green), xylem II vessels (navy blue), xylem II fibers (yellow), cambium (light blue), phloem fibers (purple), phloem parenchyma (orange), and cortex (olive).

color-coded according to class in an overlay of the original reference channel, considering 18-, 12-, and 5-feature selection sets (**Figures 3B–J**). It is evident from these panels that increased confidence filtering reduces selection of cells in boundaries of differing cell types such as the cork and phloem parenchyma. The incidence of misclassifications is diminished by confidence filtering where cell types are interspersed, such as with xylem vessels and xylem parenchyma. Conversely, removing lowranked and seemingly redundant features can lead to increased misclassification (inset of **Figures 3B–J**).

#### Training Set Versatility

A common scenario in developmental biology is the need to survey (to "phenotype") many genotypes. Automated quantitative morphometrics and fluorescence channel screening offer a means to circumvent the logistic bottleneck in quantifying traits from microscopic tissues. Yet it is not efficient to develop distinct testing sets for each genotype (as with a screen of a mutagenized population). To examine the potential of using a common training set for genotypes with greatly differing tissue organization and morphometric characteristics, we chose to carry out a reciprocal examination between wild type (Col-0) and the knat1bp−<sup>9</sup> mutant which exhibits irregular radial organization of tissues in the hypocotyl (e.g., reduced xylem fiber formation) and altered luminal areas of xylem vessels (Liebsch et al., 2014). In this examination, we produced 12-feature training set models for each, and then compared result for each genotype (**Figure 4**). For the wild-type test image (**Figures 4A,B**), the knat1bp−<sup>9</sup> training set classified the vast majority of cortical, phloem parenchyma, phloem fiber, xylem I fiber and vessel cells correctly. However, at the boundary between xylem I and xylem II, relatively more misclassifications or absent classification (low confidence) of vessels and fibers occurred in the wild-type tissue with the knat1bp−<sup>9</sup> training set. Similarly, the wild-type training model on knat1bp−<sup>9</sup> performed well on the classification of all cell types except for xylem fibers and vessels at the boundary between xylem I and II relative to the knat1bp−<sup>9</sup> training set (**Figures 4C,D**). Generally, with the wild-type training set the boundary between xylem I and II is moved toward the outside of knat1bp−<sup>9</sup> hypocotyls (Liebsch et al., 2014) and misclassification at the boundary most likely represent the dominant nature of "radius" in the classification model. Yet, such misclassifications compose a small proportion of the classified cells that present a wealth of information for comparative morphometrics and fluorescence quantification.

In order to see if automated classification could be used as the basis to conduct comparative morphometrics of genotypes, we chose to examine cell area ("s.area") of xylem vessels of wild type (Col-0) and the knat1bp−<sup>9</sup> mutant, as knat1bp−<sup>9</sup> is known to have smaller cell areas (Liebsch et al., 2014). In this case, three separate sections from the same specimen were quantified, filtered on 70% confidence interval, and the means of all remaining regions of "s.area" measures computed with standard deviations (**Figure 5**). In line with previously published data (Liebsch et al., 2014), Xylem-II vessels of knat1bp−<sup>9</sup> were significantly smaller than in wild type providing evidence that multi-genotype morphometric comparisons are feasible.

#### Fluorescence Quantification

With robust, accurate classification and subdivision of ROIs to predefined classes, each ROI (and sub-ROI) provides a mask to conduct a variety of measurements of fluorescence (**Table 2**). While sub-region-specific ROIs provide a high resolution of fluorescence distribution, relative or summative distribution of these intensities can be more informative. As a result, "derived" measures are computed from values in sub-regions (1–4) of various ROIs. For the purpose of demonstration, we probed tissues with an antibody specific to xylan in the secondary cell wall. A fluorescent secondary antibody permitted visualization of epitope localization. Fluorescence was quantified by computing pixel-wise intensities of each ROI, then "derived" measures were subsequently plotted as spatial maps (Supplemental Figure 4). In 21-day-old hypocotyls,

secondary cell walls occur exclusively in the xylem vessels, thereby providing a clear case of cell type-specific fluorescence to validate the quantification methodology (**Figure 6**). Comparison of fluorescence channel (**Figure 6B**) with a spatial heatmap of fluorescence intensity ("derived\_wallsignal") mapped to ROIC (**Figure 6C**) demonstrates that the quantification replicates the spatial distribution of fluorescence in the source image. As evidence that the signal is predominantly in the walls, the "derived\_lumensignal" values are only moderate in the outermost (youngest) vessels and completely absent in older vessels (Supplemental Figure 4). This is likely due to presence of epitope within the lumen, as it is defined by the segmentation process, of living xylem vessel cells during wall assembly.

#### FIGURE 4 | Continued

(A,B) and knat1bp−<sup>9</sup> (C,D) genotypes classified with training sets of either knat1bp−<sup>9</sup> (B,D) and wild-type (A,C) 31-day-old hypocotyls. (A–D) Classifications passing the 70% confidence interval threshold. Insets in wild type (A,B) and knat1bp−<sup>9</sup> (C,D) are arbitrarily chosen regions at the boundary between Xylem I and II that demonstrate the effect of selection of training set iteration on classification result (classification of selected cells of clear identity are indicated as correct [checkmarks], not classified ["+" sign], and misclassified ["x"]).

all cells passing the 70% confidence interval filtering. \*t-test, wild type vs.

mutant, p < 0.05.

As proof-of-concept that the classification provides a meaningful basis to group cells for fluorescence characterization, we next examined the means of fluorescence intensities ("derived\_wallsignal") values for all cells of each cell class, filtering at 50, 70, and 90% (**Figures 6D–G**, respectively). From this series, it is clear that the xylem vessels are the dominant cell type that exhibits a fluorescence signal. Not evident with visual examination of the spatial map of "derived\_wallsignal" (**Figure 6C**), the xylem parenchyma exhibits a weak signal, too. As increasing stringency reduces the contribution of the xylem parenchyma to the overall signal, we presume that misclassification is the primary cause of signal bleed into that cell type. Yet, stringent filtering (**Figure 6G**) does not eliminate this signal entirely and thus parenchymal cells of the xylem still retain signal where we expect none (no xylan in this cell type). One explanation for the signal is that there is inaccuracy in establishing the watershed boundaries between cell types that differ greatly in cell wall thickness (**Figure 1A**, inset).

means of relative signal intensities. Three biological replicates.

## High Throughput Data Processing Package

This methodological proposal presents the reader with a MATLAB-based data analysis pipeline (Supplemental File 1) that provides the complete set of scripts necessary to prepare quantitative data from a complete fluorescence imaging experiment for downstream (multivariate statistical) analysis, while also providing a wealth of images and plots of diagnostic value (**Figure 7**, detailed in Supplemental Figure 5). Importantly, this set of scripts employs the MATLAB Image Processing toolbox as well as the DIPimage toolbox (http://www.diplib.org/) together providing flexibility and efficiency in a variety of standard image analysis procedures. However, it is the unique assemblage of novel custom scripts using the Random Forest classification model that provide the core analytical steps for generating quantitative data from raw images of plant tissue. Importantly, the core scripts are nested within a graphical user interface that minimizes command line interaction and prompts the user with all of the important parameters for image processing, segmentation, classification and filtering (Supplemental Table 1).

## DISCUSSION

Here, we provide an image analysis toolkit to accurately segment all the cells in transverse sections of hypocotyls, to precisely classify the individual segments into 6 to 8 cell type classes with a specified degree of certainty (confidence interval scores) and to extract precise morphometric data and fluorescence intensity of two different channels for each cell type. By analyzing the distribution of fluorescence emitted by a xylem-vessel-specific marker we obtained accurate segmentation and classification of xylem vessels and parenchymatic cells. This is a relevant improvement over previous attempts of automated cell type classification, which could not distinguish between xylem vessels from parenchyma (Sankar et al., 2014; Montenegro-Johnson et al., 2015). These recent methods relied heavily on spatial

FIGURE 7 | Schematic of data processing pipeline showing main processing steps for training set development (A–D), image quantification (E–K), and data assembly and export (L–O). (A) Counterstain channel of two-channel image is first smoothed and segmented on one of the training set images. (B) The user then defines the number of cell classes, provides names for cell classes, and select cells for each class in each of the training set images. The cell selections are stored for the experiment. (C) The user is presented with spatial maps of the features and chooses which features to use in the classification. This data is stored as an "iteration" for the chosen training set. (D) The classification is carried out on the chosen features to produce a model to be used in classifying images of an experimental treatment class in (E–K). (E) The user selects a set of images that will use a common training set iteration, then (F) images are similarly processed (as in A) and (G) measurements for the features chosen in the training set iteration are computed for all ROIs obtained by the segmentation in (F). (H) Cell classification is carried out using the model generated by the chosen training set iteration and (I) spatial maps are generated for 50, 70, and 90% confidence intervals. Classification and confidence scores are stored for each image along with morphological measures used in classification. (J) After similar image pre-processing as the counterstain channel, fluorescence measures listed in ROIC.

localization of cell types within the tissue. However, using coordinates as a sole criterion for the classification can obviously not discriminate between different spatially dispersed cell types, as xylem vessels. Furthermore, the relative location of a cell to the manually selected origin of coordinates can change during development, e.g., the cambium is progressively pushed away from the center of the hypocotyl. Our approach is not only based on the position of each cell within the tissue context but also on features such as cell wall thickness or shape and is therefore expected to be less sensitive to variation in growth than strategies relying on coordinates only.

Automated image analysis is required for large screens involving hundreds of samples, like mutant screens and association mapping. Whereas, segmentation should not be affected by the morphologic variation across the samples, classification might require time-consuming extension of trainings sets. Application of a training set derived from wild type on hypocotyls of knat1bp−<sup>9</sup> , a mutant which is characterized by severely distorted radial organization, indicated that applying stringent confidence filtering can reduce misclassifications in the mutant efficiently. However, the boundaries between xylem I and II cell types were inaccurately recognized when using a wild-type training set on mutant hypocotyls. This underlies the dominant nature of the radius feature, which measures the radial displacement of a segment from the center of the hypocotyl. Omission of the radius feature, improvements to training sets (image and cell choices) or simply scaling of "radius" are promising means to improve the classification of cells at the boundary between xylem I and II.

Although plant cell walls are considered to be dynamic structures, deposition of wall components is usually stable and irreversible; as opposed to changes in the abundance of shortlived gene products, which can occur in the range of minutes. This, together with easy fixation and efficient preservation of walls as compared to the cytoplasm, makes chemical cell wall properties a good marker for irreversible decisions in cell differentiation. The onset of cell death in xylem vessels is, for example, marked by the lignification of secondary vessel walls (Smith et al., 2013). Our image analysis toolkit permits the analysis of large numbers of commercially available antibodies against different cell wall epitopes. Alternative methods, which offer theoretically the same spatial resolution, to assess the chemical composition of walls of single cells are vibrational spectroscopy and ToF-SIMS (Gorzsás et al., 2011; Gerber et al., 2014; Felten et al., 2015). However, long acquisition times, rather poor image quality and absence of automated image analysis are obstacles that have not yet been overcome by these spectroscopic methods. While in specific cases methods of spatially resolved

#### REFERENCES


spectroscopy can provide important chemical information they are, in contrast to the here presented method, not suitable for larger genetic screens or highly resolved time courses.

Here, we present proof-of-concept studies employing images from Arabidopsis hypocotyls. While we expect that automated segmentation of similarly processed plant material, irrespective of the species, should be feasible with no or little adaptation to the script, there may be a need to optimize tissue fixation, sectioning and mounting for other materials than transverse sections of Arabidopsis hypocotyls or stems. On the other hand, application of our pipeline in the analysis of epitope distribution in wholemount samples, e.g., root tip, or in live imaging of fluorescent markers should be within reach. Images derived from wholemount samples or live imaging may, however, be more difficult to segment due to lower signal to noise ratios.

In its current form, our data analysis pipeline efficiently and accurately provides a wealth of morphometric data for automatically categorizing cell types of transverse sections of Arabidopsis hypocotyls at various growth stages. Furthermore, our pipeline provides a robust means to accurately quantify immunofluorescence for specific cell types, filterable by confidence scores for individual cells. This manuscript, along with the accompanying script package, thus presents an initial exploration into the application of this MATLABbased analytical approach of segmentation, classification and quantification of confocal images; one that foreseeably quantifies any number of fluorescence targets on separate channels in more sophisticated fluorescence-based experiments on living or fixed tissues.

#### AUTHOR CONTRIBUTIONS

HH and UF designed the research. HH prepared the specimens and captured the images. HH, AF and CL wrote the code and analyzed the data. HH and UF wrote the manuscript.

#### ACKNOWLEDGMENTS

This work was supported by Bio4Energy and the Berzelii Centre (Vinnova). We would like to thank Björn Sundberg for helpful discussions.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 00119

Breiman, L. (2001). Random forests. Mach. Learn. 45, 5–32. doi: 10.1023/A:1010933404324

Cobb, J. N., Declerck, G., Greenberg, A., Clark, R., and McCouch, S. (2013). Next-generation phenotyping: requirements and strategies for enhancing our understanding of genotype-phenotype relationships and its relevance to crop improvement. Theor. Appl. Genet. 126, 867–887. doi: 10.1007/s00122-013- 2066-0


expression in the Drosophila blastoderm at cellular resolution I: data acquisition pipeline. Genome Biol. 7, R123. doi: 10.1186/gb-2006-7-12-r123


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Hall, Fakhrzadeh, Luengo Hendriks and Fischer. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Phenotyping of Eggplant Wild Relatives and Interspecific Hybrids with Conventional and Phenomics Descriptors Provides Insight for Their Potential Utilization in Breeding

Prashant Kaushik, Jaime Prohens \*, Santiago Vilanova, Pietro Gramazio and Mariola Plazas

Instituto de Conservación y Mejora de la Agrodiversidad Valenciana, Universitat Politècnica de València, Valencia, Spain

#### *Edited by:*

Rodomiro Ortiz, Swedish University of Agricultural Sciences, Sweden

#### *Reviewed by:*

Michael Benjamin Kantar, University of Hawaii, USA Ezio Portis, DISAFA–University of Torino, Italy

> *\*Correspondence:* Jaime Prohens jprohens@btc.upv.es

#### *Specialty section:*

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

> *Received:* 30 March 2016 *Accepted:* 02 May 2016 *Published:* 19 May 2016

#### *Citation:*

Kaushik P, Prohens J, Vilanova S, Gramazio P and Plazas M (2016) Phenotyping of Eggplant Wild Relatives and Interspecific Hybrids with Conventional and Phenomics Descriptors Provides Insight for Their Potential Utilization in Breeding. Front. Plant Sci. 7:677. doi: 10.3389/fpls.2016.00677 Eggplant (Solanum melongena) is related to a large number of wild species that are a source of variation for breeding programmes, in particular for traits related to adaptation to climate change. However, wild species remain largely unexploited for eggplant breeding. Detailed phenotypic characterization of wild species and their hybrids with eggplant may allow identifying promising wild species and information on the genetic control and heterosis of relevant traits. We characterizated six eggplant accessions, 21 accessions of 12 wild species (the only primary genepool species S. insanum and 11 secondary genepool species) and 45 interspecific hybrids of eggplant with wild species (18 with S. insanum and 27 with secondary genepool species) using 27 conventional morphological descriptors and 20 fruit morphometric descriptors obtained with the phenomics tool Tomato Analyzer. Significant differences were observed among cultivated, wild and interspecific hybrid groups for 18 conventional and 18 Tomato Analyzer descriptors, with hybrids generally having intermediate values. Wild species were generally more variable than cultivated accessions and interspecific hybrids displayed intermediate ranges of variation and coefficient of variation (CV) values, except for fruit shape traits in which the latter were the most variable. The multivariate principal components analysis (PCA) reveals a clear separation of wild species and cultivated accessions. Interspecific hybrids with S. insanum plotted closer to cultivated eggplant, while hybrids with secondary genepool species generally clustered together with wild species. Many differences were observed among wild species for traits of agronomic interest, which allowed identifying species of greatest potential interest for eggplant breeding. Heterosis values were positive for most vigor-related traits, while for fruit size values were close to zero for hybrids with S. incanum and highly negative for hybrids with secondary genepool species. Our results allowed the identification of potentially interesting wild species and interspecific hybrids for introgression breeding in eggplant. This is an important step for broadening the genetic base of eggplant and for breeding for adaptation to climate change in this crop.

Keywords: descriptors, genepools, intespecific hybrids, introgression breeding, phenomics, *Solanum melongena*, Tomato Analyzer

## INTRODUCTION

Eggplant (Solanum melongena L.) is an important vegetable in tropical and subtropical regions across the world, where it is a source of dietary fiber, micronutrients and bioactive compounds (Mennella et al., 2010; Niño-Medina et al., 2014; San José et al., 2014). At present eggplant is the sixth most important vegetable after tomato, watermelon, onion, cabbage, and cucumber and the most important Solanum crop native to the Old World (FAO, 2016). At the global level, it has been one of the crops with the greatest increase in production in the last years, with total production rising by 59% in a decade, from 31.0·10<sup>6</sup> t in 2004 to 49.3·10<sup>6</sup> t in 2013 (FAO, 2016).

The narrow genetic base of eggplant, probably a consequence of a genetic bottleneck during its domestication in Southeast Asia (Meyer et al., 2012), is a limitation to obtain major breeding advances. This limited genetic diversity contrasts with the large morphological and genetic variation present in the eggplant wild relatives (Meyer et al., 2012; Vorontsova et al., 2013; Vorontsova and Knapp, in press). Phylogenetically, eggplant is a member of the so-called "spiny solanums" group (Solanum subgenus Leptostemonum), which contains many wild species from the Old World, most of them from Africa (Vorontsova et al., 2013; Vorontsova and Knapp, in press). These wild species could represent a source of variation for developing a new generation of eggplant cultivars with dramatically improved yield and quality, as well as for addressing the challenges posed by adaptation to the climate change. In this respect, resistance and tolerance to several major diseases and pests is found among wild eggplant relatives (Daunay and Hazra, 2012; Rotino et al., 2014) and they can also be found in a wide range of environmental conditions, including desertic and semidesertic areas, environments with extreme temperatures (Knapp et al., 2013; Vorontsova and Knapp, in press). Some eggplant wild relatives are known to possess high levels of chlorogenic acid and other bioactive compounds of interest for human health (Mennella et al., 2010; Meyer et al., 2015). However, with a few exceptions (Rotino et al., 2014; Liu et al., 2015), eggplant breeders have largely neglected the potential of wild species for eggplant breeding, and contrarily to other crops like tomato (Díez and Nuez, 2008), wild relatives have not made a relevant contribution to the development of new eggplant cultivars.

Eggplant can be crossed with a large number of wild relatives (Daunay and Hazra, 2012; Rotino et al., 2014; Plazas et al., 2016). The closest wild relative of eggplant is S. insanum (Knapp et al., 2013; Vorontsova et al., 2013), which is naturally distributed in Southeast Asia, Madagascar and Mauritius (Knapp et al., 2013; Vorontsova and Knapp, in press), where it is frequently found as a weed (Mutegi et al., 2015). Solanum insanum is considered as the wild ancestor of eggplant and is the only species in the primary genepool of cultivated eggplant (Syfert et al., 2016). Hybrids of eggplant with S. insanum are easily obtained; fruits from interspecific hybridization have many seeds, which have high germination rates, and the hybrid plants are fully fertile (Davidar et al., 2015; Plazas et al., 2016). Interspecific hybrids have also been obtained with many wild species from the secondary genepool (Daunay and Hazra, 2012; Rotino et al., 2014; Plazas et al., 2016), which includes some 50 African and Southeast Asian species (Vorontsova et al., 2013; Syfert et al., 2016). The degree of success of interspecific sexual hybridization between eggplant and secondary genepool species, as well as the hybrid fertility is variable depending on the species involved and the direction of the cross (Plazas et al., 2016).

The characterization of wild species and interspecific hybrids for traits of interest for breeders is a fundamental step for the efficient utilization of crop wild relatives in breeding. Combined data on the cultivated and wild species and their interspecific hybrids, not only allows identifying sources of variation and materials of potential interest, but also provides information on the inheritance of some traits present in the wild species, as has been demonstrated in crosses between S. incanum and eggplant (Prohens et al., 2013). Also, characterization of these materials for vigor traits may allow identification of materials potentially useful as rootstocks. In this respect, highly vigorous eggplant of wild relatives and interspecific hybrids are increasingly used for eggplant grafting, as they induce precocity and higher yield and many of them are tolerant to biotic and abiotic stresses (Gisbert et al., 2011; Daunay and Hazra, 2012). In the case of eggplant wild relatives there are a number of studies on their taxonomic and phylogenetic relationships (Vorontsova et al., 2013; Vorontsova and Knapp, in press), of resistance or tolerance to diseases and pests (Bubici and Cirulli, 2008; Daunay and Hazra, 2012; Naegele et al., 2014). However, to our knowledge there are no comprehensive studies on the morphological and agronomic traits of interest in a set of wild species of the primary and secondary genepools of eggplant and their interspecific hybrids with cultivated eggplant.

Several characterization studies in eggplant with standardized morphological and agronomic descriptors developed by the European Eggplant Genetic Resources Network (EGGNET; van der Weerden and Barendse, 2007) and the International Board for Plant Genetic Resources (IBPGR, 1990) have revealed that are suited for providing a useful morphological and agronomic characterization for eggplant breeders (Prohens et al., 2005; Muñoz-Falcón et al., 2009; Boyaci et al., 2015). EGGNET and IBPGR descriptors have been successfully used for evaluating segregating generations of interspecific crosses between eggplant and related species (Prohens et al., 2012, 2013). In addition to conventional morphological descriptors fruit phenomics data provide eggplant breeders with relevant information for evaluating the variation of the fruit morphology. In this respect, the phenomics tool Tomato Analyzer (Rodríguez et al., 2010) has revealed as useful for the detailed morphometric analysis of fruit size and shape of eggplant and related materials (Prohens et al., 2012; Hurtado et al., 2013).

Here we characterize cultivated eggplant, wild relatives from the primary and secondary genepools and interspecific hybrids between cultivated eggplant and wild relatives using conventional and Tomato Analyzer descriptors. Apart from providing a characterization of the three types of materials studied and

their differences, we aim to evaluate the interest for breeding of different wild relatives using characterization data of the wild relatives and of their interspecific hybrids with eggplant. The information obtained may also provide clues on the interest of wild species and hybrids as potential rootstocks for eggplant.

### MATERIALS AND METHODS

### Plant Material

The plant material included six accessions of cultivated eggplant (S. melongena), 21 accessions of a total of 12 wild species, and 45 interspecific hybrids between the eggplant accessions and seven of the wild species (**Table 1**). The eggplant accessions include materials from both the Occidental (Ivory Coast) and Oriental (Sri Lanka) cultivated genepools (Vilanova et al., 2012; Cericola et al., 2013). Among the wild relatives, three accessions belong to the primary genepool (GP1) S. insanum, and 18 accessions to secondary genepool (GP2) species, namely S. anguivi (n = 2), S. campylacanthum (n = 3), S. dasyphyllum (n = 1), S. incanum (n = 1), S. lichtensteinii (n = 2), S. lidii (n = 2), S. linnaeanum (n = 2), S. pyracanthos (n = 1), S. tomentosum (n = 1), S. vespertilio (n = 2), and S. violaceum (n = 1). All the accessions are deposited at the germplasm bank of the Universitat Politècnica de València (València, Spain). The 45 interspecific hybrids were obtained after reciprocal crossings between cultivated eggplant and wild relatives (Plazas et al., 2016) resulting in 18 hybrids between eggplant and primary genepool species and 27 hybrids between eggplant and secondary genepool species (**Table 1**). Five plants per accession or interspecific hybrid were grown under open field conditions during the summer season of 2015 at the Universitat Politècnica de València (Valencia, Spain; GPS coordinates of the plot: 39◦ 28′ 55′′ N, 0◦ 22′ 11′′ W; altitude 7 m a.s.l.). Plants were spaced 1.2 m between

TABLE 1 | Accessions of cultivated eggplant (*Solanum melongena*) and wild relatives of the primary and secondary genepools, and interspecific hybrids between cultivated eggplant and wild relatives used for the morphological and phenomics characterization.


For the interspecific hybrids, the first and second parentals included in the hybrid code correspond to the female and male, respectively.

rows and 1.0 m within the row and distributed according to a completely randomized design. Drip irrigation was applied and 80 g plant−<sup>1</sup> of a 10N–2.2P–24.9K plus micronutrients fertilizer (Hakaphos Naranja; Compo Agricultura, Barcelona, Spain) was applied during the whole cultivation period through the irrigation system. Plants were trained with bamboo canes and pruned when needed. Weeds were removed manually and no phytosanitary treatments were needed.

## Characterization

All plants were characterized using 27 conventional morphological descriptors based on EGGNET (van der Weerden and Barendse, 2007) and IBPGR (IBPGR, 1990) descriptors (**Table 2**). These morphological descriptors describe different traits of the whole plant (4), leaf (7), inflorescence and flower (7) and fruit (9) and in general display limited GxE interaction (IBPGR, 1990). Except for descriptors concerning the whole plant (e.g., plant growth habit), for which one measurement was taken per plant (i.e., one measurement per replicate), five measurements were taken from each individual plant in order to obtain individual plant averages for the conventional morphological descriptors (i.e., five measurements per replicate). Using a similar approach, five fruits per plant (replicate), collected at the commercially ripe stage (i.e., physiologically immature) for cultivated eggplant and at a similar physiological stage (when they had attained full size but was not physiologically mature) in the case of wild species and interspecific hybrids, were cut opened longitudinally and scanned using an HP Scanjet G4010 photo scanner (Hewlett Packard, Palo Alto, CA, USA) at a resolution of 300 dpi. Scanned images were subjected to fruit morphometric analysis with the fruit shape phenomics tool Tomato Analyzer version 4 software (Rodríguez et al., 2010). A total of 20 fruit morphometric descriptors were recorded using this tool (**Table 2**).

## Data Analyses

For each trait, the mean, range and coefficient of variation (CV, %) were calculated using average accession or hybrid values of cultivated eggplant (n = 6), wild relatives (n = 21) and interspecific hybrids (n = 45). Means of each accession or hybrid were subjected to analyses of variance (ANOVA) to detect differences among the three groups considered. Significance of differences among group means was evaluated using the Student-Newman-Keuls multiple range test at P = 0.05. Heterosis over mid parent (H; %) for the traits of greater agronomic importance was studied in the interspecific hybrids using formula H = 100 × ((F1 − MP)/MP), where F<sup>1</sup> = hybrid mean, and MP = mean of the parents. Values of H above 100% indicate that the hybrid is superior to the highest parent, and therefore present positive heterosis over the highest parent. Principal components analyses (PCA) were performed using pairwise Euclidean distances among accession or hybrid means for standardized characterization data. All the statistical analyses were performed using the Statgraphics Centurion XVI software (StatPoint Technologies, Warrenton, VA, USA).

#### RESULTS

#### Differences between Eggplant, Wild Relatives and Interspecific Hybrids

Significant differences (P < 0.05) were found among average values for the groups constituted by cultivated eggplant, wild relatives and interspecific hybrids for 18 out of the 27 conventional descriptors (**Table 3**). Generally, wild species and interspecific hybrids had larger plant size, greater leaf prickliness, more flowers per inflorescence, and less elongated fruits than the cultivated species. The cultivated species and interspecific hybrids had more anthocyanin pigmentation, larger leaf size, and greater number of flower parts than the wild species. Flower, fruit pedicel and fruit size had the greater average values in the cultivated species, while the smaller ones were for the wild species, with the interspecific hybrids having intermediate values. The three groups overlap for all conventional descriptors except for Leaf Pedicel Length, Corolla Diameter, Fruit Pedicel Length, Fruit Pedicel Diameter, and Fruit Weight, in which all the accessions of the cultivated species presented higher values than any of the wild species.

All Tomato Analyzer descriptors evaluated, except two (Rectangular and Shoulder Height) displayed significant (P < 0.05) differences among average values for the three groups (**Table 4**). For the eight Tomato Analyzer descriptors related to fruit size the cultivated eggplant presented significantly higher values than wild species, while for Ovoid it had lower values; interspecific hybrids presented intermediate values, in most cases being significantly different from both cultivated eggplant and wild species (**Table 4**). Cultivated eggplant had greater Distal Fruit Blockiness and Ellipsoid values than either wild species or interspecific hybrids, while wild species had higher values for Triangular than either cultivated species or interspecific hybrids. Similarly to conventional descriptors, the three groups overlap in the ranges of variation for all Tomato Analyzer descriptors except for Perimeter, Area, Height Mid-width, Maximum Height, Curved Height and Circular, in which there is no overlap between the range of variation of cultivated and wild species, with the values of the former being larger than those of the latter (**Table 4**).

### Variation in Eggplant, Wild Relatives, and Interspecific Hybrids

Variation for the conventional and Tomato Analyzer descriptors was found in the materials studied (**Tables 3**, **4**; **Figure 1**). For most traits, more variation both in terms of range and CV was found in the wild species, compared to the cultivated eggplant accessions. For all conventional descriptors there was more variation in the wild species than in the cultivated eggplant, except for Shoot Tip Anthocyanin Intensity, the number of flower parts. Conversely, in the case of Tomato Analyzer descriptors, the range of variation was greater in wild species than in the cultivated eggplant for only six out of the 20 descriptors evaluated (Perimeter, Width Mid-height, Maximum Width, Rectangular, and Ovoid), while for the CV the wild species had a greater value than cultivated eggplant for nine of the descriptors, of which seven are related to fruit size (**Table 4**).

#### TABLE 2 | Descriptors used for phenotyping.


The list displays conventional morphological descriptors based on EGGNET (van der Weerden and Barendse, 2007) and IBPGR (1990) descriptors list and phenomics fruit morphometric descriptors based on Tomato Analyzer software (Rodríguez et al., 2010) used for the characterization of accessions of cultivated eggplant (S. melongena; n = 6); wild relatives (n = 21) and interspecific hybrids between cultivated eggplant and wild relatives (n=45).

#### TABLE 3 | Variation parameters for conventional morphological descriptors.


(Continued)

#### TABLE 3 | Continued


Values represent the mean, range (between brackets), and coefficient of variation (CV; %) for the conventional morphological descriptors studied in accessions of cultivated eggplant (S. melongena; n = 6), wild relatives (n = 21) and interspecific hybrids between cultivated eggplant and wild relatives (n = 45 except for fruit traits in which n = 42) and significance of mean differences among the three groups.

<sup>a</sup>Means within rows separated by different letters are significantly different according to the Student-Newman-Keuls test.

For interspecific hybrids a large range of variation was observed for many conventional descriptors, with variation parameters generally larger than those of the cultivated species and smaller than those of the wild species. In this respect, the range of variation was larger than that of the cultivated eggplant for all but nine conventional descriptors, while compared to wild species it was larger for 11 descriptors (**Table 3**). The coefficient of variation for conventional descriptors was also larger than in the cultivated species for all traits except nine (Plant Growth Habit, Stem Diameter, Leaf Blade Lobing, Leaf Prickles, Number of Flower Prickles, Number of Sepals, Number of Petals, Number of Stamens, and Fruit Apex Shape) and larger than that of the wild species for eight descriptors (Plant Height, Leaf Blade Tip Angle, Fruit Pedicel Length, Fruit Pedicel Diameter, Fruit Length/Breadth Ratio, Fruit Cross Section, Fruit Apex Shape, and Fruit Calyx Prickles; **Table 3**).

Regarding the variation for Tomato Analyzer traits, the range of variation in the interspecific hybrids was greater than those of cultivated eggplant and wild species for all traits except five in the case of cultivated eggplant, which correspond to fruit shape indexes and Circular, and only one (Ovoid) in the case of wild species (**Table 4**). Also, larger values were obtained in the CV for Tomato Analyzer descriptors in the interspecific hybrids compared to the cultivated species for all traits but seven. When compared to wild species the interspecific hybrids also presented higher CV for all traits, except four (**Table 4**).

#### Multivariate Analysis

The three first components of the principal components analysis made with all conventional and Tomato Analyzer descriptors accounted for 58.8% accounted of the total variation among accession means, with the first, second and third component accounting, respectively for 37.2, 12.0, and 9.5% of the total variation (**Table 5**). The first principal component was positively correlated to Corolla diameter, fruit size and to elongated fruit shape (**Table 5**). The second principal component was positively correlated to Plant Height and to obovoid fruit shape. The third principal component was positively correlated to Plant Growth Habit (i.e., prostrate habit), to multiple plant, leaf and corolla size traits, to a higher number of flower parts (sepals, petals and stamens) and to an increased prickliness in leaves, and flower and fruit calyces (**Table 5**).

The projection of eggplant, wild species and interspecific hybrids in the PCA plot reveals that although considerable diversity exists in both eggplant (black squares) and wild species (white symbols), the interspecific hybrids (gray symbols) present a more scattered distribution in the PCA plot (**Figures 2**, **3**). Interspecific hybrids with the primary genepool species S. insanum plot closer to the cultivated eggplant and are intermingled with it the PCA graphs. On the contrary, interspecific hybrids with secondary genepool species plot closer to the wild species and are also intermingled with them (**Figures 2**, **3**). The first component separates the group formed by eggplant and the interspecific hybrids with the primary genepool species S. insanum, which present positive values for this component, from the group formed by all the wild species and interspecific hybrids with secondary genepool species. Among the interspecific hybrids with secondary genepool species, those with S. incanum and S. lichtensteinii are the closest to eggplant in this first component (**Figures 2**, **3**). When considering the second component all eggplant accessions but one have positive values, while interspecific hybrids with S. insanum are equally distributed in the positive and negative values of this second component (**Figure 2**). Primary genepool wild species S. insanum and all secondary genepool species, except S. campylacanthum, S. pyracanthos, S. tomentosum and one accession of each of S. anguivi and S. lidii have negative values for this second component. When considering interspecific hybrids with secondary genepool species, although they are intermingled with the wild species for this second component most of the hybrids present positive values for this second component, with the exceptions being the hybrids with S.

#### TABLE 4 | Variation parameters for Tomato Analyzer phenomics fruit descriptors.


Mean, range (between brackets), and coefficient of variation (CV; %) for the Tomato Analyzer phenomics fruit morphometric descriptors studied in accessions of cultivated eggplant (S. melongena; n = 6), wild relatives (n = 21) and interspecific hybrids between cultivated eggplant and wild relatives (n = 42) and significance of mean differences among the three groups. <sup>a</sup>Means within rows separated by different letters are significantly different according to the Student-Newman-Keuls test.

lichtensteinii (four out of five), S. linnaeanum and one of each of the interspecific hybrids with each of the species S. anguivi and S. incanum (this latter with a value very close to 0). Amazingly, the highest values for this second component correspond to interspecific hybrids with S. anguivi (**Figure 2**). For the third

component both eggplant and the interspecific hybrids with S. insanum are scattered and display positive or negative values (**Figure 3**). Most wild species accessions have negative values for this third component, except the accessions of S. dasyphyllum, S. linnaeanum, S. pyracanthos, and S. violaceum, as well as one

genepool S. insanum (p1); wild species of secondary genepool S. anguivi (s1), S. campylacanthum (s2), S. dasyphyllum (s3), S. incanum (s4), S. lichtensteinii (s5), S. lidii (s6), S. linnaeanum (s7), S. pyracanthos (s8), S. tomentosum (s9), S. vespertilio (s10), and S. violaceum (s11); interspecific hybrids between eggplant and primary genepool species S. insanum (hp1); and, interspecific hybrids between eggplant and secondary genepool species S. anguivi (hs1), S. dasyphyllum (hs2), S. incanum (hs3), S. lichtensteinii (hs4), S. linnaeanum (hs5), and S. tomentosum (hs6). Fruits are not depicted at the same scale; the size of the grid cells is 1 × 1cm.

accession of S. incanum (with values close to 0). The lowest values for this component are those of S. lidii, S. vespertilio and S. tomentosum (**Figure 3**). On the other hand all interspecific hybrids with secondary genepool species, with the exception of two interspecific hybrids with S. anguivi, present positive values for this third component. In this case, the highest values for the third component correspond to interspecific hybrids with S. dasyphyllum, S. lichtensteinii, and S. incanum (**Figure 3**).

### Traits of Agronomic Interest in Wild Species

The 12 wild species evaluated presented considerable differences for traits of agronomic interest (**Table 6**). For example, important differences were found for vegetative traits. For example, the tallest plants were those of S. anguivi, which also presented thick stems (**Table 6**). Important differences were also found for Leaf Blade Lobing. The greatest leaf prickliness was observed S. dasyphyllum, S. pyracanthos, and S. violaceum, while S. anguivi and S. tomentosusm did not present prickles in the leaves. The largest leaf blades were those of S. dasyphyllum and S. campylacanthum, while the smallest were those of S. tomentosum, with a Leaf Blade Length of 5.2 cm (**Table 6**). When considering flower and fruit traits, the two species with a larger number of flowers per inflorescence were S. lidii and S. vespertilio, with more than 13 flowers/inflorescence, while the smaller number was S. insanum (**Table 6**). Important differences were also observed for Corolla Color. All wild species had five petals (and sepals and stamens), except S. lidii and S. vespertilio, which had only four. The largest fruits were those of S. incanum and S. lichtensteinii, with average values above 25 g, more than 10-fold heavier than those of S. anguivi, S. lidii, S. pyracanthos, S. tomentosum, S. vespertilio, and S. violaceum. The highest calyx prickliness was observed in S. linnaeanum, S. pyracanthos, and S. violaceum, while S. anguivi, S. lidii, and S. vespertilio did not present calyx prickles (**Table 6**). The most elongated fruit were those of S. incanum, while the most flattened ones were those of S. dasyphyllum and S. lidii (**Table 6**).

#### Heterosis in Interspecific Hybrids

Interspecific hybrids between eggplant and its wild relatives generally displayed positive heterosis for plant size traits, with average heterosis values of up to 90.5% for Plant height and 46.2%



Values represent the correlation coefficients for the three first principal components in the collection of eggplant (S. melongena), wild relatives and interspecific hybrids evaluated. Only correlations with absolute values ≥0.150 have been listed.

for Stem diameter in the hybrids of eggplant with S. dasyphyllum (**Table 7**). The only negative value observed for these traits was for Stem Diameter in the interspecific hybrid with S. linnaeanum. Most interspecific hybrids presented higher prickliness than their parent species, and in consequence, very high average values for heterosis for Leaf Prickles are observed, with values between 91.0% for S. dasyphyllum and 800.0% for S. tomentosum. Leaf size traits were also, in general, heterotic in the interspecific hybrids, with the exception of Leaf Pedicel Length in S. dasyphyllum and S. linnaeanum. The same phenomenon was observed for the Number of Flowers per Inflorescence, with values of up to 87.7% in the hybrids with S. tomentosum (**Table 7**). The pigmentation of the corolla (Corolla Color) also presented average positive heterosis values in the hybrids of eggplant with five out of the seven wild species, the exception being interspecific hybrids with S. anguivi and S. tomentosum. The number of flower parts, represented by the Number of Petals, displayed low absolute values for heterosis in all cases (**Table 7**).

Regarding Fruit Weight, considerable differences were observed between the hybrids with the primary genepool species (S. insanum) on one hand, and the hybrids with secondary genepool species on the other. In this respect, while the hybrids with S. insanum displayed small negative average heterosis (−5.5%), not significantly different from 0, in the case of secondary genepool species, the heterosis for Fruit Weight is highly negative, with values between −60.4% for hybrids with S. dasyphyllum to −98.6% in hybrids with S. tomentosum (**Table 7**). As occurred for Leaf Prickles, positive heterosis values, although of smaller magnitude, were observed for Fruit Calyx Prickles, with the exception of the hybrids with S. anguivi, which did not present prickles in the calyx, and in consequence had a heterosis value of −100%. Finally, for fruit shape, the hybrids with primary genepool species S. insanum presented positive heterosis, while those with secondary genepool species had negative heterosis values (**Table 7**).

#### DISCUSSION

Crop wild relatives are widely recognized as an invaluable genetic resource for breeding, in particular for broadening the genetic base of crops with narrow genetic diversity, and as sources of variation for traits of interest in breeding crops, including adapting them to the challenges posed by climate change (Dempewolf et al., 2014). Modern varieties of many important crops carry introgressions from wild species resulting from breeding programmes performed in the last 100 years (Hajjar and Hodgkin, 2007). One of the most outstanding examples is tomato, where modern commercial hybrids carry different combinations of 15 different introgressions from different wild species (Díez and Nuez, 2008; Sabatini et al., 2013). However, in the case of eggplant, despite being one of the most important vegetables and being intercrossable with many wild relatives, there are few reports on the use of the variation available in the wild species for eggplant breeding (Daunay and Hazra, 2012; Rotino et al., 2014; Liu et al., 2015) and no modern commercial varieties of eggplant carrying introgressions from wild species are known to us.

In our study we have evaluated six accessions of cultivated eggplant, 21 accessions of 12 wild species, and 45 interspecific hybrids of cultivated eggplant with seven wild species. This

represents the largest study up to now on morphological and agronomic traits for breeding of this type of materials. As expected, many differences were found within and among cultivated eggplant, wild relatives and the interspecific hybrids for the conventional descriptors used, confirming the utility of the EGGNET (van der Weerden and Barendse, 2007) and IBPGR (1990) conventional morphological descriptors and Tomato Analyzer traits (Rodríguez et al., 2010) used for evaluating eggplant wild relatives and interspecific hybrids (Prohens et al., 2013).

Also, many differences were found for the traits studied among cultivated eggplant, wild species and interspecific hybrids. Although many of the wild species of eggplant thrive in arid and semi-arid conditions (Knapp et al., 2013; Vorontsova and Knapp, in press), when grown under the favorable conditions of cultivated environments, the wild species and their interspecific hybrids generally display a high vigor, expressed as average values for plant height and stem diameter above those of cultivated eggplant. This is of interest for developing new rootstocks, which generally require having high vigor (Gisbert et al., 2011), and opens the way to exploiting several to the wild species evaluated and interspecific hybrids as potential new rootstocks for eggplant. Another important trait of agronomic interest for which there were considerable differences among groups was prickliness, which was much greater in wild species and interspecific hybrids, confirming that alleles from the cultivated eggplant are recessive (Doganlar et al., 2002; Gramazio et al., 2014; Portis et al., 2015). The number of flowers per inflorescence was also much greater in wild species and interspecific hybrids. This trait is very important in eggplant breeding, as a reduced value of this trait results in increased fruit size uniformity (S˛ekara and Bieniasz, 2008). Also, fruit size and shape, which are of great relevance for breeding (Daunay and Hazra, 2012; Portis et al., 2015), also differed considerably among the three groups, with the interspecific hybrids presenting intermediate values, although on most cases they were closer to those of the wild species, indicating dominance of the genes of the latter (Doganlar et al., 2002).

The much higher variation observed in wild species and interspecific hybrids for vegetative, flower and inflorescence traits compared to cultivated eggplant was expected, as we were comparing a single species with an admixture of different wild species or hybrids, which present a much higher genetic diversity (Meyer et al., 2012; Särkinen et al., 2013; Vorontsova et al., 2013). However, for traits related to the fruit size and shape much higher variation was observed in the cultivated eggplant than in the wild species, confirming the general observation that the morphological variation in the organ for which a crop is domesticated (in this case the fruit) increases during

domestication (Meyer and Purugganan, 2014). Amazingly, in the case of interspecific hybrids a larger variation was found for most fruit size and shape traits than in the cultivated eggplant. Although most interspecific hybrids were more similar to the wild species, in some cases they were intermediate, revealing that different genic control mechanisms must exist for fruit size and shape among the wild relatives of eggplant. In this respect, the multivariate analysis clearly shows that interspecific hybrids with the primary genepool species S. insanum are morphologically closer to the cultivated eggplant, while the hybrids with secondary genepool species present a general morphology closer to that of the wild species. These results may support the hypothesis that S. insanum is the wild ancestor of cultivated eggplant (Knapp et al., 2013), as domestication should be easier when genes for domestication traits from the wild species display intermediate dominance rather than full dominance.

The study of individual wild species suggests that S. anguivi, S. campylacanthum, S. pyracanthos, and S. violaceum may be of interest for increasing the vigor of cultivated eggplant or for being used as rootstocks. Also, wild eggplant species use to have undesirable traits (e.g., prickliness, small fruit size, etc.) that have to be removed during the breeding (Rotino et al., 2014). In this case, the most desirable wild species are those that are most similar to the crop for these traits. For example, the lack of prickles or very low prickliness of S. anguivi, S. campylacanthum, and S. tomentosum is a very favorable trait for breeders (Daunay and Hazra, 2012). Regarding fruit weight, the wild species with greater fruit weight should be the most interesting for breeders in order to recover fruit size in few backcross generations. In this case, S. insanum, S. dasyphyllum, and S. lichtensteinii should be the most interesting candidates if a rapid recovery of fruit size is desired. In any case, Prohens et al. (2013) showed that fruit size recovers quickly even in first backcrosses with the wild species S. incanum, which has an intermediate fruit size among wild species.

Although differences were observed among interspecific hybrids from different wild species, hybrids were in general vigorous, displaying heterosis for vigor traits. This phenomenon had already been described in interspecific hybrids with S. incanum (Gisbert et al., 2011; Prohens et al., 2013), and our results suggest that this is a common phenomenon in the hybrids between eggplant and wild relatives. Amazingly, most interspecific hybrids were highly heterotic for prickliness, with heterosis values over 100%. Prickles even appeared in interspecific hybrids with wild species that were not prickly, like S. tomentosum. In previous works, heterosis for prickliness had already been described in interspecific crosses in eggplant (Prohens et al., 2012; Devi et al., 2015; Plazas et al., 2016). Several studies with segregating populations of S. linnaeanum and S. insanum show that differences in prickliness between cultivated eggplant and wild relatives is under the control of


 were selected so that they were relevant for breeding and useful to distinguish the different wild species.


TABLE 7 | Heterosis over mid parent values (%; ±SE) based on accession and interspecific hybrid means.

Values are presented for traits of agronomic interest in the interspecific hybrids of eggplant with seven wild relatives (one from the primary genepool, S. incanum; and six from the secondary genepool).

<sup>a</sup>For S. dasyphyllum data are available for four accessions for plant traits and only for one accession for fruit traits.

a few QTL (Doganlar et al., 2002; Gramazio et al., 2014) and therefore prickliness should be easily removed in backcross generations. Although for fruit size traits negative heterosis was generally observed in the interspecific hybrids, indicating a greater similarity to the wild species, interspecific hybrids with primary genepool species S. insanum presented values close to zero, similarly to intraspecific hybrids of eggplant (Rodríguez-Burruezo et al., 2008), indicating intermediate dominance and values intermediate between both parental species. However, hybrids with wild species from the secondary genepool displayed highly negative heterosis, in some cases close to 100% like in interspecific hybrids with S. anguivi and S. tomentosum, suggesting that in these materials it may be more difficult to recover fruit size in the backcross generations.

In conclusion, the characterization with conventional descriptors and the Tomato Analyzer phenomics tool has allowed a detailed characterization of eggplant, close wild relatives and their interspecific hybrids. The high variation among wild species identified sources of variation and most promising species for traits of interest for eggplant breeding. The fact that interspecific hybrids with primary genepool species S. insanum are intermediate or close to eggplant for many traits, may facilitate the use of this species in introgression breeding and supports previous evidence that this species is the ancestor of cultivated eggplant. Also, the high vigor of most interspecific hybrids may be directly exploited by using them as rootstocks. The information obtained here on phenotypic characteristics and heterosis of wild species and interspecific hybrids is of interest for eggplant breeding. Given the adaptation of many wild species to stressful conditions, their utilization in eggplant breeding may result in the development of a new generation of cultivars adapted to climate change challenges.

### AUTHOR CONTRIBUTIONS

JP, SV, PG, and MP conceived and designed the research; PK and MP performed the phenotypic and phenomics characterization; PK, JP, and PG analyzed the data; JP, SV, PG, and MP wrote the manuscript. All authors read and approved the manuscript.

## FUNDING

This work was undertaken as part of the initiative "Adapting Agriculture to Climate Change: Collecting, Protecting and Preparing Crop Wild Relatives" which is supported by the Government of Norway. The project is managed by the Global Crop Diversity Trust with the Millennium Seed Bank of the Royal Botanic Gardens, Kew and implemented in partnership with national and international gene banks and plant breeding institutes around the world. For further information see the project website: http://www.cwrdiversity.org/. This work has also been funded in part by European Union's Horizon 2020 research and innovation programme under grant agreement No 677379 (G2P-SOL) and from Spanish Ministerio de Economía y Competitividad and FEDER (grant AGL2015-64755-R). Prashant Kaushik is grateful to ICAR for a pre-doctoral grant. Pietro Gramazio is grateful to Universitat Politècnica de València for a pre-doctoral (Programa FPI de la UPV-Subprograma 1/2013 call) contract.

#### Kaushik et al. Phenotyping of Eggplant Relatives and Hybrids

#### REFERENCES


on eggplant domestication. Mol. Phylogenet. Evol. 63, 685–701. doi: 10.1016/j.ympev.2012.02.006


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Kaushik, Prohens, Vilanova, Gramazio and Plazas. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Direct Comparison of Remote Sensing Approaches for High-Throughput Phenotyping in Plant Breeding

#### Maria Tattaris <sup>1</sup> , Matthew P. Reynolds <sup>1</sup> \* and Scott C. Chapman<sup>2</sup>

*1 International Maize and Wheat Improvement Center, Texcoco, Mexico, <sup>2</sup> CSIRO Agriculture, Queensland Bioscience Precinct, Queensland, QLD, Australia*

Remote sensing (RS) of plant canopies permits non-intrusive, high-throughput monitoring of plant physiological characteristics. This study compared three RS approaches using a low flying UAV (unmanned aerial vehicle), with that of proximal sensing, and satellite-based imagery. Two physiological traits were considered, canopy temperature (CT) and a vegetation index (NDVI), to determine the most viable approaches for large scale crop genetic improvement. The UAV-based platform achieves plot-level resolution while measuring several hundred plots in one mission via high-resolution thermal and multispectral imagery measured at altitudes of 30–100 m. The satellite measures multispectral imagery from an altitude of 770 km. Information was compared with proximal measurements using IR thermometers and an NDVI sensor at a distance of 0.5–1 m above plots. For robust comparisons, CT and NDVI were assessed on panels of elite cultivars under irrigated and drought conditions, in different thermal regimes, and on un-adapted genetic resources under water deficit. Correlations between airborne data and yield/biomass at maturity were generally higher than equivalent proximal correlations. NDVI was derived from high-resolution satellite imagery for only larger sized plots (8.5 × 2.4 m) due to restricted pixel density. Results support use of UAV-based RS techniques for high-throughput phenotyping for both precision and efficiency.

#### Keywords: UAV, multispectral, thermal, indices, airborne imagery, high-throughput phenotyping

## INTRODUCTION

High-throughput phenotyping, particularly through the application of remote sensing tools, offers a rapid and non-destructive approach to plant screening (White et al., 2012). Recent advances in remote sensing technologies as well as in data processing has increased applications in both field and controlled growing conditions (Leinonen and Jones, 2004; Jones et al., 2007; Möller et al., 2007; Swain and Zaman, 2012; Araus and Cairns, 2014) with important consequences for crop improvement.

Remotely sensed spectral readings are based on the interaction between incoming radiation and target objects, resulting in a characteristic signature of reflected light. Such signatures are typically used to calculate spectral indices, which are a function of the light absorption properties of the plant at given wavelengths (e.g., see Tables 7.1–7.3 in Mullan, 2012

#### Edited by:

*Rodomiro Ortiz, Swedish University of Agricultural Sciences, Sweden*

#### Reviewed by:

*Michael Abberton, International Institute of Tropical Agriculture, Nigeria Anyela Valentina Camargo Rodriguez, Aberystwyth University, UK*

> \*Correspondence: *Matthew P. Reynolds m.reynolds@cgiar.org*

#### Specialty section:

*This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science*

Received: *23 April 2016* Accepted: *15 July 2016* Published: *03 August 2016*

#### Citation:

*Tattaris M, Reynolds MP and Chapman SC (2016) A Direct Comparison of Remote Sensing Approaches for High-Throughput Phenotyping in Plant Breeding. Front. Plant Sci. 7:1131. doi: 10.3389/fpls.2016.01131* and Table 2 in Zarco-Tejada et al., 2013). Two commonly used traits for high-throughput screening are the Normalized Difference Vegetation Index (NDVI), and canopy temperature (CT). NDVI is calculated using wavelengths within the NIR (near infrared) and VIS (visible) regions of the electromagnetic spectrum. NDVI relates to chlorophyll content due to absorption features of the molecule, and hence the photosynthetic capacity of the plant. CT, which is measured from emitted infrared radiation, can be used as a tool to indirectly evaluate the transpiration rate of a plant (Berliner et al., 1984; Peñuelas et al., 1992). Based mainly on ground based proximal sensing approaches, CT shows a robust association with plant performance, especially under stress, being intimately associated with water status and stomatal conductance (Blum et al., 1982; Berliner et al., 1984; Amani et al., 1996) while NDVI can estimate relative crop biomass at different growth stages (Babar et al., 2006) as well as N deficiency and crop senescence rate (Blum et al., 1982; Reynolds et al., 1994, 1998; Raun et al., 2001; Babar et al., 2006; Olivares-villegas et al., 2007).

Notwithstanding the examples cited above, proximal remote sensing methods can lose precision at high-throughput due to changes in environmental conditions between the start and end of measurements (typically a time period of one to several hours for breeding trials). Satellite imagery has the advantage of covering large areas instantaneously, but generally does not offer the spatial (sub-meter) and temporal (weekly/daily) resolution required for breeding experiments. Low level, airborne remote sensing measurements have the advantage in that resolution is at plot level while at the same time providing the possibility of instantaneously capturing multiple plots at a practical breeding scale at a high temporal resolution (Araus and Cairns, 2014; Chapman et al., 2014).

While a body of literature has shown the value of airborne derived spectral indices to estimate environmentally determined performance traits for a number of crops (Shanahan et al., 2001; Champagne et al., 2002; González-Dugo et al., 2006; Berni et al., 2009; Zhang et al., 2009; Dupin et al., 2011; Swain and Zaman, 2012; Zarco-Tejada et al., 2012), the use of UAVs to increase throughput for breeding purposes and the focus on genetic effects within one agronomic treatment is relatively new (Lelong et al., 2008; Chapman et al., 2014; Díaz-Varela et al., 2015; Zaman-Allah et al., 2015). The UAV approach has obvious potential to increase throughput but the issue of precision relative to other approaches has not been examined. Moreover, data from UAVs have not been compared with satellite derived imagery for phenotyping applications.

The work presented here aims to demonstrate the potential of low level thermal and multispectral UAV imagery and highresolution multispectral satellite imagery for the derivation of spectral indices of experiments that comprise of 100 s of plots (**Table 1**) growing in realistic field environments. A methodology was developed with three main objectives. The first was to compare data derived from the UAV with proximal sensors, to determine how well they relate to each other, and their relative ability to predict biomass and yield of wheat. A second objective was to compare data derived from the UAV with satellite imagery for different

sized experimental breeding plots, as well as with equivalent data at ground level. The focus of the UAV measurements was the derivation of NDVI and a spectral index relating to canopy temperature, and similarly NDVI and CT were measured using proximal instruments on the ground. For the satellite imagery, NDVI was calculated. A third objective was to evaluate the robustness of the UAV derived indices as selection tools by examining their relationship with crop performance characteristics of different classes of breeding material growing in different simulated target environments. Specifically, the screening traits were measured on advanced breeding lines under optimal, heat stressed, and water deficit conditions, while un-adapted genetic resources were evaluated under water deficit.

## MATERIALS AND METHODS

#### Study Site

Trials were located at an experiment station of the International Maize and Wheat Improvement Centre (CIMMYT) in the Sonoran desert, close to Ciudad Obregon, NW Mexico (27◦ 20′ N; 109◦ 54′ W; and 38 m above sea level). Environmental and management details of this area are given in (Sayre et al., 1997). Five trials made up of elite lines and un-adapted genetic resources were studied in three different environments (**Table 1**); optimal irrigated (OPT), drought stress (DRT) (Gutiérrez-Rodríguez et al., 2004), and hot-irrigated (HOT) (Pinto et al., 2010). The trials named Elite OPT, Elite HOT 1, Elite HOT 2, and Elite DRT are made up of advanced spring wheat lines from CIMMYT adapted to the optimal, hot-irrigated and drought stress environments, respectively. The trial denoted as Gen Res DRT, sown under drought stress, is made up of landraces mainly from Mexico, northern Africa and western central Asia, chosen for potential expression of drought adaptive traits. All trials were sown under an alpha-lattice design, with either two or three replications.

### Proximal Data Collection

Grain yield (gm−<sup>2</sup> ) and dry biomass weight (gm−<sup>2</sup> ) were estimated at maturity for each plot following the methods described in Pask et al. (2012) (see **Table 1** for harvest dates). Key phenological stages of emergence, heading, anthesis, and physiological maturity were recorded for each plot (Pask et al., 2012).

Canopy temperature (CT) was recorded at ground level using the Sixth Sense LT300 handheld infrared thermometer. Measurements were made along each of the plots from a distance of ∼0.5 m above canopy, angled to avoid bare soil (about 60◦ to nadir) and directed specifically at the part of the plot most exposed to the sun (i.e., with the sun behind observer), when cloud cover was minimal and at times of low wind speed (Pask et al., 2012).

NDVI was measured at ground level with the Trimble Greenseeker 505 Hand-Held active sensor. This instrument emits and measures light at 656 and 774 nm. Measurements were made close to noon, when the plant canopy and soil surface are dry, at about 0.5 m horizontally above the canopy such that the FOV


TABLE 1 | Details of the five trials under the three environments of DRT (drought), OPT (irrigated) and HOT (hot irrigated).

*Harvest date indicates the approximate date at which harvest was made for yield and biomass estimates. Measurement dates of ground-based, UAV, and satellite data used for comparisons.*

is directly above the plot and centered over the middle row (Pask et al., 2012). NDVI allows for the estimation of vegetation present in each measurement via Equation 1 (Rouse et al., 1973):

$$NDVI = \frac{NIR - R}{NIR + R} \tag{1}$$

where NIR and R are the measured reflectance in the NIR and red spectral bands respectively, (774 and 656 nm for the case of the Greenseeker). **Table 1** details the measurement dates for the proximal instruments.

#### UAV Data Collection

Aerial imagery was collected via the AscTec Falcon 8 Unmanned Aerial Vehicle (UAV) (**Figure 1**). The 8-rotor UAV has a maximum 750 g payload; hence it has the ability to fly small, lightweight instruments. The flight system includes an onboard in-built GPS and a Mobile Ground Station (**Figure 1**, inset). Aerial images were collected with two cameras mounted separately on the UAV; the Tetracam ADC Lite multispectral camera (2048 × 1536 pixels for Red Green and NIR bands together) and the FLIR Tau 640 LWIR uncooled thermal imaging camera (640 × 512 pixels). See **Table 2** for specifications of the cameras. An 8000 mAh lithium battery powers the UAV and cameras, providing ∼15-min flight time. Several batteries allow for multiple flights in one session. The ADC Lite Tetracam takes photos in the green, red and NIR regions of the electromagnetic spectrum (**Figure 2A**), allowing for the calculation of NDVI,

while the FLIR thermal camera is used to derive a thermal index relating to the CT of the target plots. In the specification used, the thermal camera records analog video (integrated over 7.5–13µm), which is subsequently converted to still images for processing (**Figure 2B**).

the Mobile Ground Station (inset).

#### TABLE 2 | Specifications of the two cameras mounted on the UAV.


FIGURE 2 | (A). Raw image of Gen Res DRT trial within the drought environment, taken using the ADC Lite Tetracam on the UAV, approximately at 100 m height. Ground dimensions of plots are 2 × 0.8 m, with arrows representing direction of proximal measurements. Assuming a measurement time of 10 s per plot, the time taken to complete measurements using proximal sensors is ∼69 min for this trial, compared to several seconds with the UAV. (B) Raw image of a "HOT" trial extracted from video footage from the FLIR Tau thermal camera. Flight altitude was ∼30 m. Ground dimensions of plots are 2 × 0.8 m. (C) Pan-sharpened WV-2 imagery of Elite OPT. Pan-sharpened imagery of a trial containing smaller sized plots in (D) did not allow for the extraction of NDVI as plots were mixed within pixels.

#### Satellite Imagery Description

Satellite imagery was obtained from the commercial Digital Globe WorldView-2 (WV-2) satellite, taken on 6th April 2013. The imagery includes an 8-band multispectral image (bands between 396 and 1043 nm) and a panchromatic image (447–808 nm), with a spatial resolution of 0.46 and 1.85 m respectively. The georeferenced, unprocessed images cover ∼25 km<sup>2</sup> , including the whole of the CIMMYT research station.

**Table 1** presents the measurement dates for the relevant remotely sensed data for each of the trials. Proximal dates were compared based on closest dates available between proximal and airborne data collection.

#### Image Processing and Analysis UAV Imagery

Processing was carried out using ENVI version 5.0 (Exelis Visual Information Solutions, Boulder, Colorado). Radiometric distortions, e.g., lens vignetting, were solved by applying a cross track illumination correction, removing any broadband variation without affecting narrowband features. Geometric distortions are corrected using a "warping" procedure, by which an image is resampled to match the geometry of a "base" image or a vector map via the selection of Ground Control Points (GCPs). Images are subsequently mosaicked together by identifying overlapping regions within images.

FIGURE 3 | Example of the image processing using UAV-mounted FLIR Tau image of "HOT" trial shown in Figure 2B where a mask is applied to remove any non-vegetation pixels by applying a threshold for each pixel value. This is followed by the detection of each plot using pre-defined location parameters (red rectangles) and the removal of high variance pixels (using histogram of the pixel values of each plot). An average of pixel values over each band is taken to get a value per band per plot. This value is then subsequently used to calculate the target indices.

A camera specific mask is applied to the image/mosaic of each trial, via pixel band ratio thresholds, to differentiate between vegetation and non-vegetation pixels, aiming to remove any non-vegetation pixels, such as soil, as different materials can be distinguished by their pixel signal. An algorithm is then applied for the automatic detection of plots using pre-defined parameters by the user, for example the plot size in pixels, and distance between plots in pixels. An average across all bands for each pixel is calculated and pixels within each plot that exhibit high variance are removed, to eliminate any non-vegetation pixels that the mask may have missed, as well as pixel mixing effects. The average of each plot at each band is then taken to derive the target indices at plot level (**Figure 3**). For the ADC Lite multispectral camera, the NDVI index is calculated as, TM4−TM<sup>3</sup> TM4+TM3 , where TM4 (≈ 760–900 nm) and TM3 (≈ 630–690 nm) denote the Landsat bands.

The processed images collected from the thermal camera onboard the UAV were used to derive a temperature index, which relates to the CT of each plot. The temperature index T<sup>I</sup> was calculated using the sum of the green and blue bands of the plot averaged values of the processed images acquired from the recorded analog video:

$$T\_I = T\_G + T\_B \tag{2}$$

where T<sup>G</sup> and T<sup>B</sup> are the averaged "plot" values at the green and blue bands respectively.

#### Satellite Imagery

The Fast Line-of-sight Atmospheric Analysis of Spectral Hypercubes (FLAASH), an ENVI atmospheric correction tool, was applied to the satellite imagery used here. FLAASH incorporates the Moderate Resolution Atmospheric Transmission V4 (MODTRAN4) radiative transfer model to simulate the spectral radiance at pixel sensor level using user defined input variables. For a detailed description on the methods used by FLAASH see Adler-Golden et al. (1999).

The panchromatic (spatial resolution 0.46 m) and the multispectral imagery (spatial resolution 1.8 m) were fused together to create a single high resolution multispectral image, via the ESRI "pan-sharpening" algorithm, using ESRI ArcMap 10.1 (Mishra and Zhang, 2013). The pan-sharpened image, with a spatial resolution of 0.46 m, was used to derive NDVI (using 833 nm and 659 nm as the NIR and red wavelengths, respectively) as this can then be compared with the NDVI derived from the UAV (ADC Lite), and proximal (GreenSeeker) measurements.

#### Statistical Analysis

All satellite, airborne and proximal data was spatially corrected for row and column variation across the experiments using Multi Environmental Trial Analysis (META) for SAS (Vargas et al., 2013). Adjusted means were computed individually at each measurement date for the given traits based on the lattice design of the trials. Phenotypic correlations among adjusted genotype means per trial per date were determined to compare the relationship between the airborne, proximal, and agronomic traits. When more than one reading of data was available for both airborne and proximal data, multiple correlations are presented. Statistical analysis was carried out using R 3.1.2 (R Core Team, 2014). Differences between the phenotypic correlations of the proximal and UAV indices against yield or biomass were tested for significance with a Student t-test. The Holm-Bonferroni method was applied to the p-values of the t-tests to account for multiple comparisons (Holm, 1979). In addition, in order to investigate the interactions of the UAV and proximal phenotypic correlations against yield and biomass under the three different environments, for the CT/thermal index and NDVI, a multifactor analysis of variance (ANOVA) was performed using R.3.1.2.

#### RESULTS

#### Comparison of Data from Airborne and Proximal Sensing Approaches

**Table 3** shows phenotypic correlations between airborne and proximal sensed data, using mean values of genotypes for both the thermal index and NDVI measured from the UAV compared with the equivalent traits -CT and NDVI- measured using proximal sensors. Correlations between the UAV derived thermal index and the proximal CT are, in general, significant for all trials. This adds confidence to the use of the airborne thermal index. Difficulties can arise when comparing the two different methodologies due to the sensitivity of CT to external environmental factors (time of day, temperature, radiation, wind, irrigation status, VPD, etc.), particularly wind speed. Nonetheless of all the comparisons made, only one did not show a correlation, that between CT and the thermal index for the trial Elite DRT, probably due to variations in wind speed during measurements

TABLE 3 | Phenotypic correlations between genotype means for the airborne/satellite derived thermal index/NDVI, against the corresponding ground-based CT/NDVI and between the genotypic means for the aerial derived indices and yield/biomass.


*Also shown are equivalent correlations with ground-based indices.* <sup>+</sup>*,* \**,* \*\* *represent significant levels of 0.1, 0.05, and 0.01 respectively.*

(as was noted at the time of observation). Significant correlations were also observed between the UAV and proximal NDVI measurements for all trials.

## Association of Traits with Yield and Biomass Comparing Airborne and Proximal Sensing Approaches

Phenotypic correlations were estimated between UAV derived NDVI and the thermal index with both biomass and yield of genotypes (**Table 3**). For comparison, the corresponding correlation between proximal NDVI and CT with yield and biomass is also shown. Correlations between the UAV derived thermal index and yield/biomass are significant for almost all trials, and are generally larger than the corresponding proximal CT correlations with the yield and biomass. Note that negative correlations were observed between CT/UAV thermal index and yield/biomass as cooler canopies are generally associated with better adaptation. Similarly, the UAV derived NDVI index generally shows stronger correlations with biomass and yield compared with the respective proximal NDVI.

The "Gen Res DRT" trial is made up of diverse genetic resources expressing non-homogeneous height and which are not necessarily well adapted to the photoperiod and other conditions of the screening environment. This could help explain the relatively lower, although still significant, correlations between NDVI and yield for this trial compared to the hot irrigated and drought data sets from elite material, i.e., there was large variation in development stage and morphology (**Table 3**). The lower correlation between CT and biomass for this trial compared to yield could also be attributed to confounding effects due to variations in height within the trial and their attendant influence on boundary air layers that affect transpiration rate when there is no breeze. Note that this is not the case for NDVI, which is free from such confounding effects.

For the Elite OPT trial, correlations with yield and NDVI are of lower significance (p < 0.1) compared to those of the hot irrigated trials, also made up of elite lines (**Table 3**). However, these results are consistent with previous observations that these techniques are most effective as a selection tools under abiotic stress (Pinto et al., 2010).

When considered together, correlations between the UAV derived indices and yield/biomass were significantly different to the equivalent proximal sensed correlations (t-test, P = 0.01). When separated into groups, there was significant difference between the UAV and proximal derived correlations with yield and biomass for the following groups CT/thermal index (t-test, P = 0.01), yield (t-test, P = 0.05), biomass (t-test, P = 0.05), and NDVI (t-test, P = 0.1).

The phenotypic correlations were separated into the three environments (OPT, DRT, and HOT) and a multi-factor ANOVA was performed. For the OPT environment, a significant interaction (P = 0.06) was observed between the proximal and UAV phenotypic correlations between yield and biomass, probably associated with the greater biomass correlations, particularly those of the UAV (see **Table 3**). For the HOT trials, there was a significant difference (P = 0.08) between the proximal and UAV phenotypic correlations between yield and biomass for both NDVI and the thermal index/CT. This can be attributed to the higher correlations for the UAV observations. No interaction was observed for the DRT environment, this can be partly explained by un-adapted material in the genetic resource trial, as explained above.

#### Satellite Imagery

**Figure 2C** shows the pan-sharpened WV-2 extracted image of Elite OPT. It can be seen that the satellite image provides sufficient resolution for multiple (∼20) pixels for each plot and hence the NDVI index was able to be calculated.

**Table 3** compares the NDVI calculated from the three methods: space-borne collected WV-2 imagery, low level airborne collected imagery via the UAV and proximal measurements. The proximal and UAV measurements were chosen to be as close as possible to the satellite imagery collection date. It can be seen that the NDVI derived from all methods are well correlated with each other. The correlation between the NDVI from the satellite image and the NDVI from the other two methods gives confidence to the calculation of NDVI from high resolution satellite imagery for plots of the size of those in Elite OPT trial (8.5 × 2.4 m). Also compared in **Table 3** is the relationship between each of the NDVI indices and the dry biomass weight and yield measured at maturity for Elite OPT. The NDVI derived from the satellite provide the best correlation with biomass and yield, and proximal NDVI the lowest.

An attempt was made to retrieve NDVI from trials of smaller sized plots. **Figure 2D** shows an extract of the pan-sharpened WV-2 imagery from an OPT trial with plot size at 2 m × 0.8 m. The resolution of the image prevented the separation of plots due to pixel mixing; hence it was not possible to distinguish between plots.

#### DISCUSSION

The results of the current study demonstrate the advantage of airborne remote sensing as a tool to estimate a range of physiological and agronomic traits on a large scale in experimental plots. Proximal measurements have already been proven to predict yield and biomass in wheat (Reynolds et al., 1994, 1998; Aparicio et al., 2000) and are beginning to be used routinely in breeding (Pask et al., 2014). The generally strong correlations presented here between airborne indices and equivalent ground-based CT and NDVI, as well as significant correlations between the airborne indices and yield/biomass, that were generally greater than the equivalent correlations with ground-based measurements, suggest that increased precision results from the use of the indices derived from imagery, particularly in the stressed environments. This is a promising result given the impacts of changing climate and its implications for food security.

Most published work that attempts to thoroughly validate multispectral indices is based on proximal measurements. Errors may be introduced when moving from proximal to aerial measurements at a spatially larger scale, for example atmospheric scattering may cause absorption features of light by pigments to alter, as well as affects related to canopy architecture (angle and area of leaves), water vapor in the atmosphere, background noise and measurement geometry (Suarez et al., 2008; Garbulsky et al., 2011). However, the results presented here demonstrate the potential of low level UAV measurements to indirectly measure yield and biomass in field conditions. The relative precision of airborne measurements can be associated with two main factors. The first is related to reduced errors linked to the ability to remove non-vegetation pixels and other statistical outliers during image analysis (**Figure 3**). The second is through limiting confounding effects caused by environmental drift, such as changes in temperature, sun angle etc., typically associated with the time taken to make ground based measurements on large trials (**Figure 2**).

Given that the operation of UAVs is less labor intensive than proximal readings, as well as being free from restrictions associated with access to plots (due to irrigation or application of pesticides, for example), the approach lends itself well to routine measurements including for growth analysis, to measure the evolution of stress, and the application of regression e.g., (Lopes and Reynolds, 2012) or spline (Hurtado et al., 2011) models over time from which additional parameters can be derived to compare treatments and genotypes.

Despite the promising results presented here for NDVI derived from the satellite measurements, it is probably not the most effective tool for this application. While satellite imagery has the advantage of covering vast areas, resolution restricts its application to measurements in which target objects are of a larger scale than the small plots typical of genotypic screening. Furthermore, it is difficult to obtain satellite imagery at frequent

#### REFERENCES


time intervals and the option to adjust timing of measurement to avoid cloud cover or other inclement weather conditions is absent.

The fact that the estimates of CT and NDVI were generally better associated with performance traits when measured by UAVs compared to proximal data, under both heat and drought stressed conditions, and in advanced lines as well as unimproved genetic backgrounds, confirms the value of the UAV approach in breeding for climate change, where a new generation of breeding lines must be developed based on extensive screening of plant genetic resources.

#### AUTHOR CONTRIBUTIONS

MT collected data, carried out the data analysis, drafted the article and carried out revisions. MR created the experimental design, made suggestions for the data analysis, helped with the draft and carried out critical revisions. SC made suggestions for the data analysis, helped with the draft and carried out critical revisions. All authors gave final approval for publication.

#### ACKNOWLEDGMENTS

The authors would like to thank Dr. Gemma Molero, Dr. Mariano Cossani, and Dr. Marc Ellis for sharing data for this work. The research was supported by funding from the Ministry of Agriculture in Mexico, Secretaria de Agricultura, Ganaderia, Desarrollo Rural, Pesca y Alimentacion (SAGARPA), the Sustainable Modernization of Traditional Agriculture (MasAgro) project and the International Wheat Yield Partnership (IWYP).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Tattaris, Reynolds and Chapman. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# A Novel Remote Sensing Approach for Prediction of Maize Yield Under Different Conditions of Nitrogen Fertilization

Omar Vergara-Díaz <sup>1</sup> , Mainassara A. Zaman-Allah<sup>2</sup> , Benhildah Masuka<sup>2</sup> , Alberto Hornero<sup>3</sup> , Pablo Zarco-Tejada<sup>3</sup> , Boddupalli M. Prasanna<sup>2</sup> , Jill E. Cairns <sup>2</sup> and José L. Araus <sup>1</sup> \*

1 Integrative Crop Ecophysiology Group, Plant Physiology Section, Faculty of Biology, University of Barcelona, Barcelona, Spain, <sup>2</sup> International Maize and Wheat Improvement Center, CIMMYT Southern Africa Regional Office, Harare, Zimbabwe, <sup>3</sup> Laboratory for Research Methods in Quantitative Remote Sensing, QuantaLab, Institute for Sustainable Agriculture, National Research Council, Cordoba, Spain

#### Edited by:

Susana Araújo, ITQB-Universidade Nova de Lisboa, Portugal

#### Reviewed by:

Cristina Cruz, University of Lisbon, Portugal Jan F. Humplík, Palacký University Olomouc, Czech Republic Jean-Luc Regnard, Montpellier SupAgro, France

> \*Correspondence: José L. Araus jaraus@ub.edu

#### Specialty section:

This article was submitted to Crop Science and Horticulture, a section of the journal Frontiers in Plant Science

Received: 18 November 2015 Accepted: 01 May 2016 Published: 18 May 2016

#### Citation:

Vergara-Díaz O, Zaman-Allah MA, Masuka B, Hornero A, Zarco-Tejada P, Prasanna BM, Cairns JE and Araus JL (2016) A Novel Remote Sensing Approach for Prediction of Maize Yield Under Different Conditions of Nitrogen Fertilization. Front. Plant Sci. 7:666. doi: 10.3389/fpls.2016.00666 Maize crop production is constrained worldwide by nitrogen (N) availability and particularly in poor tropical and subtropical soils. The development of affordable high-throughput crop monitoring and phenotyping techniques is key to improving maize cultivation under low-N fertilization. In this study several vegetation indices (VIs) derived from Red-Green-Blue (RGB) digital images at the leaf and canopy levels are proposed as low-cost tools for plant breeding and fertilization management. They were compared with the performance of the normalized difference vegetation index (NDVI) measured at ground level and from an aerial platform, as well as with leaf chlorophyll content (LCC) and other leaf composition and structural parameters at flowering stage. A set of 10 hybrids grown under five different nitrogen regimes and adequate water conditions were tested at the CIMMYT station of Harare (Zimbabwe). Grain yield and leaf N concentration across N fertilization levels were strongly predicted by most of these RGB indices (with R <sup>2</sup> ∼ 0.7), outperforming the prediction power of the NDVI and LCC. RGB indices also outperformed the NDVI when assessing genotypic differences in grain yield and leaf N concentration within a given level of N fertilization. The best predictor of leaf N concentration across the five N regimes was LCC but its performance within N treatments was inefficient. The leaf traits evaluated also seemed inefficient as phenotyping parameters. It is concluded that the adoption of RGB-based phenotyping techniques may significantly contribute to the progress of plant breeding and the appropriate management of fertilization.

Keywords: breeding, crop management, field phenotyping, maize, nitrogen fertilization, NDVI, RGB indices

## INTRODUCTION

Low soil fertility, alongside drought and heat, is a major stress factor limiting crop productivity on a world scale (Stewart et al., 2005). In the case of sub-Saharan Africa, the lack of nitrogen (N) is the main constraint on cereal yields in areas with more than 400 mm average annual rainfall (Buerkert et al., 2001). Therefore, an optimization of N use is critical for increased grain production, especially in the low productive regions. On the other hand, on the basis of environmental and economic

**355**

sustainability, a more restricted and reasonable use of fertilizers is necessary. Plant scientists, especially breeders and agronomists, face the challenge of solving these limitations while taking into account the additional implications of climate change on food security (Cairns et al., 2012, 2013).

Maize is the second most cultivated cereal worldwide and the most commonly cultivated cereal in Africa in terms of land area and production (FAO, 2013). In particular, agricultural productivity of sub-Saharan Africa remains the lowest in the world partly due to low soil fertility (Cairns et al., 2013; Fischer et al., 2014). Therefore, improving tolerance of maize to low N will increase yields and impact positively on livelihoods and food security (Masuka et al., 2012).

In this sense, two strategies are considered paramount for crop scientists: (i) breeding to improve varieties toward higher nutrient use efficiency and tolerance to nutrient-deficiency (ii) and appropriate fertilization management (Wezel et al., 2014), including precision agriculture (PA; Hatfield, 2000; Chen et al., 2014). Thereby, the implementation of such improvements may increase farmers' profits by maintaining crop yield and reducing the use of resources while preventing further degradation to the environment (Hergert et al., 1996; Delgado et al., 2001; Roberts et al., 2001; Wang et al., 2003). In that sense, technologies for crop monitoring and breeding must be high performing, broaduse and affordable, particularly (but not only) when national agricultural systems, seed companies, or small farmers from developing countries are the targets. Moreover, in the case of breeding, improvements are needed to overcome the field phenotyping bottleneck that limits breeding and advances in PA (Araus et al., 2008; Furbank and Tester, 2011; Araus and Cairns, 2014).

Remote proximal sensing technologies are being used currently for precise management of crops, whereas its potential application for field high throughput phenotyping has gathered increasing interest in recent years (Araus and Cairns, 2014; Liebisch et al., 2015). The classical approach has involved the use of multispectral sensors and the development of numerous vegetation indices associated with vegetation parameters such as above-ground biomass, water and nutrient-deficiency and crop yield (Petropoulos and Kalaitzidisz, 2012). Among the indices, the Normalized Difference Vegetation Index (NDVI) is the most widely used. Concerning crop N performance, several studies have shown that it is possible to quantify it satisfactorily using multispectral data at both the aerial and ground levels (Barnes et al., 2000; Boegh et al., 2002). However, multi and hyperspectral imagers are relatively expensive and complex from the operational point of view.

As a low-cost alternative, vegetation indices derived from Red-Green-Blue (RGB) cameras have been employed for remote sensing assessment in field conditions, providing a wide-range of phenomic data about genotypic performance under different stress conditions and species, including water stress and foliar diseases in bread wheat, durum wheat and tritordeum and triticale (Casadesus et al., 2007; Casadesús and Villegas, 2014; Vergara-Diaz et al., 2015; Zhou et al., 2015). Moreover, digital sensors have been successfully integrated on board unmanned aerial vehicles (UAV) to assess crop vigor, vegetation coverage, and greenness (White et al., 2012; Andrade-Sanchez et al., 2014; Svensgaard et al., 2014). For example, digital indices derived from RGB images have been proposed for grain yield (GY) assessment in water limiting conditions (Casadesus et al., 2007) and for quantifying leaf N concentration (Rorie et al., 2011). However, the use of RGB images to assess genotypic performance in terms of yield and crop N accumulation in response to different levels of soil fertility has not yet been assessed. RGB images may represent a proper alternative to spectroradiometric approaches at different levels: at the whole trial level from aerial platforms, at the plot level from ground-based measurements or even at the single leaf level replacing leaf chlorophyll meters.

Information derived from plant samples may also be relevant for crop monitoring and phenotyping (Araus and Cairns, 2014). For example the stable isotope composition in plant matter constitutes an integrative selection criterion because it can describe the behavior of the crop under stress (Masuka et al., 2012). Nitrogen isotope composition (δ <sup>15</sup>N) can be employed to characterize the efficiency in using N fertilizers (Evans, 2001; Serret et al., 2008). For its part, implementing carbon isotope composition (δ <sup>13</sup>C) in maize is not clear for assessing genotypic differences due to the C4-photosynthetic metabolism of this species, but still appears responsive to differences in growing conditions (Monneveux et al., 2007; Araus et al., 2010). Finally, some other morphological and compositional traits such as the specific leaf area (SLA), N concentration, N per unit leaf area (N/LA), and carbon to nitrogen ratio (C/N), which are in turn related to nitrogen use efficiency, leaf construction, and primary metabolism (Poorter and Evans, 1998; Feng et al., 2008), have the potential to be useful for breeding, but knowledge about their association with crop yield is scarce.

The main goal of this study is to develop affordable easy-to-use new phenotyping tools that increase selection efficiency for grain yield and leaf N concentration under different N fertilization conditions in maize. To accomplish this objective, we compared the accuracy of field-spectroradiometer data vs. RGB-derived vegetation indices assessing GY and leaf N concentration in a set of ten maize hybrids grown in the field under five N-fertilizer levels. Firstly, we assessed the performance of these parameters for all the N-treatments together, and subsequently we dissected the correlations within each N-level for further discussion of phenotyping. Additionally, simple regression models were made for GY prediction and these models were tested and validated against the experimental yield of another trial. The performance of the leaf parameters N/LA, C/N, SLA, and δ <sup>13</sup>C and δ <sup>15</sup>N were also studied with the aim of relating these structural and compositional leaf properties with crop performance and phenotyping data. All RGB and UAV imagery were obtained at flowering stage in order to integrate the differences in crop performance from plant emergence to flowering stage, when the number of kernels per ear is determined.

## MATERIALS AND METHODS

## Experimental Design and Growing Conditions

Field trials were carried out at the Southern Africa regional station of CIMMYT (International Maize and Wheat Improvement Center) located in Harare (17◦ 43′ 32′′S, 31◦ 00′ 59′′E) where two field experiments were studied. Before sowing, soil pH, total soluble salts (TSS), nitrogen as nitrate (NO<sup>−</sup> 3 ) and phosphorus (P2O5) were analyzed in three soil depth ranges (0–30, 30–60, and 60–90 cm) and six replicates for each depth range were produced. Mean values for the full soil profile were pH <sup>=</sup> 5.8, TSS <sup>=</sup> 240.9 ppm, NO<sup>−</sup> <sup>3</sup> <sup>=</sup> 4.12 ppm, and P2O<sup>5</sup> = 18.93 ppm.

Ten maize hybrids were sown, three of them were commercial hybrids (PAN7M-81, SC635, SC537) and the other seven were maize hybrids developed at CIMMYT (TH11894, TH127591, TH127053, TH127618, TH13466, CZHH1155, TH127004). These maize hybrids cover a big range of agronomical sensitivity to low nitrogen conditions. A split-plot arrangement in a randomized block design was set up and five nitrogen fertilization levels (0, 10, 20, 80, and 160 kg·ha−<sup>1</sup> NH4NO3) were applied in both trials. Two and three replicates were set for the first and second trials, with 100 and 150 being the respective number of plots in each trial (trials S and P, respectively). A two-row border was sown between fertilization treatments and on the edges of the trial to prevent spatial variability.

Seeds were sown during the wet season, on December 23th 2013, in two rows per plot; rows were 4 m long and 75 cm apart (6 m<sup>2</sup> /plot), with 17 planting points per row and 25 cm between plants within a row. All trials were homogeneously fertilized with 400 kg·ha−<sup>1</sup> of super-phosphate and potassium oxide fertilizer (P2O<sup>5</sup> 14% and K2O 7%). Weather conditions throughout growing season were recorded with a weather station. The mean temperature was 18.9◦C, mean humidity 81.2 and total rainfall during the crop period was 563.1 mm, therefore, preventing the water deficit in these rainfed conditions.

The trials were harvested on May 20th 2014. The central 3.5 m of each row was harvested discarding 2 plants at each end, thus the collected weight corresponded to 5.25 m<sup>2</sup> (0.75 m apart × 2 rows × 3.5 m long). The cobs were threshed and the grains dried until they reached around 12% moisture, and then the grain from each plot was weighed. Grain yield (GY, Mg·ha−<sup>1</sup> ) was calculated as follows: (X kg plot−<sup>1</sup> × 10)/5.25 where X is the grain weight per plot.

#### NDVI Calculation

The normalized difference vegetation index (NDVI) was calculated using the equation:

$$NDVI = \text{(NIR } - \text{ R)}/\text{(NIR } + \text{R)}$$

where R is the reflectance in the red band and NIR is the reflectance in the near-infrared band. NDVI was obtained around the flowering stage by using two different approaches: using ground measurements and from aerial multispectral images (**Figure 1**).

The NDVI of individual plots at ground level (NDVIground) was determined with a ground-based portable spectroradiometer with an active sensor (GreenSeeker handheld crop sensor, Trimble, USA). This equipment uses the spectral wavelengths 650–670 nm as the red band and 765–795 nm as the near infrared. The distance between the sensor and the plots was kept constant

Multispectral false-color image at the aerial level showing near infrared (800 nm) as red, green (550 nm) as blue and red (670 nm) as green, spatial resolution of 10 cm/px; (C) RGB digital image from the high-nitrogen fertilization treatment at the canopy level; (D) and its resulting processed image with BreedPix; (E) RGB image from the low-nitrogen treatment and (F) its respective processed image.

using a ladder, around 0.5–0.6 m above and perpendicular to the canopy. The whole areas of the two trials were measured from 12 to 14 h on March 3rd and 4th, 2014.

The aerial NDVI index (NDVIaerial) was obtained using a UAV-based remote sensing platform developed by Airelectronics (Montegancedo campus, Spain) in collaboration with the Crop Breeding Institute-Zimbabwe, CIMMYT, QuantaLab at the Institute for Sustainable Agriculture (IAS-CSIC, Spain) and the University of Barcelona. This aerial platform was equipped with a multispectral camera (ADC-Lite, Tetracam, Inc., Chatsworth, CA, US), which provides spectral images on the green, the red and the near-infrared bands, with a final ground resolution of 10 cm per pixel when flying at an object distance of 150 m. These bands are approximately equal to the Landsat Thematic Mapper (TM) bands TM2, TM3, and TM4, respectively, so that the spectral wavelengths from 630 to 690 nm represent the red band and 760 to 920 nm the near infrared band. The flight was conducted at an altitude of 150 m at midday on a sunny day when crops were around the flowering stage. The collected images covered 220 out of the total 250 plots, completely covering the block S trial (100 plots) and partially covering block P (120 of the total of 150 plots). Aerial images were subsequently corrected and calibrated with ImapQ (QuantaLab-IAS-CSIC, Cordoba, Spain) which converts images to radiance. Mosaicking and rectifying processes were applied with Autopano (Kolor SARL, Francin, France) by applying the image stitching technique (SIFT algorithm) in addition to a manual orthorectification from several checkpoints selected. NDVI values were finally extracted from the images using ENVI software (Exelis Visual Information Solutions, Boulder, Colorado, USA).

#### RGB Indices

Vegetation indices derived from red-green-blue (RGB) images were evaluated at the plot and the single leaf level (RGBcanopy and RGBleaf indices, respectively; **Figure 1**). In the case of RGBcanopy, one digital RGB picture was taken per plot by holding the camera about 0.8–1.0 m above the canopy, in a zenithal plane and focusing near the center of each plot. Plot images were taken on the same days as the measurements with the ground spectroradiometer using a Nikon COOLPIX S8000 digital compact camera without flash and with a focal length of 54 mm and were saved in a 4288×2848 pixel JPEG format. Later, six leaves per plot were taken from the S trial (100 plots) and were subsequently scanned with a Dell 2155 cdn multifunction color printer (Round Rock, TX, USA). Finally, scanned images were saved in the same format with a resolution of 2338 × 1653 pixels and RGBleaf indices calculated as below.

Subsequently, images were analyzed with the open source Breedpix 0.2 software (Casadesus et al., 2007) designed to process digital images. This software enables calculation of several RGB vegetation indices based on the different properties of color inherent in RGB images. RGB VIs were obtained either from the average color of the whole image or from the hue histogram in each image. BreedPix produces several automatic conversions of the original RGB image to other color spaces (i.e., each model that numerically represents the color in terms of different coordinates). Four VIs (a<sup>∗</sup> , b<sup>∗</sup> , u<sup>∗</sup> , and v<sup>∗</sup> ) belonging to CIE (from the French abbreviation of International Commission on Illumination) color spaces were calculated and used in this study. The software require the use of Java Advanced Imaging (JAI) for the conversion of RGB color space to CIE-XYZ color space and the resulting coordinates are subsequently converted to other color spaces. First, the VIs a<sup>∗</sup> and b<sup>∗</sup> belong to CIE-Lab color space, being L<sup>∗</sup> the lightness dimension and a<sup>∗</sup> and b<sup>∗</sup> the color-opponent coordinates. Red/green opponentcolors are represented along a<sup>∗</sup> axis, whereas b<sup>∗</sup> axis represent the yellow/blue opponent colors. Similarly, u<sup>∗</sup> and v<sup>∗</sup> indices represent the axis in the chromaticity diagram of CIE-Luv color space. Thereby the software obtains the average values of these components of color for each one of the processed images. Hue component is calculated using the JAI functions which employ the formulae described in Seul et al. (2000) whereas the components of CIE-Lab and CIE-Luv color spaces are calculated as described in Trussell et al. (2005). The relative green area (GA) and the relative "greener area" (GGA) are based on the sum of frequencies of the histogram classes included in a certain range of hue in the image. GA is the percentage of pixels in the image in the hue range from 60 to 180◦ , that is, from yellow to bluish green. On the other hand GGA is somewhat more restrictive since the range of hue considered by this index is from 80 to 180◦ , excluding yellowish-green tones and therefore, it more accurately describes the amount of photosynthetically active biomass and leaf senescence.

#### Analysis of Leaf Parameters

The leaf portions in the RGBleaf indices were also used the subsequent measures. Firstly, immediately before being scanned, a handheld spectroradiometer developed for leaf chlorophyll measurements (Minolta SPAD-502, Spectrum Technologies Inc, Plainfield, IL, USA) was used to measure the index related to leaf chlorophyll content (LCC). Four measurements were made for each leaf segment. Secondly, the leaves were oven dried at 70◦C for 24 h and the dry weight was measured. Then the specific leaf area (SLA) was calculated using the equation

$$\text{SLA} = \text{LA} \mid DW$$

where LA is the total leaf area (m<sup>2</sup> ) measured previously from the scanned images using the open-source Java-based software ImageJ (http://rsb.info.nih.gov.sire.ub.edu/ij/) and DW is the corresponding dry weight (kg).

Finally, dry leaves were ground to a fine powder and 0.7– 0.9 mg of leaf dry matter from each plot was weighed and sealed into tin capsules. Stable carbon (13C/12C) and nitrogen (15N/14N) isotope ratios as well as the leaf N and C concentrations (%) were measured using an elemental analyser (Flash 1112 EA; Thermo Finnigan, Bremen, Germany) coupled with an isotope ratio mass spectrometer (Delta C IRMS, Thermo Finnigan) operating in a continuous flow mode. Samples were loaded into a sampler and analyzed. Measurements were conducted at the Scientific Facilities of the University of Barcelona. Isotopic values were expressed as a composition notation (δ) as follow:

$$\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}\text{'}$$

where "sample" refers to plant material and "standard" to international secondary standards of known <sup>13</sup>C/12C ratios (IAEA CH7 polyethylene foil, IAEA CH6 sucrose and USGS 40 L-glutamic acid) calibrated against Vienna Pee Dee Belemnite calcium carbonate with an analytical precision (standard deviation) of 0.15‰. The same δ notation was used for the <sup>15</sup>N/14N ratio expression but with the standard referring to air. For nitrogen, international isotope secondary standards IAEA N1, IAEA N2, IAEA NO3, and USGS 40 were used with a precision of 0.3‰. Further, the C/N ratio was obtained from these analyses and total nitrogen concentration per unit leaf area (N/LA) was calculated with the formula:

$$N \,\mathrm{/}LA = \left(\frac{DW}{LA}\right) \times N$$

where LA is the total leaf area (m<sup>2</sup> ), DW is the corresponding dry weight (g) and N is its nitrogen concentration (in % dry matter).

#### Statistical Analysis

Statistical analyses were conducted using SPSS 21 (IBM SPSS Statistics 21, Inc., Chicago, IL, USA). Multiple variance analyses, the multiple comparison Duncan post-hoc test and bivariate correlations were performed. The presented leaf parameters (LCC, N, δ <sup>15</sup>N, N/LA, SLA, δ <sup>13</sup>C, C/N) and RGBleaf indices from scanned leaves were only analyzed for the S trial, whereas the NDVIground, NDVIaerial, and RGBcanopy indices were studied for both trials. The determination coefficients of the linear relationships of GY and leaf N concentration with the vegetation indices NDVI and RGB were calculated for the entire trials and within each N fertilization treatment. All graphs were performed with SigmaPlot 10.0 (Systat Software Inc., San Jose, California, US).

#### RESULTS

Significant differences in GY between genotypes and nitrogenfertilization levels were observed in this study (**Table 1**) with GY increasing in response to N fertilization (**Table 2**). Differences within nitrogen-input levels were also detected with both (ground and aerial) NDVI approaches and with all RGBcanopy indices except by a<sup>∗</sup> . Genotypic differences were detected by the RGBcanopy indices GGA, GA, a<sup>∗</sup> , u<sup>∗</sup> , and hue, whereas among the spectroradiometric indices only the NDVI at the ground level detected genotypic differences.

Leaf N concentration varied significantly between genotypes and the effect of N-fertilization levels was highly significant

TABLE 1 | P-values from multivariate analysis of variance with two fixed factors: genotype and nitrogen level and its interaction (GxN).


As dependent variables, grain yield (GY), leaf nitrogen concentration (%N), NDVI at aerial and ground levels, RGB indices from canopy images and scanned leaves, leaf chlorophyll content (LCC), nitrogen per unit area (N/area), the stable carbon (δ <sup>13</sup>C), and nitrogen (δ <sup>15</sup>N) isotope composition, carbon nitrogen ratio (C/N) and specific leaf area (SLA).

(**Table 1**) with values increasing as N fertilization increased (**Table 2**). All the RGBleaf indices detected very significant differences between nitrogen treatments and genotypic differences were also found with GA (**Table 1**). At the same time, LCC also indicated highly significant differences between N fertilization levels but not genotypic differences.

All the analyzed leaf parameters (N, N/LA, SLA, δ <sup>15</sup>N, δ <sup>13</sup>C, C/N) were highly sensitive to variations in N fertilizer levels (**Table 1**). In contrast, apart from leaf N concentration, genotypic differences were only detected for δ <sup>13</sup>C. Increasing N fertilization caused significant increases in leaf N, N/LA and SLA while δ <sup>15</sup>N and the C/N ratio decreased (**Table 2**). Additionally, differences among N fertilization levels were also found in δ <sup>13</sup>C but its trend was somewhat different: in the low-N levels δ <sup>13</sup>C was quite steady and then it decreased at 80–160N, whereas leaf-N concentration increased.

Additionally, the effect of changing light in outdoor conditions was evaluated in RGB indices obtained from canopy images (Table S1). For this purpose, 57 plots were photographed twice in nearly consecutive days, firstly in a sunny day and secondly in a partly cloudy day. All indices were strongly correlated between replicates (p < 0.001), particularly the indices GA, GGA, u<sup>∗</sup> , a∗ (R <sup>2</sup> > 0.72).

#### Grain Yield Assessment across Nitrogen Regimes and Genotypes

All vegetation indices (either ground and aerial NDVI, RGBcanopy, or LCC) were strongly correlated with GY variation across the whole set of plots of the two trials. The best results were obtained by using the RGB-indices GA and GGA at the canopy level, which showed an exponential regression model and explained 70–72% of GY variability (**Figure 2**). Meanwhile, the RGBcanopy indices u<sup>∗</sup> and a<sup>∗</sup> evolved inversely with increasing GY and demonstrated lower accuracy (R <sup>2</sup> = 0.326 and R <sup>2</sup> = 0.302, respectively, data not shown). In contrast, LCC evolved linearly with increases in GY and explained 69% of GY variation (**Figure 2**). Finally, both NDVI approaches followed a power regression model and their determination coefficients were moderate and similar (NDVIground at **Figure 2**; NDVIaerial R <sup>2</sup> = 0.293, data not shown).

Additionally, simple regression models from the P trial that explained GY across the different N fertilization levels were obtained by using the different VIs and validated for their accuracy in estimating the GY of the S trial (**Table 3**). The estimated GY from all VIs always fitted satisfactorily with the experimental GY for the entire trial. The determination coefficients increased further when six hybrids contrasting in their grain yield were selected, with three of them being high-yielding and the remaining ones lowyielding. Genotypic differences were found in the estimated GY from GGA, NDVIaerial and NDVIground between the six selected hybrids and the experimental GY was also significantly different. Moreover, in all models, differences between N fertilization levels were always detected by the estimated GY.


TABLE 2 | Means of grain yield (GY) (Mg·ha−1) from the two trials, leaf nitrogen concentration (%N), nitrogen per unit leaf area (N/LA), specific leaf area (SLA), the stable carbon (δ <sup>13</sup>C), and nitrogen (δ <sup>15</sup>N) isotope composition and the leaf C/N ratio according to the ten hybrids and the five nitrogen levels.

Letters are significantly different according to Duncan's multiple range test (P < 0.05).

## Grain Yield Assessment across Genotypes within Each N Regime

To further assess the accuracy of these indices, the determination coefficients for GY prediction within each N-input level across genotypic means were performed (**Table 4**). GGA, GA, u\*, and a\* indices were correlated significantly to GY variation within all N levels, whereas both NDVI approaches were correlated significantly to GY only for some of the studied N levels. By contrast, LCC did not correlate with GY across plots within any of the N levels.

## Leaf Nitrogen Assessment across N Regimes and Genotypes

LCC was the best predictor of leaf N concentration across the entire trial, explaining more than 80% of N variability, moderately surpassing the fitting accuracy of the RGBleaf indices (**Figure 3**). Thus, the RGBleaf index a<sup>∗</sup> explained about 69% of leaf N variation across N fertilization treatments (**Figure 3**) and u ∗ , b<sup>∗</sup> , and v<sup>∗</sup> were quite similar (R <sup>2</sup> = 0.682, R <sup>2</sup> = 0.643, and R <sup>2</sup> = 0.621, respectively, data not shown). For its part, NDVIaerial was also a good predictor of leaf N (**Figure 3**), whereas NDVIgroundwas less accurate in its prediction (R <sup>2</sup> = 0.116, data not shown). Finally, the RGB index v<sup>∗</sup> at the canopy level was more related to leaf N than it was to GY, and it was shown to be a reasonably good predictor of leaf N across the whole trial (**Figure 3**).

#### Leaf Nitrogen Assessment across Genotypes within Each N Regime

A table depicting the determination coefficient between the RGBleaf indices, NDVIaerial, NDVIground, and LCC against leaf N across genotypic means within each of the N fertilization levels is presented (**Table 5**). In the low-N treatments (0N to 20N) the best determination coefficients were provided by the RGBleaf indices b<sup>∗</sup> , v∗ , u∗ , and a<sup>∗</sup> . In addition, most of the RGBleaf indices were also sensitive to leaf N variation at the 80N level but none of them related significantly at the 160N level. For its part, LCC showed a quite similar accuracy compared with the RGBleaf indices in their predictions of leaf N within the low-N levels, but it was not significantly correlated in the high-N fertilization levels (**Table 5**). Finally, NDVIaerial was especially sensitive to leaf N variations in the high-N and 0N treatments, whereas NDVIground was generally unrelated to leaf N within each N treatment.

### Leaf Parameters Performance and Relationships with VIs and Yield

Leaf N was strongly negatively correlated across N levels with δ <sup>15</sup>N and the C/N ratio and to a lesser extent with δ <sup>13</sup>C and SLA (Table S2). Correlations of these traits with GY were also negative but weaker, except for SLA which did not correlate.

Most of the RGB indices (both at the leaf and canopy scales), the LCC and the NDVI correlated with N/LA across N regimes, but always more moderately than they correlated with leaf N concentration (Table S2). The association of δ <sup>15</sup>N with NDVI, LCC, and RGB indices (at the both scales) was highly significant and in some cases their correlation coefficients were higher than the respective coefficients between δ <sup>15</sup>N and leaf N. Similarly, δ <sup>13</sup>C was fairly well correlated with most of the RGB indices (especially at the leaf scale) and LCC. Regarding the C/N ratio, LCC was the best predictor but this correlation was smaller than with leaf N concentration. However, most of the RGBleaf indices (a∗ , b<sup>∗</sup> u ∗ , v∗ , GA), the RGBcanopy indices (hue, u<sup>∗</sup> , GA, GGA) as well as NDVIground and NDVIaerial correlated more strongly with

the leaf C/N ratio than they did with leaf N (Table S1). Finally, SLA correlated strongly with the RGBleaf indices GA and GGA, and slightly with both NDVIs.

The relationships between leaf N, N/LA, C/N, δ <sup>13</sup>C, δ <sup>15</sup>N, and SLA with GY across genotypes within N fertilization treatments were almost all non-significant except for leaf N in the 160N treatment (**Table 4**). Regarding the genotypic correlations within each N fertilization level of these leaf traits with leaf N, only the leaf N derived parameters (C/N and N/LA) were significantly correlated (**Table 5**).

## DISCUSSION

#### Crop Monitoring and Phenotyping Parameters for GY Estimation

As previously found in other studies in wheat grown under different stress conditions (Casadesus et al., 2007; Morgounov et al., 2014; Vergara-Diaz et al., 2015), the RGBcanopy indices (from BreedPix software) measured at flowering were strongly correlated with GY. RGB-based indices may perform far better than NDVI for GY prediction, which has been recently described under water and biotic stresses in wheat (Elazab et al., 2015; Vergara-Diaz et al., 2015; Zhou et al., 2015). The lower accuracy of NDVI in comparison to digital-based RGB indices can be explained in several ways. On the one hand, graphs clearly highlight (**Figure 2**) that the variability in the canopy NDVIvalues at ground level is small, with more than 90% of values being in the range 0.5–0.8 and with the NDVI values in the low N treatments being already relatively high (e.g., average of NDVIground = 0.57 in the 0N treatment). Therefore, the NDVI values remained almost unchanged as GY increased from 4 to 13 Mg ha−<sup>1</sup> . These results support the previously reported saturation of reflectance spectra in the red and near-infrared regions, such that increasing leaf area does not involve a parallel increase in NDVI values (Hobbs, 1995; Elazab et al., 2015). Thus, the relationship between NDVI and aerial biomass saturates as canopies become denser (i.e., LAI > 4) and as a consequence the relationship between the NDVI and GY also worsened as GY increased. Moreover nearinfrared reflectance is sensitive to canopy architecture variations (Gitelson et al., 2002) which surely affected NDVI measurements in maize canopies. The use of multi-angular spectral data may solve these problems by capturing the scattering of sunlight by vegetation, which enables to assess three-dimensional vegetation structures (Hasegawa et al., 2010). Whereas this approach may improve the estimation of NDVI (and other spectral indices) for phenotyping, the increasing complexity (i.e., more time and resources needed) of the method makes it less feasible as low-cost alternative.

For its part, the range of variability in the RGBcanopy index, GA, was much wider (only 63% of values were in the range of 0.5–0.8) and GA values in the low N treatments were somewhat smaller (average GA = 0.46 in the 0N treatment) than those of the NDVI, and in fact GY correlated much better with GA than with the NDVI. Even so, the RGBcanopy indices also seemed to saturate for high GY but to a lesser extent than the NDVI because they mainly depend on changes in pigment concentration and the


TABLE 3 | Simple regression models obtained with different Vegetation Indices (the spectroradiometric indices NDVIaerial, NDVIground, and the RGBcanopy indices GA and GGA) in the P trial, explaining Grain Yield (GY) variation across nitrogen fertilization levels, were used for GY estimation in the S trial.

The fit of the estimated Grain Yield (GY est.) to the experimental Grain Yield (GY exp.) was tested with the determination coefficient (R<sup>2</sup> ) for the entire trial and for six yield-contrasting hybrids. P-values were analyzed for all estimated GYs and for the experimental GY using the six selected hybrids. \*\*P < 0.001.

TABLE 4 | Determination coefficients (R2) of RGB-indices from canopy images (RGBcanopy), aerial NDVI, ground NDVI, leaf chlorophyll content (LCC), the leaf nitrogen concentration on a dry matter basis (Leaf %N), the nitrogen concentration on a leaf area basis (N/LA), the ratio of carbon to nitrogen concentration (C/N), the stable carbon (δ <sup>13</sup>C) and nitrogen (δ <sup>15</sup>N) isotope composition and the specific leaf area (SLA) predicting grain yield in the five N levels separately (0, 10, 20, 80, and 160 kg·ha−<sup>1</sup> NH4NO<sup>3</sup> ) following linear regression models.


\*P < 0.05; \*\*P < 0.001; ns, non-significant.

canopy LAI is less affected in the visible region than in the NIR region (Casadesus et al., 2007; Elazab et al., 2015).

In the case of the airborne NDVI data, the correlation with GY was also much lower than with GA taken on individual plots with GY. In fact, the images from the ADC multispectral camera have around four-fold less resolution than current digital camera technology (3.2 vs. 12 MP in our study, respectively). Although many ADC images were employed to obtain mosaics of the entire field trials, the resolution obtained at the flight altitude generated pixels which were mixed between pure vegetation, shadows and soil components. Such effects were successfully separated in the imagery collected at the near-canopy level with the RGB camera due to the higher resolution obtained. Altogether, the NDVIaerial provides a much lower amount of information than the GA and other VIs derived from RGB images taken at the plot level.

In the case of the LCC, it correlated strongly and linearly with grain yield across fertilization levels (**Figure 2**). In fact the leaf chlorophyll meters calculate a spectral ratio of the leaf transmittance to the near-infrared and red bands and they were primarily developed to assess N fertilization levels (Fox et al., 1994; Markwell et al., 1995). LCC indirectly predicts GY when a wide range of N conditions are considered and this is probably due to the relationship between chlorophyll content, leaf N and yield (Argenta et al., 2004).

Concerning the applications in breeding, the determination coefficients within N levels across genotypic means (**Table 4**) support the strength of RGBcanopy indices as phenotyping

parameters. Thus, these indices were able to indicate the most efficient genotypes in terms of grain yield within each N fertilization level, whereas the NDVI performed much worse as a phenotyping parameter. Although genetic variability in maize hybrids in response to low N doses is high (Wang et al, 1999; Zaman-Allah et al., 2015) it has been scarcely exploited by breeding programs since they mainly focus on breeding for maize performance under favorable conditions (Machado and Fernandes, 2001). In this sense, the proposed phenotyping parameters herein, based on the use of RGB images, can significantly contribute to selection of maize hybrids resilient to low N as well as being more responsive to increases in N fertilization. For its part, LCC was unrelated to genotypic GY variation at any of the N-levels tested (**Table 4**), and this is in agreement with previous reports in maize that have noted LCC as not always being significantly correlated with genotypic differences in GY (Gallais and Coque, 2005).

### Crop Monitoring and Phenotyping Parameters for Leaf N Assessment

The importance of leaf N concentration for N management and breeding lies not only in its potential contribution to grain N (Gallais and Coque, 2005) but is also due to it being a component of the nitrogen uptake efficiency (Serret et al., 2008). Moreover, leaf N is an indicator of leaf photosynthetic capacity contributing to grain yield (Richards, 2000) as well as a key fodder trait (Van der Wal et al., 2000). Therefore, the estimation of leaf N concentration within a given N fertilization treatment may provide valuable information about the genotypic efficiency for the uptake of N.

Our study highlights the potential of RGB indices for precise crop N management and for phenotyping genotypic performance under a wide-range of N conditions. As widely reported, LCC proved to be a very good indicator of leaf N concentration across nitrogen fertilization levels, therefore enabling monitoring of N application (Hirel et al., 2007). However, LCC failed to be effective as a phenotyping parameter, especially at high Nfertilization levels (**Table 5**). In contrast, the RGBleaf indices demonstrated that they were the best genotypic predictors for leaf N concentration in the 0 to 80 kg·ha−<sup>1</sup> N range. Thus, RGB indices at the leaf level have the potential to inform breeding programs about tolerance to N-deficiency stress in maize. This is a helpful insight because selection experiments have shown that the maximum genetic advance for low N is achieved when selecting in such N conditions (Gallais and Coque, 2005).

By contrast, in the highest N-fertilization level (160 kg ha−<sup>1</sup> ) the RGBleaf indices and LCC were probably saturated because they did not correlate with variations in leaf N concentration. For its part, the NDVIaerial had an irregular trend as it was significantly correlated to changes in leaf N concentration at three of the five N fertilization levels (0, 80, and 160 kg ha−<sup>1</sup> ) and these correlations were especially strong in the high N levels. As discussed above, besides of some plot variability and soil exposure, the poorer performance of the NDVIaerial may be mainly explained by the relatively poor spectral resolution at the single plot level of the multispectral aerial images. Even TABLE 5 | Determination coefficients (R2) of RGB-indices from scanned leaves (RGBleaf), leaf chlorophyll content (LCC), ground NDVI, aerial NDVI, the nitrogen concentration on a leaf area basis (N/LA), the ratio of carbon to nitrogen (C/N), leaf stable carbon (δ <sup>13</sup>C) and nitrogen (δ <sup>15</sup>N) isotopic composition and specific leaf area (SLA) predicting leaf nitrogen concentration on a dry matter basis separately in the five N fertilization levels (0, 10, 20, 80, and 160 kg·ha−<sup>1</sup> NH4NO<sup>3</sup> ).


\*P < 0.05; \*\*P < 0.001; ns, non-significant.

so, according to our results this approach seems efficient for its implementation in aerial platforms.

#### Use of Leaf Analytical Parameters for Crop Management and Phenotyping

Besides the leaf N concentration discussed above, other leaf N parameters like the N concentration on an area basis (N/LA) and the C/N ratio were strongly associated with GY across N fertilization levels. In the case of the leaf δ <sup>15</sup>N, its value gradually decreased as the N application rate increased. This trend is due to the absorption of N from chemical fertilizers that are highly depleted in <sup>15</sup>N, whereas in the low N treatments plants absorb the N available in the soil, which is usually <sup>15</sup>Nenriched (Bateman et al., 2005; Masuka et al., 2012). However, the genotypic effect was not significant for δ <sup>15</sup>N, which does not support the use of this isotopic signature for maize phenotyping under low N stress. These results disagree with previous studies in wheat where genotypic differences were found under N stress conditions (Araus et al., 2013).

In agreement with previous studies (Dercon et al., 2006), low N induced higher δ <sup>13</sup>C in maize, whereas it decreased in the high N fertilization treatments. This pattern of response appears related to the occurrence of some degree of water stress associated with a larger transpiring area due to nitrogen fertilization. In agreement with previous studies in maize, genotypic differences in leaf δ <sup>13</sup>C may be attributed to differences in transpiration efficiency, but the variation in δ <sup>13</sup>C was unrelated to GY within treatments (Cabrera-Bosquet et al., 2009).

Previous studies noted the relevance of SLA for the compositional and ecophysiological characterization of plants (Reich et al., 1998; Nautiyal et al., 2002). Several authors (Poorter and Evans, 1998; Meziane and Shipley, 2001) have reported a positive relationship between leaf N and SLA (**Table 2**). In turn, changes in SLA may be due to variations in leaf thickness and/or leaf density (Witkowski and Lamont, 1991). Increasing leaf density in low N conditions may be attributed to the increased synthesis of dense tissues such as sclerenchyma and vascular tissues that are rich in nitrogen-free substances (Garnier et al., 1997), whereas leaf thickness seems to have a minor role (Arendonk and Poorter, 1994). However, concerning its phenotyping use, SLA was shown to be homogeneous among the studied maize hybrids and unrelated to GY, as well as within a given N fertilization level, which excludes SLA as a phenotyping trait.

Regarding the relationship between VIs and the C/N ratio, most of the RGB indices (at the canopy and leaf levels) and both NDVI approaches were demonstrated as being even better correlated to the leaf C/N ratio than to leaf N concentration (Table S2). This finding may have considerable economic implications as the C/N ratio informs not only about the crop N status but also about the aerial biomass quality, including digestibility and nutritional quality (Van der Wal et al., 2000). Finally, all VIs and the LCC were better at capturing the differences in leaf N concentration than the amount of N concentration per unit leaf area (Table S2), thus avoiding the effect of leaf thickness or density. This evidence is enhanced by the weak relationship of the digital and spectral indices to SLA. This finding is particularly interesting in the case of LCC (SPAD readings), which has been previously positively correlated with leaf thickness and negatively correlated with SLA in other species (Marenco et al., 2009).

## Implications for Breeding and Crop Management

The tested vegetation indices based on RGB images and to a lesser extent the NDVI demonstrated a high-throughput for the accurate prediction of several traits that are highly valuable for maize breeders and agronomists such as grain yield, leaf N concentration and the ratio of carbon to nitrogen under a wide range of N fertilization levels. Proper N fertilization management may be assisted considerably by using these parameters as decision criteria controlling the expected production and the uptake of N by the above-ground biomass. Beyond this, maize breeding programs may benefit from these findings through their application during the characterization of genotypic performance within N fertilization levels. In this way the selection of the most efficient genotypes in terms of grain production and/or N uptake may respond to the needs of low N stress tolerant maize varieties.

Vegetation indices derived from RGB images proved to be broad-use because they were previously employed satisfactorily in other crops under biotic and water stress conditions (Casadesus et al., 2007; Vergara-Diaz et al., 2015). Therefore, since this technique has proven its efficiency for the evaluation of plant growth and leaf color, it may be probably applicable to a wide range of biotic and abiotic stresses and crop species. Moreover our study also supports the use of this technique to assess genotypic differences in grain yield under good agronomical conditions.

Although the performance of the RGB indices (obtained from JPEG images) worked well in this study, future research may address the possibility of further improve their accuracy by using input images saved in a lossless compression format as TIFF or PNG. Despite of storage inconvenient, their larger capability (16 bit per pixel instead of 8 bit) may maintain higher quality detail from the visible spectrum. Another important consideration is the effect of changing light conditions when making these

## REFERENCES


outdoor measurements. Despite of the good strength and repeatability of the results (Table S1) fluctuating ambient lighting should be considered as a possible source of error. Further research should also be targeted toward implementation and evaluation of similar RGB phenotyping methods in remotely piloted aerial platforms (Elazab et al., 2016; Rasmussen et al., 2016).

## AUTHOR CONTRIBUTIONS

BP, JC, MZ, and BM managed and directed the maize programme in the Southern Africa regional office of CIMMYT in Harare, Zimbabwe. MZ, PZ, and AH carried out the UAV flights for the obtainment of aerial measurements. On the other hand, JA, BM, JC, MZ, and OV conducted the field measurements and the collection of samples. AH and PZ processed the aerial images. OV analyzed the samples and other data and wrote the paper under the supervision of JA and with contributions from all the other authors.

### ACKNOWLEDGMENTS

This article was supported by grants from the MAIZE CGIAR Research Program and the Project AGL2013-44147-R from the Ministerio de Economía y Competitividad of the Spanish Government. OV is a recipient of a research grant (APIF) sponsored by the University of Barcelona. We thank the personnel from the CIMMYT Southern Africa Regional Office at Harare for their support during the field measurements and sampling. The trials were planted under the Bill and Melinda Gates funded project Improved Maize for Africa Soils. Finally we thank Dr. Jaume Casadesús for providing the BreedPix software.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2016. 00666


area index, nitrogen concentration, and photosynthetic efficiency in agriculture. Remote Sens. Environ. 81, 179–193. doi: 10.1016/S0034-4257(01) 00342-X


European Conference on Precision Agriculture, eds S. Blackmore and G. Grenier (Montpellier), 545–550.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2016 Vergara-Díaz, Zaman-Allah, Masuka, Hornero, Zarco-Tejada, Prasanna, Cairns and Araus. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.