EDITED BY : Jose C. Jimenez-Lopez, Karam B. Singh, Alfonso Clemente, Matthew Nicholas Nelson, Sergio J. Ochatt and Penelope Mary Collina Smith

PUBLISHED IN : Frontiers in Plant Science and Frontiers in Sustainable Food Systems

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-935-9 DOI 10.3389/978-2-88963-935-9

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# LEGUMES FOR GLOBAL FOOD SECURITY

Topic Editors:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Spain

Karam B. Singh, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

Alfonso Clemente, Consejo Superior de Investigaciones Científicas (CSIC), Spain Matthew Nicholas Nelson, Agriculture and Food, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

Sergio J. Ochatt, INRA UMR1347 Agroécologie, France

Penelope Mary Collina Smith, La Trobe University, Australia

Citation: Jimenez-Lopez, J. C., Singh, K. B., Clemente, A., Nelson, M. N., Ochatt, S. J., Smith, P. M. C., eds. (2020). Legumes for Global Food Security. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-935-9

# Table of Contents

*07 Editorial: Legumes for Global Food Security*

Jose C. Jimenez-Lopez, Karam B. Singh, Alfonso Clemente, Matthew N. Nelson, Sergio Ochatt and Penelope M. C. Smith

*12 Development of a Sequence-Based Reference Physical Map of Pea (*Pisum sativum *L.)*

Krishna Kishore Gali, Bunyamin Tar'an, Mohammed-Amin Madoui, Edwin van der Vossen, Jan van Oeveren, Karine Labadie, Helene Berges, Abdelhafid Bendahmane, Reddy V. B. Lachagari, Judith Burstin and Tom Warkentin


Ziwei Zhou, Ido Bar, Prabhakaran Thanjavur Sambasivam and Rebecca Ford


Edelín Roque, Concepción Gómez-Mena, Rim Hamza, José Pío Beltrán and Luis A. Cañas

*72 Research Progress in Membrane Lipid Metabolism and Molecular Mechanism in Peanut Cold Tolerance*

He Zhang, Jiale Dong, Xinhua Zhao, Yumei Zhang, Jingyao Ren, Liting Xing, Chunji Jiang, Xiaoguang Wang, Jing Wang, Shuli Zhao and Haiqiu Yu

*85 Altered Expression of an* FT *Cluster Underlies a Major Locus Controlling Domestication-Related Changes to Chickpea Phenology and Growth Habit*

Raul Ortega, Valerie F. G. Hecht, Jules S. Freeman, Josefa Rubio, Noelia Carrasquilla-Garcia, Reyazul Rouf Mir, R. Varma Penmetsa, Douglas R. Cook, Teresa Millan and James L. Weller

*98 Deciphering Genotype-by- Environment Interaction for Targeting Test Environments and Rust Resistant Genotypes in Field Pea (*Pisum sativum *L.)* Arpita Das, Ashok K. Parihar, Deepa Saxena, Deepak Singh, K. D. Singha,

K. P. S. Kushwaha, Ramesh Chand, R. S. Bal, Subhash Chandra and Sanjeev Gupta

*113 Resistance to Plant-Parasitic Nematodes in Chickpea: Current Status and Future Perspectives*

Rebecca S. Zwart, Mahendar Thudi, Sonal Channale, Praveen K. Manchikatla, Rajeev K. Varshney and John P. Thompson

	- Lorenzo Raggi, Leonardo Caproni, Andrea Carboni and Valeria Negri

Lacey-Anne Sanderson, Carolyn T. Caron, Reynold Tan, Yichao Shen, Ruobin Liu and Kirstin E. Bett

*182 Transcriptional Reprogramming of Pea Leaves at Early Reproductive Stages*

Karine Gallardo, Alicia Besson, Anthony Klein, Christine Le Signor, Grégoire Aubert, Charlotte Henriet, Morgane Térézol, Stéphanie Pateyron, Myriam Sanchez, Jacques Trouverie, Jean-Christophe Avice, Annabelle Larmure, Christophe Salon, Sandrine Balzergue and Judith Burstin


Maria Pazos-Navarro and Parwinder Kaur

*225 Evaluation of Protein and Micronutrient Levels in Edible Cowpea (*Vigna Unguiculata *L. Walp.) Leaves and Seeds*

Felix D. Dakora and Alphonsus K. Belane

*235 Utilization of Interspecific High-Density Genetic Map of RIL Population for the QTL Detection and Candidate Gene Mining for 100-Seed Weight in Soybean*

Benjamin Karikari, Shixuan Chen, Yuntao Xiao, Fangguo Chang, Yilan Zhou, Jiejie Kong, Javaid Akhter Bhat and Tuanjie Zhao


*273 Exploring the Genetic Cipher of Chickpea (*Cicer arietinum *L.) Through Identification and Multi-environment Validation of Resistant Sources Against Fusarium Wilt (*Fusarium oxysporum *f. sp.* ciceris*)*

Mamta Sharma, Raju Ghosh, Avijit Tarafdar, Abhishek Rathore, Devashish R. Chobe, Anil V. Kumar, Pooran M. Gaur, Srinivasan Samineni, Om Gupta, Narendra Pratap Singh, D. R. Saxena, M. Saifulla, M. S. Pithia, P. H. Ghante, Deyanand M. Mahalinga, J. B. Upadhyay and P. N. Harer

*285 Expression Patterns of Key Hormones Related to Pea (*Pisum sativum *L.) Embryo Physiological Maturity Shift in Response to Accelerated Growth Conditions*

Federico M. Ribalta, Maria Pazos-Navarro, Kylie Edwards, John J. Ross, Janine S. Croser and Sergio J. Ochatt

*296 Sustainability Dimensions of a North American Lentil System in a Changing World*

Teresa Warne, Selena Ahmed, Carmen Byker Shanks and Perry Miller

*318 Evaluation and Identification of Promising Introgression Lines Derived From Wild* Cajanus *Species for Broadening the Genetic Base of Cultivated Pigeonpea [*Cajanus cajan *(L.) Millsp.]*

Shivali Sharma, Pronob J. Paul, C.V. Sameer Kumar, P. Jaganmohan Rao, L. Prashanti, S. Muniswamy and Mamta Sharma

*330 Seed Coat Pattern QTL and Development in Cowpea (*Vigna unguiculata *[L.] Walp.)*

Ira A. Herniter, Ryan Lo, María Muñoz-Amatriaín, Sassoum Lo, Yi-Ning Guo, Bao-Lam Huynh, Mitchell Lucas, Zhenyu Jia, Philip A. Roberts, Stefano Lonardi and Timothy J. Close

*342 Biotic and Abiotic Constraints in Mungbean Production—Progress in Genetic Improvement*

Ramakrishnan M. Nair, Abhay K. Pandey, Abdul R. War, Bindumadhava Hanumantharao, Tun Shwe, AKMM Alam, Aditya Pratap, Shahid R. Malik, Rael Karimi, Emmanuel K. Mbeyagala, Colin A. Douglas, Jagadish Rane and Roland Schafleitner

#### *366 Genomics of Plant Disease Resistance in Legumes*

Prasanna Kankanala, Raja Sekhar Nandety and Kirankumar S. Mysore

*386 Genome Wide Association Study and Genomic Selection of Amino Acid Concentrations in Soybean Seeds*

Jun Qin, Ainong Shi, Qijian Song, Song Li, Fengmin Wang, Yinghao Cao, Waltram Ravelombola, Qi Song, Chunyan Yang and Mengchen Zhang

#### *401 Genome-Wide Association Mapping for Agronomic and Seed Quality Traits of Field Pea (*Pisum sativum *L.)*

Krishna Kishore Gali, Alison Sackville, Endale G. Tafesse, V.B. Reddy Lachagari, Kevin McPhee, Mick Hybl, Alexander Mikić, Petr Smýkal, Rebecca McGee, Judith Burstin, Claire Domoney, T.H. Noel Ellis, Bunyamin Tar'an and Thomas D. Warkentin

*420 Physiological Traits for Shortening Crop Duration and Improving Productivity of Greengram (*Vigna radiata *L. Wilczek) Under High Temperature*

Partha Sarathi Basu, Aditya Pratap, Sanjeev Gupta, Kusum Sharma, Rakhi Tomar and Narendra Pratap Singh

#### *438 Genotype × Environment Studies on Resistance to Late Leaf Spot and Rust in Genomic Selection Training Population of Peanut (*Arachis hypogaea *L.)*

Sunil Chaudhari, Dhirendra Khare, Sudam C. Patil, Subramaniam Sundravadana, Murali T. Variath, Hari K. Sudini, Surendra S. Manohar, Ramesh S. Bhat and Janila Pasupuleti


George Vandemark, Samadhi Thavarajah, Niroshan Siva and Dil Thavarajah


Anju Rani, Poonam Devi, Uday Chand Jha, Kamal Dev Sharma, Kadambot H. M. Siddique and Harsh Nayyar

# Editorial: Legumes for Global Food Security

Jose C. Jimenez-Lopez <sup>1</sup> \*, Karam B. Singh<sup>2</sup> , Alfonso Clemente<sup>3</sup> , Matthew N. Nelson<sup>2</sup> , Sergio Ochatt <sup>4</sup> and Penelope M. C. Smith<sup>5</sup>

<sup>1</sup> Department of Biochemistry, Cell & Molecular Biology of Plants, Estación Experimental del Zaidín, Spanish National Research Council (CSIC), Granada, Spain, <sup>2</sup> Agriculture and Food, Commonwealth Scientific and Industrial Research Organization (CSIRO), Perth, WA, Australia, <sup>3</sup> Department of Physiology and Biochemistry of Animal Nutrition, Estación Experimental del Zaidín, Spanish National Research Council (CSIC), Granada, Spain, <sup>4</sup> Agroécologie, AgroSup Dijon, INRAE, Université de Bourgogne, Dijon, France, <sup>5</sup> Legumes for Sustainable Agriculture, School of Life Sciences, La Trobe University, Melbourne, VIC, Australia

Keywords: legumes, food security, legume breeding, sustainable agriculture, climate resilience, genetic resources, environmental stresses and physiology

#### **Editorial on the Research Topic**

#### **Legumes for Global Food Security**

#### Edited by:

Susana Araújo, New University of Lisbon, Portugal

Reviewed by:

Eric Von Wettberg, University of Vermont, United States

#### \*Correspondence:

Jose C. Jimenez-Lopez josecarlos.jimenez@eez.csic.es

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 07 May 2020 Accepted: 05 June 2020 Published: 07 July 2020

#### Citation:

Jimenez-Lopez JC, Singh KB, Clemente A, Nelson MN, Ochatt S and Smith PMC (2020) Editorial: Legumes for Global Food Security. Front. Plant Sci. 11:926. doi: 10.3389/fpls.2020.00926 Global climatic change combined with population growth is imposing a huge pressure on demand for agronomic resources. These effects are major threats for food security having a great impact on the agroecosystem and generating various abiotic and biotic stresses that, in turn, trigger many physiological and metabolic disorders in plants. These stresses reduce crop yields at precisely the time when they need to increase to reach the demands of the increasing population. One of the major scientific and agronomic challenges of this century is to understand and, when possible, withstand stress so that yields are maintained, even under stressful conditions. This special issue brings together a range of scholarly review and research articles focused on legume crops, key components of healthy diets and productive crop rotations. Here we summarize some of the highlights derived from the 36 articles published in this special issue.

Ribalta et al. provided novel information on the impact of growing conditions on the progress of seed development and maturation, and also analyzed the endogenous hormone accumulation across diverse pea genotypes, thereby providing further insights into the mechanism of hormonal regulation of legume seed development and in vitro precocious germination.

Rani et al. have reviewed the literature relevant to the development of climate-resilient chickpea through the exploitation of biotechnological and molecular approaches for the generation of novel genotypes with an improved resistance to extreme temperatures and drought.

Basu et al. took a physiological approach to explore heat tolerance and grain filling in Vigna radiata. They measured response to heat stress during the sensitive reproductive phase over 3 years in two field locations in a panel of 116 accessions. They focused on a subset of 17 contrasting accessions to perform heat stress experiments in controlled glasshouse conditions. The most promising accessions could be distinguished using a set of 11 PCR-based markers. Further work will be required to explore the genetics of heat tolerance during reproduction in this species.

Nair et al. have comprehensively reviewed the abiotic and biotic stresses that affect Vigna radiata, many that will be relevant in a climate change situation, and addressed the challenges for breeding of more resilient lines. Further breeding utilizing the available molecular technologies will be essential to make the most of the advantages of this legume that is an important source of protein for human nutrition.

Working with peanut, Tian et al., examined the effect on salinity tolerance of priming 4-week-old seedlings with the green volatile (Z)-3-Hexeny-1-yl acetate (Z-3-HAC), comparing one salt-sensitive and one salt-tolerant peanut genotype. Z-3-HAC primed seedlings exhibited increased relative water content, net photosynthetic rate, maximal photochemical efficiency of photosystem II and activities of the antioxidant enzymes; moreover, osmolyte accumulation under salt stress and coupled with significantly reduced reactive oxygen species, electrolyte leakage, and malondialdehyde content compared to non-primed plants. Z-3-HAC also increased the total length, surface area, and volume of roots under salt conditions. Thus, Z-3-HAC generated a priming-induced modification of the photosynthetic apparatus, antioxidant systems, osmoregulation, and root morphology protecting the peanut seedlings from salinity.

Working with M. truncatula and tobacco cell suspensions, Elmaghrabi et al. observed that under high NaCl levels both cell and nuclear size decreased but were not useful markers of cell survival under salt stress, while nuclear marginalization, observed for the first time concomitant with salinity in plant cells, could be a novel and helpful morphological indicator for acquisition of salinity tolerance and may be a common response across eukaryotes.

Legume crops are valued in crop rotations in part due to their ability to raise soil N levels through symbiotic nitrogen fixation (SNF) with rhizobial species. Although extreme temperatures may have detrimental effects on growth and development, alfalfa is a legume crop known for its climate-resilience. Liu et al., studied the role of symbiosis with rhizobium on the plant's performance under low temperature stress conditions, by comparing plants with active or inactive nodules or no nodules at all. They found that plant survival was higher in those with active nodules. Irrespective of whether nodules were active or not, nodulated plants accumulated more soluble proteins and sugars, compared to plants without nodules, which exhibited a greater activity of oxidation protective enzymes; rhizobia nodulation enhanced the tolerance of plants to low temperatures through an alteration of the expression of regulatory and metabolism associated genes.

The common use of N fertilizers in modern agriculture has raised the N content of many soils and may have led to weakened selection for SNF efficiency in modern legume breeding. This hypothesis was tested in common bean by Wilker et al., comparing the SNF efficiency and agronomic performance under low soil N levels of 19 modern cultivars (bred under high soil N conditions) to 25 heirloom varieties (bred under lower soil N conditions). There was wide genetic variation for SNF efficiency but on average heirloom bean varieties were not any more SNF efficient than modern cultivars, although the best performer was an heirloom variety. The authors advocated the incorporation of heirloom varieties into modern bean breeding programs.

Phenological adaptation is a key aspect of crop productivity and is highly relevant to food security in the light of changing climates. Furthermore, flowering time is a key trait in breeding and crop evolution, due to its importance for adaptation to different environments and for yield. The molecular control of flowering is now well-understood in the model plant Arabidopsis. However, despite the importance of legumes for food security, there are large gaps in our understanding of how phenology is controlled at the molecular level in legumes. Zhang L. et al. used a transgenic approach to investigate the mode of action of a homologue of the Arabidopsis photoperiod response gene CDF in Medicago. Rather than acting to suppress expression of the floral integrator gene FT via the photoperiod gene CO (as is the case in Arabidopsis), CDF appears to directly suppress FT independently of CO in Medicago.

Ortega et al. analyzed two different inbred populations to examine the genetic control of domestication-related differences in flowering time and growth habit between domesticated chickpea and its wild progenitor Cicer reticulatum. A single major quantitative trait locus for flowering time under shortday conditions [Days To Flower (DTF)3A] was mapped to a 59 gene interval on chromosome three containing a cluster of three FT genes, which collectively showed upregulated expression in domesticated relative to wild parent lines. They point to derepression of this specific gene cluster as a conserved mechanism for achieving adaptive early phenology in temperate legumes.

The exploitation of hybrid vigor is common across many grain and vegetable crops, yet remains under-exploited in legume crops. Hybrid vigor can increase grain yield, broaden adaptation and improves weed competitiveness. A major hindrance to the development of hybrid legume varieties is the lack of malesterility systems for hybrid seed production. In this regard, the production of engineered male sterile plants by expression of a ribonuclease gene under the control of an anther, i.e., ENDOTHECIUM 1 (PsEND1),—or pollen-specific promoter has proven to be an efficient way to generate pollen-free elite cultivars. Roque et al. studied the genetic control of flower development in legumes and several genes that are specifically expressed in a determinate floral organ. Using genetic constructs carrying the PsEND1 promoter fused to the uidA reporter gene and to the barnase gene produces full anther ablation at early developmental stages, preventing the production of mature pollen grains in all plant species tested. Additional effects with interesting biotechnological applications include the redirection of resources to increase vegetative growth, the reduction of the need for deadheading to extend the flowering period and the elimination of pollen allergens in ornamental plants. The PsEND1::barnase-barstar construct could also be useful to generate parental lines in hybrid breeding approaches to produce new cultivars in different legume species.

Heat stress during flowering has a detrimental effect on legume seed yield, mainly due to irreversible loss of seed number. In this regard, Liu et al. provided an overview of the developmental and physiological basis of controlling seed setting in response to heat stress, and showing that the entire seed setting process in legume crops including male and female gametophyte development, fertilization and early seed/fruit development is sensitive to heat stress, particularly male reproductive development.

In pea seeds, an important source of protein for food and feed, N partitioning is a key component for seed quality and yield. Lamure and Munier-Jolain investigated the effect of temperature on N partitioning during seed filling. High temperatures have a significant effect reducing the amount of N in mature seeds. This appears to be a result of reduced sink strength for N and reduced duration of seed filling. N seems not to be efficiently remobilized from leaves, being particularly obvious for nitrate fertilized plants, where although more nitrate was assimilated in high temperatures it was not mobilized into the seeds.

Nutrient remobilization was addressed in another study on pea by Gallardo et al. They combined investigation of N remobilization using <sup>15</sup>N labeling and analysis of transcriptional changes occurring as seed filling progresses to identify possible transcriptional regulators of the process. The authors showed a dynamic remobilization of N from leaves at reproductive and vegetative nodes and later from all organs. Their parallel analysis of the same processes in M. truncatula identified regulatory steps that may be shared by both plants.

Cold damage has become the key limiting factor of early sowing. Zhang H. et al. reviewed membrane lipid metabolism and its molecular mechanism, as well as lipid signal transduction in peanut (Arachis hypogaea L.) under cold stress to build a foundation for explaining lipid metabolism regulation patterns and physiological and molecular response mechanisms during cold stress and to promote the genetic improvement of peanut cold tolerance.

The multidimensional nature of plant-pathogen interactions and the production of disease-resistant crop plants that are resilient to climate change are major agricultural challenges currently under thorough investigation. The manuscript by Kankanala et al. reviewed how genomic approaches are increasing our understanding of plant-pathogen interactions in legumes. This is an important and timely review given the major losses most legume crops face annually due to disease issues. They comprehensively covered a range of topics in terms of legume crops and diverse pathogens, with a major focus on transcriptomic studies. These studies have greatly expanded in recent years due to the increasing affordability of next generation sequencing approaches and the production of reference genomes for many legume crops, helping to identify a number of potentially key genes for both resistant and susceptible interactions. The review also looked at how genomic approaches will facilitate breeding for resistance to pathogens in legumes by describing some of the molecular tools to incorporate defense related traits into breeding programs.

The manuscript by Nay et al. analyzed disease resistance in common bean to angular leaf spot, an important disease worldwide that is caused by the fungal pathogen, Pseuocercospora griseola. They looked at 316 common bean lines representing a diversity set, under both glasshouse and field conditions, with the latter taking place at multiple sites in South America and Africa. They used genotyping by sequencing and genome wide association mapping to study the response of the common bean lines to different races of the pathogen. In contrast to an earlier work, which had identified 5 significant resistant loci, this comprehensive study found only 2 to be important, Phg-2 and Phg-4, with Phg-2 being effective against multiples races.

Das et al. assessed in field pea genotypes the magnitude of environmental and genotype-by-environment interaction on the resistance against rust notably influenced by environmental factors. They identified various "ideal" genotypes as IPF-2014- 16, KPMR-936 and IPF-2014-13, which can be recommended for release and exploited in a resistance breeding program for the region confronting field pea rust.

Chaudhari et al. screened a set of 340 diverse peanut genotypes for LLS and rust resistance and yield traits across three locations in India under natural and artificial disease epiphytotic conditions. The study revealed significant variation among the genotypes for LLS and rust resistance in different environments. These data revealed significant environment (E) and genotype × environment (G×E) interactions for both diseases indicating differential response of genotypes in different environments. Pod yield increase as a consequence of resistance to foliar fungal diseases suggests the possibility of considering "foliar fungal disease resistance" as a must-have trait in all the peanut cultivars that will be released for cultivation in rainfed ecologies in Asia and Africa.

Zhou et al. investigated differential expression of 10 Resistance Gene Analogs (RGAs), which are key factors in the recognition of plant pathogens and the signaling of inducible defenses, among cultivated chickpea varieties which are resistant or susceptible to the foliar disease Ascochyta blight caused by the fungus Ascochyta rabiei (syn. Phoma rabiei). They found significant differential expression of four RGAs that were consistently upregulated in the most resistant genotype, ICC3996, immediately following inoculation, when spore germination began and ahead of penetration into the plant's epidermal tissues. These represent clear targets for future functional validation and potential for selective resistance breeding for introgression into elite cultivars.

Nair et al., reviewed the progress and potential for genetic improvement of mung bean for resistance to biotic stress including fungal and bacterial pathogens, viruses and insects and as for their analysis of abiotic stress discussed the constraints to breeding to overcome these pests and pathogens.

The manuscript by Zwart et al. provided a detailed review of resistance to nematodes in chickpea. A range of nematode pests are major problems for chickpea with combined annual yield losses of around 14% from root-knot, cyst and root-lesion nematodes. Resistance to these nematode species in cultivated chickpea (Cicer arietinum) is limited due to the narrow genetic diversity but, as detailed in this comprehensive review, good levels of resistance exist in a number of wild chickpea species. However, barriers to interspecific hybridization hinder the use of some of these wild species as sources of nematode resistance, although others such as C. reticulatum and C. echinospermum have been valuable sources of nematode resistance genes as well. The review also discusses the use and potential of genomeassisted breeding strategies to improve nematode resistance in chickpea.

Genetic and genomic resources of grain legumes are strategic and valuable tools currently under forefront research worldwide, bringing the knowledge and opportunity to facilitate the identification of specific germplasm, trait mapping and allele mining to more effectively develop biotic and abiotic-stressresistant and high quality grains for food and feed.

One of the main yield-determining traits under stress conditions is seed weight. In this sense, Karikari et al. have studied the genetic basis of 100-seed weight for the development of new improved soybean cultivars. They evaluated a recombinant inbred line (NJIR4P) in four different environments by using a high density interspecific linkage map which allowed them to detect 19 stable QTLs distributed on 12 chromosomes in all individual environments plus combined environments, seven of which were minor (R <sup>2</sup> < 10%) but novel, while eight were stably identified in more than one environment. Of the 12 QTLs detected in this study which co-localized with earlier reported ones having narrow genomic regions, only 2 QTLs were major (R <sup>2</sup> > 10%). Beneficial alleles of all identified QTLs were derived from cultivated soybean parent (Nannong4931). Based on PANTHER (Protein ANalysis through Evolutionary Relationships), gene annotation information, and literature searches, 29 genes within 5 stable QTLs were predicted to be possible candidate genes regulating seed-weight/size in soybean. Although their role in seed development needs further validation, this work underlined the considerable scope still available for the genetic improvement of 100-seed weight in soybean using candidate gene mining and subsequent marker-assisted breeding.

Pea has been studied as genetic model since the Eighteenth century, with key contributions to genetics and the development of the basic principles of heredity. The pea genome is characterized by its large size (∼4.45 gigabases) of which ∼85% is comprised of highly repetitive sequences. In this topic, Gali, Tar'an et al., reports the construction of a sequence-based physical map of the pea genome using whole-genome profiling (WGP). This study reports a very valuable dataset that will provide a framework to obtain a reference pea genome sequence to further explore the genes governing major traits, including those influencing seed yield and seed quality.

Raggi et al. focused on the genetic control of phenology in common bean. They recorded flowering date in a panel of 192 inbred lines developed from diverse European landraces at two sites over two seasons. They genotyped the panel using a RADseq approach and performed a genome wide association study (GWAS) to identify seven candidate genes that could potentially be used as selective markers to finely control flowering in bean breeding programs.

Gali, Sackville et al. reported on studies using genome-wide association studies (GWAS) in field pea. They analyzed 135 pea accessions from a range of countries across all continents. The focus was on agronomic and seed related traits and the accessions were first characterized using genotyping-by-sequencing (GBS) from which a final set of 16,877 high quality SNPs were selected for marker-trait association analysis. This led to the identification of many SNPs with significant association with specific traits. In some cases, these mapped to QTLs previously associated with the specific trait. Overall, this large study generated resources that have potential use in marker-assisted selection for accelerating pea cultivar improvements.

Sanderson et al. described their online legume genetic and genomic resource, KnowPulse, which they have developed to serve legume breeding and genetic research communities. The database hosts phenotypic, genotypic and genomic data for chickpea, common bean, field pea, faba bean and lentil, which can be queried by a range of visualization and data exploration tools. Built on an open-source platform, it is amenable to community collaboration, which will help to ensure its ongoing relevance and usefulness. This kind of resource is vital for linking independent studies and deriving maximum value from costly datasets generated globally.

Legumes play an important role in the sustainability of agricultural and food systems, contributing to soil fertility and environmental protection, as well as to food safety and nutrition. Under this framework, the perception of Lens culinaris producers and consumers of North America has been evaluated by Warne et al., following agronomic, economic, and nutritional criteria. In a survey of producers, the main agroeconomic reason to introduce lentil in production systems was to diversify crop rotation in order to capitalize on dryland production and serve as a cash crop. Diversifying crop rotation improves agricultural system robustness, increasing system resistance to biotic stresses and resilience to abiotic disturbances favoring the constancy of crop productivity. According to consumers' perception, the main reasons to include lentil in their eating habits are to improve nutrition, the satiety feeling after intake and to support a plant-based diet. In agreement with that, lentils like other legumes are considered good sources of proteins, starch, fiber, vitamins, and minerals. Scientific evidence has demonstrated that carbohydrates resistant to digestion are the major factors responsible for both low glycaemic index of legume foods and consumers' feeling of satiety. Finally, lentils might take part as a component of a plant-based diet being an inexpensive and rich source of high-quality proteins to assure a balanced and healthy diet. Interestingly, the growing interest of consumers and nonconsumers to increase lentil consumption seems to be based on environmental, economic and nutritional reasons. Suitable policy actions might help to address emerging challenges and concepts and open future opportunities in order to promote cultivation and increase lentil consumption.

The review manuscript by Ojiewo et al., looked at advances in research for nutritional quality and health benefits of groundnut (Arachis hypogaea L.). Groundnut is an important global crop both from a food point of view and for valuable levels of oil. This substantial review focused on breeding and genetic engineering approaches to improve various traits in groundnut including aflatoxin resistance, allergen issues and increasing oleic acid levels. The review also discussed important social approaches that are needed in this area and current progress including the ongoing efforts to improve distribution of good quality seed to small stakeholder farmers in many parts of the world.

Consumers of pulse crops in many markets highly value the appearance of the grain, with seed coat color and patterning being key traits. Herniter et al. investigated the genetics of seed coat patterning in cowpea using quantitative trait locus (QTL) and candidate gene approaches. They identified three loci with candidate genes (basic helix–loop–helix (bHLH), WD-repeat and E3 ubiquitin ligase genes) and developed a model to show how they interact to give the observed seed coat patterning.

Furthermore, Dakora and Belane identifying cowpea genotypes that can enhance protein accumulation and micronutrient density in edible leaves and seed through breeding has the potential to overcome protein-calorie malnutrition and trace element deficiency in rural Africa.

Taken together, the 36 articles reported in this special issue represent a substantial contribution to the advancement in our understanding and breeding of climate-resilient legumes, and we hope will lead to improved global food security in the longer term.

# AUTHOR CONTRIBUTIONS

JJ-L, KS, AC, MN, SO, and PS have written, reviewed and edited the original draft. All authors have approved the final manuscript.

# FUNDING

This work was supported by MINECO — Spanish Government grant ref.: BFU2016-77243-P, Ramón y Cajal RYC-2014-16536 to JJ-L and by European Research Program MARIE CURIE (FP7-PEOPLE-2011-IOF) grant ref.: PIOF-GA-2011-301550 to JJ-L and KS; by the Grains Research and Development Corporation, including current project #9176622 to KS; by the MINECO-AEI (ERDF co-financed grant AGL2017-83772-R) to AC; and by the grant ARC Industrial Transformation Research HUB IH140100013 Legumes for Sustainable Agriculture to PS.

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer EW declared a past co-authorship with one of the authors MN to the handling editor.

Copyright © 2020 Jimenez-Lopez, Singh, Clemente, Nelson, Ochatt and Smith. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpls-10-00323 March 13, 2019 Time: 18:18 # 1

# Development of a Sequence-Based Reference Physical Map of Pea (Pisum sativum L.)

Krishna Kishore Gali<sup>1</sup> , Bunyamin Tar'an<sup>1</sup> , Mohammed-Amin Madoui<sup>2</sup> , Edwin van der Vossen<sup>3</sup> , Jan van Oeveren<sup>3</sup> , Karine Labadie<sup>3</sup> , Helene Berges<sup>4</sup> , Abdelhafid Bendahmane<sup>5</sup> , Reddy V. B. Lachagari<sup>6</sup> , Judith Burstin<sup>7</sup> and Tom Warkentin<sup>1</sup> \*

<sup>1</sup> Crop Development Centre, University of Saskatchewan, Saskatoon, SK, Canada, <sup>2</sup> Atomic Energy and Alternative Energies Commission (CEA), Genomics Institute (IG), Évry, France, <sup>3</sup> Keygene N.V., Wageningen, Netherlands, <sup>4</sup> INRA-CNRGV, Castanet-Tolosan, France, <sup>5</sup> INRA/CNRS – URGV, Évry, France, <sup>6</sup> AgriGenome Labs Pvt. Ltd., BTIC, MN iHub, Shamirpet, India, <sup>7</sup> J. Burstin, INRA, UMRLEG, Dijon, France

#### Edited by:

Alfonso Clemente, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Steven B. Cannon, Agricultural Research Service (USDA), United States Martin Mascher, Leibniz-Institut für Pflanzengenetik und Kulturpflanzenforschung (IPK), Germany

> \*Correspondence: Tom Warkentin tom.warkentin@usask.ca

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 24 January 2019 Accepted: 28 February 2019 Published: 15 March 2019

#### Citation:

Gali KK, Tar'an B, Madoui M-A, van der Vossen E, van Oeveren J, Labadie K, Berges H, Bendahmane A, Lachagari RVB, Burstin J and Warkentin T (2019) Development of a Sequence-Based Reference Physical Map of Pea (Pisum sativum L.). Front. Plant Sci. 10:323. doi: 10.3389/fpls.2019.00323 Whole genome profiling (WGP) is a sequence-based physical mapping technology and uses sequence tags generated by next generation sequencing for construction of bacterial artificial chromosome (BAC) contigs of complex genomes. The physical map provides a framework for assembly of genome sequence and information for localization of genes that are difficult to find through positional cloning. To address the challenges of accurate assembly of the pea genome (∼4.2 GB of which approximately 85% is repetitive sequences), we have adopted the WGP technology for assembly of a pea BAC library. Multi-dimensional pooling of 295,680 BAC clones and sequencing the ends of restriction fragments of pooled DNA generated 1,814 million high quality reads, of which 825 million were deconvolutable to 1.11 million unique WGP sequence tags. These WGP tags were used to assemble 220,013 BACs into contigs. Assembly of the BAC clones using the modified Fingerprinted Contigs (FPC) program has resulted in 13,040 contigs, consisting of 213,719 BACs, and 6,294 singleton BACs. The average contig size is 0.33 Mbp and the N<sup>50</sup> contig size is 0.62 Mbp. WGPTM technology has proved to provide a robust physical map of the pea genome, which would have been difficult to assemble using traditional restriction digestion based methods. This sequence-based physical map will be useful to assemble the genome sequence of pea. Additionally, the 1.1 million WGP tags will support efficient assignment of sequence scaffolds to the BAC clones, and thus an efficient sequencing of BAC pools with targeted genome regions of interest.

Keywords: bacterial artificial chromosome, fingerprinted contigs, Pisum sativum, sequence-based physical map, whole genome profiling

# INTRODUCTION

Field pea (Pisum sativum L.) is an important grain legume crop, which was domesticated ∼7000 years ago (Ambrose, 1995; Abbo et al., 2010). The crop is valuable both for human nutrition and as animal feed. Gregor Mendel, the father of genetics, used pea as a model plant to uncover the fundamental principles of inheritance mainly because of the easily observable phenotypes and

**12**

fpls-10-00323 March 13, 2019 Time: 18:18 # 2

genotypes. However, understanding of quantitative traits and use of genomic tools for breeding is partly restricted by the large expected genome size of 3,947 to 4,397 Mbp/1C (Arumuganathan and Earle, 1991) and the occurrence of highly repetitive sequences in the pea genome. It is estimated that ∼85% of the pea genome is of repetitive sequences (Murray et al., 1978). The majority of pea repetitive DNA is made of LTR retrotransposons, which alone were estimated to contribute to 20–33% of the genome (Macas et al., 2007). In the current study, we have undertaken construction of a sequence-based physical map of pea to address the challenge in the assembly of these repetitive sequences and overcome the shortcomings of traditional restriction digestion based physical maps.

Whole genome profiling (WGP) is a sequence-based physical mapping technology for construction of bacterial artificial chromosome (BAC) contigs of complex genomes (van Oeveren et al., 2011). WGP technology is based on generation of short sequence tags from terminal ends of restriction fragments of individual BAC clones, followed by assembly of BAC clones into contigs based on shared regions containing identical sequence tags. WGP is designed based on the use of sequence tags generated by next generation sequencing (NGS) and is a powerful alternative to traditional DNA fingerprinting based physical mapping technologies, and also simultaneously generates a partial genome sequence. Two-dimensional or multi-dimensional BAC clone pooling is an effective strategy for DNA preparation and sequencing to reduce the costs of sample preparation. The sequence-based physical map also provides information for localization of genes that are difficult to find through positional cloning. WGP was initially tested in Arabidopsis thaliana by using ∼6,100 BAC clones and the assembly order of BAC contigs was verified with the genome sequence, wherein 98% of the BAC clones were assembled correctly (van Oeveren et al., 2011). Following this validation, WGP was used to generate sequence-based physical maps and genome assembly of ∼30 crop species (Ariyadasa and Stein, 2012; Sierro et al., 2013). WGP has been used for generation of physical maps of some individual wheat chromosomes, whose sequences are highly complex and repetitive (Philippe et al., 2012; Poursarebani et al., 2014). Recently, WGP technology was adopted by the International Wheat Genome Sequencing Consortium to generate new sequence information that will improve the quality and utility of physical maps for 15 chromosomes<sup>1</sup> . To address the challenges of accurate assembly of the massive and complex pea genome, we as part of international pea genome sequencing consortium adopted in the current study the WGP technology for assembly of pea BAC clones into a physical map.

#### MATERIALS AND METHODS

#### BAC Libraries

A total of 295,680 BAC clones derived from pea cv. Cameor available at the CNRGS, Toulouse, France, with an average insert

<sup>1</sup>www.wheatgenome.org

size of 95 Kb and approximately 6.7-fold genome coverage were used to construct a sequence-based physical map<sup>2</sup> .

#### Whole Genome Profiling Generation of BAC Sequence Tags

The BAC clones were subjected to WGP as described by van Oeveren et al. (2011). Pooling of BAC clones and DNA extraction was done by Amplicon Express (Pullman, WA, United States). BAC clones stored in 384-well plates were pooled in a threedimensional format, into row, column, and split-box pools, with each pool type consisting of 48, 48 and 64 clones, respectively. Illumina grade BAC DNA (high concentration and low E. coli) was extracted from the pooled BAC clones using an optimized alkaline lysis method. The DNA was digested with HindIII and MseI restriction enzymes, ligated with Illumina adaptor sequences containing barcode sequences as sample identification tags and were PCR amplified. The PCR products were pooled, cluster amplified and amplicons were then sequenced from the HindIII restriction site end using the Illumina HiSeq2000 with 100 nt read length. The reads were processed for identification of barcodes and assigned to BAC pools followed by deconvolution, a process to assign sequence reads as WGP tags to individual BAC clones. Deconvolution was successful when the WGP tag was detected in exactly one of each of the three dimensions of the BAC pools. WGP tags were filtered for sequencing quality and used for contig analysis.

#### Physical Map Construction

A total of 825 million sequence tags were generated by WGP, of which 1.11 million tags were unique (**Supplementary Table S1**) and corresponded to 220,013 BACs (**Supplementary Table S2**). The unique sequence tags were used for construction of the physical map. These sequences tagged BACs were used to generate SuperBACs, by grouping all individual BACs with 75% or more similarity, using an improved version of Fingerprinted Contigs Software (FPC; KeygeneTM). FPC was initially developed for analyzing BAC restriction fragment based fingerprint data (Soderlund et al., 1997), and the improved version is capable of processing sequence-based BAC fingerprint data. WGP tags from all the grouped BACs were assigned to the SuperBACs. WGP tags were converted into numbers to yield pseudo restriction fragment sizes for analysis using FPC to generate contigs based on BAC clone overlap. The genome coverage of BAC clones, mean contig size, and N<sup>50</sup> contig size were calculated in million base pairs (Mbp) by multiplying FPC band units and the mean distance between two WGP tags.

#### RESULTS

#### WGP Tag Generation

Multi-dimensional pooling of the 295,680 BAC clones and sequencing the ends of restriction fragments of pooled DNA generated 825 million deconvolutable reads, which constituted 45.5% of the total number of 1814 million high quality reads

<sup>2</sup>http://cnrgv.toulouse.inra.fr/layout/set/print/Library/Pea

TABLE 1 | Summary of whole genome profiling (WGP) input parameters and sequence data processing.

#### WGP parameter

fpls-10-00323 March 13, 2019 Time: 18:18 # 3


sequenced (**Table 1**). The deconvolutable reads yielded 1.11 million unique WGP tags and the average number of reads per tag was 96.6. The first 51 nucleotide sequence of the unique sequence tags are presented in **Supplementary Table S1**. These WGP tags were tagged to 220,013 BACs (**Supplementary Table S2**) with an average of 28.7 tags generated per BAC.

#### Physical Map Construction

The WGP tag data of 1.11 million tags tagged to 220,013 BAC clones was used to assemble individual BAC clones into contigs and superBACs using the modified FPC software (Keygene N.V.), capable of processing sequence-based BAC fingerprint data instead of fragment mobility information as used in the original FPC software (Soderlund et al., 1997). A cut-off value of 1e−<sup>50</sup> was used initially to assemble the contigs. The cut-off value was reduced step-by-step and a final cut-off value of 1e−<sup>01</sup> has resulted in 13,040 BAC contigs and 6294 BAC singletons. The number of BACs in each of the 13,040 contigs was listed in **Supplementary Table S3** and the BACs in each contig were listed in **Supplementary Table S4**. The estimated N<sup>50</sup> contig size was 42 BACs and average contig size was 0.329 Mbp. As an example, **Figure 1** shows the largest contig in the assembly,

FIGURE 1 | Part of the largest contig in the assembly (Ctg 2178) based on number of BACs and tags. The BACs are ordered to their position in the contig. Horizontal lines indicate relative BAC length and positioning of the lines indicates relative position and degree of overlap between BACs. The scale at the bottom represents the consensus band (CB) scale units. (A) Only non-buried BACs are shown, i.e., a semi-minimal tiling path, meaning that BACs which overlap largely with another BAC in the contig are not displayed. BACs indicated with a <sup>∗</sup> indicate the presence of one or more buried BACs at this position. (B) Part of the same contig in the assembly showing all the buried BACs. Buried BACs are marked with = or ∼, where = means identical and ∼ means nearly identical. The figures are in CB units; the length of the entire contig is 1532 CB units.

fpls-10-00323 March 13, 2019 Time: 18:18 # 4

selected based on number of BACs and tags. The BACs are ordered to their position in the contig. Horizontal lines indicate relative BAC length and positioning of the lines indicates relative position and degree of overlap between BACs. In **Figure 1** (A) only non-buried BACs are shown, i.e., BACs which overlap with another BAC in the contig are not displayed, while **Figure 1** (B) shows the same contigs but with all the buried BACs included. The FPC output file was included as **Supplementary File S1**, which can be opened in FPC program available at http: //www.agcol.arizona.edu/software/fpc/ to view the diagrammatic representation of each contig including the representing BACs and their sequence overlaps.

The estimated span of the BAC physical map was 4294 Mbp, which is the same as the total estimated size of the pea genome (**Table 2**). After the deconvolution and filtering of the WGP tags, 27.7% of the BAC clones sequenced were not represented in the contig assembly. The parameters of physical map assembly are presented in **Table 2**.

#### DISCUSSION

The two major steps involved in traditional physical map construction, restriction digestion-based fingerprinting severalfold genome equivalents of BAC clones, and their assembly into contigs, are highly intensive and error prone for a genome as large as pea. Several improvements have been made in BAC fingerprinting techniques (Luo et al., 2003) and contig assembly (Frenkel et al., 2010). The introduction of sequence-based WGP technology for physical map construction has made it possible to tag a large number of BAC clones based on short reads generated on NGS platforms and increase the accuracy of contig assembly (van Oeveren et al., 2011). This technology is particularly useful for large genomes with an abundance of repetitive DNA.

TABLE 2 | Whole genome profiling (WGP) metrics for the pea physical map construction using a 50 nt tag length and standard stringency.


<sup>1</sup>This is the mean number of BACs per contig. <sup>2</sup>This number indicates that more than 50% of the contig coverage comprises contigs with at least this number of BACs. <sup>3</sup>This number is the mean contig size in million base pairs. <sup>4</sup>This number indicates that more than 50% of the contig coverage comprises at least this number of million base pairs. <sup>5</sup>The coverage estimate is based upon multiplication of FPC band units of all contigs with the estimated average distance between two tags. Due to this multiplication, the accuracy of the estimated average distance between two tags has a large impact on the result.

Comparison of WGP sequence tags may also provide important biological information such as determination of ancestral origin of polyploids (Sierro et al., 2013).

The parameters of the pea physical map assembly developed here are comparable to WGP-based physical maps of other crops, i.e., the average number of WGP tags per BAC clone (28.7) generated in this study and the percent of BAC clones represented in the contig assembly (72.3%) were comparable with WGP profiling of other complex genomes such as wheat (Poursarebani et al., 2014). Three contigs per Mbp were detected in the current physical map, in comparison to 2.2, 2.6 and 3.1 contigs per Mbp reported in tobacco (Sierro et al., 2013), tomato, and potato (De Boer et al., 2011), respectively. In the pea physical map assembly, the average number of BACs per contig is 16.4 and the average contig size is 0.33 Mbp in comparison to 34 BACs and 0.46 Mbp in tobacco (Sierro et al., 2013).

The size of the current WGP-based physical map assembly corresponded with the estimated genome size of pea. The significance of this research includes the use of a large number of BAC clones, ∼220,000, in WGP assembly and building a contig assembly near the estimated genome size of 4.2 GB, considering the high proportion of repetitive sequences. It is to be noted that the span of physical map is similar to the estimated size of the pea genome though 27.7% of the BAC clones sequenced were not represented in the contig assembly. This could be because of the physical gaps between the FPC contigs which will subsequently be verified in comparison with genetic linkage maps and genome sequence. It is also possible that vast majority of the unassembled 27.7% BAC clones were chimeric BACs and are represented by the BACs in contig assemblies in various proportions.

In this research, we have constructed a high quality physical map of pea based on WGP with the assembly parameters comparable to WGP assembly of other crops. Since the map is based on sequenced DNA tags, the physical map provides the skeleton framework for anchoring the genome sequence to obtain a high quality reference genome sequence to explore the genes governing traits and to study the genome features. The recent improvements of optical mapping of genomes in nanochannel arrays (Bionano) (Lam et al., 2012) and "Chicago" method based on in vitro reconstituted chromatin (Putnam et al., 2016) are further advancements to support physical mapping and sequence assembly in complex genomes and provide substantial improvement in the N<sup>50</sup> contig size. Using the Bionano approach, Staòková et al. (2016) obtained contigs of the short arm of chromosome 7D (7DS; 381 Mb) of bread wheat, with a N<sup>50</sup> value of 1.3 Mb, and identified ∼800 kb array of tandem repeats.

We have provided information of all the WGP tags in **Supplementary Table S1** and the BACs corresponding to these tags are shown in **Supplementary Table S2**. The map is accessible through the .FPC file (**Supplementary File S1**), and users can view it in FPC output format, by using FPC software. This information will assist users to navigate and identify the BAC clones of their interest. The international consortium for pea genome sequencing is using the WGP-based physical map in conjunction with Bionano optical mapping to anchor and improve the complex genome sequence of pea.

#### DATA AVAILABILITY

fpls-10-00323 March 13, 2019 Time: 18:18 # 5

The datasets generated for this study can be found in bioRxiv, doi: 10.1101/518563.

#### AUTHOR CONTRIBUTIONS

TW, BT, JB, and EvdV designed the study. JvO and KL performed the sequence and FPC analysis. HB and AB provided the BACs. KG drafted the manuscript. RL contributed to data analysis. All authors contributed to the manuscript review.

#### FUNDING

The study was funded by Saskatchewan Pulse Growers (SPG).

#### REFERENCES


## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00323/ full#supplementary-material

TABLE S1 | Unique sequence tags identified by sequencing ends of restriction fragments of 295,680 BAC clones.

TABLE S2 | BAC clones corresponding to the unique sequence tags identified by sequencing ends of restriction fragments of 295,680 BAC clones.

TABLE S3 | Number of BAC clones in each contig built based on the sequence similarities of unique sequence tags.

TABLE S4 | Distribution of BAC clones in contigs built based on the sequence similarities of unique sequence tags.

FILE S1 | Fingerprinted Contig output file to visualize all the BAC contigs and overlap of each BAC in the reported contigs.

mapping and sequencing in the highly complex and repetitive wheat genome. BMC Genomics 13:47. doi: 10.1186/1471-2164-13-47


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Gali, Tar'an, Madoui, van der Vossen, van Oeveren, Labadie, Berges, Bendahmane, Lachagari, Burstin and Warkentin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Effect of Rhizobium Symbiosis on Low-Temperature Tolerance and Antioxidant Response in Alfalfa (Medicago sativa L.)

Yu-Shi Liu<sup>1</sup> , Jin-Cai Geng<sup>2</sup> , Xu-Yang Sha<sup>1</sup> , Yi-Xin Zhao<sup>1</sup> , Tian-Ming Hu<sup>1</sup> \* and Pei-Zhi Yang<sup>1</sup> \*

<sup>1</sup> Department of Grassland Science, College of Animal Science and Technology, Northwest A&F University, Yangling, China, <sup>2</sup> Shaanxi Grassland Workstation, Xi'an, China

#### Edited by:

Sergio J. Ochatt, INRA UMR1347 Agroécologie, France

#### Reviewed by:

Abdelali Hannoufa, Agriculture and Agri-Food Canada (AAFC), Canada Yongzhen Pang, Institute of Animal Science (CAAS), China

#### \*Correspondence:

Pei-Zhi Yang yangpeizhi@126.com Tian-Ming Hu hutianming@126.com

#### Specialty section:

This article was submitted to Plant Physiology, a section of the journal Frontiers in Plant Science

Received: 15 January 2019 Accepted: 08 April 2019 Published: 30 April 2019

#### Citation:

Liu Y-S, Geng J-C, Sha X-Y, Zhao Y-X, Hu T-M and Yang P-Z (2019) Effect of Rhizobium Symbiosis on Low-Temperature Tolerance and Antioxidant Response in Alfalfa (Medicago sativa L.). Front. Plant Sci. 10:538. doi: 10.3389/fpls.2019.00538 Low temperature-induced stress is a major environmental factor limiting the growth and development of plants. Alfalfa (Medicago sativa L.) is a legume well known for its tolerance of extreme environments. In this study, we sought to experimentally investigate the role of rhizobium symbiosis in alfalfa's performance under a low-temperature stress condition. To do this, alfalfa "Ladak+" plants carrying active nodules (AN), inactive nodules (IN), or no nodules (NN) were exposed to an imposed low temperature stress and their survivorship calculated. The antioxidant defense responses, the accumulation of osmotic regulation substances, the cell membrane damage, and the expression of low temperature stress-related genes were determined in both the roots and the shoots of alfalfa plants. We found that more plants with AN survived than those with IN or NN under the same low temperature-stress condition. Greater activity of oxidation protective enzymes was observed in the AN and IN groups, conferring higher tolerance to low temperature in these plants. In addition, rhizobia nodulation also enhanced alfalfa's ability to tolerate low temperature by altering the expression of regulatory and metabolismassociated genes, which resulted in the accumulation of soluble proteins and sugars in the nodulated plants. Taken together, the findings of this study indicate that rhizobium inoculation offers a practical way to promote the persistence and growth potential of alfalfa "Ladak+" in cold areas.

Keywords: alfalfa, low-temperature tolerance, antioxidant response, rhizobium symbiosis, low temperature regulated genes

# INTRODUCTION

Alfalfa (Medicago sativa L.) is a widely cultivated forage crop of substantial economic value that possesses excellent agricultural traits, in that its roots can fix atmospheric nitrogen molecules with the help of symbiotic rhizobia (Hou et al., 2013; Quan et al., 2016; Zhang W. et al., 2017). In turn, the endogenous nitrogen pool accumulated in the root system may enhance the cold-tolerance ability

**Abbreviations:** Cas, the cold acclimation specific protein; CAT, catalase; CBF2, a transcription factor related to the cold tolerance; CorF the coding gene for galactinol synthase; MDA, malondialdehyde; POD, peroxidase; ProDH, the coding gene for proline dehydrogenase; qRT-PCR, quantitative real-time PCR; ROS, reactive oxygen species; SOD, superoxide dismutase.

of this plant (Dhont et al., 2006). Yet, little research has actually investigated the effect of rhizobium symbiosis on alfalfa's tolerance of low temperatures. Earlier work demonstrated that inoculated rhizobia improved the productivity and survival of legumes under low temperature conditions (Prévost et al., 1999, 2003). And elevated CO<sup>2</sup> has been shown to stimulate rhizobiuminoculation alfalfa growth and to reduce this plant's freezing tolerance (Bertrand et al., 2007).

Low temperatures adversely affect crop survival, growth, and productivity (Dimkpa et al., 2009; Chinnusamy et al., 2010), typically prompting various physiological and biochemical changes in plants, including alterations to membrane permeability and enzyme activities (Hu et al., 2016). A low temperature elicits the generation of reactive oxygen species (ROS), such as hydrogen peroxide (H2O2), superoxide radical (O<sup>2</sup> •−), and hydroxyl radical (OH• ), causing severe oxidative stress. Oxidative damage not only results in the oxidation of cellular components, which leads to protein dysfunction and DNA damage, but it also harms the cell membrane's lipids. As a key negative product in the signaling network of plants' stress responses, ROS can disrupt the plant membrane structure then generate malondialdehyde malondialdehyde (MDA), a by-product that is highly reactive and able to cause secondary oxidative damage (Gill and Tuteja, 2010; Erdal et al., 2015; Chen et al., 2016; Jin et al., 2016).

To alleviate this oxidative damage, plants have evolved protective enzymatic defense systems to detoxify ROS and to reduce oxidative stress, such as those relying on peroxidase (POD, EC 1.11.1.7), superoxide dismutase (SOD, EC 1.15.1.1), and catalase (CAT, EC 1.11.1.6). These traditional antioxidant enzymes work together to detoxify ROS, but mounting evidence also suggests proline can function as a non-enzymatic antioxidant by regulating osmosis and detoxifying ROS (Trovato et al., 2008). Moreover, the accumulation of osmotic adjustment substances, namely soluble proteins and sugars, can further contribute to the tolerance of low temperature-induced stress in plants (Castonguay et al., 1995).

Under stressful environmental conditions, crop plants activate their stress-responsive genes involved in ROS homeostasis regulation. Examples include ProDH, encoding proline dehydrogenase, being downregulated when a plant experiences drought or low temperature (Xin and Browse, 1998); the CorF gene, encoding galactinol synthase, which regulates osmosis and maintains membrane integrity by controlling the synthesis of soluble sugars (Liu et al., 2016); Cas, a member of the dehydrin protein family, produced in response to cold or drought stress (Monroy et al., 1993; Wolfraim and Dhindsa, 1993); and CBF2, a transcription factor related to the low- temperature tolerance of plants (Thomashow, 2010; Shu et al., 2017).

Because of its low level of autumn dormancy, the alfalfa "Ladak+" cultivar is often planted in northwestern China, especially in the deserts of Xinjiang. But the air temperatures in these deserts varies greatly from day to night, often reaching highs of 35◦C and lows of −6 ◦C; this clearly imposes a stress upon these crop plants. We hypothesized that rhizobium symbiosis may improve the low-temperature tolerance of alfalfa "Ladak+" by affecting its physiological and biochemical processes. To test this, we evaluated the tolerance to cold of alfalfa "Ladak+" plants with active nodules (AN), inactive nodules (IN), and no nodules (NN), by comparing their respective survival and electrical conductivity under 0 and −6 ◦C, activity of antioxidative enzymes, alterations in osmolyte adjustment, and the expression profiles of low temperature-related genes at 0◦C.

# MATERIALS AND METHODS

# Plants, Their Growing Conditions, and Nodulation Treatments

The seeds of alfalfa (Medicago sativa) (Ladak+, United States) were first rinsed with 70% ethanol for 30 s, then with a 0.5%- NaClO solution for 15 min, and finally thrice with sterile water. All seeds were germinated on wet filter paper in Petri dishes for 5 days in a plant growth chamber at 25◦C and 70% relative humidity, with a 16 h photoperiod.

Six days old seedlings were then individually transplanted into plastic conical pots containing sterilized silica sand (100 mesh), followed by sterilization with a 0.5% NaClO solution and threetimes rinsed with running water. The seedlings were cultivated under a normal day (30 ± 5 ◦C) and night (20 ± 5 ◦C) cycle, with a relative humidity that ranged between 55 ± 5% and 70 ± 5%, in the greenhouse of Grassland Science Department of Northwest A&F University.

When the seedlings reached a height of 10 cm, they were randomly divided into three groups: (I) AN: alfalfa plants inoculated with the Rhizobium meliloti strain Dormal to form AN, supplemented with 1/4 strength nitrogen-free Hoagland solution (Hoagland and Arnon, 1950) daily; (II) IN: alfalfa inoculated with the same Rhizobium meliloti strain Dormal as the AN group, but watered daily with 1/4 strength Hoagland solution (Wang et al., 2016); (III) NN: alfalfa without rhizobia inoculation, watered daily with 1/4 strength Hoagland nutrient solution.

Plant shoots of all three groups were cut at the base of the stem and their biomass weighed on days 60 and 90 of the experiment. After this shoot removal and subsequent plant regrowth, the intended root nodule inoculation treatments were achieved at 120 days: pink nodules were present in the AN roots, white nodules were observed in the IN roots, and NN were observed in the NN roots (**Figure 1**). These 120 days old seedlings were then subjected to the low temperature treatments, as described below.

#### Low Temperature Treatments

The 120 days old alfalfa seedlings of AN, IN, and NN groups were exposed to low temperature treatments of 0 and –6◦C in a reconstructed refrigerator. To ensure whole plants (shoot and root parts) were treated at the same given temperature, a temperature controller reduced the ambient air temperature in the refrigerator at a rate at –2.5◦C per hour. Hence it took 10 or 16 h to reach 0 or –6◦C, respectively, at which plants remained in their targeted temperatures for another 8 h. For their physiological index determinations, the shoots and roots of plants were harvested separately at 0, 2, 4, 6, and 8 h since imposing the low temperature stress treatment. The samples were washed with distilled low-temperature water to remove any sand, then dried

FIGURE 1 | Roots of AN, IN, and NN groups. (AN, alfalfa with active nodules; IN, alfalfa with inactive nodules; NN, alfalfa with no nodules.) Yellow arrows show the pink nodules in the roots of the AN group and the white nodules in the roots of the IN group.

with paper towels, and immediately frozen in liquid nitrogen for storage at –80◦C.

#### Survivorship

fpls-10-00538 April 27, 2019 Time: 15:32 # 4

At each sampling point in time, 50 treated plants per group were removed at random from the refrigerator. The temperature of plants was restored to room temperature at a rate of +2.5◦C per hour, after which the plants were irrigated with their respective nutrient solutions for another 2 weeks. During the recovery period, those plants which had maintained or regained green coloring on their ground parts, or developed new green shoots, were considered to have survived the low temperature stress treatments.

#### Biochemical Analyses

Fresh alfalfa leaves were used for electrical conductivity testing, whereas frozen samples were used for all other biochemical assays. All spectrophotometric analyses were conducted on a HITACHI spectrophotometer (UV-3900, Japan). Electrical conductivity was determined as previously described by Song et al. (2006) with minor modifications. Briefly, fresh alfalfa leaves after low temperature stress were gathered and soaked in distilled water for 2 h at 4◦C; next, the conductivity value was read as L1 by a conductivity meter DDS-307 (Leici Corporation, China). The mixture was then heated in a boiling water bath for 20 min and the conductivity value L2 was collected once it had cooled down to room temperature. Relative electrical conductivity was calculated as (L1/L2) × 100%. Malondialdehyde (MDA) was measured by using the thiobarbituric acid (TBA) reaction (Guo et al., 2010). Briefly, about 0.5 g of alfalfa tissue was homogenized in 5 ml of a cooled potassium phosphate buffer (pH = 7.8). After centrifuging at 4000 × g for 10 min, 2 ml of supernatant was mixed with 2 ml of 0.6% TBA acid and then incubated in a boiling water bath for 20 min. This mixture was chilled rapidly and then centrifuged again at 4000 × g for 10 min to remove debris. The absorbances at 532, 600, and 450 nm were measured on a spectrophotometer.

Peroxidase activity (POD) was determined using a guaiacol (C7H8O2) substrate, as described in Xu et al. (2011). Frozen alfalfa plant tissue (approximately 0.2 g) was homogenized in 10 ml of a 50 mM potassium phosphate buffer containing 1% polyvinylpyrrolidone and 1 mM EDTA. The homogenate was centrifuged at 15,000 × g for 15 min at 4◦C. The supernatant was used to quantify POD activity by measuring the oxidation of guaiacol; the reaction mixture contained 8 mM C7H8O2, 50 mM potassium phosphate buffer, and 2.75 mM H2O2, with the increase of absorbance at 470 nm measured.

Superoxide dismutase activity (SOD) was determined by using nitroblue tetrazolium (NBT), as described by Giannopolitis and Ries (1977). Specifically, SOD is measured by the reaction mixture's ability to inhibit the photochemical reduction of NBT. The plant material was homogenized in a 50 mM phosphate buffer (pH 7.8) containing 100 µM EDTA, and 1% (w/v) polyvinyl pyrrolidone (PVP-40), on an ice bath. The homogenate was centrifuged at 12,000 × g for 15 min and the ensuing supernatant transferred to a mixture containing 50 mM of phosphate buffer, 130 mM of methionine, 750 µM of NBT, and 20 µM of riboflavin (pH 7.8). The absorbance at 560 nm was monitored.

Catalase activity (CAT) was determined as earlier described by Chance and Maehly (1955). This approach used a CAT reaction solution that consisted of 100 mM phosphate buffer (pH 7.0) and 100 mM H2O2. The consumption of H2O<sup>2</sup> was then inferred by the decrease in optical density recorded at 240 nm.

Proline content determination followed the acidic ninhydrin reagent method (Bates et al., 1973). Approximately 0.5 g of alfalfa tissue material was homogenized in 10 ml of 3% aqueous sulfosalicylic acid, and then centrifuged at 1000 × g. Two ml of the supernatant was reacted with 2 ml of ninhydrin reagent and 2 ml of glacial acetic on a boiling water bath for 1 h. The reaction mixture was extracted with 4 ml of toluene, after which the absorbance of the supernatant was read at 520 nm.

Soluble protein concentration was measured according to the method of Bradford (1976). Specifically, approximately 0.5 g of alfalfa plant tissue was homogenized and centrifuged at 6000 × g and 4◦C in a phosphate buffer (pH = 7.8). Then the supernatant was mixed with the Bradford reagent, and the absorbance at 595 nm was measured. Soluble sugar content was assayed by using the anthrone reagent (Dreywood, 1946). The alfalfa plant tissue was first mixed with absolute ethanol, and this mixture heated at 80◦C for 0.5 h and then centrifuged at 4000 × g for 10 min. Two ml of the supernatant was mixed with 5 ml of anthrone reagent-sulfuric acid and incubated on a boiling water bath for 10 min. Finally, the absorbance at 625 nm was measured.

### Plant RNA Extraction, Reverse Transcription, and qRT-PCR Analysis

Total RNA was extracted from the shoots and roots of alfalfa plants exposed to the low temperature treatment (0◦C) for 0, 2, 4, 6, and 8 h, by using the Eastep total RNA extraction kit (Promega, China). First strand cDNA was synthesized with the SuperScript II reverse transcriptase (Invitrogen, United States) and the qRT-PCR was performed using the primers listed in **Supplementary Table S1** The β-Actin gene was used as an

internal reference (Zhang et al., 2016), and relative expression levels were calculated using the standard 2−11Ct algorithm (Livak and Schmittgen, 2001). The qRT-PCR was carried out on the Roche LightCycler 4800II Real-time PCR system, with a SYBR Green-based PCR assay used (ABM EvaGreen 2 × qPCR MasterMix–No Dye). Every qRT-PCR sample contained 2 µl of cDNA, 10 µl of SYBR Green, and 2 µl of primer. The qRT-PCR cycle parameters were as follows: 10 min at 95◦C, and then 40 cycles of 15 s at 95◦C and 1 min at 60◦C.

#### Statistical Analyses

The experiment consisted of three nodulation treatments (AN, NN, and IN), two low temperature treatments (0 and -6◦C), and five sampling time points (0, 2, 4, 6, and 8 h of low temperature stress). One-way ANOVAs were used to determine whether there was a significant difference among treatment means (ANOVA tables are attached in the **Supplementary Material**), with pairwise mean differences compared by the least significant difference test (LSD) at an alpha level of 0.05. In this study, 54 plants (3 nodulation treatments × 3 biological replicates × 6 individuals) were used for the biomass determination. For the survival determination, we set two temperature conditions (0 and -6◦C). And under each temperature, 2250 plants (3 nodulation treatments × 3 biological replicates × 5 sampling times × 50 individuals) were used. To measure the relative electric conductivities under the two low temperatures, for each 135 plants were used (3 nodulation treatments × 3 biological replicates × 5 sampling times × 3 individuals). However, only those plants treated at 0◦C were selected to analyze the other plant physiological indices. Finally, 135 plants (3 nodulation treatments × 3 biological replicates × 5 sampling times × 3 individuals) were used for the determination of expression profiles of low temperature-related genes under 0◦C. All data were analyzed using SPSS v19.0 software (SPSS IBM, United States) and the figures drawn in GraphPad Prism 5 (San Diego, CA, United States).

# RESULTS

#### Rhizobium Nodules Enhanced the Low-Temperature Tolerance of Alfalfa

To confirm that plants were acquiring the same level of nitrogen within a treatment, we measured their harvested aboveground biomass at days 60, 90, and 120 of the experiment. No significant difference in aboveground biomass was observed among the three groups alfalfa (**Supplementary Figure S2**). All three groups had 100% survival at 0◦C, but after 4 h of exposure to –6◦C the alfalfa seedlings' survival was significantly reduced (**Figure 2**). After 6 h of exposure, survival of the AN and IN groups was significantly higher than that of the NN group, with more AN than IN plants surviving the low temperature stress. After 8 h exposure, only a portion of the AN group had survived, whereas this long-term low temperature stress killed all plants of both IN and NN groups.

# Effect of Rhizobium Symbiosis on Cell Membrane Damage in Alfalfa

Throughout the 8 h exposure of plants to a 0◦C-low temperature, the relative electrical conductivity of AN, IN, and NN groups did not significantly increase (**Supplementary Figure S3**). This indicated that the damage to cell integrity caused by low temperature stress at 0◦C was insufficient to cause leakage of intercellular fluid. Comparatively, under –6◦C, the electrical conductivity of AN, IN, and NN groups were significantly elevated over time (**Supplementary Figure S3**), yet no significant difference was observed them at all time points within this temperature treatment. More importantly, the observed higher relative electrical conductivity under –6◦C indicated a greater level of internal disorder, and it exceeded the determination limits (**Supplementary Figure S3**). Therefore, the detection of other physiological indicators was only tested in the 0◦C-treated plants.

Under 0◦C, the MDA concentration in the shoots of all three groups of alfalfa were increased (**Figure 3A**), but the AN group apparently impeded the generation of MDA in its shoots more efficiently than those of NN. In the roots, the concentrations of MDA in all plants were elevated by the low temperature stress

at 0◦C (**Figure 3B**). However, the MDA concentration in the NN group significantly exceeded that in the AN and IN groups by 1.90 times and 2.16 times, respectively.

#### Effect of Rhizobium Symbiosis on Alfalfa's Antioxidant Defenses

POD activity in rhizobium-treated or non-treated alfalfa shoots did not show the same patterns under 0◦C. The IN group had significantly higher POD activity than AN or NN at 2 and 6 h of exposure duration (**Figure 4A**), however, all three groups' POD activities were similar at 8 h. In the root parts, POD activity of the AN group was significantly higher than those of NN or IN at 0, 2, and 8 h (**Figure 4B**). As the low temperature treatment continued, the SOD activity in the shoots of the AN and NN groups were maintained at a constant high level (**Figure 4C**). A significantly higher SOD activity in AN group's roots than those of NN or IN groups (**Figure 4D**). In the shoots, CAT activity in the AN group gradually increased under 0◦C, whereas initially it increased faster in the NN group but decreased under prolonged incubation, leaving AN with and the highest CAT activity after 8 h. The IN group had the highest CAT activity at 0 h but this declined rapidly soon afterward, so that it was on par with NN group at 8 h (**Figure 4E**). By contrast, in the roots, no significant difference in CAT activity was observed among the AN, IN, and NN groups after 8 h (**Figure 4F**).

# Effect of Rhizobium Symbiosis on Soluble Substances in Alfalfa

means ± SE, n = 3. Different letters indicate a significant difference between means (P < 0.05).

The proline contents of shoots in the NN group continually decreased under a low temperature treatment of 0◦C (**Figure 5A**), whereas, rhizobium-inoculated alfalfa were capable of sustaining a constant proline level in their shoots in response to the low temperatures. In their roots, the AN group accumulated proline through 8 h of low temperature exposure (**Figure 5B**). As shown in **Figures 5C,D**, AN in the AN group assisted protein accumulation in the host plant cells, resulting in a greater level of soluble proteins in both shoots and roots compared with the NN and IN groups. Under prolonged low temperature stress at 0◦C, the soluble sugar contents of shoots in the three nodulation groups presented similar increasing trends (**Figure 5E**). However, compared with NN and IN, the roots of the AN group accumulated more soluble sugar during the whole 8 h period (**Figure 5F**).

#### Effect of Rhizobium Symbiosis on Low Temperature-Regulated Genes

The expression of several low temperature-related genes was also profiled to determine how their differences were influenced

by rhizobium symbiosis in alfalfa. In most plants, CBF2 functions as a cold-response transcriptional regulator. Despite differing in their rhizobium nodulation, when the plant groups were subjected to the 0◦C-low temperature treatment they, similarly, expressed the CBF2 gene. Nevertheless, the AN group had the highest transcript level for the CBF2 gene in roots from 4 to 6 h, and after 8 h its transcription was significantly higher in the AN and IN groups than in plants without nodulation. In the shoots, the AN group displayed a higher transcript level of the CBF2 gene earlier on, during 2–4 h of exposure to low temperature; yet after 6 h, its expression did not significantly differ among the three groups (**Figures 6A,B**).

Cas is a cold acclimation specific protein that, allows the plant to maintain functioning under a low temperature. In roots, rhizobium nodulation may help the host plant to maintain a higher level of Cas15a gene expression under a low temperature treatment. During the interval of 2–4 h, a significantly higher Cas15a gene expression was observed in the AN group compared with the IN and NN groups; hence, activated rhizobium symbiosis could have assisted host alfalfa plants' respond to low temperature stress more rapidly. In the shoots, however, no significant difference was detected among the three groups during the first 4 h of low temperature, but later on, during the 6–8 h interval, the Cas15a gene expression in plants with AN was greater than those with IN or NN (**Figures 6C,D**).

As **Figures 6E,F** shows, ProDH gene expression in the shoots of the AN group was sustained at a constant level for the entire time period of the experiment, falling below that of the IN and NN groups after 6 h exposure to the 0◦C-low temperature. Likewise, in the roots the expression of the ProDH gene remained constant and similar among the three groups from 0 to 6 h, but at 8 h the AN group had a lower transcript level of the ProDH gene.

The CorF gene contributes to soluble sugar accumulation. As shown in **Figures 7A,B**, nodulation in alfalfa significantly increased the expression of this gene in the shoots after 6 h of the 0 ◦C-low temperature treatment. In the root part, compared with the other two groups, the AN group showed significantly higher CorF levels at 2, 6, and 8 h.

The expression levels of SOD and CAT are indicative of the antioxidative abilities of plants. The transcript level of SOD gradually increased in the shoot of the three groups during the first 4 h of the 0◦C-low temperature treatment. At 6 h, both AN and IN groups had a significantly higher SOD levels when compared with the NN group. Moreover, an activated rhizobium induced a greater level of SOD in the AN group at 8 h. In roots, only the AN group showed higher SOD levels at 2, 6, and 8 h (**Figures 7C,D**).

When the low temperature treatment began, the IN group displayed a greater level of CAT than AN or NN did in the shoots of alfalfa. This difference, however, disappeared during the interval of 2–6 h. Through 8 h the AN group had the highest level of CAT in the shoots, whereas in the roots, although no such difference in CAT was observed in early 2 h, at 4 h it was significantly greater in the IN group than AN or NN, becoming highest in the AN group after 6 and 8 h (**Figures 7E,F**).

### DISCUSSION

Low temperature is a major abiotic factor that limits the growth, development, survival, and productivity of plants (Zhou et al., 2018). Accumulating evidence shows that rhizobium symbiosis with plants plays an important role in their various abiotic stress tolerance mechanisms (Larrainzar and Wienkoop, 2017). In this experimental study, we demonstrated that rhizobium symbiosis could improve low-temperature tolerance in alfalfa. By inducing rhizobium nodules in its roots, the activity of anti-oxidation enzymes, osmotic adjustment, and low temperature-related genes of host plants were all significantly altered, consequently attenuating the oxidative stress, which led to higher survival under a low temperature condition.

Altered environmental factors may significantly change how symbionts behave when host plants must adjust to new circumstances (Paracer and Ahmadjian, 2000). In this study, two nutrient solutions were used to irrigate the alfalfa plants inoculated with or without rhizobia (**Supplementary Figure S1**). One nutrient solution provided plants with inorganic nitrogen, while another contained no nitrogen supply. Without nitrogen intake from nutrient solution, the symbiotic rhizobia were activated, forming the characteristic pink AN (AN group), which provide organic nitrogen in the form of amino acids to the host plant (Silvente et al., 2002). In contrast, plants irrigated with the nitrogen-containing nutrient solution could assimilate the inorganic nitrogen directly from the nutrient solution (Taule et al., 2012), leaving symbiotic rhizobia inactivated, thus generating the white IN (IN group). Compared with IN, more plants of the AN group survived under low temperature stress and exhibited stronger physiological responses, suggesting that an activated rhizobium nodule converted more nutrients for the host alfalfa to mitigate against the low temperature environment.

Both IN and NN groups were irrigated with the same total nitrogen nutrient solution, but though plants were inoculated with rhizobia the nodule was left inactive in the IN group, from which a higher proportion survived under low temperature stress. This result indicates that rhizobia symbiosis with the IN may have contributed to the low-temperature tolerance of host plants. This may due to rhizobia can enhance the systemic resistance of their host plants by inducing the expression of many defensive genes in different legume species, like Stylosanthes (Stylosanthes. guianensis cv. Reyan II) and peanut (Arachis hypogaea L.) (Dong et al., 2017; Furlan et al., 2017).

Alfalfa, moreover, is an autotetraploid legume crop (Brouwer et al., 2000). As such, there may be much natural variation in different growth and physiological parameters between different individual plants of the same cultivar. Therefore, for our experiment, a great many plants were needed to carry out a robust statistical analysis, with a total of 4500 (2250 plants for each temperature × 2 temperature) alfalfa individuals used to quantify survival.

To explore the mechanisms underpinning plant responses to low temperature stress, we investigated key physiological variables responsible for low-temperature tolerance. Zhang et al. (2011) had determined the survivorship of Medicago truncatula cv. Jemalong A17 and Medicago falcata cv. Humeng at −10◦C for 5 h. By contrast, in our study we ensured that aboveground and belowground plant tissues experienced the same low temperature. Under –6◦C, alfalfa with AN survived best, which indicated the rhizobium interaction improved this plant's lowtemperature tolerance.

It has been reported that low temperatures induce cell membrane damage (Aghdam et al., 2019), and MDA is widely adopted as an indicator of oxidative stress and membrane integrity in plants when they respond to stressors (Sato et al., 2011). In our study, under a low temperature, compared with both IN and NN groups, the AN group clearly produced less MDA in both their shoot and root parts (**Figure 3**). Recently, a similar difference in the accumulation of MDA was reported between two genotypes of bermudagrass [Cynodon dactylon (L). Pers.] that differed in their tolerance to low temperature stress (Huang et al., 2017). Undoubtedly, higher plants have developed many complex strategies to respond to the low temperatureinduced stress, and research has suggested that plants exhibit higher POD and SOD under conditions of low temperature stress, and then benefit from receiving less oxidative stress (Radwan et al., 2010). Furthermore, enhanced POD activity can indicate a higher capacity for the decomposition of H2O<sup>2</sup> that is generated by SOD (Wu et al., 2014). Our results revealed that among the three alfalfa groups, POD (in root) and SOD activity was greatest in the AN group after 8 h

at 0◦C (**Figures 4B–D**), and we also found the IN group's POD activity in shoots exceeded that of NN (**Figure 4A**), which together suggested POD accumulation is a key factor promoting alfalfa's survival under low temperature. Furthermore, the findings point to an inactivated nodule facilitating higher cold tolerance for host plants. This may indicate that rhizosphere form composition was changed after the rhizobium inoculation, so that the "stress tolerance ability" for plant was increased. Concurrently, there was markedly higher SOD activity in the roots with AN than those with IN or NN (**Figure 4D**), indicating that alfalfa plants with functional nodules had a stronger tolerance to low temperature. Further, since the AN group received less oxidative stress, this may be due to the fact that rhizobium symbiosis stimulates host plants to produce additional antioxidants. Our results are consistent with

those of Zhang R.X. et al. (2017) and Kakar et al. (2016), who found that cold-resistant plants received less oxidative stress and were capable of higher antioxidant enzyme activity. It is known that antioxidant metabolism can protect cells from oxidative damage caused by ROS, and that CAT can decrease oxidative damage (Meng et al., 2017). The lower MDA accumulation in the AN group may explain, in part, its milder oxidative damage in AN group which also has higher activity of CAT. The greater up-regulation of SOD and CAT biosynthesis genes in the AN group corroborates the higher activity of both antioxidant enzymes we found. Our study is in line with view taken by Mutlu et al. (2013) opinion, who pointed out that cold-tolerant plant cultivar should have higher CAT activity when faced with cold stress. To sum up, the more MDA accumulated and lower antioxidant enzyme activity

in IN and NN groups indicated that those plants suffered more severe oxidative damage under low temperature stress. However, with the aid of its rhizobia symbionts the AN group incurred less damage.

Through a variety of metabolic pathways, the plant cell releases soluble organic matter or compounds to reduce its water potential and to adjust itself to the surrounding stressful environment (Cunningham et al., 2003). We also determined the accumulation of soluble substances such as proline, soluble sugar, and soluble protein in alfalfa plants. High-level accumulations of soluble substance should enhance a plant's freezing tolerance (Cao et al., 2012), and proline can also function as a protein compatible hydrotrope (Srinivas and Balasubramanian, 1995), which may positively affect soluble proteins and promote the latter to accumulate. It was reported that cold treatment could increase accumulation of soluble protein (Castonguay et al., 1995). Bao et al. (2017) had pointed out that an increased protein content of alfalfa (M. sativa cv. Dongmu) prevents damage from cold stress, and soluble sugar accumulation was associated with enhanced freezing tolerance in curly kale (Brassica oleracea L. var acephala; Steindal et al., 2015). Compared with the IN group, AN can obtain more soluble substances from the active rhizobia symbiotic nodules to counteract an environmental stress (Erdal, 2012). After 8 h of low temperature stress, the AN group had accumulated more proline than the IN and NN groups, and similar results were found for soluble protein and sugar. This clearly shows that the plantnodule interaction is vital for improving the low-temperature tolerance of alfalfa.

Numerous molecular processes were also altered when the alfalfa plants faced low temperature stress. According to Ito et al. (2006), the CBF gene could be critical for plants' cold tolerance, which when overexpressed can improve plant cold tolerance. Furthermore, the study pointed out the accumulation of proline and soluble sugar in rice under abiotic stress could have been due to the overexpression of its CBF gene (Ito et al., 2006). Similarly, our results showed that the AN group expressing more CBF genes also accumulated more proline and soluble sugar than IN and NN groups. Upregulated expression of the Cas gene in AN may have led to more dehydrins which protected the structure of cells and maintained the stability of intracellular proteins as well as the activity of intracellular macromolecules, thereby enhancing the lowtemperature tolerance of host plants (Monroy et al., 1993; Pennycooke et al., 2008; Hara, 2010). Conversely, Cas may regulate the process of proline synthesis in addition to sugar content (Zuther et al., 2015). Under low temperature, elevated proline may protect the stressed plant from dehydration and stabilize its subcellular structure and, more importantly, scavenge for free radicals (Ashraf and Foolad, 2007). In short, proline accumulation fosters a low-temperature tolerance of host plants. Nevertheless, proline dehydrogenase (ProDH), a key enzyme in proline degradation, constitutively consumes extra proline in plant cells. In our study, alfalfa plants subjected to a low temperature treatment responded with more ProDH gene expression that led to a reduced proline concentration, however, rhizobium inoculation was able to change this gene's expression profile. During the low temperature treatment, transcription of the ProDH gene was sustained at a constant level in shoots of the AN group, and even decreased in its roots (**Figures 5A,B**, **6E,F**). It has been suggested that ProDH is a gene expressed in all plant tissues in model plant (Liu et al., 2012), thus, the effect of rhizobium on host plants may first act in the roots and then transferred to the aboveground parts. Moreover, CorF is a key enzyme in the formation of raffinose family oligosaccharides (RFOs) (Pembleton and Sathish, 2014). Work by Cunningham et al. (2003) indicated RFO synthesis strengthens the overwintering ability of plants. In our study, compared with IN and NN, the AN group had higher transcript levels of CorF gene. This may partially explain why shoots, as well as roots, of the AN group also showed a higher soluble sugar content. We speculate the changed gene expression induced by low temperature stress may be regulated by the symbiotic relationship between rhizobia and its host plants.

In a nutshell, based on our study's results, we conclude that rhizobium inoculation effectively protected the alfalfa's membrane system and assisted host plants to accumulate more proline, soluble protein, and soluble sugar; induced cold stress-related genes to counter the low temperature stress; and activated the interaction between nodules and alfalfa plants, which together provided a better protective effect against low temperature stress.

### AUTHOR CONTRIBUTIONS

T-MH and P-ZY conceived and designed the project. Y-SL performed the experiments, analyzed the data, and wrote the manuscript. J-CG, X-YS, and Y-XZ performed the experiments. All authors contributed to the manuscript revision, and read and approved the submitted version.

#### FUNDING

This work was supported by grants from the Project of National Natural Science Foundation of China (31572456, 31772660) and the Technical System of National Forage Industry (CARS-34).

#### ACKNOWLEDGMENTS

We thank Dr. Yajun Wu from South Dakota State University, for his advice on technical assistance to this manuscript, as well as the reviewers for their thoughtful critique and suggestions.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00538/ full#supplementary-material

# REFERENCES

fpls-10-00538 April 27, 2019 Time: 15:32 # 12


contribution to biological nitrogen fixation. Plant Sci. 263, 12–22. doi: 10.1016/ j.plantsci.2017.06.009


falcata cold-acclimation-specific genes. Plant Physiol. 146, 1242–1254. doi: 10. 1104/pp.107.108779


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Liu, Geng, Sha, Zhao, Hu and Yang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Determination of the Key Resistance Gene Analogs Involved in Ascochyta rabiei Recognition in Chickpea

Ziwei Zhou, Ido Bar, Prabhakaran Thanjavur Sambasivam and Rebecca Ford\*

Environmental Futures Research Institute, School of Environment and Science, Griffith University, Nathan, QLD, Australia

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Maryam Nasr Esfahani, Lorestan University, Iran Tom Warkentin, University of Saskatchewan, Canada Eva Madrid, Max Planck Institute for Plant Breeding Research, Germany

> \*Correspondence: Rebecca Ford Rebecca.ford@griffith.edu.au

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 14 December 2018 Accepted: 29 April 2019 Published: 17 May 2019

#### Citation:

Zhou Z, Bar I, Sambasivam PT and Ford R (2019) Determination of the Key Resistance Gene Analogs Involved in Ascochyta rabiei Recognition in Chickpea. Front. Plant Sci. 10:644. doi: 10.3389/fpls.2019.00644 Chickpea (Cicer arietinum L.) is an important cool season food legume, however, its production is severely constrained by the foliar disease Ascochyta blight caused by the fungus Ascochyta rabiei (syn. Phoma rabiei). Several disease management options have been developed to control the pathogen, including breeding for host plant resistance. However, the pathogen population is evolving to produce more aggressive isolates. For host resistance to be effective, the plant must quickly recognize the pathogen and instigate initial defense mechanisms, optimally at the point of contact. Given that the most resistant host genotypes display rapid pathogen recognition and response, the approach taken was to assess the type, speed and pattern of recognition via Resistance Gene Analog (RGA) transcription among resistant and susceptible cultivated chickpea varieties. RGAs are key factors in the recognition of plant pathogens and the signaling of inducible defenses. In this study, a suite of RGA loci were chosen for further investigation from both published literature and from newly mined homologous sequences within the National Center for Biotechnology Information (NCBI) database. Following their validation in the chickpea genome, 10 target RGAs were selected for differential expression analysis in response to A. rabiei infection. This was performed in a set of four chickpea varieties including two resistant cultivars (ICC3996 and PBA Seamer), one moderately resistant cultivar (PBA HatTrick) and one susceptible cultivar (Kyabra). Gene expression at each RGA locus was assessed via qPCR at 2, 6, and 24 h after A. rabiei inoculation with a previously characterized highly aggressive isolate. As a result, all loci were differentially transcribed in response to pathogen infection in at least one genotype and at least one time point after inoculation. Among these, the differential expression of four RGAs was significant and consistently increased in the most resistant genotype ICC3996 immediately following inoculation, when spore germination began and ahead of penetration into the plant's epidermal tissues. Further in silico analyses indicated that the differentially transcribed RGAs function through ADP-binding within the pathogen recognition pathway. These represent clear targets for future functional validation and potential for selective resistance breeding for introgression into elite cultivars.

Keywords: Resistance Gene Analogs, ascochyta blight, chickpea, host resistance, expression profiling

# INTRODUCTION

fpls-10-00644 May 17, 2019 Time: 15:11 # 2

Chickpea (Cicer arietinum L.) is a staple cool season food legume, important in the Indian sub-continent, West Asia, North Africa and grown as a high-return cash export crop in Australia and North America (Du et al., 2012). However, production is seriously constrained by fungal disease Ascochyta blight, which is the most frequent and devastating disease of chickpea crops worldwide (Sagi et al., 2017). The fungus Ascochyta rabiei (syn. Phoma rabiei), can infect all parts of the plant above ground, and at any growth stage (Sharma and Ghosh, 2016).

Australia is the second largest global producer and exporter of chickpea (ABARE report from February 2, 2016), while India is the largest chickpea producer, whose production dwarfs that of all other countries. The first recorded A. rabiei epidemic in Australia occurred in 1998 (Du et al., 2012). With growing market demand and cash return, production in northern New South Wales and southern and central Queensland has recently increased. This has led to significantly increased risk from A. rabiei due to complacency in disease management best practice from novice growers and the potential for wetter winters than in southern growing regions. During the 2012–2014 seasons, the high rainfall in these northern regions led to widespread A. rabiei epidemics; and highly aggressive clonal isolates destroyed crops of the most resistant cultivars despite repeated fungicide applications (Moore et al., 2015b). Despite the presence of the teleomorph elsewhere, the Australian population is asexual, reliant on mutational events for favorable selection and potential adaptation (Leo et al., 2015). The emergence of growing numbers of highly aggressive isolates across the growing regions indicated sufficient genetic diversity within the clonal population to select for ability to overcome the fungicides and host resistance genes employed (Mehmood et al., 2017).

Since there appears to be a growing potential for A. rabiei to evolve new pathotypes with high aggressiveness (Mehmood et al., 2017), it is important for breeders to be able to select for germplasm with the best and most stable resistance. This may in part be informed by understanding the functional pathogen recognition mechanisms, of which RGAs play a key role and are responsible for the onward signaling and activating of plant defense responses shown to be involved in many plant pathosystems (Grant et al., 1998).

Resistance Gene Analogs (RGAs) are a large gene family with conserved domains and structural features that enable classification into either nucleotide binding site leucine rich repeat (NBS-LRR) or transmembrane leucine rich repeat (TM-LRR) sub-families. They function mainly as intracellular receptors that perceive the presence of pathogen effectors by direct binding of the pathogen effector proteins, or by monitoring the modification of host proteins after associating with the pathogen, to activate multiple defense signal transductions to restrict pathogen growth (Sagi et al., 2017). Emerging evidence indicates that an intermediate vesicle-type exosomal body is involved in delivering the molecules that initiate the chickpea signaling for defense to necrotrophic fungi (Boevink, 2017). In the Chickpea – A. rabiei pathosystem, RGAs are predicted to recognize the fungus and then induce signaling of defense molecules previously identified by Coram and Pang (2006), leading to resistance in several commonly grown chickpea cultivars (i.e., PBA Seamer).

Subsequent plant defense responses are complex and diverse at the genomic level, the expression of transcription factors and protein kinases, as well as the increase in cytosolic calcium are all involved in defense signaling (Grant and Mansfield, 1999). Moreover, the speed and coordination of the host's perception of the pathogen, signal transduction and transcriptional activation are also vital to successful defense. In the study by Coram and Pang (2006), 13.6% of chickpea complementary DNAs (cDNAs) evaluated by microarray were differentially expressed in response to A. rabiei. Further, the kinetics of differential expression after inoculation of A. rabiei highlighted the differential timing of pathogen recognition and subsequent transcriptional changes associated with the A. rabiei defense response (Coram and Pang, 2005a,b; Leo et al., 2016).

Although the earlier studies identified some key defense-related mechanisms, the underlying pathogen recognition factors were not elucidated. In addition, the defense of chickpea to ascochyta blight is multigenic and governed by resistance-quantitative trait loci (R-QTL) with many QTLs for A. rabiei resistance identified on multiple linkage groups (Santra et al., 2000; Leo et al., 2016; Sagi et al., 2017). According to Sagi et al. (2017), 121 NBS-LRR genes are associated to R-QTL for A. rabiei. Subsequent assessment of their expression levels at 12, 24, 48, and 72 hpi revealed several RGAs that are deemed functional in early pathogen recognition. However, together with those previously identified by Leo et al. (2015), they represent only a subset of the possible recognition factors and their activities at earlier and crucial time points are still unknown. Characterization and functional assessment of a wider range of RGAs at the "pre-penetration" and "during penetration" stages will provide essential information for future targeted breeding of varieties able to quickly recognize and respond to this devastating pathogen. Therefore, the aims of this study were to: (1) Identify RGA candidates present in the chickpea genome through published literature searches and sequence analyses; (2) Validate the presence of RGA candidates within key resistant chickpea genotypes; (3) Assess the putative function of the RGA candidates via transcription in response to an aggressive isolate of A. rabiei at biologically important early interaction stages; and (4) Further characterize the putative function of the most responsive RGA candidates through predictive in silico analyses.

# MATERIALS AND METHODS

#### Target RGA Loci and Development of PCR Markers

Five sequences, representative of three RGA classes which were previously characterized and considered putatively functional in resistance to fusarium wilt, rust, and ascochyta blight (Palomino et al., 2009), were initially chosen for further assessment. These included RGAs of class 01, previously detected in faba bean and RGAs of classes 02 and 03, previously detected in chickpea (Palomino et al., 2009). Additionally, four

chickpea NBS-LRR RGA loci were chosen from Leo et al. (2016). Finally, three RGA sequences, reported to be upregulated in response to A. rabiei, were chosen from Sagi et al. (2017). Simultaneously, thirteen RGA sequences were sought from chickpea sequences deposited to the NCBI database<sup>1</sup> . The 13 sequences were chosen because they represented the breadth of the RGA families and they were unanimously identifiable from the existing database. Seeking and assigning of putative RGAs was performed using known motifs for specific RGA classes (NBS-LRR family) with a 99% of within-class identity threshold, while the motif information was referenced from Sekhwal et al. (2015).

PCR primers flanking the selected RGA loci were designed using Primer3web (version 4.0.0<sup>2</sup> ) with the following criteria: melting temperature (Tm) of 59 ± 3 ◦C, and PCR amplicon size of 150–300 base pair (bp), primer length of 18–23 nucleotides and GC content of 40–60%. Primers were synthesized by SIGMA-ALDRICH.

#### Plant Material and Fungal Isolates

Four chickpea genotypes with differentially known disease reactions to A. rabiei were used; ICC3996, PBA Seamer, PBA HatTrick, and Kyabra (**Table 1**). It is worth mentioning that even the resistance varieties are evaluating show substantial disease symptoms under many typical field epidemic situations. Seed was obtained from the National Chickpea Breeding Program, Tamworth, NSW, Australia. Seedlings were grown in 15 cm diameter pots containing commercial grade potting mix (Richgro premium mix), with 5 seed per pot/replication (six replicates per host genotype and isolate). Plants were grown in a controlled growing environment (CGE) maintained at 22 ± 1 ◦C with a 16/8 h (light/dark) photoperiod for 14 days until inoculation. The A. rabiei isolate FT13092-1 used in this experiment was collected in 2013 from Kingsford, South Australia (by Dr. Jenny Davidson of the South Australian Research and Development Institute). Isolate FT13092-1 is highly aggressive on PBA HatTrick, Kyabra, and is moderately aggressive on ICC3996 (Grains Research and Development Corporation annual report for project #UM00052; R. Ford pers. comm.). The single-spored isolate was cultured on V8

<sup>1</sup>https://www.ncbi.nlm.nih.gov

<sup>2</sup>http://bioinfo.ut.ee/primer3-0.4.0/

juice agar and maintained in the incubator for 14 days at 22 ± 2 ◦C with a 12/12 h near-UV light irradiation (350–400 nm)/dark photoperiod.

#### Preparation of Inoculum and Bioassay

Inoculum was prepared by adding 10 mL of sterile distilled water to the cultured plates and scraping the pycnidia with a sterile bent glass rod to release pycnidiospores. The spore suspension was then filtered through muslin cloth and the final spore concentration was adjusted to 10<sup>5</sup> spores·mL−<sup>1</sup> . Since three replications are sufficient to show significant consistency, three replicates (three pots) of 14-day-old seedlings were sprayed using an air-pressured hand-held sprayer with a fine mist of prepared inoculum until run-off and labeled as treated groups. Another three replicates were sprayed with sterile water and labeled as untreated groups. Tween 20 (0.02% v/v) was added to the inoculum and water as a surfactant. All plants were covered with inverted plastic cups immediately after the inoculation according to the minidome technique (Chen et al., 2005) to ensure maximum humidity and darkness to induce optimum spore germination (Sambasivam et al., 2017) maintained in a CGE at 22 ± 1 ◦C. The main stems and young leaf tissues from treated and untreated groups were collected at 2, 6, and 24 hpi into 25 mL falcon tubes, snap frozen in liquid N2, and stored at −80◦C until processing. Following collection of foliar tissue for transcript analyses at each of the time points from individual plants, the remaining plant was left under the bioassay conditions to develop disease symptomology to confirm a viable infection had occurred.

### RNA Extraction, cDNA Preparation, and Differential Expression via RT-qPCR

RNA was extracted from the leaf and stem tissues of inoculated and uninoculated samples using a NucleoSpin <sup>R</sup> RNA Plant kit (Macherey-Nagel, Germany) according to the manufacturer's instructions. The RNA sample purity was assessed by reading the OD260/OD<sup>280</sup> absorption ratio using a Nano drop spectrometer (ND-1000). Total RNA (1 µg) of each sample was used for Genomic DNA (gDNA) elimination and reverse transcription using a PrimeScriptTM RT reagent Kit with gDNA Eraser (Perfect Real Time; Takara Bio, United States). The quality of cDNA and absence of gDNA were evaluated through PCR by


TABLE 1 | Chickpea genotypes and disease ratings to A. rabiei in Australia.

using the primer pair used to amplify the chickpea reference gene (CAC) from Reddy et al. (2016) which produced an amplicon that spanned intron-exon boundaries. The expected amplification product size was 110 bp and this was validated by electrophoresis. The cDNA samples were then diluted (1:50) with DNase/RNase free water for RT-qPCR. Each primer pair was assessed for PCR amplification on gDNA and cDNA samples. In addition, three reference genes (ABCT, UCP, and CAC) were selected from Reddy et al. (2016) and used as Inter-Run Calibrators (IRC), since they were previously shown to be stably expressed across many chickpea varieties. All primer sequences designed are listed in **Supplementary Figure 1** and **Supplementary Table 1**. The PCR efficiency of each primer pair was evaluated by using serially diluted cDNA samples (10<sup>0</sup> , 10−<sup>1</sup> , 10−10, 10−100, 10−1000). Bio-Rad CFX Manager 3.1 software (Bio-Rad, CA, United States) and a custom R script were used to calculate the correlation coefficient (R 2 ), slope value, and PCR amplification efficiency (E) of each primer pair combination.

A SYBR <sup>R</sup> Premix Ex TaqTM II (TIi RNaseH Plus) kit was used for assessing target gene expression using optical 96 well plates on a BIO-RAD CFX96 real-time PCR detection system (Bio-Rad laboratories) and reactions were prepared according to the manufacturer's instructions. The PCR reactions were performed in a total volume of 25 µL containing 12.5 µL of 2x SYBR <sup>R</sup> Premix Ex TaqTM II (TIi RNaseH Plus), 0.4 µM of each primer, and 2 µL of diluted cDNA template. The reaction conditions were set as 30 s at 95◦C (initial denaturation); followed by 40 cycles of 95◦C for 5 s, 60◦C for 30 s (fluorescence reading), and then followed by a melt curve analysis at 65–95◦C every 0.5◦C for 10 s. All reactions were carried out in technical duplicates. If variations between duplicates were significant, a triplicate was performed, and the two closest data points were taken. IRC were used in every single plate, because all samples in this experiment could not be analyzed in the same run. A Non Template Control (NTC) was included for each primer combination, to detect any potential contamination from gDNA and/or primer dimer (Leo et al., 2016).

#### RT-qPCR Data Analysis

Cq data of all RGA that were differentially expressed between chickpea genotypes and treatments were imported into LinRegPCR software version 2017.1 (Ruijter et al., 2015) for further analyses. Samples that did not amplify or produced a low, high or inconsistent Cq value (under 5 or over 40 cycles) were removed. The raw Cq values of the expression of each RGA locus were then corrected according to their respective PCR efficiencies, and the mean values of the biological triplicates were calculated. The Delta-Delta-Cq (ddCq) algorithm was used to determine relative and differential expressions among varieties and treatments (Pfaffl, 2001). An R script was then used to generate the differential expression plots of each RGA locus. Relative expression data (ddCq) above 0 meant that the RGA gene at this time point/genotype was up-regulated in the treated compared to the control group, whereas negative ddCq indicated that the RGA gene was down-regulated at that point.

A heatmap was constructed and displayed using R software based on the calculated mean fold-change in expression values among genotypes and time-points after normalization with the reference genes and untreated samples. Several statistical tests were then performed to provide evidence for real differences in RGA expression levels among genotypes and following inoculation: Firstly, a Levene test was performed to verify the homogeneity of variances, followed by a Shapiro–Wilk test to assess the normality of the variances. If both conditions were met, an ANOVA test was applied to compare the significance of expression differences between treated and untreated groups, otherwise, a non-parametric Kruskal–Wallis test was used to compare the groups. If the result was significant, pairwise comparisons among all sample groups were undertaken to test which group(s) were different from others using a Tukey test. All statistical analyses were carried out in the R Language and Environment for Statistical Computing (R Core Team, 2017). All R script developed for this study can be found at https://github.com/ziwei-zhou/Thesis\_R\_scripts. A p-value of 0.05 was used as the significance threshold in all statistical tests.

#### Analysis of RGA Protein Sequences

Bioinformatics and predictive in silico tools were used to further characterize RGAs. The predicted amino acid sequence of each RGA candidate was obtained from the NCBI database



FIGURE 2 | A heatmap representing the fold-change differences in expression among the 10 RGA target loci at 2, 6, and 24 hpi in four chickpea cultivars (PBA Seamer, PBA HatTrick, Kyabra, ICC 3996; so I\_6 = ICC 3996\_6hpi, same as others). Green color represents up-regulation, black color represents no change and red color represents down-regulation and color intensity indicates fold-change. No detectable expression is represented in white. The mean fold change expression values of the expression profiles for each treatment and genotype were normalized with the two mentioned reference genes and untreated samples.

TABLE 3 | Homologous super family predictions of the four chickpea target RGA sequences and their reference sequences definitions.


and imported into InterPro 5<sup>3</sup> (Jones et al., 2014) and KOBAS 3.0 software<sup>4</sup> (Xie et al., 2011), which were used to classify the predicted proteins into families and to predict domains and important (i.e., binding) sites. The RGA that responded with the highest transcriptional response to the pathogen was chosen for secondary structure prediction using the Position Specific Iterated – BLAST based secondary structure prediction (PSIPRED) method<sup>5</sup> (Jones, 1999). Three-dimensional atomic models of this RGA and its potential

<sup>3</sup>https://www.ebi.ac.uk/interpro/

<sup>4</sup>http://kobas.cbi.pku.edu.cn/annotate.php

<sup>5</sup>http://bioinf.cs.ucl.ac.uk/psipred/


aa, Predicted amino acid sequence length.

binding sites were predicted through RaptorX software<sup>6</sup> (Källberg et al., 2012).

#### RESULTS

#### RGA Locus Identification and Validation

In total, 25 RGA loci were identified from previous publications and based on known RGA motifs from within the chickpea sequences within the NCBI database. These were labeled from RGA 1 to 25. PCR products of the expected sizes were successfully amplified from 23 of the targeted putative loci across all four chickpea varieties assessed (**Table 2**). After primer efficiency testing, 10 RGAs produced a reliable and consistent linear amplification, based on their R 2 result and E value (RGAs 4, 6, 8, 9, 10, 11, 12, 15, 21, and 23).

#### Quantitative Real-Time Expression Profiling of the RGA Genes

Differences in the transcription levels of the selected RGAs over time, after inoculation with isolate FT-13092-1, were observed

<sup>6</sup>http://raptorx.uchicago.edu/

among the four chickpea genotypes assessed (**Figures 1A–J**). Interestingly, RGA 8 and 10 were both significantly up-regulated at the earliest timepoint assessed, at 2 hpi and in only the resistant PBA Seamer and ICC3996 genotypes (**Figures 1C,E**). These then remained up-regulated for the duration of the experiment, potentially indicating their ability to recognize the pathogen prior to invasion. This may indicate that they provide sustained signaling, leading to the instigation of downstream defense occurring much faster in these genotypes than in the more susceptible ones. RGA 21 and 23 showed down regulations in ICC 3996 at the beginning of the experiment, and then sharply increased to up-regulations at 6 hpi (**Figures 1I,J**). Meanwhile, RGA 4, 9, and 15 were initially down-regulated with a subsequent sharp increase in most chickpea genotypes, potentially indicating an overall ability of these RGA to recognize the pathogen following invasion, possibly too late for effective defense signaling (**Figures 1A,D,H**). While the expression profiles of RGA 6, 11, and 12 were not so significant in the plots (**Figures 1B,F,G**).

The relationships among the differential mean fold-changes of expressions of the 10 RGAs during the time-course were observed in the heatmap (**Figure 2**). Cluster 1 comprised of RGAs 4, 6, and 9. These were either down-regulated or unchanged for all genotypes (except in PBA HatTrick) at all time points assessed. Cluster 2 comprised of RGAs 8, 10, 21, and 23. These were up-regulated at 6 and 24 hpi and as stated above, RGA 8 and 10 were also up-regulated at 2 hpi in ICC3996, the commonly used A. rabiei resistance source in the Australian breeding program (Mehmood et al., 2017).

#### Prediction of RGA Functional Groups

RGAs 8, 10, 21, and 23 were further assessed through in silico analyses to predict functional involvement in A. rabiei recognition. Their homologous super families and amino acid sequences were predicted (**Table 3** and **Supplementary Table 2**, respectively) and NCBI reference sequences (RefSeq), gene and protein IDs were retrieved (**Table 3**). Domains and motifs were also predicted (**Table 4**). Whilst none of the four interrogated RGAs were able to be fully annotated, potentially indicating novelty, all were highly homologous (90–99% identity) with SUMM2 (KEGG orthology number K20599; Zhang et al., 2012). SUMM2 is an NB-LRR protein known to function in plant mitogen-activated protein kinase (MAPK) signaling pathways (**Figure 3**).

RGA 8 responded with the highest and earliest transcriptional response to the pathogen and so was chosen for further secondary structure prediction that revealed eight α-helices and four β-strands (**Figure 4**). The top predicted binding site domains for potential external sequences were identified with predicted binding residues at positions G1, G2, V3, G4, K5, T6, T7, L8, R112, M131, L139, K143, P169, and L170, and their collective predicted ligands were Magnesium ion (Mg2+), Adenosine diphosphate (ADP) and exchanging adenosine triphosphate (ATP) (**Figure 5**).

### DISCUSSION

Plants have their own effective innate immune systems that they use to recognize pathogens when they come into contact or begin to invade and cause infection (Höhl et al., 1990; Ilarslan and Dolar, 2002; Jayakumar et al., 2005). Most necrotrophic pathogen-plant pathosystems utilize R-gene families otherwise known as RGAs as the receptors for initial pathogen perception (Sekhwal et al., 2015). For the chickpea-A. rabiei pathosystem, this study has assessed several existing and newly identified RGAs for their involvement in this perception process, which is proposed to lead to downstream signaling of biochemical and physical defense mechanisms (Palomino et al., 2009; Leo et al., 2016; Mehmood et al., 2017; Sagi et al., 2017).

The timing of RGA expression is thus crucial for a plant to be able to recognize a pathogen fast enough to incite effective defense responses. In this study, we found that a cluster of RGAs (Cluster 2), was up-regulated by 2–6 h following inoculation with a highly aggressive A. rabiei isolate and that this was consistent with the timing of spore growth (germ tube elongation) and penetration (appressoria development) (Sambasivam et al., 2008). If a plant can recognize and initiate defense responses faster, it may be able to contain the fungus long enough for more systemic resistance responses to occur, including hormone signaling, structural rearrangement and production of pathogenesis proteins. These alert the whole plant to the presence of the pathogen and direct a concerted attack at

the site of invasion. This was proposed to be the case in the lentil – Ascochyta lentis pathosystem, whereby the host genotype was able to recognize and defend itself against the pathogen faster and was able to incite production of toxic phenolic compounds in a hypersensitive response as well as strengthen the cell wall around the invading hyphae compared to the slower and susceptible genotype (Sambasivam et al., 2017; Khorramdelazad et al., 2018). The fast recognition of the pathogen by several RGAs assessed in the chickpea – A. rabiei pathosystem stands in agreement with the observation of Leo et al. (2016) and Sagi et al. (2017) who also observed up-regulation as early as 2–6 hpi.

Since ICC3996 is the most widely used resistance source in breeding new resistant chickpea cultivars in Australia (Mehmood et al., 2017), it was important to determine which of the responsive RGA are present in this genetic background. In Cluster 2 of the heatmap (**Figure 2**), RGA 8, 10, 21, and 23

were up-regulated at 2–24 hpi in ICC3996. The homologous super family predictions indicated a common evolutionary origin among these four RGAs as evidence by the nucleoside triphosphate hydrolase domain (P-loop NTPase) (Leipe et al., 2003). P-loop NTPase is the most prevalent nucleotide-binding protein domain, catalyzing the hydrolysis of the beta-gamma phosphate bond of a bound nucleoside triphosphate (NTP) (Arya and Acharya, 2017). It is possible that these responsive RGAs in chickpea are Signal Transduction ATPases with Numerous Domains (STAND) P-loop NTPases and may function by ATP to initiate effector-triggered immunity (ETI) signaling.

RGA 8 was up-regulated in ICC3996, PBA Seamer, and PBA HatTrick at all times assessed, indicating that this locus is robust in its response to the pathogen. Also, since PBA Seamer and PBA HatTrick are progeny of crosses containing ICC3996 as the resistance donor parent (Dr. Kristy Hobson, Australian Chickpea Breeder, pers. comm.), this highlights that RGA 8 is heritable and may be selected for as a major contributor to the resistance response. The region containing the "GGVGK" domain in RGA 8 was proposed as a magnesium ion binding site, believed to induce phospho-transfer reactions (Li et al., 2001). This region was once showed resistance in tobacco after tobacco mosaic virus (TMV) infections (Les Erickson et al., 1999), and in response to Synchytrium endobioticum in potato (Hehl et al., 1999). Further, as mentioned, the secondary structure prediction for the RGA 8 revealed eight α-helices and four β-sheets (**Figure 4**), which is similar to the predicted plant disease resistance gene product reported by Rigden et al. (2000), found to function in His-Asp phosphor-transfer pathways. Therefore, the function of RGA 8 within defense to A. rabiei in chickpea may logically be predicted as a receptor to trigger the phospho-transfer signaling pathway through the activation of MAPK cascades.

Interestingly, RGA 10 was up-regulated in ICC 3996 and PBA Seamer but not in PBA HatTrick. The "resistant" status of PBA HatTrick was revised from "moderately resistant" to "moderately susceptible" in February 2017 by Pulse Breeding Australia, due to a substantial increase in aggressiveness within the isolate population (Mehmood et al., 2017). Meanwhile both ICC 3996 and PBA Seamer remained "resistant" at the time. RGA 10 contains domains homologous to Arabidopsis broad-spectrum mildew resistance protein RPW8 and a putative transposon-transfer assisting protein (TTRAP) (Xiao et al., 2001; Pulavarti et al., 2013). RPW8 is involved in resistance to a broad range of powdery mildew pathogens and TTRAP is associated with a family of small bacterial proteins largely derived from Clostrium difficile (Pulavarti et al., 2013). One could postulate that the functionality of the chickpea RGA 10 may have been lost in PBA HatTrick when exposed to a new highly aggressive isolate such as the one used in this study. This highlights the evolutionary risk of relying on one or few RGA (R-genes) for sustained resistance, as has been proven over again in other crops such as cereals in the race to breed for resistance against rust pathogens and in canola against the blackleg pathogen (Burdon et al., 2014; Moore et al., 2015a; Periyannan et al., 2017; Bousset et al., 2018; Zhang and Fernando, 2018).

RGA 21 was also up-regulated at 6 and 24 hpi in ICC 3996, meanwhile, it showed up-regulation in the susceptible Kyabra at 6 hpi. As showed in **Table 4**, RGA 21 contains a NB-ARC domain, a AAA ATPase domain, and a Leucine Rich Repeat, all belonging to the NBS-LRR family. Meanwhile, RGA 23 contained homologs of ArgK and PhoH-like proteins. ArgK is a member of the of P-loop GTPases, involved in the transport of positively charged amino acids (lysine, arginine, and ornithine) and has arginine kinase activity (Leipe et al., 2002). Previously, this was only found to exist in eukaryotic Caenorhabditis and Leishmania species. Similarly, the PhoH-like protein is a cytoplasmic protein. which has been shown to act in phosphate regulation in Escherichia coli (Kim et al., 1993). Further analyses will determine if the chickpea genes are complete and potentially functional.

Finally, the predicted proteins of all four RGAs share high similarities with the NB-LRR protein SUMM2 (**Figure 3**). SUMM2 is proposed to be activated with the MEKK1-MKK1/MKK2-MPK4 cascade when the MAPK signaling pathway is disrupted by pathogen effector binding, leading to the responses that cause localized cell death (Zhang et al., 2012). This indicates the potential for these RGA candidates to activate the well characterized defense responses to A. rabiei in chickpea when the MAPK signaling pathway is potentially suppressed by A. rabiei leading to apoptosis and the observed hypersensitive response (Leo et al., 2016; Mehmood et al., 2017).

#### CONCLUSION

fpls-10-00644 May 17, 2019 Time: 15:11 # 11

Although many studies have been devoted to improving chickpea resistance to A. rabiei, sustained success may in part have been limited due to a lack of accurate knowledge of the pathogen recognition mechanism and how it may lead to subsequent instigated defense mechanisms. This is despite a great deal of effort in genetic mapping and characterization of multiple contributory defense-related QTLs, and their identification in diverse genetic backgrounds (Coram and Pang, 2005a,b; Palomino et al., 2009; Sagi et al., 2017). Although the physical locations of several genes underpinning the resistance responses have been uncovered, few studies have contributed to discovering the structures and functions of the actual resistance proteins. Fortunately, a great deal of knowledge exists on resistance proteins structure and function, as well as the molecular mechanisms of defense signaling proteins in Solanaceous plants (summarized by van Ooijen et al., 2007), which provides a guiding model for exploring the classes and functions of resistance proteins in other plant species. In this research, several existing and newly identified RGAs in chickpea were classified into previously described classes and assessed for their involvement in the A. rabiei perception process, which is proposed to lead to downstream signaling of biochemical and physical defense mechanisms (Palomino et al., 2009; Leo et al., 2016; Sagi et al., 2017). In conclusion, the future directions of this study should be focused on unraveling the protein functions of the selected RGAs that were differentially expressed in the

# REFERENCES


resistant chickpea varieties after A. rabiei infection. This will provide further evidence for the selection of key RGAs in resistance breeding approaches.

#### AUTHOR CONTRIBUTIONS

ZZ designed and conducted the experiments, analyzed the data, and wrote the manuscript. RF directed the project, co-designed the experiments, and edited the manuscript. IB and PS assisted with the experiments and data analyses.

#### FUNDING

This work was supported by the Environmental Futures Research Institute and the School of Environment and Science, Griffith University, Australia.

### ACKNOWLEDGMENTS

Drs. Yasir Mehmood and Audrey Leo are gratefully acknowledged for providing unpublished information on transcript and histopathology studies of the chickpea – A. rabiei interactions.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00644/ full#supplementary-material


potential in some varieties. Plant Cell Environ. 39, 1858–1869. doi: 10.1111/pce. 12757


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zhou, Bar, Sambasivam and Ford. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

fpls-10-00644 May 17, 2019 Time: 15:11 # 12

# Nuclear Migration: An Indicator of Plant Salinity Tolerance in vitro

#### Adel M. Elmaghrabi1,2† , Dennis Francis<sup>1</sup> , Hilary J. Rogers<sup>1</sup> \* and Sergio J. Ochatt<sup>2</sup> \*

<sup>1</sup> School of Biosciences, Cardiff University, Cardiff, United Kingdom, <sup>2</sup> Agroécologie, AgroSup Dijon, INRA, Université Bourgogne Franche-Comté, Dijon, France

In order to understand the mechanisms underlying acquisition of tolerance to salinity, we recently produced callus tissues of tobacco and Medicago truncatula resistant to NaClinduced salt stress following application of a step-up recurrent selection method. The effects of salinity on cell size are known, but those on cell morphometry including cell and nuclear surface area and position of nuclei within salt stress resistant cells were never studied before. This work fills that gap, using suspension cultured cells of M. truncatula A17 initiated from callus, and Nicotiana tabacum BY-2 cell line resistant to increasing NaCl concentrations up to 150 mM NaCl. The surface area of salinity resistant cells of M. truncatula A17 and N. tabacum BY2 and their nuclei, produced by step-up recurrent selection, were reduced, and cells elongated as NaCl increased, but these parameters proved to be unreliable in explaining cell survival and growth at high NaCl. Conversely, nuclei of resistant cells migrated from the center to the periphery of the cytoplasm close to the walls. Nuclear marginalization was for the first time observed as a result of salt stress in plant cells, and could be a novel helpful morphological marker of acquisition of salinity tolerance.

Keywords: abiotic stress, cell morphometry, cell suspensions, Medicago truncatula, Nicotiana tabacum, nucleus position, salinity tolerance

# INTRODUCTION

Increased soil salinity is a world-wide problem, hence there is a need to develop more salinityresistant crop cultivars (Ochatt, 2015). The mechanisms of plant salt tolerance in vivo have been investigated at the molecular, cellular, and whole plant levels (Munns and Tester, 2008). In vitro selection for salt tolerance has focused on cellular (Davenport et al., 2003) and genetic (Elmaghrabi et al., 2013) mechanisms involved in salt tolerance using selected NaCl-tolerant cell lines, while gene transfer has also been successfully exploited very recently to generate salt (Confalonieri et al., 2019) and water stress (Confalonieri et al., 2014; Alcântara et al., 2015; Duque et al., 2016) tolerance in M. truncatula. Alternative methods to exploit in vitro stress to characterize the biology and genetic diversity of early stage seedling growth (Parida and Das, 2005; Elmaghrabi et al., 2013), as well as effects of salinity stress on plant morphology, have also been extensively studied (Parida and Das, 2005; Claeys et al., 2014; Golkar et al., 2017; Negrão et al., 2017). However, only effects of salinity on cell size have been examined to date (Kurth et al., 1986) while those on cell morphometry have not.

Nuclear positioning is important during cell division, mediated by the three cytoskeletal filament systems, F-actin, intermediate filaments (IF), and microtubules (Ingber, 2003). In a recent review, Gundersen and Worman (2013) examined the sparse knowledge and understanding of the reasons and effects of cell movement and of the position of their nuclei within the cytoplasm. Given that

#### Edited by:

Alma Balestrazzi, University of Pavia, Italy

#### Reviewed by:

Photini V. Mylona, Hellenic Agricultural Organisation (HAO), Greece Ahmad Arzani, Isfahan University of Technology, Iran

#### \*Correspondence:

Hilary J. Rogers RogersHJ@cardiff.ac.uk Sergio J. Ochatt Sergio.ochatt@inra.fr

#### †Present address:

Adel M. Elmaghrabi, Plant Tissue Culture Department, Biotechnology Research Center (BTRC), Tripoli, Libya

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 05 February 2019 Accepted: 29 May 2019 Published: 12 June 2019

#### Citation:

Elmaghrabi AM, Francis D, Rogers HJ and Ochatt SJ (2019) Nuclear Migration: An Indicator of Plant Salinity Tolerance in vitro. Front. Plant Sci. 10:783. doi: 10.3389/fpls.2019.00783

**42**

nuclear positioning has been reported to reflect an interference with the proteins involved in nuclear movement (Maniotis et al., 1997; Folker et al., 2011), they hypothesized that this may then inhibit a number of cellular activities. These include an effect on the organization and mechanical properties of the cytoplasm with a concomitant impact on cytoplasmic signaling and on the accessibility of the nucleus to the associated signaling pathways (Dahl et al., 2004; Gundersen and Worman, 2013). However, to our knowledge, this hypothesis that nuclear movement may regulate cellular signaling pathways and responses to stress (Gundersen and Worman, 2013) has never been directly tested to date, be it with animal or plant cells.

Recently we developed a step-up selection method in M. truncatula, for obtaining embryogenic calli under increasing salt stress. Within 5 months, different developmental patterns of callus varying between embryogenic to a non-regenerative condition were observed, correlated with a differential nuclear DNA content and biochemical profile (Elmaghrabi and Ochatt, 2006; Elmaghrabi et al., 2013). Callus growth was significantly impaired at ≥100 mM NaCl but green callus was observed up to 100–150 mM NaCl, coincident with healthy growth despite the high salinity. However, 250 and 350 mM NaCl were lethal to most cells, and only small clusters of cells survived. To assess how this step-up approach affected cellular morphology, it was adapted and applied to M. truncatula and N. tabacum cells in suspension cultures. As well as monitoring cell and nuclear area, we have now filled the gap in morphometric analysis showing here that nuclear positioning is affected by the NaCl treatments.

#### MATERIALS AND METHODS

Calli from leaves of Medicago truncatula cv. Jemalong line A17 were subcultured monthly on MS medium (Murashige and Skoog, 1962) supplemented with 2.0 mg/l NAA (1 naphthaleneacetic acid), 0.5 mg/l BAP (6-benzylaminopurine) and 3% (w/v) sucrose; pH was adjusted to 5.8 before addition of 0.9% (w/v) agar (MANA medium). Media were autoclaved for 20 min at 121◦C/1 par. Cultures were kept at 24/22◦C with a 16/8 h (light/dark) photoperiod of 90 µE m−<sup>2</sup> s −1 from warm white fluorescent tubes, as reported previously (Elmaghrabi and Ochatt, 2006; Elmaghrabi et al., 2013, 2017).

After 5 months of callus induction on MANA medium, 0.5 g fresh weight pieces of callus were transferred into 250 ml Erlenmeyer flasks containing 100 ml of BY-2 liquid medium and used to establish cell suspensions. BY-2 liquid medium (Nagata et al., 1992) consists of MS (Murashige and Skoog, 1962) medium modified with 0.2 mg/l 2,4D, 1 mg/l Thiamine-HCl, 100 mg/l Myo-inositol and enriched with 200 mg/l KH2PO4. Cell suspensions were sub cultured every 2 weeks. After four subcultures, once proliferation of suspension cells stabilized, cells were sub cultured into the same medium with a low concentration of NaCl (0, 35, 50, and 70 mM) for gradual acclimation to salt-stress. One month later, these concentrations were changed to 0, 50, 100, and 150 mM NaCl and suspension cultures were maintained as above. Tobacco (Nicotiana tabacum) BY-2 cell cultures were analyzed as a comparison to the M. truncatula suspension cultures, by adding the same NaCl concentrations to BY-2 liquid medium. Cell suspension cultures of both species were shaken (130 rpm) and were sub cultured every 14 days (Elmaghrabi and Ochatt, 2006).

The viability of the cell suspension cultures was tested by dual propidium iodide (PI) and flouroscein diacetate (FDA) viability staining. The dual stain contained PI (0.24 mg ml−<sup>1</sup> ) and FDA (0.04%) and sucrose (w/v, 2%). The cell suspension (75 µl) was added to 75 µl of dual staining solution and incubated on ice for 20 min. Percentage cell mortality was counted using an Olympus BH2 fluorescent microscope at 20× magnification. Approximately 300 cells were scored as either living (green) or dead (red). Culture viability was also assessed by measuring cell density using a spectrophotometer at 600 nm and visually by increasing density of the culture during the 2 week subculture period.

Hoechst staining (1 µL of a 10 mg ml−<sup>1</sup> stock of Bisbenzimide H, 2 µL Triton X-100 and 97 µL sterile distilled H2O) was used to assess cell morphology, using an Olympus BH-2 compound microscope equipped with UV epi-fluorescence. Following 60 days of acclimation to increased salinity, cell and nuclear size were measured using Sigmascan-pro (objective: DPlan Apo 20 UV, 0.70, 160/0.17). Position of nuclei within cells was determined using ArchimedPro and Histolab software (Microvision, France) by six measurements 60◦ apart for each cell, n = 13–19 cells (**Supplementary Figure 1**).

Data were analyzed using R software (R version 3.3.2, Foundation for Statistical Computing). ANOVA tests followed by a Tukey's test, or non-parametric Kruskal–Wallis followed by a Dunn's test were applied to determine differences across multiple samples.

## RESULTS

In M. truncatula cell suspensions, the initial trend was an increase in cell and nuclear area following 60 days exposure to 50 mM NaCl although these increases were not significant (P > 0.05). Likewise, in the tobacco cultures cell area remained stable up to 50 mM NaCl, although nuclear area was already significantly lower than the control (**Figure 1**). When the NaCl concentration was raised further, to 100 or 150 mM, both nuclear and cell area decreased in both species, perhaps as a function of plasmolysis. Cell area was significantly lower than the control at 150 mM in both species, and nuclear area was significantly lower than the control in both the 100 and 150 mM NaCl treatments (P < 0.05; **Figure 1**). This decline in cell and nuclear size across treatments, despite the presence of viable cells even at the highest NaCl concentration (**Supplementary Table 1**), and the lack of response at 50 mM NaCl suggests that these traits are not a reliable criterion to assess cell growth of M. truncatula or tobacco in response to NaCl stress over a longer period. Similar trends were noted for both cell and nuclear area when suspensions were cultured under the same conditions for up to 4 months (**Supplementary Figures 2, 3**). Moreover, they showed the opposite trend to those observed for osmotic stress-resistant cells of M. truncatula where osmotic stress provoked an increase

in cell and nuclear area concomitant with endoreduplication (Elmaghrabi et al., 2017).

We observed mitoses in the M. truncatula cell suspensions established from callus for monitoring cellular behavior under salt stress (0, 50, 100, or 150 mM NaCl; **Figure 2A**). Our aim was to identify a characteristic cellular/nuclear phenotype as a consistent marker of fast growing or salt tolerant callus, for use as a diagnostic criterion of acquisition of in vitro salt tolerance in M. truncatula, as recently observed under osmotic stress (Elmaghrabi et al., 2017).

Interestingly, subjecting the cell suspension cultures to the 50, 100, and 150 mM NaCl treatments for 2 months consistently resulted in the migration of nuclei from the center toward the periphery of cells (**Figure 2C** and **Supplementary Table 2**). To test whether this was a distinctive feature of M. truncatula or a more general response to salt stress by plant cells, the tobacco cells were also analyzed, and the same effect was seen (**Figure 2B**). This repositioning was never observed for the control cells of either species studied when grown under stress-free conditions, where the nucleus maintained a central position within the cytoplasm, equidistant to the wall (**Figures 2A,B**, first panel). It is also noteworthy in this respect, that these observations were undertaken on cell suspensions that had undergone already the repeated cycles with and without NaCl during the step-up protocol through which they were produced, which would suggest that the phenomenon of nucleus repositioning is correlated to the acquisition of salt tolerance.

# DISCUSSION

In addition to being a model species, M. truncatula (barrel medic) can fix atmospheric nitrogen, has high protein content (Young and Udvardi, 2009) and includes cultivars with relatively high salinity tolerance (Merchan et al., 2003). We developed a method for induction of new accessions of M. truncatula tolerant to salinity induced by NaCl (Elmaghrabi et al., 2013) and also to osmotic stress provoked by PEG 6000 (Elmaghrabi et al., 2017), in both cases through in vitro selection via a step-up recurrent strategy. A gradual exposure to successively higher NaCl concentrations has led to long-term acclimation of cells to salinity in other plant species, and embryogenic and organogenic callus produced by this method have enabled selection of salt resistant cultures (Miki et al., 2001; Merchan et al., 2003). Long-term

measurements from nuclear envelope to cell wall for each cell + SE; n = 13–19 cells (see Supplementary Figure 1). Different letters indicate significantly different means (Kruskal Wallis followed by a Dunn's test, P < 0.05).

culture under salt stress conditions may simultaneously induce physiological adaptation in cells (Naliwajski and Skłodowska, 2014) and the generation of acclimated and truly tolerant somaclones (Arzani, 2008), which are then capable of growth at NaCl concentrations that are lethal to non-acclimated ones. In this respect, the continuous assessment of cell viability with time in culture and following our step-up recurrent strategy resulted in a gradual enrichment in truly tolerant cells in the population selected under NaCl stress and the concomitant death of those cells that were only physiologically adapted to the stress imposed (Elmaghrabi et al., 2013).

Here using cell suspension cultures, we explored how exposure to increasing salinity over long culture periods affected cell morphology, and demonstrate that an alteration of nuclear position was more sensitive to low NaCl concentrations than changes in nuclear or cell area. Here we only assessed a single variety of M. truncatula, and it will be interesting for future studies to assess different genotypes and species of Medicago known for their differential responses to salinity stress, as shown with M. sativa (Ehsanpour and Fatahian, 2003; Quan et al., 2016). However, nuclear repositioning in response to NaCl was shown here to be consistent across two very different species,

M. truncatula and N. tabacum suggesting that it may be a widespread plant cellular response.

Understanding of the cellular significance of nuclear position within cells in terms of both their metabolism and physiology is still in its infancy, and all studies on the movement of cells and positioning of their nuclei thus far have been restricted to human and animal cells (Gundersen and Worman, 2013). Among them, in cancerous cells, nuclear positioning was shown to alter their ability to respond to the pathways regulating transcription and mRNA transport and localization (Calvo et al., 2010). It was also speculated that the distance the nuclei traveled depended on various cytoplasmic stimulatory and inhibitory factors, whereby their change of position relative to the origin of an external signal may modulate the nuclear response particularly when signaling is asymmetrical. However, only one study in zebrafish has examined the relationship between nuclear position and asymmetrical signaling (Del Bene et al., 2008). Our results with salt-tolerant plant cells are in line with the studies above. In zebrafish gradients of external stress-inducing factors during development, resulted in a repositioning of the nucleus within the cytoplasm so that its responsiveness to stress might be improved by replacing the nucleus in the close proximity to the stress signal.

Such migration of nuclei from the center of cells toward the outside is a type of perturbation not shown before in plant cells to our knowledge, and appears to be a major effect of salt at the cellular level perhaps related to negative growth responses to the increasing internal concentrations of NaCl of cells in culture. Nuclear migration will be concomitant with the typical cell responses to osmotic stress, including changes in cell wall thickness, vacuole volume and plastid rearrangements as observed in Arabidopsis (Gobert et al., 2007), but also in the surface area of cells and their nuclei as recently reported in osmotic stress resistant cells of M. truncatula (Elmaghrabi et al., 2017).

Exposure to high salt stress can induce rapid nuclear deformation (Katsuhara and Kawasaki, 1996) leading to programmed cell death. Moreover, growth of barley at 192 mM NaCl resulted in chromatin condensation (Werker et al., 1983). Nuclear marginalization has also been associated with cell death (O'Brien et al., 1998), suggesting that some of the cells under salt stress in this study are preparing to undergo cell death. On the other hand, M. truncatula cell suspensions were shown to respond to stress like whole plants (Elmaghrabi et al., 2013; Araújo et al., 2016) and, in this context, nuclear marginalization as observed would be a part of an eustress cellular mechanism to cope with the induced stress. In this respect, eustress is an

#### REFERENCES


activating, stimulating stress, which is a positive element in plant development, and is also referred to as good stress or constructive stress that can promote plant defense secondary metabolisms for improving tolerance to further stress (Kranner et al., 2010; Hideg et al., 2013).

# CONCLUSION

In conclusion, cell and nuclear size decreased at high NaCl, consistent with signs of plasmolysis, but were not useful traits in explaining cell survival and growth at high NaCl concentrations. Conversely, nuclear marginalization was for the first time observed as a result of salt stress in plant cells, and could be a novel and helpful morphological indicator for acquisition of salinity tolerance. Importantly, our results strongly suggest that the repositioning of the nucleus within the cytoplasm is not passive nor random. Indeed, it results from the onset under stress of a mechanism that may be a common response across eukaryotes.

#### AUTHOR CONTRIBUTIONS

DF, HR, and SO designed the project and experiments. AE performed the experiments. SO and HR wrote and revised the manuscript. All authors analyzed the data, read, and approved the manuscript.

### FUNDING

In this work, AE was financially supported by the Biotechnology Research Center, Tripoli.

## ACKNOWLEDGMENTS

We thank Catherine Conreux and Mike O'Reilly for their technical assistance.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00783/ full#supplementary-material

truncatula cell suspension and increases the expression of key genes involved in the antioxidant response and genome stability. Plant Cell Tiss. Organ Cult. 127, 675–680. doi: 10.1007/s11240-016-1075-5



Ingber, D. E. (2003). Tensegrity I. Cell structure and hierarchical systems biology. J. Cell Sci. 116, 1157–1173. doi: 10.1242/jcs.00359

Katsuhara, M., and Kawasaki, T. (1996). Salt stress induced nuclear and DNA degradation in meristematic cells of barley roots. Plant Cell Physiol. 37, 169–173. doi: 10.1093/oxfordjournals.pcp.a028928


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Elmaghrabi, Francis, Rogers and Ochatt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Priming With the Green Leaf Volatile (Z)-3-Hexeny-1-yl Acetate Enhances Salinity Stress Tolerance in Peanut (*Arachis hypogaea* L.) Seedlings

*Shufei Tian1† , Runze Guo1† , Xiaoxia Zou1 , Xiaojun Zhang1 , Xiaona Yu1 , Yuan Zhan1 , Dunwei Ci <sup>2</sup> , Minglun Wang1 , Yuefu Wang1 \* and Tong Si1 \**

*1 Shandong Provincial Key Laboratory of Dryland Farming Technology, College of Agronomy, Qingdao Agricultural University, Qingdao, China, 2 Shandong Peanut Research Institute, Qingdao, China*

#### *Edited by:*

*Sergio J. Ochatt, INRA UMR 1347 Agroécologie, France*

#### *Reviewed by:*

*Petronia Carillo, Università degli Studi della Campania Luigi Vanvitelli Caserta, Italy Arafat Abdel Hamed Abdel Latef, South Valley University, Egypt*

#### *\*Correspondence:*

*Yuefu Wang wangyuefu01@163.com Tong Si tongsi@qau.edu.cn; nmst12@163.com*

*† These authors have contributed equally to this work*

#### *Specialty section:*

*This article was submitted to Plant Physiology, a section of the journal Frontiers in Plant Science*

*Received: 10 April 2019 Accepted: 29 May 2019 Published: 20 June 2019*

#### *Citation:*

*Tian S Guo R, Zou X, Zhang X, Yu X, Zhan Y, Ci D, Wang M, Wang Y and Si T (2019) Priming With the Green Leaf Volatile (Z)-3-Hexeny-1-yl Acetate Enhances Salinity Stress Tolerance in Peanut (Arachis hypogaea L.) Seedlings. Front. Plant Sci. 10:785. doi: 10.3389/fpls.2019.00785*

Green leaf volatiles play vital roles in plant biotic stress; however, their functions in plant responses to abiotic stress have not been determined. The aim of this study was to investigate the possible role of (Z)-3-hexeny-1-yl acetate (Z-3-HAC), a kind of green leaf volatile, in alleviating the salinity stress of peanut (*Arachis hypogaea L.*) seedlings and the underlying physiological mechanisms governing this effect. One salt-sensitive and one salt-tolerant peanut genotype were primed with 200 μM Z-3-HAC at the 4-week-old stage before they were exposed to salinity stress. Physiological measurements showed that the primed seedlings possessed higher relative water content, net photosynthetic rate, maximal photochemical efficiency of photosystem II, activities of the antioxidant enzymes, and osmolyte accumulation under salinity conditions. Furthermore, the reactive oxygen species, electrolyte leakage, and malondialdehyde content in the third fully expanded leaves were significantly lower than in nonprimed plants. Additionally, we found that application of Z-3-HAC increased the total length, surface area, and volume of the peanut roots under salinity stress. These results indicated that the green leaf volatile Z-3-HAC protects peanut seedlings against damage from salinity stress through priming for modifications of photosynthetic apparatus, antioxidant systems, osmoregulation, and root morphology.

Keywords: green leaf volatiles, Z-3-HAC, priming, salinity stress tolerance, peanut

### INTRODUCTION

As an important cash and oilseed crop, peanut (*Arachis hypogaea* L.) is widely cultivated in most tropical, subtropical, and temperate regions worldwide (Sharma et al., 2016; Cui et al., 2018). Peanut is also a great source of many nutrients for humans, such as protein, fatty acids, and vitamins (King et al., 2008; Aninbon et al., 2016). Soil salinity is one of the key environmental factors that affects plant growth and reduces crop productivity worldwide (Tanji, 2002; Hasegawa, 2013). More than 800 million hectares of agricultural land have been impaired by salinity (Rengasamy, 2010). Among all types of salinity, the most soluble and widespread salt is sodium chloride (NaCl). Similar to many other leguminous crop species,

**48**

peanut is moderately sensitive to salinity, especially NaCl stress (Greenway and Munns, 1980). Salinity stress has a severe impact on the growth and morphogenesis of peanut, decreasing seed germination and dry matter accumulation, affecting the establishment of seedling morphology, and inducing damage to the photosynthetic apparatus (Mäser et al., 2002; Deinlein et al., 2014; Yi et al., 2015; Meena et al., 2016).

Plants employ ubiquitous mechanisms to cope with salinity and minimize salt toxicity. The plant responses to salinity stress include the induction of phytohormones and antioxidant systems, vacuole compartmentalization of toxic ions, and synthesis and accumulation of compatible compounds to osmotically balance the cytosol with vacuoles (Cheeseman, 1988; Zhu, 2002; Munns and Tester, 2008; Garma et al., 2015; Ferchichi et al., 2018; Abdel Latef et al., 2019). In the past several decades, plant growth-regulating substances have been widely used to confer salinity stress in many crop species, including sodium selenate (Subramanyam et al., 2019), melatonin (Li et al., 2017; Chen et al., 2018), hydrogen peroxide (Li et al., 2011), brassinosteroid (Divi et al., 2010; Zhu et al., 2015), nitric oxide (Sun et al., 2014; Ahmad et al., 2016), and glycine betaine (Nawaz and Ashraf, 2010; Nusrat et al., 2014; Kreslavski et al., 2017; Annunziata et al., 2019). In addition, traditional breeding and genetic engineering have also been promising approaches for the acquisition of salinity stress tolerance of crops (Hanin et al., 2016; Ismail and Horie, 2017). Although these strategies are well accepted by farmers, more eco-friendly plant growthregulating substances that confer crop salinity tolerance are required to achieve the goal of agricultural sustainability.

Biogenic volatile organic compounds (VOCs) mainly consist of terpenes, fatty acid-derived products, and products of the shikimic acid pathway, which are emitted by plants under stress (Dudareva et al., 2006; Heil and Silva Bueno, 2007; Nguyen et al., 2016). VOCs can act as an alarm signal when plants are under attack from insect herbivores. Green leaf volatiles (GLVs) are an important group of VOCs for priming plant defenses against insect herbivore attacks, which were first reported by Engelberth et al. (2004). Typically, GLVs are released by plants after mechanical wounding or herbivore attack and could induce defense-related genes to alert the undamaged tissues in plant biotic stress responses (Pare and Tumlinson, 1997; Arimura et al., 2002; Yan and Wang, 2006). However, the role that GLVs play in plant abiotic stress remains an open question. Previous studies, including our research, documented the importance of wounding- or herbivore-induced phytohormones, such as ethylene (ETH) and jasmonic acid (JA), and signaling molecules, such as hydrogen peroxide (H2O2) and nitric oxide (NO), in response to plant abiotic stress (León et al., 2001; Schilmiller and Howe, 2005; Chauvin et al., 2012; Ahmad et al., 2016; Si et al., 2017, 2018); thus, GLVs might also be a crucial molecule in plant abiotic stress.

GLVs are synthesized *via* the lipoxygenase pathway, where (Z)-3-hexen-1-al (Z-3-HAL), (Z)-3-hexen-1-ol (Z-3-HOL), and (Z)-3-hexeny-1-yl acetate (Z-3-HAC) are all major components. Extensive studies have demonstrated that Z-3-HAC plays a pivotal role in plant defenses against insect herbivore attack (Matsui et al., 2012; Ameye et al., 2015). However, the literature regarding priming by Z-3-HAC in response to plant abiotic stress remains scarce. More recently, Cofer et al. (2018) reported that exogenous Z-3-HAC treatment determines increased growth and reduced damage under cold stress in maize (*Zea mays*) seedlings. This report was the first to describe the priming effects of Z-3-HAC in plant abiotic stress. Given these findings, we speculate that Z-3-HAC could also play a role in other plant abiotic stresses, such as salinity stress. To date, Z-3-HAC has been tested only on maize, but not on other species monocots or dicots. Therefore, a new study was designed in this paper to further our understanding of the role that Z-3-HAC plays in plant abiotic stress. It was hypothesized that exogenous application of Z-3-HAC could enhance salinity stress tolerance in peanut seedlings. This effort to improve salt tolerance in peanut will reduce the yield losses caused by salinity stress, and we can obtain greater output from salinized agricultural land worldwide.

# MATERIALS AND METHODS

#### Plant Materials

Two peanut cultivars, Huayu 20 (abbreviated here as "HY20") and Huayu 22 (abbreviated here as "HY22"), which are classified as salt-sensitive and salt-tolerant genotypes, respectively, were used as the experimental materials in this study. The seeds were surface sterilized with 2% (v/v) sodium hypochlorite, rinsed two times with tap water, and soaked in tap water overnight. Then, the seeds of uniform sizes were germinated in vermiculite in the dark at 28°C for 2 days before transfer to pots (inner diameter of 9 cm and height of 8 cm with small holes at the bottom, one seedling/pot) filled with 200 g of garden soil each. The seedlings were then transferred to an artificial climate-controlled chamber with an air temperature of 25°C, a light/dark cycle of 16/8 h, a humidity of 60%, and a photosynthetic photon flux density (PPFD) of 1,200 μmol m−2 s−1. Each pot was watered with 200-ml distilled water on every alternate day. Four-week-old seedlings with uniform sizes were selected for the subsequent experiments.

#### Experimental Design

The information of (Z)-3-hexeny-1-yl acetate (Z-3-HAC) (≥98%, Sigma-Aldrich, Inc. USA) were as follows: CAS number of 3,681-71-8, linear formula of CH3CO2CH2CH2CH═CHC2H5, and molecular weight of 142.20. All selected seedlings were randomly divided into two batches. A half batch of the seedlings was first foliar applied with 200 μM Z-3-HAC (Z-3-HAC was dissolved in 95% (v/v) ethanol as stock solution) twice with a 3-day interval. At the same time, the other half batch was treated with distilled water with the equivalent amount of ethanol. A relatively moderate concentration of Z-3-HAC at 200 μM was most effective according to our previous experiments (data not shown). Seven days after pretreatment, half of the seedlings treated with Z-3-HAC and distilled water were exposed to NaCl stress treatments. Each pot was watered with 200-ml NaCl solution at a concentration of 300 mM three times with a 2-day interval, while the rest of the seedlings were watered with distilled water at the same time. The final salt content of the NaCl-treated soil was 0.35% (w/w), which could be classified as severely saline soil. In total, four treatments were composed: control (water + water without NaCl), Z-3-HAC (Z-3-HAC + water without NaCl), NaCl (water + water with NaCl), and Z-3-HAC + NaCl (Z-3-HAC + water with NaCl). Physiological and biochemical parameters were determined at 7 days after the onset of salinity stress treatment. One representative pot was selected from at least 10 similar-looking plants for each treatment, and pictures were taken. For all the measurements, the third fully expanded leaves from the plant tops were selected. Three independent biological replicates were performed for each treatment.

#### Measurement of Shoot Weight and Root Morphology

The seedlings were washed twice with distilled water, the topical moisture was removed, and then the fresh weights of the dissected shoots were measured immediately. To obtain the dry weights, the dissected shoots were oven-dried at 105°C for 15 min to deactivate enzymes and then heated in a stove at 85°C until constant weights were recorded. Meanwhile, the fresh roots were also dissected, carefully washed twice with distilled water, and then scanned using a dual lens scanning system (V700, SEIKO EPSON CORP., Japan) according to the method of Jiang et al. (2017). The data obtained were analyzed using the WinRHIZO Pro software (Version 2012b, Regent Instruments Inc., Canada). There were three independent biological replicates for each treatment and one representative picture is shown.

#### Determination of Gas Exchange Parameters, Chlorophyll Fluorescence and Total Chlorophyll Content

Determination of gas exchange parameters was conducted between 9:00 am and 11:00 am using the portable photosynthesis system (Li-COR 6800, Lincoln, NE, USA). The net photosynthetic rate (Pn), stomatal conductance (Gs), intercellular CO2 concentration (Ci), and transpiration rate (Tr) were measured based on the following conditions in the leaf chamber: air temperature of 25°C, air relative humidity of 80%, CO2 concentration of 400 μmol mol−1, and PPFD of 1,000 μmol m−2 s−1.

Chlorophyll fluorescence was measured after a 30-min dark adaptation period with an imaging pulse amplitude modulated (PAM) fluorimeter (IMAG-MAXI; Heinz Walz, Effeltrich, Germany), as described in detail by Ahammed et al. (2013). The minimum fluorescence emission signal (Fo), maximal fluorescence (Fm), steady-state fluorescence yield (Fs), and light-adapted maximum fluorescence (Fm′) were recorded as the area of interest in the compound leaves. Then, the maximal photochemical efficiency of photosystem II (PSII) (Fv/Fm), the quantum efficiency of PSII photochemistry (ΦPSII), the photochemical activity of PSII (Fv′/Fm′), and the non-photochemical quenching (*NPQ*) were calculated according to the formulas as described by Kramer et al. (2004). The images of Fv/Fm were also exported, and the representative leaf for each treatment is shown.

For the assay of the total chlorophyll content, 0.1 g of fresh leaf was extracted in 25 ml of anhydrous ethanol and acetone (1:1, v/v) solution and incubated for 12 h in the dark at room temperature. Then, the total chlorophyll content (mg g−1 FW) was determined colorimetrically at 647 and 663 nm and calculated as originally described by Lichtenthaler and Wellburn (1983).

#### Measurement of Relative Water Content, Electrolyte Leakage, and Lipid Peroxidation

The leaf relative water content (RWC) was measured based on the method of Jensen et al. (2000) with some modifications. In total, the leaves were excised and fresh weight (FW) was measured. Then, the leaves were soaked in tubes with 5 ml of deionized water for 4 h at room temperature before the turgid weight (TW) was recorded. Dry weight (DW) was further measured after the leaves were oven-dried for 24 h at 90°C. RWC was calculated by RWC (%) = [(FW − DW)/(TW − DW)] × 100.

The measurement of relative electrolyte conductivity (REC) was conducted using the method of Griffith and McIntyre (1993). The leaf samples were excised immediately and rinsed briefly with deionized water and soaked in 10 ml of deionized water at room temperature for 12 h. The conductivity (C1) was then measured with a conductivity bridge (DDS-307A, LEX Instruments Co., Ltd., China). Then the solution was boiled for 30 min, and the conductivity (C2) was further recorded after cooling. RWC was calculated by REC (%) = C1/C2 × 100.

The lipid peroxidation level was determined by quantifying the equivalents of malondialdehyde (MDA). The 2-thiobarbituric acid (TBA) reaction was used in this assay, and the absorbance values of the red adduct at 450, 532, and 600 nm were recorded to calculate the MDA equivalents as described previously (Hodges et al., 1999). All spectrophotometric assessments in this paper were carried out using a UV–Vis spectrophotometer (UV3200, Mapada Instruments Co., Ltd., China).

#### Histochemical Staining and Quantitative Assay of H2O2 and O2 −

Hydrogen peroxide (H2O2) in leaves was visually detected by histochemical staining according to the method of Thordal-Christensen et al. (1997) with minor modifications. The leaves were excised from the plants and immediately submerged in 3,3-diaminobenzidine (DAB) solution (1 mg ml−1, pH 3.8). Then, the leaves were incubated for 12 h under light with a PPFD of 1,200 μmol m−2 s−1 at room temperature, after which the leaves were bleached in 95% (v/v) boiling ethanol for approximately 15 min until the brown spots were clearly visualized. Then, the leaves were carefully transferred to fresh 95% (v/v) ethanol, and pictures were taken after cooling. The H2O2 concentration was determined by measuring the absorbance of the titanium peroxide complex at 410 nm according to the method of Willekens et al. (1997) with minor modifications.

Superoxide anion (O2 − ) was also visually detected according to the method originally described by Jabs et al. (1996). In brief, the leaves were excised from the seedlings and soaked in nitro blue tetrazolium (NBT) solution (1 mg ml−1, pH 6.1). Then, the leaves were incubated at room temperature in the dark for 6 h before they were completely bleached in 95% (v/v) boiling ethanol. After cooling, the leaves were transferred to fresh ethanol, and pictures were taken immediately. The O2 − production rate was also quantified according to the previous method of Elstner and Heupel (1976) by monitoring the nitrite formation from hydroxylamine in the presence of O2 − at an absorbance of 530 nm.

#### Extraction and Analysis of Activity of Antioxidant Enzymes

For the determination of antioxidant enzymes, the leaves were frozen immediately in liquid nitrogen and stored at −80°C prior to analysis. In brief, 0.5 g of frozen leaf samples was ground with 5 ml of ice-cold phosphate buffer (50 mM, pH 7.8) containing 20% (v/v) glycerol, 0.2 mM ethylenediaminetetraacetic acid (EDTA), 5 mM MgCl2, and 1 mM dithiothreitol (DTT). The homogenates were centrifuged at 4°C for 20 min at 12,000 *g*, and the resulting supernatants were then collected for the determination of enzymatic activity. The total protein content was first analyzed using a Coomassie Brilliant Blue reaction at 595 nm following the method of Bradford (1976). Superoxide dismutase (SOD) activity was assessed by determining its ability to inhibit the photochemical reduction of NBT at 560 nm (Stewart and Bewley, 1980). Guaiacol peroxidase (G-POD) activity was assayed using guaiacol as a substrate at 470 nm as originally described by Cakmak and Marschner (1992). Catalase (CAT) activity was assayed based on the oxidation of H2O2 and measured as a decline at 240 nm following the method of Patra et al. (1978). Ascorbate peroxidase (APX) activity was determined based on the oxidation of ascorbate and measured as a decline at 290 nm according to the method of Nakano and Asada (1981).

#### Contents of Total Soluble Sugars, Sucrose, and Free Amino Acids

Oven-dried (15 min at 105°C and then 85°C for 3 days) leaf samples were powdered with a high-speed ball mill (MM400, Retsch GmbH, Haan, Germany) and mixed thoroughly. A total of 0.1 g of the powder was extracted with 8 ml of 80% (v/v) ethanol in a 10-ml plastic tube at 80°C and centrifuged at 3,000 *g* for 30 min. The supernatant was then collected in a 25-ml glass tube. The extraction was then repeated twice, and the same ethanol was added to the glass tube to a final volume of 25 ml. After mixing thoroughly, the extract was used to determine the contents of total soluble sugars, sucrose, and free amino acids. The anthrone method was adopted, and the absorbance at 620 nm was recorded to calculate the total soluble sugar content according to the method of Buysse and Merckx (1993). For the sucrose content, the resorcinol method was used, which was modified by the method of Buysse and Merckx (1993), and the sucrose content was determined colorimetrically at 480 nm. The content of free amino acids was assessed by the ninhydrin reaction at 570 nm according to the method of Moore and Stein (1954).

### Statistical Analysis

All data collected were statistically analyzed using one-way ANOVA with the SPSS statistical software package (Version 22.0, SPSS Inc., Chicago, IL, USA). Duncan's test (*p* < 0.05) was performed to evaluate the difference of each treatment. Principal component analysis (PCA) was carried out according to the method of Sun et al. (2018). Each treatment value is the average of three independent biological replicates unless otherwise stated.

FIGURE 2 | Effects of Z-3-HAC on (A) relative electrolyte conductivity (REC) and (B) relative water content (RWC) of the third fully expanded leaves in peanut seedlings under salinity stress. The seedlings were primed with distilled water or 200 μM Z-3-HAC twice. After priming, the seedlings were exposed to NaCl stress. At 7 days after the onset of salinity stress treatment, the leaves were excised and the REC and RWC were determined. Bars are the standard deviations (SD) of three independent replicates (*n* = 3). Error bars labels with different letters indicate significant differences at *p* < 0.05 between treatments according to Duncan's test.

#### RESULTS

#### Effects of Exogenous Z-3-HAC on Plant Growth, Relative Electrolyte Conductivity, and Relative Water Content under Salinity Stress

The first objective was to test the effects of exogenous Z-3-HAC on plant growth. The peanut seedlings were primed with distilled water or 200 μM Z-3-HAC. Then the seedlings were exposed to NaCl stress (NaCl shock did not happen). At 7 days after the onset of salinity stress treatment, the Z-3-HAC-treated seedlings showed a clear apical dominance compared to watertreated seedlings under normal growth conditions in both HY20 and HY22 (**Figure 1C**). However, no significant difference in the shoot dry weight and fresh weight was observed between these treatments (**Figures 1A,B**). Exposure of plants to salinity conditions stunted the growth of peanut plants as indicated by the significant decreases in shoot dry weight and fresh weight by 63.39 and 56.94% of HY20 and 19.18 and 32.34% of HY22, respectively. Strikingly, priming with Z-3-HAC resulted in improved plant growth under salinity conditions of HY20, as indicated by the significant increases in shoot dry weight and fresh weight by 55.23and 64.78%, respectively, compared with salinity control. In HY22, Z-3-HAC pretreatment also showed increases in the shoot dry weight and fresh weight by 18.28and 25.48%, respectively, under salinity conditions compared with the salinity control, although the difference was not significant (**Figures 1A,B**).

Consistent with the phenotypic changes of the peanut seedlings, exogenous application of Z-3-HAC had no effect on the relative electrolyte conductivity (REC) and relative water content (RWC) of both HY20 and HY22 under normal growth conditions. As expected, salinity stress significantly increased REC by 247.90 and 128.83% in HY20 and HY22, respectively, while decreasing RWC by 15.14and 18.28% in HY20 and HY22, respectively, compared with the control (**Figure 2**). Notably, priming with Z-3-HAC decreased REC by 36.15 and 34.52% while increasing RWC by 5.5 and 4.3% under salinity stress in severe saline soil compared with their salinity control in HY20 and HY22, respectively.

#### Effects of Exogenous Z-3-HAC on Gas Exchange and Chlorophyll Fluorescence Parameters Under Salinity Stress

Plants treated with only salinity stress displayed significant decreases of 50.00and 47.64% in the net photosynthetic rate (Pn), significant decreases of 37.14and 50.13in the stomatal conductance (Gs), and significant decreases of 52.17and 45.16in the transpiration rate (Tr) in HY20 and HY22, respectively (**Figures 3A,C,D**), while exhibiting significant increases in the intercellular CO2 concentration (Ci) by 144.03and 61.61%, respectively, in HY20 and HY22 compared with the control (**Figure 3B**). In contrast, exogenous Z-3-HAC significantly reversed the deleterious effects of salinity stress, as indicated by an increase of Pn by 72.52% in HY20 and a significant increase of Pn by 28.83% in HY22, an increase of Gs by 31.03% in HY20 and a significant increase of Gs by 61.77% in HY22, and a significant increase of Tr by 109.09and 35.29%, respectively, in HY20 and HY22, while a significant reduction of Ci by 71.39 and 14.38%, respectively, in HY20 and HY22. The application of exogenous Z-3-HAC alone did not affect Pn, Gs, or Tr in either genotype, whereas Ci was significantly increased by 16.98and 20.71% in HY20 and HY22, respectively.

Exogenous Z-3-HAC had no significant effects on the maximal photochemical efficiency of photosystem II (PSII) (Fv/Fm) in both genotypes. Salinity stress significantly decreased Fv/Fm by 86.57and 14.46% in HY20 and HY22, respectively. Again, Fv/Fm was significantly increased by 59.72% in HY20 and 7.85% in HY22 when the seedlings were primed with Z-3-HAC under salinity stress (**Figure 4A**). Fv/Fm status in different treatments was indicated by pseudo color images of the leaves. Similarly, the other chlorophyll fluorescence parameters, such as the photochemical activity of PSII (Fv′/Fm′), the non-photochemical quenching (*NPQ*), and the quantum efficiency

of PSII photochemistry (ΦPSII), displayed similar changes compared with Fv/Fm with a few exceptions where Z-3-HAC failed to increase *NPQ* and ΦPSII under salinity stress in HY20 (**Figures 4B,C,E**). The leaf chlorophyll content was significantly decreased by 44.84% in HY20 and 39.00% in HY22 under salinity conditions. In contrast, the application of Z-3-HAC showed an insignificant increase in the chlorophyll content by 35.85% in HY20 and 16.78% in HY22 following exposure to salt treatment (**Figure 4D**).

#### Effects of Exogenous Z-3-HAC on ROS Accumulation and Lipid Peroxidation Under Salinity Stress

The accumulations of two representative reactive oxygen species (ROS), H2O2 and O2 − , were detected using histochemical allocation methods. H2O2 and O2 − accumulated slightly following the application of Z-3-HAC under normal conditions. The accumulation of H2O2 and O2 − was induced to higher levels under salinity stress but was largely reduced by the exogenous Z-3-HAC in HY20 and HY22 (**Figures 5A,B**). In keeping with this result, the quantitative data further demonstrated that both H2O2 and O2 − were significantly induced by Z-3-HAC and salinity stress in HY20 and HY22. The exogenous application of Z-3-HAC significantly reduced H2O2 by 11.18% in HY20 and 27.65% in HY22 and significantly reduced O2 − by 31.20% in HY20 and 13.10% in HY22 under salinity conditions (**Figures 5C,E**). It is worth noting that the accumulations of H2O2 and O2 − were more pronounced in the salt-sensitive genotype HY20 than in the salt-tolerant genotype HY22.

The lipid peroxidation of peanut seedlings was examined according to the accumulation of MDA. Salinity stress significantly induced MDA content by 73.18% in HY20 and 70.32% in HY22. In line with the effect of Z-3-HAC on ROS accumulation, exogenous Z-3-HAC significantly reduced MDA content by 30.39% in HY20 and insignificantly reduced MDA content by 17.51% in HY22 under salinity conditions (**Figure 5D**). In contrast, priming with Z-3-HAC alone did not affect the MDA content in HY20 but significantly increased the MDA content in HY22 by 16.78% compared with the control.

#### Effects of Exogenous Z-3-HAC on Antioxidant Metabolism and Osmolytes Accumulation Under Salinity Stress

Exogenous application of Z-3-HAC significantly increased the activity of superoxide dismutase (SOD) by 18.86% in HY20, the activity of guaiacol peroxidase (G-POD) by 25.99% in HY20 and 36.45% in HY22 (**Figures 6A,B**). However, the activities of catalase (CAT) and ascorbate peroxidase (APX) were only slightly affected by sole application of Z-3-HAC in both genotypes (**Figures 6C,D**). As outlined above, application of Z-3-HAC significantly inhibited the accumulation of MDA during salinity stress. In keeping with these results, exogenous Z-3-HAC resulted in a significant increase in SOD activity by 10.95% in HY20 and 23.65% in HY22, G-POD activity by 35.20% in HY20 and 57.82% in HY22, CAT activity by 26.64% in HY22, and APX activity by 18.99% in HY20 and 16.12% in HY22 under salinity stress compared to the salt treatment control (**Figure 6**),

suggesting that Z-3-HAC treated seedlings had stronger oxidation resistance under salinity conditions.

Low-molecular weight organic compounds, such as total soluble sugars (TSS), sucrose, and free amino acids (FAA), are major components of plant osmolytes. Both salinity stress and exogenous application of Z-3-HAC significantly increased the concentrations of total soluble sugars, sucrose, and free amino acids in both genotypes. Two exceptions came from the data where salinity stress insignificantly increased the TSS content in HY20, and application of Z-3-HAC failed to increase FAA content in HY22 (**Figure 7**). In particular, the treatment of "Z-3-HAC + NaCl" had significantly higher concentrations of these osmolytes compared to the salt treatment control, where the total soluble sugar content was increased by 33.41 and 27.17%, sucrose content was increased by 35.36 and 27.63%, and free amino acid content was increased by 24.85 and 32.74% in HY20 and HY22, respectively.

#### Effects of Exogenous Z-3-HAC on Root Morphology Under Salinity Stress

To further our understanding of the effects of Z-3-HAC on the underground part of peanut seedlings, the root morphology

parameters were determined. Using a dual lens scanning system, we were able to examine the root morphological characteristics of various treatments. From the morphological point of view, salinity stress reduced the total root volume and total root length compared with non-salinity stressed treatments (**Figure 8A**). The quantitative data further demonstrated that exogenous application of Z-3-HAC did not affect the total root volume, total root length, root average diameter, or the total root surface area in both genotypes compared with the non-salinity stressed control, with only one exception where the total root length was significantly decreased by 10.03% in HY22 (**Figure 8C**). For the saltsensitive genotype HY20, the total root volume, total root length, and total root surface area were significantly decreased by 53.9766.90, and 57.44%, respectively, under salinity stress, whereas the magnitude of the reduction was less for the salt-tolerant genotype HY22 than for HY20 (**Figures 8B,C,E**). The application of Z-3-HAC before salinity stress significantly increased the total root volume by 78.37 and 51.11%, significantly increased the total root length by 116.43 and 56.11%, and increased the total root surface area by 53.23 and 81.39% in HY20 and HY22, respectively, compared to the salt treatment control. However, no significant difference was observed between treatments in root average diameter (**Figure 8D**).

#### Principal Component Analysis

A principal component analysis (PCA) integrating all the information of four treatments (including two cultivars, HY20 and HY22) was performed. The two components of PCA collectively explained 84.81% of data variability. The first PC (PC1) accounted for 69.12% of the total qualitative variation and had REC, SOD, FAA, and APX with high positive loadings. The second PC (PC2) accounted for 15.69% of the total qualitative variation and had TSS, G-POD, CAT, and Fv/Fm with high positive loadings (**Figure 9**). TSS, G-POD, CAT, Fv/Fm, FAA, SOD, APX, and REC were located toward the positive end of the PC1 axis in the first quadrant. In conclusion, Fv/Fm and the antioxidant system, including the activities of G-POD, SOD, CAT, and APX, were the most important factors in response to Z-3-HAC under salinity stress according to the plot of PC1, PC2, and the treatments in **Figure 9A**.

#### DISCUSSION

It is well accepted that salinity stress markedly inhibits plant growth and adversely affects crop production (Cheeseman, 1988; Munns and Tester, 2008; Deinlein et al., 2014; Niu et al., 2018). In the past decade, plant growth-regulating substances have been widely adapted by research groups to minimize the

pernicious effects of salinity stress on crop species, such as silicon (Zhu et al., 2016), melatonin (Arora and Bhatla, 2017; Chen et al., 2018), and epibrassinolide (Wani et al., 2019). Nevertheless, identifying more effective and eco-friendly plant growth-regulating substances is warranted.

A growing body of literature indicates that GLVs are rapidly emitted by plants after wounding to cope with plant biotic stress (Yan and Wang, 2006; Heil, 2014; Tanaka et al., 2018). For a long time, however, scant information was available on the role that GLVs play in the plant abiotic stress response. Recently, Cofer et al. (2018) reported that priming with physiological concentrations of GLV, Z-3-HAC alleviated cold stress in maize seedlings. In the present study, the ameliorative effect of Z-3-HAC in combination with salinity stress in severe saline soil was further investigated using two peanut genotypes. To the best of our knowledge, this is the first time the pivotal role for Z-3-HAC in the plant salinity stress response has been proposed. This mechanism could be of paramount importance to enhance plant salinity stress tolerance and thereby achieve higher crop productivity.

In previous work, Cofer et al. (2018) reported that priming with Z-3-HAC exhibits a positive effect on maize seedlings growth under cold stress. Similarly, the inhibition of growth was clearly relieved by Z-3-HAC application as indicated by plant dry weight and fresh weight when the peanut seedlings were exposed to salinity stress in this study (**Figure 1**). Notably, the application of Z-3-HAC alone failed to increase or decrease the growth of the peanut seedlings without salinity conditions, suggesting that a moderate concentration of Z-3-HAC could help rescue the seedlings from adverse environmental conditions.

Leaf REC and RWC are vital indicators of plant damage under abiotic stress. The REC increased, while RWC decreased, when plants were suffering from salinity stress (Yi et al., 2015; Zarza et al., 2016; Niu et al., 2018). In keeping with these findings, our results indicated that salinity stress led to an increased level of REC and a decline of RWC in both genotypes. Furthermore, exogenous application of Z-3-HAC could help to maintain the integrity of the plant cell plasma membrane, as evidenced by the decreased REC and increased RWC (**Figure 2**). In support of the RWC data, the accumulation of osmolytes, such as soluble sugars and free amino acids, was also observed in the "Z-3-HAC + NaCl" treatment (**Figure 7**). The existence of these substances might also contribute to the higher water content in plant leaves, as previously reported (Koffler et al., 2014; Chen et al., 2018; Wang et al., 2018). The findings to date signified the essentiality of Z-3-HAC in the plant salt response.

Next, we aimed to explore the physiological mechanism of Z-3-HAC in greater detail. The gas exchange parameters indicated that Z-3-HAC effectively attenuated the damage to the photosystem caused by salinity stress in peanut seedlings. An increase in Pn was observed in Z-3-HAC-treated leaves when the seedlings were exposed to salinity stress (**Figure 3A**). Nevertheless, a considerably steeper reduction of Ci was detected in the "Z-3-HAC + NaCl" treatment compared with salinity stress alone. Thus, the reverse tendency of change in Ci compared to Gs indicated that stomatal limitations were not the

FIGURE 7 | Effects of Z-3-HAC on concentrations of total soluble sugars, sucrose, and free amino acids of the third fully expanded leaves in peanut seedlings under salinity stress. The seedlings were primed with distilled water or 200 μM Z-3-HAC twice. After priming, the seedlings were exposed to NaCl stress. At 7 days after the onset of salinity stress treatment, the leaves were collected, and the concentrations of (A) total soluble sugars, (B) sucrose, and (C) free amino acids were determined. Bars are the standard deviations (SD) of three independent replicates (*n* = 3). Error bars labels with different letters indicate significant differences at *p* < 0.05 between treatments according to Duncan's test.

rate-limiting factors of Pn when peanut seedlings were exposed to salinity stress (**Figures 3B,C**). Generally, diffusive (reduction of mesophyll conductance) and metabolic (limitations of photochemistry and related enzymes) processes are involved in nonstomatal limitations (Galmés et al., 2007; Varone et al., 2012). In this paper, the contents of leaf total soluble sugars and sucrose were significantly increased under salinity stress combined with Z-3-HAC treatment (**Figures 7A,B**), making it likely that the enhanced photosynthesis by Z-3-HAC could be attributed to the acceleration of carbon metabolites (Paul and Pellny, 2003). In addition, soluble sugars and sucrose together with free amino acids are major components of the osmoregulation system, which are interdependently associated with plant salt tolerance (Rai, 2002; Wang et al., 2013; Puniran-Hartley et al., 2014; Gao et al., 2019). The accumulation of these osmolytes was observed in Z-3-HAC-treated seedlings which, in principle, could help to decrease the membrane permeability under salinity conditions (**Figure 7**). Consequently, the improvement of photosynthetic performance and osmotic accumulation by Z-3-HAC could further increase the plant dry weight, fresh weight, and plant growth (**Figure 1**), thereby ultimately enhancing salt tolerance in peanut seedlings.

Leaf chlorophyll fluoresce has been principally considered as an important criterion to evaluate the potential injury to photosynthetic apparatus (Xia et al., 2009; Ivanov and Bernards, 2015). The levels of Fv/Fm and Fv′/Fm′ were significantly improved in the "Z-3-HAC + NaCl" treatment, suggesting that Z-3-HAC could reduce the damage to the photosystem under salinity stress in both genotypes (**Figures 4A,B**). Notably, the induction of Fv/Fm led to an increase in ΦPSII and *NPQ* only in HY22 but not in HY20. The change in ΦPSII could be mainly attributed to the increase in Fv′/Fm′, suggesting that Z-3-HAC could help to accommodate both the lower demand for NADPH and the excessive accumulation of ROS (**Figure 4E**). The higher level of *NPQ* in the "Z-3-HAC + NaCl" treatment indicated that Z-3-HAC plays an indispensable role in the dissipation of light energy (**Figure 4C**). The same trend of change in chlorophyll content has been observed in both genotypes, indicating that application of Z-3-HAC helps to minimize the effects of salinity stress on peanut photosynthetic pigments (**Figure 4D**). These results help to elucidate the profound role of Z-3-HAC in protecting photosynthetic apparatus to combat salinity stress.

The accumulation of ROS has been proven to be a doubleedged sword. An accumulating body of evidence documented that the excessive accumulation of ROS could harm the photosystem and plasma membrane, whereas moderate induction of ROS by biotic stress or abiotic stress might be a crucial signal to alert the plants for further response (Neill et al., 2002; Mittler et al., 2004; Miller et al., 2008; Baxter et al., 2014; Qi et al., 2017; Waszczak et al., 2018). We therefore determined the accumulations of two representative ROS, H2O2 and O2 − using both histochemical allocation and chemical quantitative analysis methods. The accumulations of H2O2 and O2 − were detected in both genotypes under salinity conditions, whereas application of Z-3-HAC largely reduced the ROS level. In addition, the reduction of ROS level was accompanied by the lowered MDA content, indicating that Z-3-HAC enhanced the ROS scavenging capacity in peanut leaves (**Figure 5**). Interestingly, H2O2 and O2 − were also observed after the application of Z-3-HAC under normal growth conditions. The MDA content was barely affected in HY20 but was slightly increased in HY22; however, the increase was not sufficient to cause any damage to the seedlings

according to the data in this paper. Thus, we deduce that the ROS induced by Z-3-HAC is more likely to be a signal, rather than a harmful substance, in response to salinity stress. In fact, H2O2 induced by plant growth-regulating substances has been frequently reported to be involved in plant abiotic signaling responses (Zhou et al., 2014; Xia et al., 2015; Dietz et al., 2016; Choudhury et al., 2017). Therefore, further research is required to elucidate the detailed mechanisms of Z-3-HAC signal transduction.

It is well accepted that SOD catalyzes the disproportionation of singlet oxygen and produces H2O2 (Li et al., 2015). We observed that salt-sensitive peanut genotype HY20 had higher levels of SOD activity than salt-tolerant peanut genotype after application of Z-3-HAC under normal growth conditions (**Figure 6A**). In this respect, the greater accumulation of H2O2 in HY20 might be the result of activated SOD. The alleviating effect of exogenous Z-3-HAC on leaf oxidative stress was further confirmed by the enhanced activities of G-POD, CAT, and

soluble sugars; SC, sucrose; FAA, free amino acids; RV, root volume; RL, root length; RSA, root surface area.

APX, where "Z-3-HAC + NaCl" treatment processed higher activities of these antioxidant enzymes compared with other treatments in both genotypes (**Figures 6B**–**D**). These results are consistent with the ROS data and support the idea that Z-3-HAC could alleviate leaf oxidative stress by modifying the antioxidant system.

To explore the mechanisms underlying the ameliorating effect of Z-3-HAC on salinity stress-induced root growth inhibition, the root morphology was further characterized. As expected, salinity stress suppressed root growth and reduced the total root volume, total root length, and root surface area. However, the root average diameter was barely affected by salinity stress (**Figure 8**). Exogenous application of Z-3-HAC significantly induced the total root volume, total root length, and root surface area in both genotypes compared with salinity stress alone treatment, providing unequivocal evidence that the green leaf volatile Z-3-HAC could protect both the aboveground and the underground portion of the seedlings against damage from salinity stress.

In conclusion, our results showed that priming with the green leaf volatile Z-3-HAC attenuated salinity stress-induced photoinhibition and growth inhibition in both salt-sensitive and salt-tolerant peanut seedlings. Exogenous application of Z-3-HAC alleviated the oxidative stress under salinity conditions by enhancing the antioxidant systems, resulting in lower ROS levels compared to the nonprimed seedlings. Additionally, modulation of osmolytes, such as total soluble sugars, sucrose, and free amino acid contents, and modification of root morphology were found to be closely related to the above physiological responses. This study promotes a more

#### REFERENCES

Abdel Latef, A. A. H., Mostofa, M. G., Rahman, M. M., Abdel-Farid, I. B., and Tran, L.-S. P. (2019). Extracts from yeast and carrot roots enhance maize performance under seawater-induced salt stress by altering physiocomprehensive understanding of the ameliorating functions of green leaf volatiles under salinity stress. Future studies using molecular and proteomic approaches are still required to fully elucidate the role of Z-3-HAC in the plant salinity stress response, as well as the signaling events involved.

#### DATA AVAILABILITY

All datasets generated for this study are included in the manuscript.

# AUTHOR CONTRIBUTIONS

TS, ST, and RG conceived and designed the experiments. XXZ, XJZ, XY, and YZ performed the experiments and analyzed the data. DC, MW, and YW performed the analyses. TS, ST, and RG contributed to the writing of the manuscript, and performed the final editing of the manuscript.

#### FUNDING

This work was financially supported by the National Key R&D Program of China (2018YFD0201007), the National Natural Science Foundation of China (Project No. 31771732), the China Agriculture Research System (CARS-14), the Shandong Provincial Modern Agriculture Industrial Technology (SDAIT-04-05), and the Opening Foundation of Shandong Provincial Crop Varieties Improvement (2017LZGC003).

biochemical characteristics of stressed plants. *J. Plant Growth Regul.* doi: 10.1007/s00344-018-9906-8

Ahammed, G. J., Ruan, Y., Zhou, J., Xia, X., Shi, K., Zhou, Y., et al. (2013). Brassinosteroid alleviates polychlorinated biphenyls-induced oxidative stress by enhancing antioxidant enzymes activity in tomato. *Chemosphere* 90, 2645–2653. doi: 10.1016/j.chemosphere. 2012.11.041


energy fluxes. *Photosynth. Res.* 79:209. doi: 10.1023/B:PRES.0000015391. 99477.0d


by acting on antioxidant system in mustard. *Plant Physiol. Biochem.* 135, 385–394. doi: 10.1016/j.plaphy.2019.01.002


trigger metabolic and transcriptional reprogramming and promote salt stress tolerance. *Plant Cell Environ.* 40, 527–542. doi: 10.1111/pce.12714


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Tian, Guo, Zou, Zhang, Yu, Zhan, Ci, Wang, Wang and Si. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

fpls-10-00819 June 21, 2019 Time: 16:38 # 1

# Engineered Male Sterility by Early Anther Ablation Using the Pea Anther-Specific Promoter PsEND1

Edelín Roque, Concepción Gómez-Mena, Rim Hamza, José Pío Beltrán\* and Luis A. Cañas\*

Department of Plant Development and Hormone Action, Biology and Biotechnology of Reproductive Development, Instituto de Biología Molecular y Celular de Plantas, CSIC-UPV, Valencia, Spain

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC), Spain

#### Reviewed by:

Juan De Dios Alché, Spanish National Research Council (CSIC), Spain Shuping Wang, Yangtze University, China Vijay Abarao Dalvi, Maharashtra Hybrid Seeds Company Private Limited, Aurangabad, India

\*Correspondence:

José Pío Beltrán jbeltran@ibmcp.upv.es Luis A. Cañas lcanas@ibmcp.upv.es

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 21 March 2019 Accepted: 06 June 2019 Published: 25 June 2019

#### Citation:

Roque E, Gómez-Mena C, Hamza R, Beltrán JP and Cañas LA (2019) Engineered Male Sterility by Early Anther Ablation Using the Pea Anther-Specific Promoter PsEND1. Front. Plant Sci. 10:819. doi: 10.3389/fpls.2019.00819 Genetic engineered male sterility has different applications, ranging from hybrid seed production to bioconfinement of transgenes in genetic modified crops. The impact of this technology is currently patent in a wide range of crops, including legumes, which has helped to deal with the challenges of global food security. Production of engineered male sterile plants by expression of a ribonuclease gene under the control of an anther- or pollen-specific promoter has proven to be an efficient way to generate pollen-free elite cultivars. In the last years, we have been studying the genetic control of flower development in legumes and several genes that are specifically expressed in a determinate floral organ were identified. Pisum sativum ENDOTHECIUM 1 (PsEND1) is a pea anther-specific gene displaying very early expression in the anther primordium cells. This expression pattern has been assessed in both model plants and crops (tomato, tobacco, oilseed rape, rice, wheat) using genetic constructs carrying the PsEND1 promoter fused to the uidA reporter gene. This promoter fused to the barnase gene produces full anther ablation at early developmental stages, preventing the production of mature pollen grains in all plant species tested. Additional effects produced by the early anther ablation in the PsEND1::barnase-barstar plants, with interesting biotechnological applications, have also been described, such as redirection of resources to increase vegetative growth, reduction of the need for deadheading to extend the flowering period, or elimination of pollen allergens in ornamental plants (Kalanchoe, Pelargonium). Moreover, early anther ablation in transgenic PsEND1::barnase-barstar tomato plants promotes the developing of the ovaries into parthenocarpic fruits due to the absence of signals generated during the fertilization process and can be considered an efficient tool to promote fruit set and to produce seedless fruits. In legumes, the production of new hybrid cultivars will contribute to enhance yield and productivity by exploiting the hybrid vigor generated. The PsEND1::barnase-barstar construct could be also useful to generate parental lines in hybrid breeding approaches to produce new cultivars in different legume species.

Keywords: barnase, hybrid seeds, male sterility, parthenocarpy, Pisum sativum, pollen allergens, PsEND1 promoter, transgene bioconfinement

# INTRODUCTION

fpls-10-00819 June 21, 2019 Time: 16:38 # 2

Male sterility has been used by plant breeders to realize breakthroughs in the yield of different crops, through the development of hybrid cultivars. The impact of such technology is currently evident in some crops, including legumes (Saxena and Hingane, 2015), which has helped to deal with the challenges of global food security. Genes that are specifically expressed in the male reproductive organs could be used to obtain genetically engineered male sterile plants with potential applications in the production of hybrid seed, elimination of pollen allergens, or to avoid undesirable horizontal gene transfer in genetic modified (GM) crops.

Genetic cell ablation has been previously used to investigate male gametogenesis and as biotechnological tool to generate engineered male sterile plants using anther- or pollen-specific promoters fused to a cytotoxic gene (Koltunow et al., 1990; Mariani et al., 1990, 1992; Nasrallah et al., 1991; Paul et al., 1992; Dennis et al., 1993; Hird et al., 1993; Roberts et al., 1995; Zhan et al., 1996; Beals and Goldberg, 1997; De Block et al., 1997; Rosellini et al., 2001; Lee et al., 2003; Huang et al., 2016; Millwood et al., 2016; Yue et al., 2017). Production of engineered male sterile plants by expression of the ribonuclease barnase gene (Hartley, 1988), under the control of anther- or pollenspecific gene promoters, has been proved to be a good approach to generate pollen-free elite cultivars without adversely affecting the respective phenotypes (reviewed in Dutt et al., 2014; Mishra and Kumari, 2018). Moreover, male fertility can be restored in plants showing barnase-induced sterility by crossing with a transgenic line harboring the barstar gene, which encodes a powerful inhibitor of barnase (Mariani et al., 1992).

Genetic and molecular studies have revealed several important regulators of anther development, such as tapetum function, anther cell differentiation, or microspore development (Ma, 2005). Unfortunately, the expression of most of these genes was also observed in other floral or vegetative organs (Schiefthaler et al., 1999; Yang et al., 1999; Canales et al., 2002; Nonomura et al., 2003). However, Pisum sativum ENDOTHECIUM 1 (PsEND1) is a pea anther-specific gene displaying very early expression in the anther primordium and along the anther development. The expression of this gene was not detected in other floral organs or vegetative tissues (Gómez et al., 2004). Therefore, due to their specific temporal and spatial expression pattern, the promoter of PsEND1 was considered a useful tool to produce male sterile plants (Roque et al., 2007).

## PsEND1 AN EARLY EXPRESSION ANTHER-SPECIFIC GENE OF UNKNOWN FUNCTION

The PsEND1 protein was identified by our group several years ago following an immunosubtractive approach (Cañas et al., 2002). We were able to produce a series of monoclonal antibodies which specifically recognize proteins only present in a determinate floral organ. One of these antibodies recognized a protein of 25.7 kDa that was only detected in stamen extracts but not in the other floral organs, seeds, or vegetative tissues. The PsEND1-sequenced peptide presented a 79.3% identity with the N-terminus of the pea albumin PA2 (M17147; UniProtKB-P08688), which is only detected in the cytosol of cotyledonary cells (Harris and Croy, 1985; Higgins et al., 1987; Vigeoles et al., 2008). To isolate the PsEND1 gene (GenBank AY091466) the similarity between the PsEND1 and PA2 proteins was very useful (Gómez et al., 2004).

The anther-specific expression of PsEND1 was elucidated by means of Northern blot and RNA in situ hybridization analyses (Gómez et al., 2004). The PsEND1 expression pattern along stamen development demonstrated that this gene is active in the anthers from very early stages to 1 day (d-1) before anthesis. In situ hybridization assays showed that PsEND1 expression begins in the stamen primordium, just in the moment when the common primordia (Benlloch et al., 2003) differentiate into petal and stamen primordia (**Figure 1A**). At late stages, PsEND1 expression was detected in the epidermis, connective, middle layer, and endothecium, but not in the tapetum and microspores (**Figures 1B–D**). The PsEND1 protein was detected by immunolocalization in the same anther tissues (**Figure 1E**) and localized in the cytosol (Gómez et al., 2004). Due to the lack of efficient protocols for pea transformation, the function of PsEND1 is to date unknown. The PsEND1 protein shows four copies of a hemopexin-type conserved repeat (Beltrán et al., 2007). Therefore, PsEND1 is related structurally to a group of mammalian regulatory proteins, in which the vitronectin is included (Jenne, 1991). The biological function of PA2 is still unclear because it does not present the classic features of a storage protein: PA2 lacks a signal peptide and it is not degraded during germination (Higgins et al., 1987). PA2 could play a role in controlling biological processes as a regulatory protein, dependent on ligand availability (Pedroche et al., 2005; Vigeoles et al., 2008).

## THE PEA PsEND1 PROMOTER IS FUNCTIONAL IN A WIDE NUMBER OF DICOT AND MONOCOT SPECIES

The specific and early expression pattern of PsEND1 suggested that the isolation of its promoter region would be of significant relevance to produce engineered male sterility. After screening of a genomic DNA library of pea and sequencing, a fragment of 2,946 bp was subcloned (GenBank AY324651). To assess whether the isolated PsEND1 promoter sequence can specifically direct the expression of a foreign gene to the anthers of plants other than pea we transformed Arabidopsis, tobacco, oilseed rape, and tomato plants with a 2,731 bp fragment of the promoter sequence fused to the coding sequence of the uidA reporter gene (Gómez et al., 2004). The expression of the reporter gene was subsequently observed by histochemical analyses of GUS activity in seedlings, stems, leaves, roots, and flowers of kanamycin-resistant plants. Our results showed that the PsEND1 promoter sequence was fully functional in all plant species tested. GUS activity was only observed in anthers, in the same tissues than pea, from very early stages of development to dehiscence (**Figures 1F–K**).

fpls-10-00819 June 21, 2019 Time: 16:38 # 3

FIGURE 1 | PsEND1 expression in pea and other plant species. (A) RNA in situ hybridization in sections of two pea floral buds using digoxigenin-labeled antisense PsEND1 RNA probes. Purple color indicates the localization of PsEND1 expression. No expression was detected in the common primordia (CP) to petals and stamens (white arrows). The expression of PsEND1 begins to be detected in the stamen primordia (St) of floral buds at day 12 before anthesis (d-12). (B) In flowers (Continued)

#### FIGURE 1 | Continued

fpls-10-00819 June 21, 2019 Time: 16:38 # 4

at d-10, the PsEND1expression is detected in the upper part of the stamen primordia where the anther locules will develop. (C) In flowers at d-8, PsEND1 expression is only detected in those tissues that will be involved in anther architecture both in antesepalous and antepetalous stamens (Sts, Stp). (D) Close-view of a flower at d-6 showing anthers with strong hybridization signal in the epidermis (Ep), endothecium (En), middle layer, and connective (Co). No expression was detected in the anther filament (F), tapetum (Tp), and microspores (M). (E) Immunolocalization (anti-IgG-FITC) of the PsEND1 protein in paraffin sections of a pea stamen. The protein is localized (green fluorescence) in the same anther tissues than the RNA. (F) PsEND1::uidA expression in transgenic Arabidopsis thaliana flowers. GUS activity (blue) was only detected in the anther but not in the filament. (G) Transgenic A. thaliana anther section showing GUS activity in the structural tissues of the pollen sacs. (H) Young PsEND1::uidA Nicotiana tabacum flower showing GUS activity only in the stamen (St) primordia. (I) Transgenic N. tabacum anther showing GUS activity in the structural tissues of the pollen sacs but not in the pollen grains or tapetum. (J) Solanum lycopersicum PsEND1::uidA flower showing specific GUS activity in the anthers. (K) Transgenic tomato flower section showing GUS activity in the tissues involved in the architecture of the pollen sacs but not in the tapetum or in the pollen grains. (L) Expression of the PsEND1::uidA construct in the anthers of an Oryza sativa floret. (M) Section of a rice floret showing GUS activity in the expected anther tissues. (N) Expression of the uidA gene in the anthers of transgenic Triticum aestivum plants carrying the PsEND1::uidA construct. (O) Mature pollen adhering to the stigma showing GUS activity in a transgenic wheat flower. (P) Close-view of a germinating pollen grain, with pollen tube (arrows) growing in the style. Ca, carpel; Co, connective; En, endothecium; Ep, epidermis; Pe, petals; Po, pollen; Se, sepals; St, stamens; Tp, tapetum. Scale bars represent 100 µm in A, B, C, D, E, G, I, K, and M; 2.0 mm in F, H, J, and L; 0.5 mm in N; and 200 µm in O and P. Adapted from Gómez et al. (2004), Roque et al. (2007), Beltrán et al. (2007), and Pistón et al. (2008).

Alternatively, we have also assayed the PsEND1::uidA construct in two monocots: rice and wheat (Beltrán et al., 2007; Pistón et al., 2008). In transgenic rice (Oryza sativa) carrying this construct, GUS activity was detected in the same anther tissues in which the PsEND1 expression has been previously described and, additionally, in the floret receptacle (**Figures 1L,M**). In transgenic wheat (Triticum aestivum) lines, GUS activity was firstly observed along pollen development, in the microspores at binucleate stage. uidA gene expression was also detected in mature pollen grains after anthesis. After pollen grain germination, uidA expression was seen from early (stigma attachment) to advanced stages (style progression) of pollen tube development (**Figures 1N–P**). No further GUS activity was detected after fertilization and during seed development (Pistón et al., 2008).

#### ENGINEERED MALE STERILITY IN MODEL AND CROP PLANTS USING THE PsEND1::BARNASE-BARSTAR SYSTEM

A chimeric construct was generated joining the 2,731 bp fragment of the PsEND1 promoter sequence to the barnase gene, which encodes a non-specific and very active ribonuclease. To prevent the undesirable effects of a possible ectopic expression of this gene, Gardner et al. (2009) proposed its use in combination with the barstar gene, thus protecting against the inappropriate expression of this active ribonuclease.

The PsEND1::barnase-barstar chimeric construct provided efficient male sterility by early anther ablation in two Brassicaceae: Arabidopsis thaliana and Brassica napus (Roque et al., 2007). A. thaliana plants were transformed by floral dip. The anther development was arrested in the transgenic plants at early stages and hook-shaped structures at the end of a short filament were formed instead of normal locules (**Figures 2A,B**). The formation of short filaments is commonly associated with male sterility or reduced fertility as a consequence of incomplete anther development (Mariani et al., 1990). All the transgenic lines obtained failed to produce siliques and seeds. Transgenic Arabidopsis plants harboring only the PsEND1::barstar chimeric gene were also generated to check the reversibility of the system to restore fertility. After crossing with the male sterile plants previously generated, fertile plants showing restored anthers were obtained (Roque et al., 2007).

New hybrid plant varieties with increased yield have been obtained by breeders in the last decades. Hybridization of self-pollinating crops (e.g., oilseed rape and tomato) has been performed traditionally by manual emasculation followed by fertilization with pollen of the selected donor. Nevertheless, this practice is a tedious and time-consuming process and full sterility is not guaranteed. Therefore, engineered male sterility is a suitable alternative to prevent self-pollination in both plant species.

Oilseed rape is a 30% allogamous and a 70% autogamous, thus to produce hybrid lines it is necessary the implementation of an efficient system for the control of pollination. For this purpose, we genetically transformed B. napus cv. Drakkar plants with the PsEND1::barnase-barstar construct to find out whether the pea PsEND1 promoter could be functional and produce male sterility in a distantly related crop (Roque et al., 2007). Primary transformants showed collapsed anthers with short filaments (**Figures 2C,D**). The absence of pollen grains into the transgenic locules was confirmed by light microscopy. The unpollinated transgenic carpels do not produced fruit and seeds, while the carpels of untransformed control plants were fertilized and formed normal fruits and seeds.

The PsEND1::barnase-barstar construct also resulted in efficient male sterility in two Solanaceae: Nicotiana tabacum and Solanum lycopersicum (Roque et al., 2007). In transgenic N. tabacum plants, the flowers presented collapsed anthers (arrowhead shape) with no pollen grains at the end of a short filament (**Figures 2E,F**).

Tomato is a widespread crop all over the world and different systems have been developed to generate male sterility in this crop. However, these systems are not very useful at the commercial level due to the difficulties to maintain pure male sterile lines. The PsEND1::barnase-barstar construct also showed high efficiency in the generation of male sterile lines of two tomato cultivars: Micro-Tom and Moneymaker (Roque et al., 2007; Medina et al., 2013). In comparison with the nontransformed control plants, the flowers were male sterile, showing collapsed anthers with necrotic tissues and without pollen grains (**Figures 2G,H**). Unlike in the wild-type flowers, the carpel was fpls-10-00819 June 21, 2019 Time: 16:38 # 5

FIGURE 2 | Engineered anther ablation in model plants and crops. Red box (A. thaliana). (A) Left: wild-type (WT) A. thaliana flower showing normal anthers (arrow). Center and right: WT A. thaliana stamen observed by scanning electron microscopy (SEM). The black arrow indicates the cell types (toothed edges) present in the anther epidermis and the white one those of the filament (lengthened). (B) Left: transgenic A. thaliana PsEND1::barnase-barstar flower (two sepals and two petals were detached). Anther ablation is evident and no pollen sacs were formed (white arrows). The anther filament is short because it does not undergo the lengthening process. Center and right: PsEND1::barnase-barstar stamen observed by SEM. The hook-shaped structures (white arrows) shown are cellular types usually present in the filament but not those present in the epidermis of WT pollen sacs. Green box (B. napus). (C) Left: WT oilseed rape (Brassica napus) cv. Drakkar flower showing normal stamens. Right: Id, but with detached sepals and petals to observe the normal anthers and filaments (white arrow). (D) Left: male sterile flower of a PsEND1::barnase-barstar oilseed rape plant showing the absence of developed stamens. Right: Id, but with detached sepals and petals to see the ablated anthers and the reduction of the filament length (white arrow). Blue box (N. tabacum). (E) Left: WT tobacco (N. tabacum) cv. Petite Havana SR1 flower after anthesis showing normal anthers and full-length filaments. Center: WT tobacco anther with its characteristic four locules fully developed observed by SEM. Right: section of a WT tobacco pollen sac showing mature pollen grains. (F) Left: PsEND1::barnase-barstar tobacco flower after anthesis showing collapsed lobes and reduced filaments. Center: Tobacco PsEND1::barnase-barstar anther showing an arrowhead shape with collapsed locules and increased number of trichomes (white arrow). Right: section of a PsEND1::barnase-barstar pollen sac, no pollen grains can be observed into the collapsed locules. Orange box (S. lycopersicum). (G) Left: WT tomato (Continued)

#### FIGURE 2 | Continued

fpls-10-00819 June 21, 2019 Time: 16:38 # 6

(S. lycopersicum) cv. Micro-Tom flower at anthesis. Showing the staminal cone (black arrow) formed by the fully developed stamens in the center. Right: Isolated WT staminal cone covering the carpel. (H) Left: tomato PsEND1::barnase-barstar flower at anthesis. Right: anther ablation in the PsEND1::barnase-barstar flowers made visible the style and ovary of the carpel. (I) Flowers from a Kalanchoe blossfeldiana cv. "Tenorio" WT plant (center) and two male sterile lines (left and right) 1 day prior to anthesis. The WT flowers show anthers with fully developed locules, whereas the transgenic ones show collapsed structures at the end of a short filament instead of a four-lobed anther (black arrows). (J) Flowers from a K. blossfeldiana cv. "Hillary" WT plant (left) and a male sterile line (right) 1 day prior to anthesis with ablated anthers (white arrow). (K) WT anther from a "Tenorio" plant showing the normal four-lobed shape. (L) Close-view of a PsEND1::barnase-barstar "Tenorio" ablated anther with a short filament. (M) Close-view of a PsEND1::barnase-barstar "Hillary" ablated anther showing necrotic tissues and a short filament. (N) Pelargonium zonale stamens from WT flowers 1 day prior to anthesis showing fully developed locules and filaments. (O) P. zonale transgenic PsEND1::barnase-barstar stamens showing collapsed and necrotic anthers at the end of a short filament instead of a normal four-lobed anther with a fully expanded filament. (P) A. thaliana WT plant (left) showing fruits (siliques, white arrow) after flower fertilization compared with a more branched transgenic male sterile PsEND1::barnase-bastar plant showing more branches and flowers and the absence of siliques (right). (Q) Comparative panel showing how the flowering branches of WT tobacco plants were fertilized normally and produced capsules (left arrowhead), while the branches of transgenic plants do not show the formation of capsules and continue growing to produce more unfertilized flowers, which finally senesce (right arrowhead). (R) WT Micro-Tom tomato fruit showing the presence of seeds (upper arrowhead) compared with a seedless PsEND1::barnase-barstar parthenocarpic fruit (bottom arrowhead). Scale bars represent 2.0 mm in A and B; 100 µm in A and B center; 200 µm in A and B right; 0.5 cm in C, D, E, F, G, and H; 0.2 cm in I, J, N, and O; 400 µm in K, L, and M; 5.0 cm in P and Q; and 1.0 cm in R. Adapted from Roque et al. (2007), García-Sogo et al. (2010, 2012), Beltrán et al. (2007), and Medina et al. (2013).

not covered by the anthers forming the staminal cone. The ploidy level of all the tomato transgenic lines obtained was checked and only the diploid ones were retained to avoid misleading results. Backcrosses of all the transformed lines using pollen from wild-type (WT) plants produced normal tomato plants harboring fruits with seeds, indicating that female fertility was not affected in the transgenic PsEND1::barnase-barstar plants. Segregation analyses indicated that, in the next generation, the inheritance and stability of the incorporated transgenes were fully conserved.

## GENERATION OF NON-ALLERGENIC POLLEN-FREE ORNAMENTAL PLANTS USING THE PsEND1::BARNASE SYSTEM

In the last decades, conventional breeding has been extensively used to introduce commercially interesting traits into different ornamental plants. At present, genetic engineering allows specific modifications of single traits, with potential interest for consumers, in already successful commercial varieties. Allergic responses to the pollen of several ornamental species have high incidence in the general atopic population and especially among gardeners and flower growers (Goldberg et al., 1998).

The PsEND1::barnase-barstar construct has been assayed in two of the most grown flowering plants in Europe: Kalanchoe and Pelargonium. In the last years, different traits of interest have been introduced into Kalanchoe blossfeldiana by genetic engineering, leading to the generation of dwarf genotypes, new floral colors, more compact phenotypes, root inducing (Ri)-lines, reduced sensitivity to ethylene, and marker-free transgenic varieties (Christensen et al., 2008; Sanikhani et al., 2008; Topp et al., 2008; Thirukkumaran et al., 2009). Therefore, the implementation in this ornamental species of a reliable and efficient male sterility system would be of interest to avoid allergic responses of the potential consumers and to produce environmentally friendly plants by preventing gene flow between the existing genetically modified cultivars and related species.

Transgenic lines of two K. blossfeldiana cultivars ("Hillary" and "Tenorio") carrying the construct PsEND1::barnase-barstar were generated from leaf explants (García-Sogo et al., 2010). Transgenic "Tenorio" and "Hillary" stamens, compared with the non-transformed ones, showed dramatic differences in development (**Figures 2I–M**). In WT flowers at 1 day prior to anthesis, the anthers showed four locules with viable pollen grains (**Figure 2K**), while in the transgenic ones the anthers were replaced by collapsed and necrotic structures without pollen grains located at the end of a short filament (**Figures 2L,M**).

Similarly, engineered PsEND1::barnase-barstar Pelargonium zonale and P. peltatum (García-Sogo et al., 2012) male sterile flowers showed collapsed and necrotic anthers without pollen grains at the end of a short filament (**Figures 2N,O**). Crosspollination of the Kalanchoe and Pelargonium male sterile lines using WT pollen resulted in the production of normal fruits and seeds, indicating that female fertility was not affected in the transgenic plants. Segregation studies in both transgenic plants indicated that the inheritance and stability of the transgenes were maintained in the progeny.

### SIDE EFFECTS IN THE PsEND1::BARNASE-BARSTAR MALE STERILE PLANTS WITH INTERESTING BIOTECHNOLOGICAL APPLICATIONS

It has been observed in different PsEND1::barnase-barstar male sterile plants some interesting side effects that could be of interest to be exploited from a biotechnological point of view. We observed increased plant longevity, branching, and number of flowers that suggest the redirection of resources usually directed to the production of fruits and seeds. The scientific explanation of these phenomena could be related with a sink's matter. Engineered male sterile plants use the sucrose that has not been used in the formation of fruits and seeds in the production of more branches and flowers and the final consequence is the prolongation of the plant's life (Beltrán et al., 2007).

In the PsEND1::barnase-barstar male sterile plants of Arabidopsis and tobacco we have observed increased branching and flower number, leading to a drastic change in plant architecture (Beltrán et al., 2007). The Arabidopsis control plants fpls-10-00819 June 21, 2019 Time: 16:38 # 7

begin the senescence process after production of fruits. However, male sterile plants do not develop siliques but produce more branches of first, second, third, and fourth order in the axillary nodes of both rosetta and cauline leaves. These branches also develop more flowers than WT plants (**Figure 2P**). Similarly, tobacco PsEND1::barnase-barstar plants continue producing flowers after WT plants finished the production of capsules and begin the senescence process (**Figure 2Q**). Both Arabidopsis and tobacco male sterile plants showed increased plant longevity, producing more branches and flowers. Therefore, our engineered male sterility system could be useful to reduce the need for deadheading to extend the flowering period or the invasive potential of some ornamental species, and also to increase biomass production in forest trees (Jain and Minocha, 2000).

In the engineered male sterile tomato plants, we have observed the production of seedless parthenocarpic fruits as a consequence of the early anther ablation (Roque et al., 2007; Medina et al., 2013; Rojas-Gracia et al., 2017). Tomato fruit set and development are strongly affected by changes in the environmental conditions, thus autonomous fruit set independent of fertilization is a desirable trait in this crop species. We generated PsEND1::barnase-barstar male sterile transgenic plants producing parthenocarpic fruits in two tomato cultivars: Micro-Tom and Moneymaker. The ovaries of these plants were able to grow in the absence of fertilization and subsequently producing parthenocarpic fruits (**Figure 2R**). In this process, early ablation of the anthers is essential to activate the developing of the transgenic ovaries into seedless fruits, in the absence of signals produced during pollination and fertilization. PsEND1::barnase-barstar tomato plants of the commercial cultivar Moneymaker showed that the parthenocarpic development of the fruit is not detrimental to fruit quality. Several elite lines were identified and selected for their increased yield and quality performance. In fact, the changes detected in the metabolic profile of the ripe fruits from these lines indicated an improved organoleptic and nutritional quality. In addition, these male sterile plants could be used in hybrid breeding applications as very convenient parental lines. The transgenic lines generated could also be useful tools to investigate the molecular mechanisms accountable for the observed metabolic phenotypes, and also to understand the connection between impaired anther development and parthenocarpy (Medina et al., 2013; Rojas-Gracia et al., 2017, 2019).

#### CONCLUSION AND PERSPECTIVES

Future advances in crop species to produce more feed and food contributing to a sustainable agriculture will require synergy among several research fields, including traditional breeding, crop management, physiology, genetics, and biotechnology (Beltrán and Cañas, 2018). Natural male-sterile mutants have appeared in the germplasm of the more used cultivars; however, their economic value was not recognized and they were less in few generations. However, after the concept of heterosis (Shull, 1908), the benefits on the use of male sterility in hybrid seed production to increase crops yield were appreciated. Male sterility could be obtained by different failures in microsporogenesis, release of pollen grains, or pollen germination that do not affect the female reproductive system; therefore, the male sterile plants can produce viable seeds after manual pollination.

Engineered male sterility can be achieved by using anther- or pollen-specific promoters fused to a ribonuclease gene to produce ablation of specific cell types that are essential for proper anther development. The use of new anther-specific promoters showing very early expression, such as the PsEND1 promoter, could help to produce new high-yielding hybrid cultivars and environmentally friendly GM crops by preventing gene flow between genetically modified plants and compatible species. We have developed a simple and reliable system to produce engineered nuclear male sterile plants using the pea PsEND1 promoter, which specifically direct the expression of the barnase gene to different anther tissues involved in anther architecture in all plant species tested. The PsEND1 promoter is currently used by different research groups in a wide range of plant species to produce male sterility, including forest trees.

In legumes, the obtaining of new hybrid cultivars will contribute to enhance yield and productivity by exploiting the hybrid vigor generated. Cytoplasmic nuclear male sterility has been widely used by breeders to achieve breakthroughs in the productivity of several crops, including legumes, generating hybrid lines. Among the high-protein legumes, the first highyielding hybrid of pigeon pea, based in cytoplasmic nuclear male sterility and partial natural outcrossing, was recently released in India with record 3–4 t/ha of grain yield and with 30–40% yield advantage over 3 years of testing in farmers' fields. Also, under high-input conditions and good management yields, up to 4,000–5,000 kg/ha have been recorded by farmers (Saxena et al., 2013; Saxena and Hingane, 2015).

The genetically engineered male sterility approach described here, which uses an anther-specific promoter from a legume, provides new opportunities to the breeders for enforcing pollination control in hybrid seed production systems and might help to produce new hybrid cultivars in different legume species.

#### AUTHOR CONTRIBUTIONS

ER, RH, and CG-M performed the experiments. CG-M, JB, and LC conceived the experiments, analyzed the data, and wrote the grants that funded this work. LC wrote the manuscript. All authors read and approved the final version of this manuscript.

# FUNDING

This work was funded by grants BIO2000-0940, BIO2000- 0940, BIO2003-01171, BIO2006-09374, PTR95-0979-OP-03-01, RYC-2007-00627, AGL2009-13388-C03-01, AGL2009-07617, BIO2009-08134, AGL2015-64991-C3-3-R, and BIO2016- 75485-R from the Spanish Ministry of Economy and Competitiveness (MINECO).

# ACKNOWLEDGMENTS

fpls-10-00819 June 21, 2019 Time: 16:38 # 8

We acknowledge Springer Nature license for reprinted panels from Gómez et al. (2004) and Pistón et al. (2008) in **Figure 1** and

#### REFERENCES


from García-Sogo et al. (2010) in **Figure 2**. We also acknowledge support of the publication fee by the CSIC Open Access Publication Support Initiative through its Unit of Information Resources for Research (URICI).


fpls-10-00819 June 21, 2019 Time: 16:38 # 9


selection of marker-free transgenic plant of Kalanchoe blossfeldiana. Plant Cell Tissue Organ Cult. 97, 237–242. doi: 10.1007/s11240-009- 9519-9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Roque, Gómez-Mena, Hamza, Beltrán and Cañas. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Research Progress in Membrane Lipid Metabolism and Molecular Mechanism in Peanut Cold Tolerance

He Zhang<sup>1</sup> , Jiale Dong<sup>1</sup> , Xinhua Zhao<sup>1</sup> , Yumei Zhang<sup>2</sup> , Jingyao Ren<sup>1</sup> , Liting Xing<sup>1</sup> , Chunji Jiang<sup>1</sup> , Xiaoguang Wang<sup>1</sup> , Jing Wang<sup>1</sup> , Shuli Zhao<sup>1</sup> and Haiqiu Yu<sup>1</sup> \*

<sup>1</sup> Peanut Research Institute, College of Agronomy, Shenyang Agricultural University, Shenyang, China, <sup>2</sup> College of Agronomy, Qingdao Agricultural University, Qingdao, China

Early sowing has been extensively used in high-latitude areas to avoid drought stress

#### during sowing; however, cold damage has become the key limiting factor of early sowing. To relieve cold stress, plants develop a series of physiological and biochemical changes and sophisticated molecular regulatory mechanisms. The biomembrane is the barrier that protects cells from injury as well as the primary place for sensing cold signals. Chilling tolerance is closely related to the composition, structure, and metabolic process of membrane lipids. This review focuses on membrane lipid metabolism and its molecular mechanism, as well as lipid signal transduction in peanut (Arachis hypogaea L.) under cold stress to build a foundation for explicating lipid metabolism regulation patterns and physiological and molecular response mechanisms during cold stress and to promote the genetic improvement of peanut cold tolerance.

Keywords: peanut, cold stress, membrane lipid metabolism, molecular mechanism, lipid signal transduction

# INTRODUCTION

Peanut (Arachis hypogaea L.), one of the most important grain legumes as the source of edible oils and proteins, is cultivated in the semi-arid tropical and subtropical regions of the world (Katam et al., 2016). Recent statistics have shown that extreme weather events, particularly drought conditions caused by the changes in the global climate and water cycle, have occurred at an increasing frequency and intensity in peanut-producing countries, such as China and India (Yu et al., 2014; Lesk et al., 2016). In recent years, peanut planting areas have rapidly developed in high-latitude areas such as Northeast China. However, these regions are subjected to severe water-deficient conditions and seasonal drought, particularly from early May to the mid-May, the area covered by drought has been above 30% of the nation's crop (Yang et al., 2018; Zhang et al., 2018; **Figure 1A**). According to the statistics of the Ministry of Water Resources of the People's Republic of China (2018), the annual loss of industrial crops caused by drought in China accounts to 28.22 billion yuan, and peanuts account for about 20% (Li et al., 2014; Aninbon et al., 2016; Qin et al., 2017). Planting spring cultivars earlier is a feasible measure to circumvent spring sowing drought in peanut production, as well as in prolonging the vegetative growth period and increase nutrient accumulation for crop propagation (Rana et al., 2017). However, as a thermophilic crop, peanut needs relatively higher temperature throughout the whole development process

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Hassan Iqbal, Xinjiang Institute of Ecology and Geography (CAS), China Renu Deswal, University of Delhi, India

> \*Correspondence: Haiqiu Yu yuhaiqiu@syau.edu.cn

#### Specialty section:

This article was submitted to Plant Physiology, a section of the journal Frontiers in Plant Science

Received: 11 April 2019 Accepted: 12 June 2019 Published: 27 June 2019

#### Citation:

Zhang H, Dong J, Zhao X, Zhang Y, Ren J, Xing L, Jiang C, Wang X, Wang J, Zhao S and Yu H (2019) Research Progress in Membrane Lipid Metabolism and Molecular Mechanism in Peanut Cold Tolerance. Front. Plant Sci. 10:838. doi: 10.3389/fpls.2019.00838

(Wang et al., 2003). The lowest temperature of peanut germination is 12-15◦C, and the peanut plant shows maximum growth at 28◦C but experiences severe metabolic perturbations below 12◦C (Bell et al., 1994). Sowing spring peanut earlier in Northeast China can impart deleterious effects on seed germination. In addition, chilling injury events have frequently occurred in Northeast China in the past few years (Ma et al., 2017; Shen et al., 2019; **Figures 1B,C**), severely influencing the peanut growth, development, bloom, and yield (Kakani et al., 2002; Wang et al., 2003; Chen et al., 2014; **Table 1**). Therefore, it is essential to optimize the comprehensive evaluation system of peanut cold tolerance and breed peanut germplasm with cold tolerance in Northeast China.

Research studies on cold stress in plants have been conducted in the early 1830s, with a history of more than 180 years. Breeders have been trying to develop new varieties to resolve the problem of peanut chilling damage and have made certain progress, and a few cold-tolerance early-maturing cultivars with ability to germinate in cooler soils have been released (Ntare et al., 2001; Gorbet and Shokes, 2002; Upadhyaya et al., 2003, 2006). However, cold tolerance in plants is an intricate quantitative trait that always occurs in combination or in succession and it is not controlled by a single regulatory pathway or gene, making conventional breeding approaches for cold tolerance challenging (Kumar et al., 2015; Wang et al., 2017). With the development of biotechnology in agriculture, extensive and indepth studies on the mechanism of cold tolerance in plants in terms of morphologicalanatomical, physiological, biochemical, and molecular biology have been conducted. Lyons (1973) proposed that chilling damage initially occurs at the cellular and organ levels. The biomembrane system, including cell membrane, nuclear membrane and organelle membrane, is the initial site of injury, particularly in terms of its structure, function, stability, and enzyme activity, thereby resulting in substantial metabolic imbalance, especially involving respiration and photosynthesis. These changes in turn affect the plant growth and development and eventually incur damages at the whole-plant level, leading to the occurrence of chilling damage. Biomembrane is also the main repository of lipid for peanut plants (Yu, 2008), and fatty acid is the main component of biomembrane, which has been used as the primary index to evaluate peanut quality. Recent studies have further shown that chilling tolerance in peanut is closely correlated with the composition and structure of the membrane lipids, particularly the saturation of membrane fatty acids (Tang, 2011). The complex physiological, biochemical, and molecular mechanisms between membrane lipid metabolism and cold tolerance is being continuously explored to improve cold tolerance by means of high-throughput gene identification, gene editing, and transgenic technology.

In this review, we summarize the effects of cold stress on membrane lipid metabolism, including permeability, peroxidation, component change, and unsaturation, as well as its molecular mechanism and lipid signal transduction in peanut under cold stress, to lay a foundation for the elucidation of lipid metabolism regulatory patterns and physiological and molecular response mechanisms in cold stress, as well as to promote the genetic improvement of peanut cold tolerance.

# EFFECTS OF COLD STRESS ON MEMBRANE PERMEABILITY

The regulatory mechanism of biomembrane fluidity is one of the principal mechanisms that plants accommodate to changes in temperature conditions (**Figure 2**) and it is affected by the distribution ratio of various lipids on the membrane and the unsaturation of the glycerol lipid group (Li et al., 2016; Barrero-Sicilia et al., 2017). When peanut plants are subjected to cold stress, membrane lipids change from liquid crystal state to the gel state (Murata et al., 1982), which can cause the cessation of protoplast flow and an increase in membrane permeability, resulting in electrolyte leakage and loss of balance of intracellular ions (Huang et al., 2015). Chilling injury symptoms include dehydration, wilt, chlorosis, and accelerated senescence consequently happen (Upadhyaya et al., 2009). To maintain turgidity and original metabolic process, various organic and inorganic substances, such as inorganic salt, proline, betaine, soluble sugars, and soluble proteins, accumulate in plant cells via osmotic regulation, which lead to an increase in the concentration of cell fluid and a decrease in osmotic potential (Kishor and Sreenivasulu, 2014). Generally, under cold stress, proline, soluble sugars, and soluble proteins accumulate in the cytosol of sensitive and tolerant cultivars. Furthermore, the increasing amplitude of these osmotic regulation substances in varieties with stronger cold tolerance is larger than cold-sensitive varieties. However, the content of these significantly decreases in the cytosol when peanut plants are subjected to unbearable chilling (Bai D.M. et al., 2018; Kazemi-Shahandashti and Maali-Amiri, 2018).

Free proline accumulation is a heritable trait (Hanson et al., 1979) and can be used in screening genotypes for cold tolerance (Kim and Tai, 2011). 11-Pyrroline-5-carboxylate synthetase (P5CS) is a key enzyme in the glutamate pathway of proline biosynthesis. The overexpression of the P5CS gene in transgenic (introgressed with cDNA of the P5CS gene) plants results in increased cytoprotection and tolerance. The cDNA of the P5CS gene results in high levels of P5CS enzyme and a 10- to 18-fold increase in proline content, which contributes to both cold and drought tolerance through enhanced biomass production (Banavath et al., 2018). With the discovery of cDNA for P5CS and P5CR genes, future research could be directed at introgression of the gene, and the effect thereof, on yield and quality attributes of peanut under cold stress conditions in Northeast China.

# EFFECTS OF COLD STRESS ON MEMBRANE LIPID PEROXIDATION

The damage of the plant biomembrane system at low temperature is also related to membrane lipid peroxidation and protein destruction induced by reactive oxygen species (ROS) (Grant et al., 2014). Membrane lipid peroxidation refers to a series of free radical reactions on double bonds of unsaturated fatty acid on the membrane, which is initiated by oxygen free radicals (O<sup>2</sup> −. , H2O2, ·OH) on unsaturated fatty acids in lipids (Thomas et al., 2016). ROS in vivo would produce abundantly and accumulate rapidly under cold stress that is far beyond the scavenging

Northeast China, and red line indicates the mean value.

ability of antioxidant system, which breaks down the original equilibrium state of ROS. At this point, ROS begin to attack biological macromolecules, such as membrane lipids, nucleic acids, and proteins. The structure of the membrane system is also destroyed, resulting in a decrease in the photosynthetic rate, development of metabolic disorders, and massive accumulation of toxic substances in plants (Liu et al., 2013; Choudhury et al., 2016; Huang et al., 2016). Malondialdehyde (MDA) is the product of membrane lipid peroxidation that accumulates to higher concentrations in sensitive than tolerant genotypes (Iqbal et al., 2018a,b; Zhong et al., 2018). Cold-tolerant cultivars can resist external environmental stress by relying on the antioxidant enzyme system, which scavenges ROS and superoxide anion free radicals produced in plant cells, which mainly include superoxide dismutase (SOD), ascorbate peroxidase (APX), catalase (CAT), glutathione peroxidase (GPX), monodehydroascorbate reductase (MDHAR), dehydroascorbate reductase (DHAR), glutathione reductase (GR), and glutathione S-transferase (GST) (Tian et al., 2015).

Antioxidant enzymes are located among different sites of plant cells and work together with ROS-generating pathways to maintain ROS homeostasis (Zhang et al., 2015; Nejat and Mantri, 2017; Iqbal et al., 2019). The transcription factor APETALA2/ethylene response factor (AP2/ERF) plays an important regulatory role in signal transduction of the plant responses to various stresses including low temperature

#### TABLE 1 | Effects of cold stress on growth stages of peanut.


the electrolyte leakage caused by membrane bursts open will happen in cold sensitive plants and ultimately lead to cell and tissue death. When plants encounter a gradual cold, membrane lipid is in gel state, the permeability of membrane increases with the prolongation of cold time, resulting in the loss of intracellular water and physiological drought. At the meantime, the increased activation energy of enzymes bound to membrane leads to metabolic disorder and toxic substance accumulation in plants.

(Sakuma et al., 2002), which confers cold tolerance by promoting polyamine turnover, antioxidant protection, and proline accumulation. ERF1-Overexpressing plants have higher antioxidant activities, which are attributable to higher expression of genes, such as Cu, Zn-SOD, CAT1, CAT2, CAT3, and cpAPX, and accumulate more proline that is associated with induced P5CS and reduced PROX2 transcription compared to the wild-type. These transgenic plants show reduced MDA contents, H2O2, and ROS accumulation under cold stress, which contribute to alleviating oxidative damage to biomembrane after cold stress treatment (Zhuo et al., 2018). To date, a variety of AP2/ERF transcription factors have been successfully identified and investigated in many plants, including Arabidopsis, rice (Nakano et al., 2006), wheat (Zhuang et al., 2011), soybean (Zhang et al., 2008) and rapeseed (Du et al., 2016). The peanut genome has eight ERFs, including AhERF1–6, AhERF008, and AhERF019. However, different expression patterns in relation to responses to abiotic stress have been described. For example, the expression of AhERF4 and AhERF6 is rapid and is substantially enhanced by abiotic stress, whereas the expression of AhERF1 and AhEERF5 are slightly enhanced under certain stress conditions (Chen et al., 2012; Wan et al., 2014). Interestingly, AhDREB1 can improve tolerance to cold stress via the ABA-dependent pathway in Arabidopsis, and histone acetylation can affect the expression of AhDREB1 under osmotic stress conditions, thereby improving plant cold tolerance (Bai H. et al., 2018).

# EFFECTS OF COLD STRESS ON MEMBRANE LIPID COMPONENT

The membrane lipids of peanut plants are mainly composed of phospholipids (PL), which include phosphatidyl choline (PC), phosphatidyl ethanolamine (PE), phosphatidyl inositol (PI), phosphatidyl glycerol (PG), phosphatidic acid (PA), glycolipids (GLs) that consist of mono-galactose diglyceride (MGDG) and di-galactose diglyceride (DGDG), and a small amount of sulfolipids (SLs) and neutral lipids (NLs), such as cholesterol (Jouhet et al., 2004). Biomembrane is a dynamic equilibrium system that adaptively adjusts the internal composition based on changes in external temperature. Changes in lipid components are closely related to peanut abiotic stress, and the distribution ratio of lipids on the biomembrane of different tolerant cultivars will change with different degrees under various stresses (Lauriano et al., 2000; Sui et al., 2018). Phospholipids content is positively correlated to cold tolerance in plants, and cold tolerance is weakened when PL synthesis is blocked (Saita et al., 2016). Phosphatidyl glycerol is the main factor determining the membrane lipid phase transition for containing much saturated fatty acids, although it only accounts for 3–5% of thylakoid membrane lipids. The percentage of high-melting point molecules (C16:0/16:0 + C16:0/16:1t + C18:0/16:0 + C18:0/16:1t) in total molecular species or saturated fatty acids (C16:0 + C16:1t + C18:0) in total fatty acids in PG is significantly related to plant cold sensitivity, which is higher in cold-sensitive cultivars (Eriksson et al., 2011). The MGDG and DGDG are important components of thylakoid membrane lipids, which are closely related to photosynthesis, and their contents also change dynamically at low temperature (Kobayashi, 2016). Lipidomic analysis of maize leaves after cold treatment shows an increase in the PA and DGDG, but a decrease in PC and MGDG, resulting in enhanced turnover of PC to PA, which serves as precursors for galactolipid synthesis under low temperature conditions (Gu et al., 2017).

Lipid transfer protein (LTP) acts as a carrier for lipid transfer among different cell membranes. Changes in LTP activity can lead to alterations in membrane lipid composition and affect cold tolerance (Sun et al., 2015). Choi and Hwang (2015) reported that the BLT101-overexpressing transgenic wheat lines (BLT101ox) under cold stress loose less water and showed decreased expression of the genes induced by hormones (such as auxin and cytokinin) compared to non-transgenic (NT) plants. After prolonged cold treatment, BLT101ox leaves show normal phenotypes, whereas the NT plants displaydehydrated and withered leaves. Non-specific LTPs (nsLTPs), small molecular basic protein with abundant content, are responsible for the intermembrane transport of phospholipids by changing the composition of membrane lipids, participating in the biosynthesis of membranes, and transporting lipids among different organelles (Liu F. et al., 2015). There is also evidence that nsLTP is closely related to stress tolerance (Gangadhar et al., 2016). LTP3 is positively regulated by the transcription factor MYB96, which mediates freezing and drought stress (Guo et al., 2013).

# EFFECTS OF COLD STRESS ON MEMBRANE LIPID UNSATURATION

Plants can modulate the stability and fluidity of membrane by changing the unsaturation of fatty acids in membrane lipids, which is of great significance for organisms to maintain normal photosynthesis and respiratory metabolism and resist cold stress (Mironov et al., 2012; Karabudak et al., 2014). In general, the content of unsaturated fatty acids in lipid membranes increases with decreasing temperature. In addition, compared to coldsensitive cultivars, the content and the degree (number of double bonds) of unsaturated fatty acids in lipids are higher in coldtolerant cultivars (Nejadsadeghi et al., 2015). The biomembrane of chilling sensitive genotypes undergoes a phase transition from liquid crystal to gel even at room temperature due to high saturation of fatty acids, whereas cold-tolerant genotypes can keep the phase transition temperature lower than the cold treatment temperature, thus avoiding phase transition (Ianutsevich et al., 2016). The main fatty acids in various peanut cultivars are similar, including palmitic acid (16:0), stearic acid (18:0), oleic acid (18:1), linoleic acid (18:2), linolenic acid (18:3), and arachidic acid (20:0). However, their contents vary among cultivars after low temperature treatment, i.e., contents of 18:1, 18:2, and 18:3 rapidly increase, whereas those of 16:0 and 18:0 decrease (Tang, 2011).

In plant cells, saturated fatty acids are synthesized by the type II fatty acid synthase system with the aid of an acyl carrier protein (ACP). The biosynthesis of unsaturated fatty acid is conducted by the desaturation of saturated fatty acids depending on two kinds of acyl lipases, which include glycerol-3-phosphateacyl transferase (GPAT) that is responsible for the lipidization on the C-1 position of the glycerol skeleton, and monoacyl-glycerol-3-phosphateacyl transferase (MGAT) that is responsible for the lipidization at the C-2 position, as well as various fatty acid desaturases (FAD) (Klempova et al., 2013). Acyl carrier protein is a small, acidic protein that plays an essential role in fatty acid synthesis by elongating fatty acid chains. In peanut, AhACP1, AhmtACP3, AhACP4, and AhACP5 have been identified and have been proven to be closely linked with plant cold tolerance (Wei, 2012; Lei et al., 2014; Chi et al., 2017). The overexpression (OE) and antisense-inhibition (AT) of AhACP1 in transgenic tobacco could alter the content of total lipids and composition of fatty acid in leaves, leading to a significant increase or decrease in the content of C18:2 and C18:3, thereby becoming more tolerant or sensitive to cold stress, respectively. It has been suggested that AhACP1 bound to C18:1 might be the specific substrate of oleoyl-ACP thioesterase or GPAT and participate in membrane lipid synthesis (Yurchenko et al., 2014). The GPAT is the first acyl-lipidase in PG biosynthesis that can transfer the aliphatic acyl to C-1 position of glycerol 3-phosphate (G-3-P) to synthesize 1-acyl-glycerol-3-phosphoric acid (GPA). Cui et al. (2017) indicated that GPATs from different chilling-tolerant varieties have different selectivities to acyl group substrates, i.e., cold-sensitive genotypes prefer C16: 0, whereas chilling-tolerant genotypes have the same selectivity for C16:0 and C18:1. The expression of GPAT under cold stress is closely

correlated to cold tolerance (Li et al., 2018). AhGPAT3 and AhGPAT5 are two genes that encode the GPAT protein, which plays a prominent role in the synthesis of peanut fatty acids, while its function in cold remains unclear (Hao et al., 2018).

# FATTY ACIDS OF MEMBRANE LIPIDS AND GENETIC ENGINEERING OF COLD TOLERANCE

With the development of biotechnology, progress has been made in the genetic engineering of peanut cold tolerance. Several genes related to cold tolerance have been cloned and transferred to plants for functional studies (Cheng et al., 2013; Chen N. et al., 2016; Liu et al., 2016). The molecular regulatory mechanism under cold stress of fatty acid desaturation in membrane lipids includes regulating the expression of FAD to change the number of enzyme proteins, regulating the activity of FAD at posttranslation level, and changing the available substrates to regulate the activity of FAD (Tovuu et al., 2016; Zhao et al., 2018). The fatty acid unsaturation of membrane lipid was mainly determined by the type and quantity of FAD, but few FADs in peanut have been functionally validated (Chi et al., 2011; **Figure 3**).

ω-3 FAD is considered the rate-limiting enzyme for the biosynthesis from diene fatty acids to triene fatty acids, and mainly responsible for catalyzing the introduction of the third double bond at ω-3 position. According to differences in subcellular localization, ω -3 FAD in higher plants can be divided into three types: FAD3 in the endoplasmic reticulum and FAD7 and FAD8 in plastids (Zhang et al., 2014). In Arabidopsis, FAD3, FAD7, and FAD8 have been proven to mediate the synthesis of trienoic fatty acids from C18:2 and C16:2, and their expression enhance chilling tolerance (Chen et al., 2015; Roman et al., 2015). Interestingly, the structures of FAD7 and FAD8 are highly similar and thus have the same functions, while their enzyme activities vary in terms of responses to low temperature. No significant changes in triene fatty acid content were observed in the leaves of fad7 mutant after cold treatment, and the expression of most FAD7 genes is not affected by low temperature. Inversely, the FAD8 gene is hardly expressed at normal temperature but induced by low temperature. It follows that FAD7 is involved in plant growth and development under normal temperature, whereas FAD8 participates in plant response to low temperature (Tang, 2007; Liu et al., 2014). The overexpression of OsFAD8 substantially increases C16:3 and C18:3 content in leaves of transgenic rice lines, resulting in the damage of plant survival at

step is catalyzed by phospholipid:diacylglycerol acyltransferase (PDAT). In the latter, glycerol-3-phosphate (G3P) is catalyzed by glycerol-3-phosphate acyltransferase (GPAT), lysophosphatidic acid acyltransferase (LPAAT) and diacylglycerol acyltransferase (DGAT) in turn and added the aliphatic group. The red characters in figure indicate the key genes in the lipid synthesis that have been identified and proved to play a vital role in peanut abiotic stress.

2 ◦C for 7 days alleviated. The content of triene fatty acids in rice lines of OsFAD8 silencing is reduced by 40.2% compared to that in the wild-type, although chilling tolerance decreased, allowing the plants to further adapt to a high-temperature environment (Nair et al., 2009). Wu et al. (2015) cloned a ω-3 115- fatty acid dehydrogenase gene, AhFAD3A, which participates in the synthesis of α-C18:3 from cotyledons of germinated peanuts, and the expression of AhFAD3A is positively correlated with the formation of α -linolenic acid in peanut kernels and may be related to peanut tolerance.

The ω-6 FAD catalyzes monoenoic fatty acids to introduce the second double bond at ω-6 and form diene fatty acids, including FAD2 in endoplasmic reticulum and FAD4 and FAD6 in plastids (Park et al., 2016). Most of the known ω-6 FAD genes have multiple copies, and various copies of the same gene in the same plant vary in terms of the coding region, intron, and the length of the 50UTR and 30UTR (Cheng et al., 2013). FAD2 gene family is functionally responsible for the conversion of C18:1 to C18:2 in peanut, and six novel full-length cDNA sequences (AhFAD2- 1, -2, -3, -4, -5, and -6) have been identified. In addition, the AhFAD2-1 gene is upregulated in developing seeds of peanut plants compared to the AhFAD2-2 gene, while the AhFAD2-2 gene is expressed most abundantly in the flowers, and they all play a major role in the conversion of oleic to linoleic acid (Wang et al., 2015; Wen et al., 2018). Accumulation of C16:1 and C18:1 in the fad6 mutant of Arabidopsis thaliana results in a decrease in the level of polyunsaturated fatty acids in chloroplast membrane lipids and the number of thylakoids under cold stress.

Acyl-ACP desaturase is the only soluble desaturase family in peanut, including 19 stearyl ACP desaturase (SAD) and 1 6 palmityl ACP desaturase (PAD). The SAD catalyzes the conversion of stearoyl-ACP to oleoyl-ACP and determines the properties of most cellular glycerol-lipids (Liu H.L. et al., 2015). Moreover, the SAD gene is induced by cold stress and enhances cold tolerance by increasing the enzyme activity and the unsaturated fatty acid content (Luo et al., 2014). AhSAD3, AhSAD3A, and AhSAD3B have been identified as possible target genes for manipulation of fatty acid saturation in peanut (Florin et al., 2011). Transgenic plants that overexpress the SsSAD gene exhibit significantly higher linoleic (18:2) and linolenic acid (18:3) content and advanced freezing tolerance (Peng et al., 2018). The expressions of GhSAD2 gene in cotton plants after cold treatment at several levels were all upregulated; the level is highest after 6 h and then gradually decreases, thereby proving that the GhSAD2 gene may play a vital role in the synthesis of unsaturated fatty acids in cottonseed oil. At the same time, it also plays a certain physiological role in cold tolerance (Cai et al., 2017).

# THE SIGNAL TRANSDUCTION OF MEMBRANE LIPIDS UNDER COLD STRESS

In the currently accepted model for temperature sensing, cold stress causes a change in membrane fluidity, and rearrangement of the cytoskeleton, followed by an influx of calcium that triggers downstream responses to confer cold tolerance (Guo and Liu, 2018). When cold signals are sensed by the plasma membrane, a series of signal transduction processes of membrane lipids are activated, leading to downstream actions to deliver the cold signal (**Figure 4**).

Phosphatidic acid is the precursor of PL biosynthesis, acts as the main lipid signal in eukaryotes, binds to specific protein, and activates the MAPK signal pathway, Ca2+-dependent protein kinase, NADPH oxidase, and ion channel (Testerink and Munnik, 2011). The biosynthesis of PA involves two different pathways: one is the direct hydrolysis of PL by phospholipase D (PLD), and the other way is phospholipase C (PLC) that catalyzes the hydrolysis of poly-phosphatidylethanolamine (PPI) together with diacylglycerol kinase (DGK) and synthesis PA from diacylglycerol (DAG) (Arisz et al., 2009). Extensive research suggests that PA is involved in the processes of plant growth, differentiation, reproduction, hormone response, and signal transduction under various biological and abiotic stresses (Hou et al., 2016; Meringer et al., 2016). Under cold stress, the two pathways of PLD/DGK and PLD are both responsive (Chen et al., 2015). Phospholipase D can catalyze the hydrolysis of phosphodiester bond and produce inositol triphosphate (IP3), diester glycerol (DAG), acetylcholine (Ach), and PA. As the second messengers in cells, IP3, DAG, PA, and Ach can cause a series of secondary reactions by changing intracellular Ca2<sup>+</sup> and protein kinase K (PRK) levels, thus completing the process of cell response to cold signals (Hong et al., 2016). Moreover, the activity of PLD is closely related to the response of plant to low temperature (Muzi et al., 2016), short-term chilling stress (0–180 min) causes rapid and transient increases in PLD activity of young leaves, while long-term chilling stress (24–36 h) causes significant decreases in PLD activity in young leaves and roots (Peppino et al., 2017). Furthermore, PLD is also involved in ABA signal transduction under cold stress. ABA influences the activity of mitochondrial membrane – binding PLD in alpine ion mustard leaves through the mediation of Ca2<sup>+</sup> under cold stress (He et al., 2017). As the initial enzyme of PL degradation, PLD can accelerate the degradation of PL, resulting in PA accumulation in the membrane. Phosphatidic acid can regulate the negative regulatory factor ABI1 of the ABA signaling pathway, as well as PLDal and PA that mediate ABA to induce upstream ROS accumulation and stomatal closure, thus contributing to chill tolerance in plants (Guo et al., 2012).

The MGDG and DGDG are vital components of chloroplast and thylakoid membrane lipids and are closely related to plant photosynthesis. They are synthesized by the catalysis of galactosylglycerol synthetase (MGD) and digalactoglycerol synthase (DGD), respectively (Rocha et al., 2018). Sensitive to feezing 2 (SFR2) is classified as a family I glycosyl hydrolase but has recently been shown to have galactosyltransferase (GAT) activity (Barnes et al., 2016). During freezing conditions, SFR2 transfers galactosyl from MGDG to another MGDG and produces oligogalactosols, including galactosyl diacylglycerol (GDG) and trigalactosyl diacylglycerol (TGD), leaving diacylglycerol (DAG) as a by-product (Roston et al., 2014). The DAG is converted into triacylglycerol (TAG), then TAG and oligogalactolipids derived from MGDG specifically increase in response to freezing (Vu et al., 2014). Therefore, the metabolic

FIGURE 4 | Model illustrating potential effects of cold stress on membrane lipid pathway in peanut. Under cold stress, there is an increase in the content of PA, PI, DAG, and DGDG (the red words), but a decrease in the content of PC, PE, PG, MGDG, and SQDG (the blue words). The main pathways to rapid cold-induced PA formation include the activity of GPAT is up regulated in the de novo biosynthesis of phospholipids (1), enhanced hydrolysis of PC by PLD resulted from the increase of PLD activity (2), the phosphorylation of PLC-generated DAG from PPI leads to DAG accumulation, which might cause an increased content of PA as a result from phosphorylation of DAG by DGK (3), or the inhibition of PAH/PAP activity by DAG (4). The activity of PECT is proposed to be down regulated by cold stress, and this would lead to reduced PE formation (5). As a second messenger, PA can inhibit the activity of PEAMT, thereby blocking the synthesis pathway of PC (6). During cold stress, the requirement for eukaryotic galactolipid biosynthesis is reduced and the activity of DGAT is upregulated, excess PC is converted to DAG and subsequently acylated to 18:1-, 18:2-, and 18:3-rich molecular species of TAG, which are contained in cytoplasmic oil bodies (o.b.). Simultaneously, turnover of MGDG in the chloroplast results in accumulation of low amounts of 16:3-containing, chloroplastic TAG (7). The red and blue boxes, respectively, represent the up-regulation and inhibition of the enzyme activities, the enzyme activities in orange boxes have no significant change or are not very clear yet before and after cold stress. AAPT, aminoalcohol phosphotransferase; CDP-ETA, cytidine diphosphate-ethanolamine; CDP-DAG, cytidine diphosphate-diacylglycerol; CDS, CDP-DAG synthase; Cho, choline; CK, choline kinase; CPT, phosphocholone cytidylyl transferase; EK, ethanolamine kinase; EPT, CDP-ethanolamine phosphotransferase; LPA, lysophosphatidic acid; LPAAT, lysophosphatidic acid acyltransferase; PAH, phosphatidate phosphohydrolase; PAP, phosphatidic acid phosphatase; PCho, phosphocholine; PEAMT, phosphoethanolaminemethyltranferase; PECT, phosphoethanolamine cytidylyl transferase; PETA, phosphoethanolamine; PGP, phosphatidic glycerol phosphatase; PIP2, phosphatidylinositol-4,5-bisphosphate; PIS, phosphatidylinositol synthase; SQD, sulfoquinovosyl diacylglycerol synthase; SQDG, sulfoquinovosyl diacylglycerol.

pathway of TAG is closely related to plant cold tolerance (**Figures 3**, **4**). Diacylglycerol acyltransferase (DGAT) is a ratelimiting enzyme in the Kennedy pathway, one of the biosynthesis pathways of triacylglycerol (TAG) (Kennedy, 1963). In peanut, AhDGAT1-1 and AhDGAT1-2 heterologous expression in a Saccharomyces cerevisiae TAG-deficient quadruple mutant could restore lipid body formation, synthesis TAG and markedly accumulate higher levels of fatty acids (Peng et al., 2013; Tang et al., 2013; Chi et al., 2014). Recent studies have shown that DAGT1 plays a role in adaptive responses to chilling injury in plants, which can modulate the production of TAG and PA that cooperate with DGK (Chi et al., 2014; Chen B.B. et al., 2016; Yan et al., 2018). The expression of DGAT1, DGK2, DGK3, and DGK5 in A. thaliana during cold stress can regulate the dynamic balance of DAG, TAG, and PA, thereby maintaining the integrity of membrane system and intracellular redox state (Tan et al., 2018). Furthermore, DGAT1 and SFR2 coexist in chloroplasts and the activity of DGAT1 may be necessary for the SFR2 pathway, and DGAT1 may improve cold tolerance by SFR2-mediated cold tolerance (Arisz et al., 2018).

#### CONCLUSION AND PROSPECTIVE

Cold damage has become the key limiting factor of early sowing conducted to alleviate the spring sowing drought in peanut production in Northeast China. To cope with cold stress, plants have developed a series of physiological and biochemical changes and sophisticated molecular regulatory mechanisms, which display similarities and differences in various plant species. However, knowledge about physiological and molecular regulation mechanisms of peanut under cold stress in recent years has not been systematically documented. The plasma membrane is the barrier that protects the cell from injury and is also the primary place that senses cold signal. In the present review, we summarized the information on membrane lipid metabolism and its molecular response mechanisms, as well as lipid signal transduction in peanut under cold stress. Despite progress in elucidating the mechanism of cold tolerance in peanut, further investigations are warranted.

The cold signal is transduced from the extracellular to intracellular regions after being sensed by the plasma membrane and causes a series of physiological and biochemical changes. The ability to tolerate cold in peanut is based on the signal transduction by various factors in plant cells. However, how cold signals are perceived by plasma membranes and how cold signals transduce into intracellular through membranes are poorly understood. The diversity in plasma membrane

#### REFERENCES


composition, structure, and function is determined by the membrane lipids and membrane proteins. The interaction between membrane lipids and membrane proteins with different structures leads to differences in plasma membrane function. It is the key for analysis of cold signal transduction and elucidation of cold tolerance mechanism in peanut to understand the dynamic changes of plasma membrane structure and identify the function of the key protein.

The unsaturation of membrane lipids is closely related to cold tolerance in peanut. The proportion of unsaturated fatty acids has been regarded as an important index to measure the cold tolerance. Changing the ratio of saturated fatty acid to unsaturated fatty acid to improve cold tolerance peanut has become the research direction in recent years. However, the fatty acid composition of various lipids is variable, and it is not enough to analyze the fatty acid composition of membrane lipids in isolation to understand the physiological mechanism of membrane lipids. The main phospholipid molecules that make up the cell membrane and the main glycolipid molecules forming the chloroplast thylakoid membrane are also important in studying the physical phase transition of the membrane system at low temperature.

The development of emerging biotechnological methods in recent years, including CRISPR/Cas9, as well as the integration of omics and multi-omics, has impacted the agricultural sector by allowing the analysis of changes in lipid metabolism intermediates in the plasma membrane, the identification of differentially expressed genes related to lipids, and establishing a regulatory network for lipid metabolism under cold stress. This will be of great significance for how to satisfy plant growth requirements under deteriorating living conditions to sustain or even improve crop production.

# AUTHOR CONTRIBUTIONS

HZ wrote the manuscript. JD, XZ, JR, LX, CJ, XW, JW, and SZ conceived the study. YZ provided valuable references and made great contributions to the later revision. HY revised the manuscript and gave final approval of the version to be published.

### FUNDING

Dr. Xu Quan provided the patient revision for this manuscript. This study was supported by the National Agricultural Research System of China (CARS-13).




and hexaploid wheats. Mol. Biol. Rep. 42, 363–372. doi: 10.1007/s11033-014- 3776-3


hypogaea L.) promotes the accumulation of oleic acid. Plant Mol. Biol. 97, 177–185. doi: 10.1007/s11103-018-0731-z


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zhang, Dong, Zhao, Zhang, Ren, Xing, Jiang, Wang, Wang, Zhao and Yu. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Altered Expression of an FT Cluster Underlies a Major Locus Controlling Domestication-Related Changes to Chickpea Phenology and Growth Habit

Raul Ortega<sup>1</sup> , Valerie F. G. Hecht<sup>1</sup> , Jules S. Freeman1,2, Josefa Rubio<sup>3</sup> , Noelia Carrasquilla-Garcia<sup>4</sup> , Reyazul Rouf Mir4,5, R. Varma Penmetsa4,6 , Douglas R. Cook<sup>4</sup> , Teresa Millan<sup>7</sup> and James L. Weller<sup>1</sup> \*

<sup>1</sup> School of Natural Sciences, University of Tasmania, Hobart, TAS, Australia, <sup>2</sup> Scion, Rotorua, New Zealand, <sup>3</sup> E. Genomica y Biotecnologia, Instituto Andaluz de Investigación y Formación Agraria y Pesquera (IFAPA), Córdoba, Spain, <sup>4</sup> Department of Plant Pathology, University of California, Davis, Davis, CA, United States, <sup>5</sup> Division of Genetics and Plant Breeding, Sher-e-Kashmir University of Agricultural Sciences and Technology of Kashmir, Srinagar, India, <sup>6</sup> Department of Plant Sciences, University of California, Davis, Davis, CA, United States, <sup>7</sup> Department of Genetics ETSIAM, University of Córdoba, Córdoba, Spain

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Carlos Eduardo Vallejos, University of Florida, United States Zhen Wang, Chinese Academy of Agricultural Sciences, China

> \*Correspondence: James L. Weller Jim.Weller@utas.edu.au

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 05 December 2018 Accepted: 07 June 2019 Published: 03 July 2019

#### Citation:

Ortega R, Hecht VFG, Freeman JS, Rubio J, Carrasquilla-Garcia N, Mir RR, Penmetsa RV, Cook DR, Millan T and Weller JL (2019) Altered Expression of an FT Cluster Underlies a Major Locus Controlling Domestication-Related Changes to Chickpea Phenology and Growth Habit. Front. Plant Sci. 10:824. doi: 10.3389/fpls.2019.00824 Flowering time is a key trait in breeding and crop evolution, due to its importance for adaptation to different environments and for yield. In the particular case of chickpea, selection for early phenology was essential for the successful transition of this species from a winter to a summer crop. Here, we used genetic and expression analyses in two different inbred populations to examine the genetic control of domestication-related differences in flowering time and growth habit between domesticated chickpea and its wild progenitor Cicer reticulatum. A single major quantitative trait locus for flowering time under short-day conditions [Days To Flower (DTF)3A] was mapped to a 59-gene interval on chromosome three containing a cluster of three FT genes, which collectively showed upregulated expression in domesticated relative to wild parent lines. An equally strong association with growth habit suggests a pleiotropic effect of the region on both traits. These results indicate the likely molecular explanation for the characteristic early flowering of domesticated chickpea, and the previously described growth habit locus Hg. More generally, they point to de-repression of this specific gene cluster as a conserved mechanism for achieving adaptive early phenology in temperate legumes.

Keywords: chickpea, domestication, florigen, flowering, growth habit, legume, photoperiod, QTL

# INTRODUCTION

The timing of flowering is a critical trait for crop adaptation, and as such has significant implications for yield and economic output (Jung and Muller, 2009; Nelson et al., 2010). The wild forms of many crops have a strong environmental requirements for flowering, ensuring that seed development occurs under favorable conditions. However, such requirements often constitute a physiological barrier for adaptation to wider agro-ecological ranges, and in general, domestication and subsequent diversification has involved selection of variants in which these requirements have been modified. A well-known example is wheat (Triticum aestivum L.), where relaxation of

photoperiod and vernalization responses has allowed the development of spring cultivars (Trevaskis et al., 2003; Yan et al., 2003; Fu et al., 2005; Beales et al., 2007; Díaz et al., 2012; Kippes et al., 2015, 2016). Similar adaptations have been reported in many other species (Nakamichi, 2015), including legumes, where a loss-of-function mutation in the circadian clock gene ELF3 overcame the obligate LD requirement of pea (Pisum sativum L.), permitting its conversion from a winter to a spring crop at higher latitudes (Weller et al., 2012). Similarly, a mutation at the Ppd locus in the short-day species common bean (Phaseolus vulgaris L.) enabled summer cropping and broad global adaptation of this crop (Wallace et al., 1993; Weller et al., 2019).

Chickpea (Cicer arietinum L.) is a major grain legume, ranking third in global production after bean and pea (FAO, 2016). It is more drought-tolerant than other cool season legumes, and its relative importance is projected to increase in future due to global population growth and climate change (Bar-El Dadon et al., 2017; Muehlbauer and Sarker, 2017). Despite being domesticated in parallel with other long day vernalization-responsive legumes (pea, lentil) and cereals (wheat, barley) (Zohary and Hopf, 2000), the domestication history of chickpea is distinct from these other species (Abbo et al., 2003a). One key difference is the decline of chickpea in the archeological record between the Neolithic period, approximately 9000 years before present (ybp) and the early Bronze Age (approximately 5000 ybp) (Abbo et al., 2003b). A second key difference is that across its center of origin, chickpea has traditionally been grown as a summer crop (Abbo et al., 2003b), and varieties with the winter annual habit typical of wild chickpea are notably absent. This contrasts with other species domesticated in the Fertile Crescent region over the same period, such as barley and pea, in which a significant proportion of the domesticated germplasm retains the ancestral, wild phenology (Saisho et al., 2011; Weller et al., 2012).

The reasons for these two differences are not known, but it is thought that chickpea was neglected as a winter crop in favor of other pulses, as a result of its inherently greater susceptibility to Ascochyta blight, a fungal disease caused by Ascochyta rabiei. This disease can cause total crop failure, particularly during humid Mediterranean winter conditions (Siddique et al., 2000; Millan et al., 2003; Sharma and Ghosh, 2016) and its impact would likely have intensified as planting densities increased with cultivation. This pressure may have motivated attempts by early farmers to shift cultivation from autumn sown, over-winter crop (when most precipitation occurs in this region) to a springsown summer crop that matures in the predominantly dryer summer season. In such a scenario the selection of earlierflowering genotypes able to complete their life cycle prior to the onset of summer drought would likely have been essential (Kumar and Abbo, 2001), and the increase in the frequency of archaeobotanical remains of chickpea in the Bronze Age is suggested to reflect the success of this transition (Kumar and Abbo, 2001; Abbo et al., 2003a).

Early phenology continues to be important in presentday chickpea cultivation, as a large proportion of the global chickpea crop is grown in short season environments exposed to end-of season stresses that reduce their productivity (Kumar and Abbo, 2001; Muehlbauer and Sarker, 2017). In Mediterranean and semi-arid environments, where chickpea is grown under rain-fed conditions and matures into summer, terminal drought is the most common cause of yield loss (Zhang et al., 2000; Turner et al., 2001; Siddique et al., 2003; Berger and Turner, 2007). In higher-latitude continental temperate environments like western Canada, the short growing season is instead limited by declining temperatures, delayed maturity and increased potential for frost damage at the sensitive phase of pod development (Croser et al., 2003; Berger J.D. et al., 2004; Clarke and Siddique, 2004; Anbessa et al., 2007). In both situations, early flowering and maturity is thus an important primary escape strategy (Siddique et al., 2003; Berger J.D. et al., 2004; Berger et al., 2006) Hence, genetic control of this trait has been a topic of increasing interest (e.g., Gaur et al., 2008; Ridge et al., 2017).

Several flowering time loci have been reported in chickpea from both classical and quantitative trait locus (QTL) analyses. These include four major loci; Photoperiod (Or et al., 1999), Early flowering 1 (Efl1), Efl3, and Efl4 (Kumar and Van Rheenen, 2000; Hegde, 2010; Gaur et al., 2014), and several QTL that appear recurrent in different populations. One prominent example is a "hot-spot" on linkage group (LG) four (Cobos et al., 2007; Varshney et al., 2014; Daba et al., 2016; Mallikarjuna et al., 2017). Another important genomic region is the central portion of chromosome 3 between markers TA6 and TA64, in which flowering time QTL have been reported from all wide crosses investigated for this trait (Cobos et al., 2009; Aryamanesh et al., 2010; Das et al., 2015; Samineni et al., 2015), as well as in several other intraspecific populations (Hossain et al., 2010; Hamwieh et al., 2013; Daba et al., 2016; Mallikarjuna et al., 2017).

In this study we aimed to elucidate the genetic basis of changes in flowering time that occurred early in chickpea crop evolution, through QTL analysis and candidate gene evaluation in two recombinant inbred populations between Cicer arietinum and its wild progenitor C. reticulatum. Our results point to a strong genetic association between the early flowering and erect growth habit typical of domesticated chickpea, and the elevated expression of a cluster of FT genes on chromosome 3. We conclude that a cis-acting genetic change leading to deregulated expression of this gene cluster may have played a key role in the prehistoric shift in phenology and farming practice integral to chickpea evolution under domestication.

#### MATERIALS AND METHODS

#### Plant Material

CRIL2 is a recombinant inbred line (RIL) population developed from an interspecific cross between C. arietinum (accession ICC4958) and C. reticulatum (PI489777) by Tekeoglu et al. (2000), Winter et al. (2000), Muehlbauer and Sarker (2017) at the United States Department of Agriculture (USDA), Agricultural Research Service and Washington State University, United States. ICC4958 is an early-flowering desi chickpea type with an erect growth habit, while the wild parent PI489777 is a Turkish accession with prostrate growth habit and late flowering typical of wild chickpea.

Three other recombinant inbred populations were used in this study, developed by the chickpea breeding group in IFAPA (Institute of Agricultural and Fisheries Research and Training, Centro Alameda del Obispo, Cordoba, Spain) and University of Córdoba, Spain. RIP12 is an interspecific population consisting of 88 F6:<sup>7</sup> RILs derived from a cross between the kabuli cultivar ICCL81001 and a C. reticulatum accession, as described in Cobos et al. (2009). RIP5 (102 RILs) and RIP8 (113 RILs) are two F6:<sup>8</sup> RIL populations derived from reciprocal crosses between the early flowering desi landrace WR315 and the late kabuli accession ILC3279 (Iruela et al., 2007; Ali et al., 2015).

### Growing Conditions and Phenotypic Evaluation

Four plants of each of the CRIL2 parents and 124 RILs were grown under long day (LD) or short day (SD) conditions in an automated phytotron at the University of Tasmania between December 2015 and April 2016. Plants under SD received 8 h (8 AM–4 PM) of natural daylight and were then moved to complete darkness inside the phytotron. Plants under LD received natural daylight, extended throughout the growing season with artificial light from high-pressure sodium lamps (50 µmolm−<sup>2</sup> s −1 ) to provide a total photoperiod of 18 h. Night temperature inside the phytotron was maintained at 16◦C. Flowering time was recorded as the number of days from seedling emergence to opening of the first flower (DTF) on each individual plant. Lines remaining vegetative at 130 days were assigned a nominal DTF value of 130 in subsequent analyses. Branching tendency was quantified at 3 weeks after emergence and expressed as the ratio of total branch length to main shoot length (branching index, BI) to normalize for differences in general vigor and stem elongation. Growth habit (GH) was scored using a four-category scale (values from 1 to 4), according to the angle of the branches from the vertical axis at harvest stage, as follows: (1) prostrate (branches 0–10◦ above horizontal), (2) semi-prostrate (10–45◦ ), (3) semi-erect (45–70◦ ), and (4) erect (>70◦ ). For all three traits, the mean value from the four replicate plants was used for analysis.

RIP12 was sown in March in the field at the IFAPA site in Cordoba (latitude/longitude/altitude: 37◦ 530N/4◦ 470W/117 m) over four different seasons (2001, 2004, 2008, and 2014). Plots consisted of 2 m-long rows set 0.5 m apart, each sown with 20 plants of each RIL. Every fifth row was sown with one of the parent lines as a check. In 2001, a greenhouse trial was also conducted to assess flowering time under natural short day conditions (Cobos et al., 2009). RIP5 was sown in the field in March 2003 at two different sites: the IFAPA site in Cordoba and the IFAPA Venta del Llano site (Mengibar, Jaen, Spain; latitude/longitude/altitude: 37◦ 570N/3◦ 480W/280 m). In this trial, RILs were randomly distributed in four blocks and parents were included as reference in each trial. The unit plot was two rows of 2 m, with 10 seeds/m and 0.7 m between rows (Ali et al., 2015). RIP8 was sown in the field in February 2003 at the IFAPA site in Cordoba with two replications, in which RILs were distributed randomly into four blocks with 20 lines per block. Four check lines were included in each block following a Latin square design to verify environmental homogeneity. The plot unit was three rows, 4 m long, with 0.5 m between rows and a density of 20 plants m−<sup>2</sup> . For these three populations, days from sowing to 50% flower was recorded (DTF). The data obtained from each of the two trials of RIP8 were analyzed separately. Information about the photoperiod experienced by RIP12, RIP5, and RIP8 during the different growing seasons can be found in **Supplementary Table 5**.

#### Molecular Markers

Both markers from previous linkage maps and new markers developed specifically for this study were used for map construction and QTL analysis. Polymorphisms in target genes across chickpea LG3 and LG4 were identified by sequencing of the parental accessions or from information available in previous reports (Saxena et al., 2014), and used to design 27 highresolution melt (HRM) markers (**Supplementary Table 1**) that were added to the markers previously genotyped in the RIP12, RIP5, and RIP8 populations previously described in Iruela et al. (2007), Cobos et al. (2009), Ali et al. (2015), respectively. In the case of CRIL2, the HRM markers were combined with a subset of 210 molecular markers selected from a dense map incorporating 2956 markers (**Supplementary Figure 1**; von Wettberg et al., 2018), to provide an even distribution [approximately 1 marker/5 centiMorgan (cM)] of high-quality (minimal missing data) markers (**Supplementary Table 1**).

#### Genetic Mapping and QTL Analysis

Linkage analysis in each population was performed using JoinMap v4.0 (Van Ooijen, 2006). Markers were grouped with a minimum logarithm of odds (LOD) value of 3.0, and the regression algorithm was used for mapping, using default options and the Kosambi function for the estimation of genetic distances (Kosambi, 1943). The initial maps were reviewed and problematic markers were removed where necessary based on the following criteria: Chi-square goodness-of-fit threshold (>1); nearest neighbor fit; genotype probability function; and the level of segregation distortion compared to surrounding markers. Following the removal of problematic markers, the maps were recalculated and the process repeated where necessary, until maps with robust order were produced.

The numbering of the LGs followed the chickpea consensus genetic map (Millan et al., 2010), based on the presence of markers in common with the consensus map itself or others marker of known position, using the Cool Season Food Legume Database<sup>1</sup> .

Quantitative trait locus analysis was performed using MapQTL6.0 software (Van Ooijen, 2009). First, interval mapping was carried out to detect putative QTL associated with the variation in each trait. For each putative QTL, the marker closest to the LOD peak and two markers either side of this were used in Automatic Cofactor Selection (ACS) to select the best cofactor for subsequent Multiple QTL Mapping (MQM) analysis. The MQM function was employed iteratively with each new cofactor selection until all QTLs for a specific trait were determined. In both interval and MQM mapping, putative QTL were declared at

<sup>1</sup>https://www.coolseasonfoodlegume.org

a chromosome-wide threshold (p < 0.05) based on permutation testing with 1000 permutations.

# RNA Extraction and qPCR

For the expression study, the six parental lines of the four populations (RIP5 and RIP8 share the same parental accessions, and therefore were represented only once) were grown in an automated phytotron at the University of Tasmania under SD (8 h) and LD (16 h) conditions. For quantitative reversetranscriptase PCR (qRT-PCR), dissected apical buds and the uppermost fully expanded leaflets were harvested. Each sample consisted of pooled material from two plants, harvested at midday at 2–4 weeks after seedling emergence. RNA extraction, cDNA synthesis and gene expression determination were performed as described in Sussmilch et al. (2015) using the primers indicated in **Supplementary Table 2**. The expression level of tested genes was normalized against ACTIN using the ∆∆Ct method.

### Statistical Analysis

Statistical analysis was conducted using IBM SPSS Statistics (version 22), including box-plot and frequency distribution graphs. Correlation between traits was measured using Spearman's rank correlation coefficient, and statistical significance was tested by paired or independent t-test, according to the nature of the data.

#### RESULTS

# A Major Locus Controls Flowering in the CRIL2 Interspecific Reference Population

We initially characterized flowering time in the CRIL2 reference population under controlled 8-h SD and 18-h LD conditions in an automated phytotron. Phenotypic values obtained are summarized in **Supplementary Figure 2**. Under LD, the difference in flowering time between the parental lines was not significant, with both flowering between 30 and 33 days after emergence. In contrast, under SD, ICC4958 flowered at around 60 days while PI489777 remained vegetative until the experiment was terminated 130 days after sowing. Thus, under these conditions, ICC4958 shows a moderate, quantitative response to photoperiod, whereas the wild line shows an obligate requirement for LD.

Among the RILs, the mean DTF under LD conditions was intermediate between the two parents while the range was substantially wider, with 12 days difference between the minimum and maximum values. Under SD, flowering time in the CRIL2 population showed a clear bimodal distribution, with a significant proportion of lines (68 out of 124) failing to initiate flowering by 130 days after sowing, like the wild parent. All RILs flowered considerably later under SD than under LD (p < 0.001) but, interestingly, phenotypic values for DTF in the two conditions were significantly correlated (with only 56 RILs able to flower in both LD and SD considered; rs[56] = 0.500, p < 0.001), indicating that part of the variation is independent of photoperiod. Transgressive segregation, particularly toward earliness, was observed under both photoperiods (**Supplementary Figure 2**), suggesting that alleles associated with early flowering have been contributed from both parents.

Consistent with the phenotypic homogeneity observed for flowering time in CRIL2 under LD, QTL analysis under these conditions revealed only one minor QTL, DTF3C (**Table 1**), located at the top of LG3 (**Figure 1**). In contrast, under SD conditions, a major effect QTL, DTF3A, was found in the middle of LG3 (LOD 50.2, PVE 85). As the peak markers for these loci are separated by only around 10 cM, and the effective population size for the LD analysis is relatively small, the possibility that the loci may be the same cannot be excluded. However, as it is also not trivial to prove, we have adopted a conservative interpretation and assigned them distinct names.

Quantitative trait locus analysis was also performed using a subset of the population formed by those 56 RILs that were able to flower under both SD and LD. Interestingly, no significant QTL were found in this case, supporting the idea that only QTL DTF3A is acting in CRIL2 grown under SD. However, these results should be interpreted with caution, considering the small population size.

#### Mapping Identifies the FT Cluster as Strong Positional Candidates for DTF3A

Several previous studies have reported major flowering QTLs in the central region of chromosome 3 between markers TA6 and TA64 (summarized in **Supplementary Figure 3**), indicating this as a particularly important genomic region (Weller and Ortega, 2015). We scanned this region for genes similar to known flowering time genes in other species and added 18 additional markers to the CRIL2 linkage map, including 13 within the TA6-TA64 interval (**Supplementary Figure 3** and **Supplementary Table 1**). This confirmed the presence of DTF3A within this interval and narrowed its location to a smaller interval flanked by markers SUVH4 and CDF2d (**Figure 1**), that corresponds to a physical distance of 1.4 Mbp and contains 124 annotated genes, according to the reference genome. Many of the flowering-related genes annotated in this region lie outside of this interval and were thus considered to be unlikely candidates, including SOC1a (SUPPRESSOR OF CONSTANS OVEREXPRESSION 1), COLh (CONSTANS-LIKE h), AG (AGAMOUS)-like, LUX (LUX ARRHYTHMO)-like, CDF (CYCLING DOF FACTOR), and WRKY (**Supplementary Figure 3**). However, the analysis confirmed the presence of a cluster of FT genes directly under the QTL peak, and a marker for one of these, FTa1, showed the strongest association with SD flowering time among all the markers tested (**Table 1**).

The dramatic delay in flowering of the PI489777 parent line and the bimodal distribution of the flowering phenotypes in CRIL2 under SD suggested that the QTL could also be analyzed as a single Mendelian locus, to refine its position. **Figure 2** illustrates all recombinants identified in the CRIL2 population across the LG3A region, and shows that DTF3A can be further delimited to a region of 0.8 Mb between markers SUVH4 and GATA9/ING2

Population Place Year Trait<sup>a</sup> Cond<sup>b</sup> QTL LOD<sup>c</sup> PVE<sup>d</sup> Marker<sup>e</sup> LG<sup>f</sup> Early<sup>g</sup> Late<sup>g</sup> Thr<sup>h</sup> CRIL2 Hobart 2016 DTF LD, P DTF3C 2.9 9.6 S1202p50545 3 29 30.3 2.8 SD, P DTF3A 50.2 85.2 FTa1 3 66.2 128.6 2.6 2016 GH SD, P GH3 34 66.6 FTa1 3 3.4 1.6 2.6 GH4 5.5 5.9 S360p1277380 4 2.8 2.2 3.1 2016 BI SD, P BI3 10.6 33.1 FTa1 3 0.4 0.9 2.6 LD, P BI3 5.4 18.4 FTa1 3 0.2 0.5 2.5 RIP12 Cordoba 2001 DTF GLH DTF3A 10.8 46.9 FTa1 3 14.1 39.6 3.1 Field DTF3A 4.5 22 FTa1 3 60.4 68.6 2.9 2004 DTF Field DTF3A 14.8 51.1 FTa1 3 8.9<sup>i</sup> 21.7<sup>i</sup> 2.9 DTF4B 3.6 9.2 STMS11 4 17.9<sup>i</sup> 12.6<sup>i</sup> 3.3 2008 DTF Field DTF3B 6.3 29.6 COLh 3 70.3 76.8 2.9 2014 DTF Field DTF3A 8.4 29.8 FTa1 3 58.3 64.2 3 DTF4A 5.3 17.3 GAA47 4 63.5 59 2.8 RIP5 Cordoba 2003 DTF Field DTF3D 9.6 38.7 WRKY 3 60.4 64.8 2.7 Cordoba DTF3A 3 8.7 FTa1/2 3 61.3 63.9 2.7 Mengibar DTF3A 5.7 26.9 FTa1/2 3 64.2 66.7 2.8 RIP8 Rep1 2003 DTF Field DTF3D 7.5 29.2 TA125 3 84.3 87.3 2.6 Rep2 DTF3D 6.8 29.0 TA125 3 84.6 87.3 2.7

TABLE 1 | Quantitative trait loci (QTL) identified by multiple QTL mapping for flowering time, growth habit and branching index in four populations grown in different environments.

<sup>a</sup>Trait analyzed: DTF, flowering time; GH, growth habit; BI, branching index. <sup>b</sup>Condition: LD, long days; SD, short days; P, phytotron; GLH, glasshouse. <sup>c</sup>The LOD scores for each QTL. <sup>d</sup>PVE, Phenotypic variation explained. <sup>e</sup>Marker nearest to the peak LOD score. <sup>f</sup>LG, linkage group harboring the QTL. <sup>g</sup>Marker genotype class means for early (C. arietinum accessions ICC4958, ICCL81001 and WR315 for CRIL2, RIP12, and RIP5/8, respectively) and late (C. reticulatum accessions PI489777 and Cr5-9 in CRIL2 and RIP12 and C. arietinum ILC3279 in the case of RIP5/8) parents, calculated for the marker with higher LOD. <sup>h</sup>Threshold LOD for a 0.995 confidence value, calculated through permutation test for each trait and linkage group. <sup>i</sup>Flowering time in 2004 is a relative value, as specified in Supplementary Figure 2.

(**Supplementary Table 3**). This region contains only 59 genes, but still includes the FT cluster.

# Comparison of the DTF3A Region in Other Crosses

The segregation of a major flowering time locus in CRIL2 and several other interspecific populations suggests a potential role for this locus in early crop evolution. However, a lack of common markers has made it difficult to compare the position of QTL between studies and clearly demonstrate their co-location. To investigate the position of DTF3A relative to previously described QTLs, and assess the possible relevance of this region at the intraspecific level, we selected three additional populations for parallel analysis through mapping of common markers. RIP12 is another interspecific population, for which a major flowering QTL has been reported in the TA6-TA64 region (Cobos et al., 2009). The intraspecific populations RIP5 and RIP8 were also examined, as preliminary evidence indicated an association of markers in the 3A region with flowering time in this cross (Castro, 2011). Where polymorphisms were available, the genes targeted in CRIL2 were also genotyped and added to the linkage maps in these additional populations (**Supplementary Table 1**) by recalculation of the linkage maps with markers for these genes and previously mapped markers (**Supplementary Figures 4**–**7**). These maps were then used for QTL analysis of flowering data for the three populations across different locations, years, and environments (**Supplementary Figure 2**), revealing a total of 12 significant flowering QTL (**Table 1**).

In the RIP12 population, analysis over several years, in glasshouse and field environments, yielded seven QTL; five on LG3 and two on LG4 (**Table 1**). The QTL on LG3 were defined by the same interval 3A described above for CRIL2 (**Figure 1**), and the FTa1 marker again explained the highest proportion of variation (up to 51%). During 2008, a flowering QTL DTF3B was detected in a second region of LG3 between markers FTa1 and Q051828. Since both the position of the interval (**Figure 1**) and the significance of the QTL (∼30% PVE) are very close to those obtained for DTF3A (**Table 1**), it seems highly probable that these two QTL are equivalent.

In the intraspecific populations, two regions on chromosome 3 influenced flowering time. One of these was region 3A, which was detected in the RIP8 population, with a variable effect on flowering time depending on location, with a strong effect when grown in Mengibar, and a weaker influence in Cordoba (26.9 vs. 8.7% variance explained, respectively). An additional highly significant QTL (DTF3D) was detected on LG3, between markers LOB189 and PRT6, in both intraspecific populations (**Figure 1**). Although this QTL was not detected in RIP5 at Mengibar, in situations where it was detected it had a greater effect than DTF3A (**Table 1**).

# FT Genes in Chickpea

In view of the central location of an FT gene cluster under the DTF3A QTL, we characterized the entire chickpea phosphatidylethanolamine-binding protein (PEBP) family, which includes FT genes and the related TFL1 (TERMINAL FLOWER 1) family of flowering repressors (Wickland and Hanzawa, 2015; **Supplementary Figures 8A, 9** and **Supplementary Table 4**). Five chickpea FT-like genes were identified in the three previously described legume FT subclades; FTa, FTb, and FTc (**Supplementary Figure 10**; Hecht et al., 2011). This analysis confirmed that chickpea, like Medicago, possesses three FTa genes, with two of these (FTa1 and FTa2) located together with the single FTc gene on chromosome 3 in a tandem arrangement (Hecht et al., 2011; Laurie et al., 2011). Only one other PEBP gene was found on this chromosome (TFL1a), while the remaining genes were located on chromosomes 1 (TFL1b), 2 (FTb and FTa3), 6 (MOTHER OF FT, MFT), and 8 (TFL1c) (**Supplementary Figure 8B**). The only difference in the chickpea FT family compared to other related legume species is the apparent presence of only a single FTb gene, where Medicago and pea each have two highly similar paralogs located in tandem in a conserved genomic location on chromosome 7 and LG5, respectively (Hecht et al., 2011; Laurie et al., 2011). In the broader PEBP family, chickpea possesses single-copy orthologs of the BFT (BROTHER OF FT) and MFT genes, and also of two of the three TFL1 genes previously described

breakpoints across a 7.14 Mb region of chromosome 3 spanning the DTF3A locus. Numbers over the markers correspond to their physical position (in Mb) in the CDC Frontier genome assembly in NCBI (ASM33114v1; Varshney et al., 2013). Alleles from the domesticated parent ICC4958 are shown in white and those from the wild parent PI489777 in gray. Flowering phenotype is shown in the column headed SD and indicates whether the indicated lines flowered (Y) or remained vegetative (N) under an 8h photoperiod. This phenotype showed no recombination between markers FTa1 and GATA9.

in pea and Medicago, TFL1a and TFL1b. The third gene, TFL1c, was represented by three gene models in the CDC Frontier genome assembly (**Supplementary Table 4**), but was not represented at all in the other available chickpea genome (from ICC4958, assembly ASM34727v3); a discrepancy that will require clarification in future.

# Genes in the FTa1-FTa2-FTc Cluster Are Upregulated in Early Accessions

FT genes are well-known as important positive regulators of flowering. This is also true in legumes, where several FT genes have been identified and most are capable of promoting flowering when overexpressed in Arabidopsis (Kong et al., 2010; Hecht et al., 2011; Laurie et al., 2011; Sun et al., 2011). Therefore, if one of the FT genes in the cluster was the basis for the effect of the DTF3A locus, increased activity or expression of one or more of these genes would be expected in the early-flowering parent. To evaluate this possibility, we examined the expression of FT genes in the parent lines of the mapping populations. In view of previous reports indicating tissue- and photoperiodspecific expression of FT genes in pea and Medicago, we collected samples from leaf and apex tissue under both LD and SD conditions at two timepoints. Expression of the AP1 homolog PROLIFERATING INFLORESCENCE MERISTEM (PIM) was used as an indicator of flowering commitment, as previously described for other legumes including chickpea (Hecht et al., 2011; Ridge et al., 2016).

**Figure 3** shows that 2 weeks after emergence PIM expression in shoot apices was not detectable in any of the accessions. By 4 weeks, PIM was expressed significantly above background in all three late parents under LD but not in SD, whereas it was strongly expressed under both LD and SD in the early parents. In parallel, the expression of all three genes in the chromosome 3 FT cluster (FTa1, FTa2, and FTc) was elevated in the early parents at 4 weeks under SD and LD. In ICC4958, expression of all three genes was higher than the wild parent even by week 2; i.e., before detectable expression of PIM. Similarly, expression of FTa2 and FTc was also elevated in the early parent of RIP12 (ICCL81001) at week 2. However, FTa2 transcript could not be detected in the early parent of RIP5/8 (WR315), reflecting a complete deletion of the gene (**Supplementary Figure 11**). This result suggests that the elevated expression of FTa2 seen in the domesticated parents of CRIL2 and RIP12 is unlikely to be solely responsible for the effect of DTF3A in these populations. As in pea and Medicago, FTa1 and FTc in chickpea differed in the tissuespecificity of their expression, with FTa1 expressed strongly in leaves and weakly at the shoot apex, and FTc expressed only weakly at the shoot apex. Despite these differences, both genes showed similar expression profiles, with an early upregulation in the domesticated/early flowering parents that preceded PIM induction, and they therefore represent good candidates to underlie the QTL.

Significant expression of the single FTb gene was seen in 2-week-old plants, but only under LD, and at a similar level in both early and late parents. This is similar to the strongly photoperiod-dependent expression of FTb genes previously reported in pea and Medicago (Hecht et al., 2011; Laurie et al., 2011), and indicates that FTb misexpression is not a factor in the effect of DTF3A under SD. The expression of FTa3 was restricted to leaf tissue, and only detected at a late developmental phase after commencement of flowering (**Supplementary Figure 12**), suggesting it is unlikely to make a major contribution to the observed differences in flowering time. The expression of TFL1b and TFL1c was also tested in apical tissue. Whereas expression of TFL1c in this tissue did not change significantly, TFL1b expression was higher in the wild line under non-inductive conditions and gradually decreased in cultivated and wild accession grown in long photoperiod, consistent with a possible role as a floral repressor. However, the level of expression observed in both genes was very low and the biological significance of these changes is therefore uncertain (**Supplementary Figure 12**).

# The DTF3A Locus Coincides With QTL for Plant Architecture

The late-flowering phenotype of wild chickpea is also associated with a prostrate growth habit (GH), reduced apical dominance and an increased number of branches (Singh and Shyam, 1959; Aryamanesh et al., 2010; Ali et al., 2015). Consistent with these reports, we also observed major differences in growth habit between CRIL2 parents and in the CRIL2

population in SD, which we quantified for genetic analysis using a four step scale (**Supplementary Figures 13A–D**). We also recorded branching propensity in young plants (prior to visible flower initiation) under both SD and LD. Late flowering RILs also showed a shoot architecture that resembled the wild parent, so we investigated the correlation between these three traits (**Supplementary Figure 13**). A highly significant difference (p < 0.001) was found between the flowering dates of erect/semierect RILs compared to those with a prostrate/semiprostrate growth habit (**Supplementary Figure 13E**), confirming that in the segregating population, prostrate growth habit is associated with late flowering, as expected. Inspection of individual RILs showed a nearly perfect correlation, with flowering observed in all 53 erect or semi-erect RILs but in only three out of 71 lines categorized as prostrate or semi-prostrate. A strong negative correlation (r = −0.504, p < 0.001) was found between growth habit and branching index (**Supplementary Figure 13F**), indicating that erect and semi-erect plants in general also had a lower branching index (BI).

BI of the population was generally higher in SD than in LD, as might be expected in view of the longer vegetative growth phase. However, across the population, a strong positive correlation (r = 0.679, p < 0.001) was found in the BI between photoperiods, suggesting that at this stage (3 weeks old plants) a genetic component of this trait is unrelated to photoperiod. QTL analysis revealed two QTLs for growth habit; a major QTL on LG3 that explained 66% of the variation for this trait, and a minor QTL on LG4. For BI, a single QTL in a similar location was identified under both photoperiods (**Table 1**). Interestingly, the QTL for both GH and BI in chromosome 3 were closely co-located with the DTF3A flowering time QTL described above (**Figure 4**). In addition, the physiology of these three QTL is similar with respect to their strong effect under SD and their absence, or minor effect, under LD, as seen in the genotype means for the FTa1 peak marker shown in **Table 1**.

## DISCUSSION

One of the critical events in chickpea evolutionary history is thought to have been its conversion from a winter to a summer crop, likely achieved by Neolithic farmers in an attempt to reduce the incidence of Ascochyta blight, whose onset is favored by the cool, wet conditions that typify Mediterranean winters (Kumar and Abbo, 2001; Abbo et al., 2003a,b). For this shift in the chickpea farming system to succeed, a major modification of phenology toward earliness would have been required in order to match the considerably shorter growing season. This selective pressure is evident today in the typically early flowering

phenotype of the domesticated C. arietinum relative to wild Cicer species (Berger J. et al., 2004).

Our analyses identify a central region of chromosome 3 (referred to as region 3A) that makes a major contribution to this difference in flowering time between domesticated chickpea and its wild progenitor, C. reticulatum, in two populations utilizing different C. arietinum parents and grown in different conditions. This result is consistent with several previous reports. Das et al. (2015) found a recurrent major QTL on chromosome 3 in an interspecific cross using ICC4958 as the domesticated parent. Aryamanesh et al. (2010) found a major QTL on chromosome 3 defined by the same interval as that reported initially in RIP12 by Cobos et al. (2009) and narrowed in the present study. The fact that these studies use different and unrelated C. arietinum accessions suggests that the presence of early alleles at this locus may be a defining feature of domesticated chickpea.

Another interpretation is that the apparent importance of this locus could reflect the fact that the wild parents used in all of these studies are closely related and could conceivably carry a unique variant at this locus that is not representative of the wider C. reticulatum germplasm. However, this is discounted by the recent finding of von Wettberg et al. (2018), who examined crosses between a common domesticated parent and 29 newly collected wild accessions representing a much wider diversity, and found that all progenies shared a common major QTL in a 3.55 Mb interval of chromosome 3 encompassing the LG3A region. Interestingly, this region also appears to have a significant effect within domesticated chickpea, as revealed by our analysis of two intraspecific populations, and several other studies (e.g., Hossain et al., 2010; Rehman et al., 2011). However, its effect at this level seems to be more dependent on environment and the influence of other loci, suggesting that additional variation in this region may have also had a role in post-domestication diversification of flowering behavior. Further clarification of this scenario will require a wider analysis in both interspecific and intraspecific contexts, whether in biparental populations or through association approaches.

In addition to late phenology, wild chickpea is also distinguished from domesticated forms by the greater profusion of branches and prostrate growth habit (Ali et al., 2015), and we found that the same chromosomal region 3A also had a significant influence on both traits, particularly under SD conditions, as reflected by the presence in the region of a major QTL for each of these traits (QTL GH3 and QTL BI3). To date, two major loci, Hg and Hg2, have been reported to determine growth habit differences between C. arietinum and C. reticulatum (Muehlbauer and Singh, 1987; Kazan et al., 1993; Ali et al., 2015). Interestingly, Hg has been mapped to the central region of chromosome 3 by Winter et al. (2000), using a population derived from the same parents as CRIL2, and studies by Cobos et al. (2009), Aryamanesh et al. (2010), Ali et al. (2015) have all reported a locus influencing growth habit in this region. Since the GH3 QTL we describe here for CRIL2 is located within the intervals reported in these studies, it seems likely that all of these studies are detecting the same locus (Hg). Association of flowering with different features of shoot architecture has been previously described in a number of other legume species, including chickpea (Lichtenzveig et al., 2006; Julier et al., 2007; Lagunes Espinoza et al., 2012; González et al., 2016; Yang et al., 2017). In the case of QTL in the chickpea LG3A region, such an association could either represent the action of independent but tightly linked genes, or the pleiotropic effects of a single gene.

The discrete and approximately 1:1 segregation of flowering time in CRIL2 under controlled SD conditions enabled us to map DTF3A as a Mendelian trait to a narrower interval, thereby reducing the number of potential candidates. The only remaining clear candidates were a cluster of three FT genes orthologous to the FTa1/a2/c cluster identified in Medicago and pea by Hecht et al. (2005, 2011). FT genes have a widely conserved role as flowering promoters (Wickland and Hanzawa, 2015), and several recent studies show that this is also the case for legume FTa and FTc genes (Kong et al., 2010; Hecht et al., 2011; Laurie et al., 2011; Sun et al., 2011). We identified elevated expression of genes in the FT cluster in the early parents of all three crosses examined (**Figure 3**), implicating the general derepression of these genes as the likely molecular basis for the DTF3A effect. A comparable situation has been recently described in another legume, narrow-leafed lupin (Lupinus angustifolius), where a strong ancestral vernalization requirement has restricted

production in warmer regions. This limitation has been overcome by the incorporation of dominant alleles at the major locus Ku, which confer de-repressed expression of a tightly linked FTc gene and permit flowering in the absence of vernalization (Nelson et al., 2017; Taylor et al., 2018). However, compared to lupin, where only a single FT gene is present in this genomic location, the presence of three genes in chickpea is clearly a more complex situation, and raises the question of which of them might be responsible for the QTL effects on photoperiod response, or the QTLs for vernalization response that has been localized to the same genomic region on LG3 (Samineni et al., 2015; Pinhasi van-Oss et al., 2016).

The FTa1 gene plays a key role in regulation of flowering in both pea and Medicago, as loss-of function mutants show significant impairment of flowering in both species, and overexpression in Medicago confers early flowering and reduced sensitivity to photoperiod and vernalization (Hecht et al., 2011; Laurie et al., 2011). FTa1 would therefore seem to be the strongest candidate for the causal gene underlying DTF3A. Although the role of FTc has not been systematically explored in either species, both MtFTc and PsFTc are strong activators of flowering when overexpressed in Arabidopsis, and their induction in apical tissues correlates closely with flowering (Hecht et al., 2011), suggesting that the higher levels of CaFTc expression could also potentially contribute to the earlier flowering of domesticated lines. Intriguingly, the most dramatic expression difference in the two interspecific comparisons was seen for FTa2, which was expressed at a low level in C. reticulatum parents and over 20 times higher in the domesticated parents. However, despite this striking association with early flowering, FTa2 was not expressed at all in the early parent of the intraspecific cross, indicating that the early flowering of domesticated relative to wild chickpea cannot result primarily from the high level of FTa2 expression. Also, in contrast to FTa1 and FTc, FTa2 from pea or Medicago is much less effective for induction of flowering when expressed in transgenic Arabidopsis, and its endogenous expression patterns are not consistently associated with flowering (Hecht et al., 2011; Laurie et al., 2011). Taken together, these observations suggest that FTa2 is less likely to be the basis for the interspecific effects of DTF3A, but it remains plausible that these effects might reflect general derepression across the cluster and a functional contribution from all three genes.

The strong photoperiod-dependence of the DTF3A effect can also be interpreted in terms of the known role of FT genes in mediating of environmental effects on flowering. In both pea and Medicago, photoperiod and vernalization responses appear to be integrated through FT genes, but whereas FTa genes are regulated by both photoperiod and vernalization, FTb genes are strictly regulated by photoperiod (Hecht et al., 2011; Laurie et al., 2011). In chickpea, a similar LD-specific expression of the single FTb gene is seen in both wild and domesticated parents (**Figure 3**) and may be sufficient for maximal promotion of flowering, which could provide an explanation for the minimal effect of DTF3A under these conditions. In contrast, under non-inductive SD conditions, the absence of FTb expression or other inputs would presumably expose any effects of elevated expression of the FTa/c cluster.

Whether one or more of the FT genes are indeed responsible for the effects of DTF3A, it is also of interest to consider what might be the molecular basis of their observed de-repression. The apparently specific effects of the QTL on expression of the underlying FT genes suggests a scenario in which the domesticated parents might have undergone modification of either a cis-acting or a closely linked trans-acting mechanism normally required for repression of the cluster. The absence of other plausible candidates in the defined region favors a cisacting mechanism, and precedent for this is provided by recent studies in two other legumes. In Medicago, insertions in the third intron and 3<sup>0</sup> flanking region of FTa1 confer gain-of-function phenotypes, with elevated FTa1 expression and dominant early flowering (Jaudal et al., 2013), whereas in narrow-leafed lupin, the derepression of FTc expression that underlies the effects of Ku alleles is associated with deletions in the FTc promoter (Nelson et al., 2017; Taylor et al., 2018). The recently reported role for the polycomb-group protein VRN2 (VERNALIZATION 2) in FTa1 repression in Medicago (Jaudal et al., 2016) points to the likely existence of both epigenetic and transcriptional components to this regulation.

Direct involvement of FT genes would also provide an explanation for the association of growth habit and flowering effects with the chromosome 3A region. It is becoming increasingly apparent that FT genes, in addition to being major flowering regulators, also affect plant architecture and growth habit across a wide range of plant species including Arabidopsis, tomato, rose and rice (Lifschitz et al., 2006; Tamaki et al., 2007; Hiraoka et al., 2013; Huang et al., 2013; Randoux et al., 2014; Tsuji et al., 2015; Weng et al., 2016). However, the most direct and relevant comparison with chickpea is again provided by Medicago, where MtFTa1 overexpression converts the prostrate habit of plants grown under SD to a more erect habit typical of LD (Laurie et al., 2011). This effect is clearly similar to that of the corresponding region on chromosome 3A in domesticated chickpea. In contrast, Medicago fta1 mutants show a highly branched, prostrate phenotype under LD similar to that of wild-type under SD, further emphasizing the multiple roles of FTa1. This observation strengthens the case that the major flowering time and growth habit loci in this region of chromosome 3 represent pleiotropic effects of misexpression of genes in the FT cluster, and possibly of FTa1 in particular.

An emerging theme in long day legumes appears to be an important adaptive role for dominant genetic variants in the region of the FTa/c cluster that relax the environmental constraints on flowering and permit early flowering (Weller and Ortega, 2015). Whether a common molecular mechanism unites these adaptations and explains their repeated evolution remains to be determined. Among the ancient legume crops, chickpea in particular may represent a unique example in which modification of such a mechanism has been fundamentally important to crop success. Future, more detailed analyses should shed light on its molecular basis and physiological consequences, and its significance for chickpea domestication and adaptation.

### AUTHOR CONTRIBUTIONS

JW, RO, VH, and TM conceived the study. JW and RO designed the study. RO, RM, NC-G, RP, JR, and TM carried out the experiments and/or generated the data. RO and JW wrote the manuscript with inputs from the other authors. All authors analyzed the data.

#### FUNDING

This work was supported by an Australian Research Council Future Fellowship (FT120100048) to JW and the INIA (Instituto Nacional de Investigacion y Tecnologia Agraria y Alimentaria) project RTA2017-00041-00-00 (co-financed by the

#### REFERENCES


European Union through the European Regional Development Fund 2014–2020).

#### ACKNOWLEDGMENTS

We thank Michelle Lang and Tracey Winterbottom for assistance in the plant husbandry.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00824/ full#supplementary-material




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Ortega, Hecht, Freeman, Rubio, Carrasquilla-Garcia, Mir, Penmetsa, Cook, Millan and Weller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Deciphering Genotype-by-Environment Interaction for Targeting Test Environments and Rust Resistant Genotypes in Field Pea (Pisum sativum L.)

Arpita Das<sup>1</sup> , Ashok K. Parihar<sup>2</sup> \*, Deepa Saxena<sup>3</sup> , Deepak Singh<sup>4</sup> , K. D. Singha<sup>5</sup> , K. P. S. Kushwaha<sup>6</sup> , Ramesh Chand<sup>7</sup> , R. S. Bal<sup>8</sup> , Subhash Chandra<sup>9</sup> and Sanjeev Gupta<sup>10</sup> \*

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Diego Rubiales, Instituto de Agricultura Sostenible (IAS), Spain Xueyan Wang, Noble Research Institute, LLC, United States

#### \*Correspondence:

Ashok K. Parihar ashoka.parihar@gmail.com Sanjeev Gupta saniipr@rediffmail.com

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 23 January 2019 Accepted: 07 June 2019 Published: 10 July 2019

#### Citation:

Das A, Parihar AK, Saxena D, Singh D, Singha KD, Kushwaha KPS, Chand R, Bal RS, Chandra S and Gupta S (2019) Deciphering Genotype-by- Environment Interaction for Targeting Test Environments and Rust Resistant Genotypes in Field Pea (Pisum sativum L.). Front. Plant Sci. 10:825. doi: 10.3389/fpls.2019.00825 <sup>1</sup> Bidhan Chandra Krishi Viswavidyalaya, Mohanpur, India, <sup>2</sup> ICAR – Indian Institute of Pulses Research, Kanpur, India, <sup>3</sup> Chandra Shekhar Azad University of Agriculture and Technology, Kanpur, India, <sup>4</sup> ICAR – Indian Agricultural Statistics Research Institute, New Delhi, India, <sup>5</sup> Regional Agricultural Research Station, Assam Agricultural University, Jorhat, India, <sup>6</sup> G. B. Pant University of Agriculture and Technology, Pantnagar, India, <sup>7</sup> Banaras Hindu University, Varanasi, India, <sup>8</sup> Regional Research Centre, Punjab Agricultural University, Ludhiana, India, <sup>9</sup> Narendra Deva University of Agriculture and Technology, Faizabad, India, <sup>10</sup> All India Coordinated Research Project on MULLaRP, ICAR – Indian Institute of Pulses Research, Kanpur, India

Rust caused by Uromyces viciae-fabae is a major biotic constraint to field pea (Pisum sativum L.) cultivation worldwide. Deployment of host-pathogen interaction and resistant phenotype is a modest strategy for controlling this intricate disease. However, resistance against this pathogen is partial and influenced by environmental factors. Therefore, the magnitude of environmental and genotype-by-environment interaction was assessed to understand the dynamism of resistance and identification of durable resistant genotypes, as well as ideal testing locations for rust screening through multi-location and multi-year evaluation. Initial screening was conducted with 250 diverse genotypes at rust hot spots. A panel of 23 promising field pea genotypes extracted from initial evaluation was further assessed under inoculated conditions for rust disease for two consecutive years at six locations in India. Integration of GGE biplot analysis and multiple comparisons tests detected a higher proportion of variation in rust reaction due to environment (56.94%) as an interactive factor followed by genotype × environment interaction (35.02%), which justified the requisite of multiyear, and multi-location testing. Environmental component for disease reaction and dominance of cross over interaction (COI) were asserted by the inconsistent and non-repeatable genotypic response. The present study effectively allocated the testing locations into various categories considering their "repeatability" and "desirability index" over the years along with "discrimination power" and "representativeness." "Mega environment" identification helped in restructuring the ecological zonation and location of specific breeding. Detection of non-redundant testing locations would expedite optimal resource utilization in future. The computation of the confidence limit (CL) at 95% level

**98**

through bootstrapping strengthened the accuracy of the GGE biplot and legitimated the precision of genotypes recommendation. Genotype, IPF-2014-16, KPMR-936 and IPF-2014-13 identified as "ideal" genotypes, which can be recommended for release and exploited in a resistance breeding program for the region confronting field pea rust.

Keywords: rust, GGE biplot, repeatability, desirability index, host plant resistance, field pea

### INTRODUCTION

fpls-10-00825 July 8, 2019 Time: 16:8 # 2

Field pea or dry pea (Pisum sativum L.) is widely cultivated on a global basis in West Europe, North America, India, Australia, Pakistan and South America, as a cool season food legume crop for human dietary protein and livestock (Kocer and Albayrak, 2012; Saxesena et al., 2013). It is predominantly an exportoriented cash crop of the world, constituting about 40 percent of the total trading in pulses (FAOSTAT, 2017). This crop is valued primarily due to richness in digestible proteins (21.2– 32.9%), coupled with important minerals and vitamins, and thus, holds immense promise for alleviating protein malnutrition to the resource poor vulnerable sections of the society (Ceyhan and Avci, 2005). Envisaging the importance of this legume, significant contributions have been made in the recent past regarding genetic improvement and cultivar development. Unfortunately, biotic stresses viz. rust, powdery mildew, downy mildew, Ascochyta blight, and root rot are the major impedes in field pea cultivation, which have resulted in subsequent yield and biomass losses worldwide.

Field pea rust incited by Uromyces spp. currently has become a major concern in Europe, North and South America, India, China, Australia, and New Zealand (EPPO, 2012). The Uromyces viciae-fabae (Pers de Bary) is the causal organism for pea rust in tropical and subtropical regions viz. India and China (Xue and Warkentin, 2001; Vijayalakshmi et al., 2005; Kushwaha et al., 2006; Joshi and Tripathi, 2012; Singh et al., 2015). Reports of U. pisi (Pers.) (Wint.) causing fieldpea rust in temperate regions of Spain, Canada, and Egypt are also available in literature (Emeran et al., 2005; Barilli et al., 2009a,b). However, U. viciae-fabae is autoecious and cosmopolitan in nature and attacks all aerial parts of the plant (**Figure 1**). The pathogen mainly appears during mid-spring at the reproductive stage of the crop, starting from flowering initiation to pod development, which resulted in reduction of photosynthetic area with an underdeveloped pod on affected plants, along with yield losses ranges from 57– 100% (Upadhyay and Singh, 1994). Occurrence of the disease at early growth stages may result in complete failure of the crop. Thus, management of rust is a vital endeavor for sustainable field pea production. Chemical control is not holistic approach for controlling pea rust due to complexity in pathogen behavior. Wider host range, lack of durability in resistance of this airborne pathogen and quantitative nature of pea rust resistance are the crucial factors complicating disease management (Barilli et al., 2009a). Therefore, exploitation of host pathogen resistance is the most modest approach of rust control (Rubiales et al., 2013).

In grain legumes – rust pathosystems, mostly incomplete resistance with no host cell necrosis is reported (Sillero et al., 2006). In some legumes, hypersensitive reaction is also observed (Stavely et al., 1989; Sillero et al., 2000). However, in field pea, only incomplete resistance is observed against U. viciae-fabae (Xue and Warkentin, 2002; Chand et al., 2006) and U. pisi (Barilli et al., 2009c). The genetic basis of resistance to U. viciaefabae is reported either under oligogenic (Katiyar and Ram, 1987) or polygenic control (Vijayalakshmi et al., 2005). Since there is existence of variants in both the host and the pathogen, understanding the host-by-pathogen interaction patterns for a particular host–pathogen system can be difficult and challenging (Yan and Falk, 2002). Thus, identification of stable and durable resistance genotypes of field pea against rust, followed by utilization of these genotypes as donors in a resistance breeding program would be a holistic attempt for disease management in a reliable way.

Understanding the role of environments and genotype by environment interaction (GEI), concerning the pathosystem and host genotype stability across diverse locations, is imperative for an efficient resistance breeding program. Environmental influence toward host pathogen response often deludes identification and recommendation of genotypes with durable resistance, thus, it is vital to identify "hot spots" having "repeatability" for evaluating genotypes and assessing actual value concerning respective disease. Unfortunately, reports are quite meager concerning appraisal of field pea genotypes against durable rust resistance across different environments, which creates exigency to understand the dynamics of host genotype and pathosystem under varied locations. Various stability approaches have been widely used in recent years to determine the GEI interaction regarding disease resistance through multi-location trials (MLT) in different crops (Abamu et al., 1998; Robinson and Jalli, 1999; Forbes et al., 2005; Mukherjee et al., 2013; Tekalign et al., 2017). Among these, GGE biplot methodology, which is a graphical approach, is becoming increasingly popular among the researchers for better explication of genotype and environmental evaluation. Recently, GGE biplot has been deployed to appraise genotypes with wide or specific adaptation related with resistance to different pathogens viz. in faba bean for Ascochyta blight and chocolate spot (Rubiales et al., 2012; Tekalign et al., 2017), in chickpea for fusarium wilt and ascochyta blight (Sharma et al., 2012; Pande et al., 2013), in pigeonpea against sterility mosaic disease (Sharma et al., 2015), in lentil for fusarium wilt and rust (Parihar et al., 2017a, 2018) and in mungbean against MYMV (Alam et al., 2014; Parihar et al., 2017b). Although, in the previous studies, during the assessment of test locations, "repeatability" and "desirability index" were not lucidly addressed for proper delineation of "mega environment." Moreover, in the previous reports, genotypes and environments recommendation was based on only graphical biplot approaches without involving

sound statistical assumptions, thus created perplexity toward the validity of the recommendations.

(d) Aeciospores of Uromyces viciae-fabae.

GGE biplots have not been expanded previously to appraise host genotypes response toward rust disease across varied locations, for identification of the best resistant genotypes, as well as "ideal" testing locations for better differentiation of resistance level among field pea genotypes. Hence, the present study was attempted through GGE biplot approach to enumerate the effect of GEI on field pea rust tested across various locations over the years, for identifying stable and superior field pea genotypes that could be recommended for future cultivation in the areas confronting rust problem. Additionally, the aim of the present study was to assess the influence of environments on host pathogen response along with identification of "ideal" test locations followed by grouping of various test locations into distinct "megaenvironments" for optimum resource allocation in future testing. In the present study, integration of bootstrapping for generating confidence limit (CL) at the 95% level validated the genotypes recommendation.

#### MATERIALS AND METHODS

#### Initial Testing

In a preliminary screening under the aegis of AICRP on MULLaRP, Kanpur, India (All India Co-Ordinated Research Project on Field pea and other pulses), a total of 250 genotypes of field pea, consisting of released varieties, germplasm accessions and advance breeding lines, were evaluated against rust reaction at nine locations during 2013–2014 in Augmented Block Design. Each genotype was sown in a plot of three rows of 3-meter length, spaced at 40 cm, and plant to plant distance was maintained at 10 cm. All the testing locations are decisively selected for the prevalence of U. viciae-fabae. Spreader rows of rust susceptible check were planted after every 10 rows of the test populations and five rows of each of the spreader row on all the sides of experimental area. A uniform basal dose of 20 kg: N, 40 kg: P2O<sup>5</sup> and 40 kg" K2O was applied at the time of sowing. On such preliminary evaluation, a subset of 23 promising field pea genotypes based on their rust resistance reaction was extricated for multi-location and multi-year evaluation.

#### Multi-Environment Evaluation (MEE)

The promising 23-field pea genotypes (**Table 1**) identified in preliminary screening were further evaluated for rust reaction across six diverse locations (**Table 2**) during winter season in two consecutive years (2014–2015 and 2015–2016) under natural epiphytotic condition. The aecial strain of U. viciae-fabae was present at all the testing locations. The genotypes were planted as per the standard agronomic practices following proper plant geometry with 4 m row length, 40 cm × 10 cm row to row and plant to plant spacing, respectively. A standard susceptible check "HFP 4" was sown after every 3 rows as spreader infector row for maintaining sufficient disease pressure under natural condition. Five rows of each of the spreader row were also grown around the experimental area. Potted spreader plants heavily infected with U. viciae-fabae were kept throughout the field to serve as additional sources of inoculumn. To increase the humidity, fields were irrigated at regular intervals until the grain attained full size. Further, to elucidate the difference among the test environments, principal component analysis (PCA) was performed considering various weather parameters: viz. max. and min. temperature, rain, rainy days and relative humidity of the locations (**Figure 2**). The results of PCA analysis validated the significant difference among the selected environments.

#### Disease Screening and Data Recording in MEE

The disease was assessed following the 1–9 scale of Subrahmanyam et al. (1995) described earlier. On the basis of disease scoring, the tested genotypes were classified into five distinct groups: (1) highly resistant; (2–3) resistant; (4–5) moderately resistant/susceptible; (6–7) susceptible; and (8–9) highly susceptible. Observation regarding rust was also recorded by visual estimation of leaf area covered with rust pustules (%).

#### Construction of GGE Biplot

The GGE biplot was constructed based on the first two principal components (PCs) resulting from singular value decomposition (SVD), by estimating each element of the matrix through following formula (Yan et al., 2000; Yan and Kang, 2003):

$$\mathbf{Y}\_{\mathbf{i}\mathbf{j}} = \boldsymbol{\mu} + \mathbf{e}\_{\mathbf{j}} + \sum\_{\mathbf{n}=1}^{N} \lambda\_{\mathbf{n}} \boldsymbol{\chi}\_{\mathbf{in}} \boldsymbol{\delta}\_{\mathbf{jn}} + \mathbf{e}\_{\mathbf{ij}}$$

Where,


#### TABLE 1 | Information regarding the field pea genotypes.


γin and δjn = genotype and environment PCs scores for axis n.

N = number of PCs retained in the model.

εij = residual effect∼ N (0,σ 2 ).

For genotype evaluation, as well as determining stability, an "average environment coordination" (AEC) view of the GGE biplot has been constructed, which facilitates genotype comparisons based on mean of disease score and stability across environments within a "mega-environment" (Yan, 2001, 2002). A performance line passing through the origin of the biplot was used to determine the mean performance of the genotype in terms of rust scoring. The arrow on the performance line represents a decrease in stability of the genotype, i.e., higher susceptibility (Yan and Falk, 2002). Similarly, for evaluation of test environments, the "discriminating power vs. representativeness" view of the GGE biplot was constructed where the "ideal" test environment should be both discriminating of the genotypes and representative of the "mega-environment" (Yan et al., 2007). The "repeatability" of a test location was measured by the mean value of the genetic correlations between years within the location (Yan et al., 2011) for sustaining up consistency in genotypic performance. Additionally, a "desirability index" of the test locations has been enumerated, considering the association among the test environments and distance from the ideal genotype, based on the AEC, considering genotypic stability and adaptability (Yan and Holland, 2010). Regarding determination of relationship between test locations,



angles between the various environment vectors were used to judge the correlation between the environments (Yan and Kang, 2003). Additionally, to ascertain superiority of the genotypes in different test environments, as well as grouping of test environments into different "mega environments," a "whichwon-where" view of the GGE biplot has been prepared (Yan and Rajcan, 2002). Finally, for assessing the validity of GGE biplot, bootstrapping, a nonparametric resampling approach, was deployed for construction of CL at the 95% level for individual principal component scores of both genotypes and environments, as suggested by Yang et al., 2009. In the raw data, columns represented environments (p = 12) and rows represented genotypes (n = 23). Accordingly, the raw data was average-centered for each environment so that each of the p dimensions of raw data has a mean of zero. The row-wise nonparametric resampling was done from the data matrix to obtain the bootstrap samples. The number of bootstrap samples were chosen to be 40 times to the number of rows (B = 920). The endpoints of CLs at 95% were estimated for genotypic and environmental scores.

#### Data Analysis

The effects of environments, genotype and their interactions were determined by analysis of variance (ANOVA) for across the locations and for each individual location, using mixedmodel analysis in GENSTAT (trial version 18; VSN International, Hemel Hempstead, United Kingdom). The ANOVA explained the partition of variations due to the effect of genotypes, environment and their interaction. Mean significant difference within genotypes and environments was enumerated by LSD test at P = 0.05 probability level. An illustration of distribution pattern of rust score across genotypes and across environments

fpls-10-00825 July 8, 2019 Time: 16:8 # 5

TABLE 3 | Analysis of variance for rust incidence in 23 genotypes of field pea evaluated at six locations in India during Year-1 (2014–2015) and Year-2 (2015–2016).


∗∗P < 0.01.

was presented through box plot. Relatedness of the genotypes and environments was calculated using Ward method and represented through a hierarchical cluster. The GGE biplot analysis was done by using the R software (R Development Core Team, Vienna).

#### RESULTS

Field pea genotypes exhibited variable responses concerning rust reaction in the tested locations. The pooled ANOVA of rust reaction revealed that the effect of genotype, environment and the genotype x environment interactions were significant among the tested genotypes (**Table 3**). Relative contribution of each source of variation reflected that environment, and GEI contributed 56.94 and 35.02% of the total variation, respectively, which indicated the perplexing role of the environment toward rust reaction among the genotypes tested across the locations. Likewise, in the different testing locations, the effect of genotype, year and genotype x year interactions were significant toward rust reaction among the tested genotypes (**Supplementary Table 1**).

Inconsistent performance of the genotypes was observed over both the years and locations and elucidated through frequency distribution of rust reaction of the genotypes at each location (**Figure 3**). The average rust score of susceptible check (HFP-4) varied from 6.0–9.0 in both years and over the locations, advocating adequate disease pressure on the tested genotypes (**Table 4**). The magnitude of rust in the field pea genotypes over both the years and across the environments was illustrated through box plot view (**Figure 4**). Genotypes exhibited incongruous performance and reflected the presence of cross over interaction (COI) across the locations over both years. Undoubtedly, the highest rust scale was found in susceptible check with a mean rust score of 7.2. Across the locations and over both the years, Pant-P-250, Pant-P-266, IPF-2014- 13, KPF-1023, KPMR-936, and Pant-P-243 were identified as moderately resistant genotypes. The association between testing environments in terms of rust score was tested by Spearman's correlation analysis (**Figure 5**). It was observed that Kanpur exhibited a negative association with all the locations except Pantnagar, whereas rest of the five locations recorded a positive association with each other. The significant positive association between Gurdaspur and Varanasi confirmed that these locations have close resemblance regarding rust reaction among the tested genotypes.

#### Evaluation of Genotypes

Mean performance and stability of the genotype across the locations were graphically portrayed through an "AEC" view of the biplot (**Figure 6**). The single arrow-head-line in the graph known as "AEC abscissa," passing through biplot origin, indicates higher disease reaction. From the figure, it could be pointed out that Pant-P-250 (16), KPF-1023 (11), Pant-P-266 (17), IPF-2014-13 (7), KPMR-936 (12), and IPF-2014-16 (8) exhibited less rust reaction. Genotypic stability is generally assessed on the basis of the absolute length of the projection of a genotype. The best performing genotypes would be those with lowest disease reaction (higher negative projection on



LSD least Significant difference based t grouping. Mean value calculated by least Significant difference method.

FIGURE 4 | Boxplot view illustrating the distribution of rust assessment among 23 genotypes of field pea across six test locations. The box represents the area from the first quartile to the third quartile. A horizontal line goes through the box at the median. The whiskers (vertical line) go from each quartile to the minimum or maximum.

AEC) with highest stability, i.e., projection on AEC close to 0 (Yan, 2014). Accordingly, IPF-2014-16 (8) was the most "ideal" genotype, having short projection from "AEC abscissa" along with moderate resistance against rust. Genotypes located closer to the "ideal" genotype are more "desirable" than others. Therefore, KPMR-936 (12), followed by IPF-2014-13 (7),

were considered as "desirable" genotypes, due to their closer position to the "ideal" genotype, with less rust score as well as having consistent performance. Considering the CL at 95% level concerning the individual genotypic and environmental scores corresponding to PC1 and PC2 (**Supplementary Table 2**), being enumerated through bootstrapping showed that the visible differences amid the genotypes reflected in the biplot were contributed to by the differences in the individual PC2 scores of the genotypes (**Figure 7**). It was also confirmed through CL at 95% level that the "ideal" genotype, IPF-2014-16 (8), was statistically different on the basis of PC2 scores (Lower limit: −3.60 and Upper limit: 0.67) from the two desirable genotypes, viz. KPMR-936 (12) and IPF-2014- 13 (7). However, the two desirable genotypes were overlapping corresponding to their PC-2 scores and were not statistically different. Concerning rust reaction, all the tested field pea genotypes were grouped into three major clusters with 16 genotypes in cluster-I, five in cluster-II and only two in cluster-III (**Figure 8**).

#### Evaluation of the Environments

Among the test locations, during the first year, Faizabad exhibited longest environmental vector followed by Gurdaspur and Varanasi, whereas Pantnagar revealed shortest projection (**Figure 9**). Therefore, Faizabad was identified with most "discriminating locations" having the power of genotypes discrimination. On the contrary, during the second year (2015–2016), Shillongani exhibited longest vector with highest "discrimination" power followed by Kanpur and Faizabad. The single arrow-head-line in the graph is denoted as "AEC abscissa. The smaller angle between the environment vectors and "AEC abscissa" is the indicator of the locations having strong "representative" power. During the first year, Shillongani followed by Kanpur exhibited smallest angle with AEC, thus were identified as most "Representative" test locations, whereas, during the second year (2015–2016), Faizabad and Gurdaspur with high disease pressure were detected as being the most "representative" test locations. Although, Gurdaspur was recorded with lowest "discrimination" power in that year. Locations with high "discrimination" power with relatively less "representativeness," such as viz. Faizabad and Pantnagar, should be considered for detecting stable genotypes. In the present study, over both years "repeatability" of the testing locations was assessed through visualizing their association ship. It was observed that amid all the locations over two years, Shillongani (R <sup>2</sup> = 0.549), along with Pantnagar (R <sup>2</sup> = 0.480), were revealed as highly "repeatable" locations, having the ability to exhibit consistent genotypic performance

with non-cross over type of interaction (NCOI) toward rust invasion (**Table 5**). The "desirability index" of testing location is the overall manifestation of pooled performance based on the "discriminatory" power of a location and the "representativeness." Based on two years of data, it could be concluded that Shillongani locations with highest "desirability index" were detected as "ideal" testing locations or "hot spots" for screening rust resistance in field pea genotypes (**Table 5**). Additionally, Faizabad, and Pantnagar would also be considered for field pea rust screening.

#### Identification of Mega Environments

The two-dimensional polygon view in the form of "whichwon-where" polygon of GGE biplot is deployed to identify genotypes for a specific test environment. The perpendicular lines are drawn from the origin of the biplot to each side of the polygon for separating the biplot into several sectors, having one "wining" genotype for each sector located at the vertex of the polygon. In the present study, it was observed that Pant-P-250 (16) had the lowest rust susceptibility and was placed far from the origin depicting inconsistency in the performance (**Figure 10**). Additionally, Pant-P-266 (17), IPF-2014-13 (7), KPF-1023 (11), KPMR-936 (12), and Vikash (20) also exhibited low rust infection. Inversely, the local check (23) was located just opposite to Pant-P-250 (16), in the downstream from the origin, thus was revealed as the most susceptible genotype. Among all the genotypes revealing resistance to moderate resistance response, the most consistent performance was disclosed by IPF-2014-16 (8), which was placed adjacent to "AEC abscissa" with lowest projection onto the "AEC ordinate." The equality lines partitioned the graph into four sectors during the first year, whereas in the second year, three sectors have been observed. These sectors could be entitled as "Mega Environment" affirming environmental variability and existence of COI. During the first year, Gurdaspur and Shillongani alone represented two different "mega environments" with distinct ecological features and genotypic responses toward rust. The other two "mega environments" were constituted by two locations in each, where Varanasi and Faizabad formed one "mega environment" and Kanpur and Pantnagar formed the other one. Deviation in the pattern of COI was reflected during the second year in contrast to the first year. In the second year, Kanpur and Varanasi alone constituted the two different "mega environments," while the rest of the four locations formed the third one. Thus, considering rust response of the genotypes

together, for both the years it was revealed that all the tested environments could be divided into four different "Mega environments."

#### DISCUSSION

Fieldpea rust is gaining prominence in Europe, India and China as it causes huge yield losses. Management of rust becomes enigmatic due to wider host range of the pathogen along with quantitative nature of the host pathogen interaction. Moreover, the influence of weather variables obscures the scenario, which creates urgency of repeated appraisal of disease severity at diverse locations for searching out durable resistance sources. Environmental effect as well as complex GEI may reduce genetic gain under selection and further create a perplexing situation regarding selection and ranking of resistant genotypes. The presence of COI in different environments switches over the genotype ranking and reduces the correlation between phenotypic and genotypic values, thus advocating multienvironment screening of genotypes for drawing conclusions regarding genotypic superiority. Unfortunately, screening of foliar disease like rust is a kind of tedious and costly affair, particularly when natural screening is the only option where unpredictable weather parameters may change the disease spectrum (Sharma et al., 2016; Parihar et al., 2018). Multi-location testing creates a burden on resource poor states and, therefore, seeks attention for identification of "hot spot" or ideal testing locations as well as "mega environment" delineation considering multi-year data for disease resistance screening.

In the present study, GGE biplot (Yan and Kang, 2003) methodology was applied for assessment of rust resistance in

Numbers correspond to genotypes as listed in Table 1. Locations are: For Year-1 (2014–2015): FZB\_1, Faizabad; GDP\_1, Gurdaspur; KN\_1, Kanpur; PNR\_1, Pantnagar; SLG\_1, Shillongani; and VAR\_1, Varanasi. For Year-2 (2015–2016): FZB\_2, Faizabad; GDP\_2, Gurdaspur; KN\_2, Kanpur; PNR\_2, Pantnagar; SLG\_2, Shillongani; and VAR\_2, Varanasi.

field pea genotypes with general or specific adaptation beside appraisal of ideal test locations, and consequently discrimination of "mega environment" for restructuring of zonation. An attempt has also been made for precise recommendation of durable resistant genotypes against field pea rust through integrating bootstrapping for generating CL at 95%. Significant environment (56.94%) and GEI (35.02%) toward rust reaction was reflected in ANOVA (**Table 3**), and confirmed the impact of GEI and dynamic nature of rust disease spectrum in the tested environments. Testing locations with discrete agro-ecologies generated a differential response of the field pea genotypes and changed genotype ranking. Previous reports affirmed the role of environment and GEI, mystifying selection of stable genotypes with durable resistance against various pathogens (Pande et al., 2013; Alam et al., 2014; Sharma et al., 2015, 2016, Funga et al., 2017; Parihar et al., 2017a,b, 2018).

The field pea genotypes had a significantly differential response toward rust under different testing locations, also validating GE influence. The rust reaction was relatively high in Shillongani followed by Pantnagar and lowest at Kanpur. In polycyclic disease like rust, inocula production is a crucial factor for determining the rate of epidemic and it is highly influenced by weather variables (Kushwaha et al., 2007). The tested genotypes in the present study also recorded variable responses in different

Pantnagar; SLG\_1, Shillongani; and VAR\_1, Varanasi. For Year-2 (2015–2016): FZB\_2, Faizabad; GDP\_2, Gurdaspur; KN\_2, Kanpur; PNR\_2, Pantnagar; SLG\_2, Shillongani; and VAR\_2, Varanasi.

locations, confirming the presence of COI, and thus implying the importance of multi-environment testing. Presence of COI is non-additive, non-separable in nature and suggesting for breeding of specific adaptation (Gregorius and Namkoong, 1986; Baker, 1990; Singh et al., 1999; Yan and Hunt, 2002; Rakshit et al., 2012; Xu et al., 2014). Differences in weather variables among the testing locations, as well as genetic variation in the host and pathosystem, ultimately generated variable genotypic response over the locations and over the years. Previous studies also stated incoherent genotypic responses with variable disease reaction in other crops (Alam et al., 2014; Sharma et al., 2015, 2016; Parihar et al., 2017a,b). During screening, a sufficient disease score was corroborated by the consistent reaction of the susceptible check across the locations and over the years.

In the comprehensive plant breeding program, plant breeders prefer to delineate genotypes having the least interacting effect with environments with broad adaptation. Unfortunately, in resistance breeding program, this infrequently happens due to complexity between host pathogen interaction and consequence in disease prevalence. Multi-environment testing facilitates to find out genotypes having small spatial variable with consistent TABLE 5 | Standardized test location evaluation parameters.


performance over locations, along with having small temporal variable with coherent performance over years (Kang, 2002). In the "Mean vs. Stability" view of the GGE biplot, the "AEC ordinates" signify higher GE interaction effect in both directions and represent poor stability (Yan and Tinker, 2006), whereas, the vector projections of the genotype to the "AEC abscissa" represent the average performance (Yan and Falk, 2002). In the present study, Pant-P-250 (16), KPF-1023 (11), Pant-P-266 (17), IPF-2014-13 (7), and KPMR-936 (12) exhibited

higher negative projection on the ATC abscissa, thus less rust reaction. IPF-2014-16 (8) was identified as the most "stable" and "ideal" genotype with lowest projection onto the "AEC abscissa." Additionally, in the present study, KPMR-936 (12) and IPF-2014-13 (7) were identified as "desirable" genotypes amid others and were positioned closer to the ideal genotype, IPF-2014-16 (8). Similarly to the "ideal" genotype, these two "desirable" genotypes also have the resistance response i.e., higher negative projection on the ATC abscissa with less projection on AEC ordinates i.e., high stability (Yan et al., 2007; Parihar et al., 2018). These strategies have been successfully deployed for identifying stable and resistant genotypes in different crops (Beyene et al., 2011; Sharma et al., 2015; Tekalign et al., 2017; Parihar et al., 2017a,b; Sillero et al., 2017). Further, through deploying bootstrapping for enumeration of CL at 95%, it was confirmed that the ideal genotype, IPF-2014-16 (8), was statistically different from the two desirable genotypes, whereas, there was no statistical difference between the two desirable genotypes. Thus, the "ideal" genotypes, along with any one of the "desirable" genotypes with durable resistance, would be precious genetic resources in the future for the comprehensive resistance breeding program of field pea fronting rust issue. In the present study, integration of GGE biplot, along with a statistical hypothesis like bootstrapping, increased the precision of the visual observation toward genotypes recommendation.

During a multi-environment trial, plant breeders should meticulously screen out testing locations considering their "discrimination" power to categorize the genotypes, "representativeness" of the mega-environment of interest, "desirability index," and "repeatability" across years in genotype ranking (Yan et al., 2011). Previous report stated that "representativeness" is the key factor to decide how a test location should be used in genotype evaluation, assuming adequate discriminating ability (Yan et al., 2007). Additionally, "repeatability" over the years and "desirability index" of the testing locations could be able to assess the "representativeness" of the testing locations flawlessly, allowing refinement in selection of future test locations. In the current study, during the first year, Faizabad and Shillongani appeared as the most "discriminating" as well as "representative" locations, respectively, while during the second year, the situation was reversed. Therefore, during the first year, Shillongani was identified as the "ideal" test location, and conversely during the second year, Faizabad was revealed as the "ideal" testing location. Dissimilar "ideal" environments in different years during the period of study were quite apparent

and signified the highest contribution of environments among the total variation. During multi-environment testing, data from multi-year is essential for enumerating "repeatability" of the locations, for proper visualization of repeatability in genotype × environment interaction (Yan et al., 2000, 2007, 2011; Yan and Rajcan, 2002; Yan and Holland, 2010). Shillongani and Pantnagar, due to having consistent weather variables over both years regarding genotype response toward rust, were recorded as highly "repeatable" locations. Additionally, "desirability index" suggested that Shillongani followed by Faizabad were the "ideal" locations for rust screening. Finally, considering the four parameters ("discrimination," "representativeness," "repeatability," and "desirability index") in our study, all the testing locations have been classified in to four categories. Shillongani would be considered as "Type-I" or "ideal" testing locations, for screening out genotype at core location during early breeding stage.

Partitioning testing locations into distinct "mega environment" is the only way of getting consistent genotype performance within that particular sector. GGE biplot methodology can be successfully portrayed out "mega environment" through "which-won-where" view (Gauch and Zobel, 1997; Yan and Kang, 2003; Yan et al., 2007). The purpose of mega-environment identification is to understand the complex GEI pattern within that region for exploiting specific adaptation, as well as increment of selection responses (Yan et al., 2011). Previous reports defined "mega environment" consisting of locations exhibiting similar and repeatable genotypic responses across the years (Yan et al., 2000; Yan and Rajcan, 2002; Yan and Tinker, 2006). Conversely, "Non-repeatability" during "mega environment" selection in the present study was obvious due to non-repeatable association among the different locations, as well as inconsistency in genotypic and environmental scores (Krishnamurthy et al., 2017). Locations within each "mega environment" constructed in the present study revealed identical conclusions regarding genotypic response toward rust reaction. Judicial alignment of testing locations and

#### REFERENCES


converging breeding efforts in a location specific manner holds great relevance for improving the precision in the resistance breeding program.

The present study focussed on enlightening the influence of environmental and genotype- by- environment interactions, concerning the response of field pea genotypes toward rust. Incoherent response of the genotypes and locations across the years reflected the influence of environment toward volatility of rust score. Our study proficiently discriminated "ideal" and "desirable" genotypes for future rust screening of field pea in India. IPF-2014-16, KPMR-936 and IPF-2014-13 as "ideal" and "desirable" genotypes with consistent performance should be recommended for cultivation in the area fronting rust problem.

# AUTHOR CONTRIBUTIONS

SG and KK designed the overall project. AD wrote the manuscript under the supervision of SG, RC, and AP. DS and AD analyzed the data. DSa, KS, KK, RB, and SC performed the phenotyping and disease scoring. SG, AP, and RC edited and finalized the manuscript.

# ACKNOWLEDGMENTS

We acknowledge the contributions of the centers of All India Coordinated Research Project (AICRP) on field pea and other pulses for executing these trials at respective locations properly and recording observations on rust reactions meticulously.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00825/ full#supplementary-material


(Cicer arietinum) at major chickpea producing areas of Ethiopia. Aus. J. Crop Sci. 11, 212–219.


fpls-10-00825 July 8, 2019 Time: 16:8 # 14


Yang, R. C., Crossa, J., Cornelius, P. L., and Burgueño, J. (2009). Biplot analysis of genotype × environment interaction: proceed with caution. Crop Sci. 49, 1564–1576.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Das, Parihar, Saxena, Singh, Singha, Kushwaha, Chand, Bal, Chandra and Gupta. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Resistance to Plant-Parasitic Nematodes in Chickpea: Current Status and Future Perspectives

Rebecca S. Zwart<sup>1</sup> \*, Mahendar Thudi1,2, Sonal Channale<sup>1</sup> , Praveen K. Manchikatla2,3 , Rajeev K. Varshney<sup>2</sup> and John P. Thompson<sup>1</sup>

<sup>1</sup> Centre for Crop Health, Institute for Life Sciences and the Environment, University of Southern Queensland, Toowoomba, QLD, Australia, <sup>2</sup> Center of Excellence in Genomics and Systems Biology, International Crops Research Institute for the Semi-Arid Tropics, Hyderabad, India, <sup>3</sup> Department of Genetics, Osmania University, Hyderabad, India

#### Edited by:

Karam B. Singh, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

#### Reviewed by:

Paola Leonetti, Italian National Research Council (CNR), Italy Kevin E. McPhee, Montana State University, United States

> \*Correspondence: Rebecca S. Zwart rebecca.zwart@usq.edu.au

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 26 March 2019 Accepted: 10 July 2019 Published: 24 July 2019

#### Citation:

Zwart RS, Thudi M, Channale S, Manchikatla PK, Varshney RK and Thompson JP (2019) Resistance to Plant-Parasitic Nematodes in Chickpea: Current Status and Future Perspectives. Front. Plant Sci. 10:966. doi: 10.3389/fpls.2019.00966 Plant-parasitic nematodes constrain chickpea (Cicer arietinum) production, with annual yield losses estimated to be 14% of total global production. Nematode species causing significant economic damage in chickpea include root-knot nematodes (Meloidogyne artiella, M. incognita, and M. javanica), cyst nematode (Heterodera ciceri), and rootlesion nematode (Pratylenchus thornei). Reduced functionality of roots from nematode infestation leads to water stress and nutrient deficiency, which in turn lead to poor plant growth and reduced yield. Integration of resistant crops with appropriate agronomic practices is recognized as the safest and most practical, economic and effective control strategy for plant-parasitic nematodes. However, breeding for resistance to plant-parasitic nematodes has numerous challenges that originate from the narrow genetic diversity of the C. arietinum cultigen. While levels of resistance to M. artiella, H. ciceri, and P. thornei have been identified in wild Cicer species that are superior to resistance levels in the C. arietinum cultigen, barriers to interspecific hybridization restrict the use of these crop wild relatives, as sources of nematode resistance. Wild Cicer species of the primary genepool, C. reticulatum and C. echinospermum, are the only species that have been used to introgress resistance genes into the C. arietinum cultigen. The availability of genomic resources, including genome sequence and resequence information, the chickpea reference set and mini-core collections, and new wild Cicer collections, provide unprecedented opportunities for chickpea improvement. This review surveys progress in the identification of novel genetic sources of nematode resistance in international germplasm collections and recommends genome-assisted breeding strategies to accelerate introgression of nematode resistance into elite chickpea cultivars.

Keywords: Cicer arietinum, crop wild relatives, root-knot nematodes, cyst nematodes, root-lesion nematodes

# INTRODUCTION

Chickpea (Cicer arietinum L.) is a nutritionally rich cool-season pulse crop that plays an important role in ensuring global food security, as it is an important source of dietary protein. Chickpea also plays an important role in farming systems by fixing atmospheric nitrogen, contributing to soil fertility, acting as a disease break and controlling weeds. Currently, chickpea is grown in an

area of over 14.5 Mha in 55 countries with total annual production of 14.7 Mt (FAO, 2017). India is the world's largest consumer of chickpea and also the world's largest producer, contributing over 70% of total global chickpea production (FAO, 2017). There are two types of chickpea differentiated by seed type and flower color, namely, desi and kabuli. Desi chickpeas have smaller dark colored seeds and pink flowers, and are predominantly grown in central Asia and in the Indian subcontinent. Whereas, kabuli chickpeas have larger beige seeds and white flowers and are predominantly grown in the Mediterranean region (Gaur et al., 2012). In India, chickpea is grown on residual moisture with low input management by resource-poor farmers (Singh and Reddy, 1991). The world average chickpea yield is less than 1 t/ha which is far less than the potential yield of 6 t/ha under favorable and irrigated conditions (Varshney et al., 2017). This enormous disparity between the actual and expected yield of chickpea is due to biotic stresses, caused by insects, bacteria, fungi, nematodes and viruses, and abiotic stresses, such as drought, nutrient deficiencies, salinity and chilling (Roorkiwal et al., 2016).

Globally, the loss of chickpea productivity due to plant parasitic nematodes is estimated to be 14% (Sasser and Freckman, 1987). Important elements for effective integrated control of plant-parasitic nematodes in cropping systems include (a) correct diagnosis of the nematode species, (b) effective rotations with non-hosts or fallow periods, and (c) use of tolerant and resistant crop cultivars (Thompson et al., 2000). Accurate diagnosis of nematode species requires extensive knowledge of nematode taxonomy and/or application of molecular diagnostic tools. Options for crop rotations are restricted in fields which are infested with nematode species with wide host ranges (Greco, 1987). Application of nematicides is avoided due to environmental and economic reasons. The most effective and sustainable long-term strategy to overcome constraints to chickpea production caused by plant-parasitic nematodes is the use of resistant cultivars. Resistance is the ability of a plant to reduce nematode reproduction such that, no nematode reproduction occurs in a highly resistant plant, a low level of reproduction occurs in a moderately resistant plant and unhindered nematode reproduction occurs in a susceptible plant (Roberts, 2002). Tolerance is a separately measured trait that characterizes the ability of a plant to grow and yield well even when infested with nematodes (Trudgill, 1991). Growing resistant cultivars has the advantage of preventing nematode reproduction and reducing yield losses in the current crop. Moreover, after growing resistant cultivars, nematode populations residual in the soil to damage subsequent crops are less than after susceptible cultivars, thus benefiting the whole farming system.

Advances in chickpea genomic resources resulting from the advent of next generation sequencing (NGS) technology, has the potential to greatly assist molecular breeding approaches to improve resistance to plant-parasitic nematodes and thereby help in achieving the yield potential of chickpea (Thudi et al., 2012). Recent reviews highlight the application of gene-editing technologies to control plant-parasitic nematodes (Leonetti et al., 2018) and improvements in chickpea genetic transformation technologies (Amer et al., 2019). In this review, we provide an overview of studies on the identification of nematode resistance genes in the C. arietinum cultigen and related species, focusing on three types of nematodes causing major economic damage to chickpea crops globally, namely, root-knot nematodes (Meloidogyne artiella, M. incognita, and M. javanica), chickpea cyst nematode (Heterodera ciceri) and root-lesion nematode (Pratylenchus thornei). We highlight the current status of nematode resistance in chickpea and discuss genomic tools available to improve the level of nematode resistance using genomic-assisted breeding.

# CHICKPEA-NEMATODE INTERACTIONS

Chickpea is a host for over 100 species of plant-parasitic nematodes (Nene et al., 1996; Sikora et al., 2018). However, only a small number of predominant species are considered to cause economic damage to chickpea crops throughout the world (**Table 1**). Crop damage due to nematode infestation can be challenging to diagnose because of non-specific above-ground plant symptoms seen on the plants (Sharma et al., 1992). The reduced functionality of the host plant roots due to the damage caused by plant-parasitic nematodes feeding and/or reproducing inside the root cells, results in infected plants showing the same symptoms as nutrient deficiency and water stress, namely, stunting, wilting, chlorotic leaves, reduced number of flowers and pods, reduced yield and patchiness in the field (Castillo et al., 2008). The significant root damage caused by plant-parasitic nematodes also reduces the ability of plants to cope with abiotic stresses of drought and low levels of plant nutrients in the soil.

Plant-parasitic nematodes contribute to decreased plant vigor by reducing Rhizobium root nodulation and nitrogen-fixing ability of the host plant (Tiyagi and Parveen, 1992; Vovlas et al., 1998; Wood et al., 2018). Furthermore, plant-parasitic nematodes exacerbate crop damage caused by other biotic stresses. Nematode infection leads to enhanced severity of infection with soil-borne fungal pathogens causing Fusarium wilt (Fusarium oxysporum f. sp.ciceris) (Castillo et al., 1998, 2003) and dry root rot (Rhizoctonia bataticola) (Ali and Sharma, 2003).

# Root-Knot Nematodes

Root-knot nematodes, Meloidogyne spp., rank as the most economically damaging nematodes to agricultural crops worldwide due to their broad host range and wide geographical distributions (Jones et al., 2013). Root-knot nematodes are sedentary endoparasites. Many Meloidogyne species are parthenogenic or facultatively parthenogenic. Motile male and female second stage juveniles penetrate the root surface. Female root-knot nematodes migrate to the vascular tissue and establish permanent feeding sites called giant cells (Vovlas et al., 2005). As the juveniles feed they become swollen and at maturity they produce egg masses that contain up to 600 eggs (Hernández Fernández et al., 2005). The characteristic galls on infected roots (**Figure 1A**) contain four to six giant cells that are formed by repeated nuclear division without cell division. Galls induced by

#### TABLE 1 | Geographic distribution of plant-parasitic nematodes infecting chickpea crops.


<sup>∗</sup>Source: (FAO, 2017).

M. artiellia on chickpea roots are smaller than those produced by other root-knot species (Vovlas et al., 2005).

Meloidogyne incognita and M. javanica are the most prevalent species of root-knot nematodes in tropical chickpea growing countries, including Ethiopia, Zimbabwe and Malawi in Africa (Sharma et al., 1992), India, Nepal, Pakistan and Bangladesh in South Asia (Castillo et al., 2008) and Brazil in South America (Sharma and McDonald, 1990; **Table 1**). In India, M. arenaria

also causes severe damage to chickpea crops (Castillo et al., 2008). M. artiellia is the most widespread root-knot nematode species in cooler chickpea growing countries of the Mediterranean region, including Italy, Spain, Syria, Turkey, Morocco, Algeria, and Tunsia (Greco et al., 1992b; Di Vito et al., 1994a,b). The rootknot nematodes, M. incognita and M. javanica, cause yield losses of 19 to 40% to chickpea in India (Ali and Sharma, 2003) with thresholds for damage for these species varying from 200 to 2000 eggs and/or juveniles per liter soil at the time of sowing (Sharma et al., 1992). On the other hand, the damage threshold for M. artiellia is calculated to be considerably lower at 20 to 140 eggs and juveniles per liter of soil, with 2000 nematodes per liter at planting resulting in yield losses of 50 to 80% (Di Vito and Greco, 1988).

# Cyst Nematodes

Chickpea cyst nematode, H. ciceri, is the most damaging cyst nematode infecting chickpea, although several other Heterodera spp. have been reported on or in the rhizosphere of chickpea without causing damage (**Table 1**), namely, H. cajani and H. swarupi in India (Ali and Sharma, 2003) and H. goettingiana in Tunisia and Morocco (Di Vito et al., 1994a). Cyst nematodes are sedentary semi-endoparasites. Motile juvenile nematodes penetrate the root surface and move to the vascular tissue where they form a permanent feeding site characterized by syncytia cells (Greco et al., 1992a). Swollen females rupture root tissues with the posterior portion of their bodies, which then protrude from the root surface forming visual cysts about 0.5 to 1.0 mm in diameter. The females retain eggs inside their bodies. While only one generation is completed per growing season on chickpea, each cyst contains up to 300 eggs (Kaloshian et al., 1986). Moreover, eggs can survive long periods in the soil in the absence of a host (Castillo et al., 2008). Infected chickpea roots are characterized by the visible swollen adult females protruding from the root surface (**Figure 1B**). The lemon shaped cysts change from white to brown as females mature (Kaloshian et al., 1986).

Heterodera ciceri is distributed throughout the eastern Mediterranean region in Turkey (Di Vito et al., 1994b), Syria (Greco et al., 1992b), Jordan and Lebanon (Di Vito et al., 2001). While H. ciceri predominantly affects chickpea (Greco et al., 1986), other grain legumes, fodder species and ornamental plants have been reported as hosts (Di Vito et al., 2001). H. ciceri was the most damaging plant-parasitic nematode in chickpea crops in Syria (Greco et al., 1992b). H. ciceri is aggressive on chickpea crops with economic yield losses occurring with 1000 eggs per liter soil. Moreover, yield losses of 20, 50, 80, and 100% were reported to occur with 8000, 16000, 32000, and 64000 eggs per liter soil at planting, respectively (Greco et al., 1988).

# Root-Lesion Nematodes

Root-lesion nematodes are the predominant plant-parasitic nematode found in chickpea crops in surveys in North Africa (Di Vito et al., 1994a), Turkey (Di Vito et al., 1994b), and Spain (Castillo et al., 1996). Root-lesion nematodes are migratory endoparasites that cause extensive damage to cortical cells in the pathway of migration and during feeding (Castillo et al., 1998). In the species P. thornei male nematodes are rare and females reproduce by mitotic parthenogenesis, depositing eggs in the cavities of root cells caused by nematode feeding and movement. P. thornei takes 25 to 35 days to complete its life cycle at 20 to 25◦C on carrot disk culture (Castillo et al., 1995); thus several generations can occur in a growing season (Sikora et al., 2018). P. thornei eggs and nematodes can survive in the soil in the absence of host plants. If the soil dries slowly a high proportion of the nematodes can survive the dry conditions (Thompson et al., 2017, 2018). Infection by P. thornei is characterized by dark brown to black lesions on chickpea roots (**Figure 1C**). Damage caused by root-lesion nematodes is generally less obvious than that caused by root-knot or cyst nematodes (Sharma et al., 1992) and symptoms of P. thornei damage to the roots do not always result in visible symptoms on above-ground plant parts. The wide host range of root-lesion nematodes hampers management strategies.

Pratylenchus thornei is the predominant species of root-lesion nematode causing damage to chickpea crops throughout the world. The distribution of P. thornei extends throughout major chickpea growing countries, including Australia (Thompson et al., 2000), India (Sharma et al., 1992), North Africa (Di Vito et al., 1994a), Turkey (Di Vito et al., 1994b), and Spain (Castillo et al., 1996). In India, the world's largest producer and consumer of chickpea, P. thornei is emerging as a serious threat to chickpea production, with high populations reported in Madhya Pradesh

(Baghel and Singh, 2013), Rajasthan (Ali and Sharma, 2003), Maharashtra (Varaprasad et al., 1997), and Uttar Pradesh (Sebastian and Gupta, 1995). Numerous other Pratylenchus species have been reported in surveys of chickpea crops in North Africa and the Mediterranean region, Brazil and North America (**Table 1**), however, limited information is available on the extent of crop damage they cause. The species P. thornei infects many cereal and pulse crops (Sikora et al., 2018); thus high populations can build up quickly in the soil and affect the whole farming system. In Australia, where P. thornei is ranked as the second most economically important biotic stress affecting chickpea (Murray and Brennan, 2012), yield losses of 25% were obtained in chickpea fields with 11600 P. thornei/kg of soil at planting (Thompson et al., 2000; Reen et al., 2014). A damage threshold as low as 31 nematode per liter of soil was reported for P. thornei by Di Vito et al. (1992) in field conditions in Syria, with 2000 nematodes per liter at planting resulting in yield losses up to 58%.

#### SOURCES OF NEMATODE RESISTANCE

Accurate, reliable phenotyping is essential for screening germplasm to identify sources of resistance. Accurate phenotyping experiments require robust statistical design in a controlled environment with plants inoculated with a known initial population of nematodes and/or eggs. Resistance to root-knot nematode is generally quantified by visual inspection and rating of infected roots using a root-galling index on a 1 to 5 scale (with 1 = no galls and 5 = greater than 100 galls per root) (Rao and Krishnappa, 1995; Hassan and Devi, 2004; Haseeb et al., 2006; Chakraborty et al., 2016). In addition to scoring root-galling index, Sharma et al. (1992, 1993, 1995) evaluated gall size (on a 1–9 scale with 1 = no galls and 9 = very large galls) and percent galled area (on a 1 to 9 scale with 1 = no galls and 9 = more than 50% root area galled) to calculate a root damage index, as an average of the three ratings. Mechanisms of resistance, such as increased peroxidase activity of infected roots, have also been used to screen chickpea germplasm against root-knot nematode (Siddiqui and Husain, 1992; Chakrabarti and Mishra, 2002). The resistance level of a plant to chickpea cyst nematode is determined by rating the number of females and cysts on infected roots using a 0 to 5 scale (with 0 = no females and cysts and 5 = greater than 50 females and cysts) (Di Vito et al., 1988; Singh et al., 1989). In the case of migratory rootlesion nematodes, the nematodes need to be extracted from roots and/or soil before quantification is possible. Researchers have reported resistance levels to P. thornei in relation to reproduction factor (final nematode population/initial nematode population) (Tiwari et al., 1992; Di Vito et al., 1995), or as number of nematodes per unit of root and/or soil (Thompson et al., 2011; Reen et al., 2019). Measuring visual lesions present on infected roots (Ali and Ahmad, 2000), is not recommended as lesions are only symptoms and not a direct measure of nematode numbers.

# Cicer arietinum Cultigen

To date, there has been relatively little success in identifying resistance to plant-parasitic nematodes in the C. arietinum cultigen, namely, chickpea cultivars, breeding lines and landraces held in global genebanks, compared with the number of accessions that have been evaluated (**Table 2**). Extensive screening efforts in Syria by the International Center for Agricultural Research in the Dry Areas (ICARDA) and the Institute for Sustainable Plant Protection, Italy, have been devoted to identifying resistance to H. ciceri, the most devastating nematode to chickpea production in the Mediterranean region. Despite screening close to 10000 chickpea accessions from global germplasm collections held by ICARDA and the International Crop Research Institute for the Semi-Arid Tropics (ICRISAT), none were found to be resistant (Di Vito et al., 1996; Singh et al., 1996) and merely 20 lines were rated as moderately resistant to H. ciceri (Di Vito et al., 1988).

Screening efforts focusing on identifying resistance to M. javanica in the C. arietinum germplasm collection held in the ICRISAT genebank proved futile, with no resistance identified in numerous studies testing several thousand accessions (Sharma et al., 1992, 1993, 1995; Ali and Ahmad, 2000; Bhagwat and Sharma, 2001; Ansari et al., 2004). Nonetheless, a few susceptible lines were deemed tolerant to M. javanica and produced a higher yield and shoot biomass in M. javanica-infested soil, even though the roots supported nematode reproduction (Sharma et al., 1992, 1993, 1995). Hussain et al. (2001) screened ten chickpea cultivars from Pakistan for resistance to M. javanica, and found all ten cultivars showed a moderate level of resistance.

Early studies were unsuccessful in finding resistance to M. incognita in Indian chickpea cultivars (Siddiqui and Husain, 1992; Rao and Krishnappa, 1995; Mhase et al., 1999; Chakrabarti and Mishra, 2002). However, more recent studies have reported resistance and moderate resistance to M. incognita in Indian chickpea cultivars and breeding lines (Hassan and Devi, 2004; Haseeb et al., 2006; Chakraborty et al., 2016). Considering the broad host range and widespread occurrence of this nematode species in India (Khan et al., 2014) it is plausible that incidental selection for resistance to M. incognita has occurred in more recent breeding programs. Sikora et al. (2018) reported that no attempts have been made to screen chickpea germplasm for resistance to M. artiella.

Sources of resistance and moderate resistance to P. thornei in the C. arietinum cultigen have been identified in breeding lines in India (Tiwari et al., 1992; Ali and Ahmad, 2000) and in accessions in the ICRISAT genebank in India (Ali and Ahmad, 2000) and Australia (Thompson et al., 2011).

The limited diversity of resistance genes in the C. arietinum cultigen is not restricted to plant-parasitic nematodes. C. arietinum lacks diversity for a range of biotic and abiotic stresses (Smýkal et al., 2015). Abbo et al. (2003) proposed that this low level of diversity can be attributed to the following genetic bottlenecks that occurred during the evolution and domestication of chickpea: (i) there is a limited distribution of chickpea wild progenitor species, (ii) the founder effect arising from the domestication of only a small number of wild genotypes, which is a bottleneck common to all modern crops, (iii) a shift from winter to spring phenology to avoid devastation by Ascochyta blight (Ascochyta rabiei), and (iv) the substitution of a large number of landraces with a small number of elite


TABLE 2 | Studies to identify resistance to root-knot nematodes (Meloidogyne incognita, M. javanica), cyst nematode (Heterodera ciceri), and root-lesion nematode (Pratylenchus thornei) in the Cicer arietinum cultigen.

cultivars from modern breeding caused yet further reduction in the diversity of the C. arietinum genepool.

The availability of large and diverse germplasm collections is a key element for the successful identification of disease resistant lines (Infantino et al., 2006). Landraces, traditional locally adapted varieties that lack formal crop improvement (Villa et al., 2005), serve as a valuable genetic resource that may help widen the narrow genetic base of chickpea by circumventing the genetic bottlenecks caused by changing from winter to spring phenology and modern breeding. While landraces hold much genetic diversity of the C. arietinum cultigen, strategic methods are crucial to mine the global chickpea germplasm collections, which have conserved close to a hundred thousand accessions (Smýkal et al., 2015). Recent developments of core, reference and mini-core collections (Upadhyaya et al., 2001, 2008) and subsampling strategies such as the focused identification of germplasm strategy (FIGS) (Khazaei et al., 2013) have created unprecedented opportunities for the systematic screening of a practical number of accessions.

A core collection is defined as a subset of all the accessions representing the genetic diversity of crop species and wild relatives with minimum repetition (Frankel and Brown, 1984). It constitutes about 10% of the total number of accessions and represents genetic diversity of the entire global germplasm collection. Based on geographic distribution and quantitative traits of accessions held at ICRISAT, a core subset was developed consisting of 1956 accessions of chickpea (Upadhyaya and Ortiz, 2001). However, the size of the core collection was still too large to be systematically evaluated for traits of interest. To overcome this limitation, a mini-core collection was developed where a subset of 211 accessions (1.1% of the entire collection) was selected based on taxonomic, morphological and geographic data (Upadhyaya and Ortiz, 2001). Also, a composite collection of 3000 accessions was formed, which represents the diversity of accessions held at ICRISAT and ICARDA collectively. From this collection, the 'Reference Set,' was produced, composed of the full mini-core collection (211) and an additional 82 C. arietinum accessions, plus

four C. reticulatum and three C. echinospermum genotypes (Upadhyaya et al., 2006).

The chickpea mini-core collection and Reference Set have been phenotyped in several studies to identify traits of interest to combat biotic and abiotic stresses. These traits include resistance to multiple diseases of economic concern namely, Ascochyta blight, Fusarium wilt, dry root rot and Botrytis gray mold (Pande et al., 2006), as well as root architectural traits for optimal use of soil resources, and adaptation to drought and other abiotic challenges (Kashiwagi et al., 2005; Krishnamurthy et al., 2010, 2011). In addition to identifying germplasm with traits of interest, these collections have been used to understand the genetic basis of heat and drought tolerance traits by using genome-wide association studies (GWAS) and candidate gene-based mapping approaches (Thudi et al., 2014). These valuable repositories of germplasm covering the genetic diversity of C. arietinum offer opportunities to efficiently search for sources of resistance to plant-parasitic nematodes that were not previously available.

#### Wild Cicer Relatives

Chickpea wild relatives can be used to reintroduce traits and widen the genetic base of the C. arietinum cultigen that did not pass through the domestication bottleneck (Abbo et al., 2003). The genus Cicer comprises 44 species, of which nine are annuals and 35 perennials (Smýkal et al., 2015). Annual Cicer species in the primary genepool (C. arietinum, C. reticulatum, and C. echinosperum) are cross-compatible, while those in the secondary genepool (C. bijugum, C. pinnatifidum, and C. judaicum) and tertiary genepool (C. chorassanicum, C. cuneatum, and C. yamashitae) have barriers to hybridization with C. arietinum (Croser et al., 2003). Despite this, accessions from all three genepools held in germplasm collections have been screened for resistance to plant-parasitic nematodes (**Table 3**).

In search for resistance to H. ciceri, a limited number of wild Cicer relatives were screened. Singh et al. (1989) screened accessions from all 8 annual wild Cicer species and identified a high level of resistance to H. ciceri only in accessions of C. bijugum. However, screening of additional germplasm identified resistance to H. ciceri in one accession of C. reticulatum, one of C. bijugum and six of C. pinnatifidum (Di Vito et al., 1996). The resistance from the cross-compatible C. reticulatum accession was then successfully transferred to C. arietinum breeding lines (Singh et al., 1996; Malhotra et al., 2002, 2008). Di Vito et al. (1995) reported resistance to P. thornei in accessions from the secondary genepool (C. bijugum and C. judaicum) and tertiary genepool (C. cuneatum and C. yamashitae), while no resistance was found in accessions from the primary genepool (C. echinosperum and C. reticulatum). Thompson et al. (2011) identified moderate resistance to P. thornei in accessions from both C. echinosperum and C. reticulatum in the primary genepool, as well as accessions of C. bijugum. Successful hybridizations of these C. echinosperum and C. reticulatum accessions with C. arietinum in the Australian chickpea breeding program has produced breeding lines with resistance at a level equivalent to the Cicer wild relative parents (Thompson et al., 2011; Rodda et al., 2016). To date, no sources of resistance to root-knot nematodes have been identified in the Cicer primary genepool. Resistance to M. artiellia has been identified in one accession of C. bijugum and one accession of C. pinnatifidum from the ICARDA genebank (Di Vito et al., 1995). No resistance was found for M. javanica in wild Cicer relatives screened by Sharma et al. (1993).

Using embryo rescue and tissue culture techniques, hybrids between C. arietinum and accessions of secondary genepool species C. bijugum, C. judaicum, and C. pinnatifidum are possible (Ahmad and Slinkard, 2004; Clarke et al., 2006). However, these techniques are extremely inefficient. Many crosses are required to recover hybrids and the few hybrids that are recovered are affected by androgenesis, infertility and lack of vigor (Clarke et al., 2011). Thus, further advancements in techniques are required to increase efficiency and cross the barriers to hybridization that exist between accessions of the secondary genepool and the C. arietinum cultigen before these sources of resistance can be applied in chickpea breeding (Pratap et al., 2018). For now, the only accessible sources of wild Cicer germplasm are accessions of C. echinosperum and C. reticulatum. However, Berger et al. (2003) highlighted the limited number of unique accessions of these wild Cicer species held in international genebanks. Of 43 C. echinosperum accessions in the world collection, only 13 are original independent accessions, with the remainder being duplicates under different accession numbers used by different genebanks. Of 139 C. reticulatum accessions, only 18 were original accessions. This under-representation of wild Cicer relatives in global genebank collections has been recently addressed with new collecting expeditions for C. echinosperum and C. reticulatum in south-eastern Turkey spanning the geographic range of these wild Cicer species (von Wettberg et al., 2018). Reen et al. (2019) recently demonstrated the value of this collection for increasing genetic diversity for resistance to plant-parasitic nematodes. Thirteen accessions were identified as significantly more resistant to P. thornei (P < 0.05) than the previously most resistant C. echinosperum accession reported by Thompson et al. (2011). Moreover, wild introgression populations of C. echinosperum and C. reticulatum parents into C. arietinum using elite chickpea varieties adapted to the major chickpea growing regions of the world, namely, India, Australia, Turkey, Ethiopia, and Canada (von Wettberg et al., 2018), will be invaluable resources for the identification and utilization of traits of interest in wild Cicer relatives, including resistance to plantparasitic nematodes.

# CHICKPEA GENOMIC RESOURCES

#### Molecular Marker-Based Resources

Recent advances in genomics research have enabled the development and application of molecular markers for crop improvement (Thudi et al., 2014; Varshney et al., 2018b). In the case of chickpea, 2n = 2x = 16 chromosomes and a genome size of ∼738 Mb (Varshney et al., 2013b), extensive genomic and transcriptomic resources have been developed (Varshney et al., 2009; Nayak et al., 2010; Hiremath et al., 2011; Thudi et al., 2011;

TABLE 3 | Studies to identify resistance to root-knot nematodes (Meloidogyne artiellia, M. javanica), cyst nematode (Heterodera ciceri), and root-lesion nematode (Pratylenchus thornei) in Cicer wild relatives.


Kudapa et al., 2014; Agarwal et al., 2016; Mashaki et al., 2018). The availability of these resources has facilitated the development of molecular markers and high density genetic maps in chickpea (Thudi et al., 2011; Varshney et al., 2014b; Jaganathan et al., 2015; Kale et al., 2015). Over 2000 simple sequence repeat (SSR) markers, millions of single nucleotide polymorphism (SNP) markers, and over 15000 diversity array technology (DArT) markers, have been developed for chickpea (Varshney, 2016) in the last decade. These molecular markers and genetic linkage maps, in combination with phenotypic data and quantitative trait loci (QTL) analysis, have been used to identify genomic regions responsible for complex traits in chickpea like drought tolerance (Varshney et al., 2014b), salinity tolerance (Vadez et al., 2012; Pushpavalli et al., 2015), heat tolerance (Paul et al., 2018), early flowering (Mallikarjuna et al., 2017), vernalization (Samineni et al., 2016) and resistance to Fusarium wilt and Ascochyta blight (Sabbavarapu et al., 2013). Further, using a GWAS approach, markers associated with drought and heat tolerance traits (Thudi et al., 2014) and protein content (Jadhav et al., 2015) have also been reported. Besides using molecular markers to assist understanding molecular mechanisms of different traits, several functional genomics approaches, such as suppression subtractive hybridization (SSH), super serial analysis of gene expression (SuperSAGE), microarray, and expressed sequence tags (EST) sequencing were also recently applied to chickpea (Buhariwalla et al., 2005; Molina et al., 2008; Varshney et al., 2009). These molecular marker-based resources, when coupled with robust and accurate phenotyping to detect marker-trait associations, can be applied to chickpea breeding to (i) assist the indirect selection of nematode resistance, (ii) facilitate pyramiding of resistance genes from several resistant or moderately resistant sources to provide cultivars with durable nematode resistance, and (iii) combine resistance to multiple biotic stresses.

# Next-Generation Sequencing-Based Resources

Several key traits have been targeted for transcriptomic studies in chickpea (Varshney et al., 2009; Hiremath et al., 2011; Kudapa et al., 2014; Kaashyap et al., 2018). In recent years, sequencing and de novo assembly of the chickpea transcriptome using short-reads and high-throughput small RNA sequencing were also deployed to discover tissue-specific and stressresponsive expression profiles (Jain et al., 2014; Kohli et al., 2014). These functional genomic resources were also used to develop informative SSR and SNP markers in chickpea (Agarwal et al., 2012; Hiremath et al., 2012; Jhanwar et al., 2012; Garg et al., 2014; Kudapa et al., 2014; Pradhan et al., 2014; Parida et al., 2015). Recently, a Gene Expression Atlas (CaGEA) from 27 chickpea tissues across five developmental stages, namely, germination, seedling, vegetative, reproductive, and senescence, of a chickpea breeding cultivar, ICC 4958, has been developed (Kudapa et al., 2018). Ramalingam et al. (2015) extensively reviewed several studies on application of proteomics and metabolomics in chickpea and other crop legumes. Integration of these technologies with genomics has the potential to inform the molecular mechanisms of plant responses to biotic stresses such as nematode infestation and identify key candidate genes to be introgressed for chickpea improvement.

Following the release of the draft genomes of chickpea (Jain et al., 2013; Varshney et al., 2013b), efforts have been made during the last decade to improve the genome assemblies. For instance, Ruperao et al. (2014) using sequence data from flow cytometry isolated chromosomes to identify misplaced contigs for improving and validating the desi and kabuli draft chickpea genome assemblies. Similarly, Parween et al. (2015), using additional sequence data and improved genetic maps, developed an improved version of the desi genome assembly. In addition, a draft genome assembly of C. reticulatum the wild progenitor of chickpea has been recently reported (Gupta et al., 2017). Further, in order to design new strategies to harness the existing genetic diversity in germplasm lines conserved in genebanks across the world, re-sequencing of germplasm lines has been advocated (McCouch et al., 2013). Toward this direction in chickpea, 90 elite lines, 35 parental genotypes of mapping populations, and 129 released varieties have been re-sequenced (Varshney et al., 2013b, 2019; Thudi et al., 2016a,b). Moreover, efforts are currently underway at ICRISAT to re-sequence the 3000 germplasm lines of the composite chickpea collection. Next-generation sequencingbased genomic resources can provide insights into candidate genes determining nematode resistance and in this way enable diagnostic markers for accurate and efficient indirect selection of resistance to be developed. Furthermore, insights into candidate resistance genes will enable mechanisms of resistance to plantparasitic nematodes to be deciphered. Increased knowledge of the mechanisms of resistance in different germplasm sources would allow the possibility to breed for enhanced durability of nematode resistance by combining genes for different resistance mechanisms in the one chickpea cultivar.

#### Genome-Assisted Breeding

Molecular breeding approaches utilizing markers and the largescale genetic and genomic resources that are now available for chickpea have been successful in improving chickpea for target traits. Some superior lines with enhanced tolerance or resistance to abiotic and biotic stresses as well as agronomically important traits have been successfully developed in legumes using markerassisted backcrossing (MABC) (Lucas et al., 2015; Varshney, 2016; Varshney et al., 2018a). A genomic region in chickpea (known as "QTL-hotspot") harboring several QTL for drought component traits was identified (Varshney et al., 2014b) and successfully introgressed initially into JG 11, an elite Indian chickpea cultivar (Varshney et al., 2013a). Preliminary yield trials indicated a 12 to 24% increase in yield under drought conditions. In addition, the introgression of this genomic region into different genetic backgrounds, like chickpea cultivars KAK 2 and Chefe, was also found to enhance drought tolerance. Further, this genomic region is being introgressed into elite cultivars in Kenya, Ethiopia and India (Thudi et al., 2017). Molecular breeding lines with enhanced resistance to Fusarium wilt (Pratap et al., 2017; Mannur et al., 2019) and Ascochyta blight in different elite genetic backgrounds (Varshney et al., 2014a) have been developed. ICRISAT has also developed highly cost-effective 10 SNP panels for several traits in legumes including chickpea

that can be used for early generation selection to accelerate the efficiency of selection in breeding programs, besides cost-effective high-throughput genotyping platforms (Roorkiwal et al., 2018). This 10 SNP panel is being used extensively in early generation selection in south Asia and Sub-Saharan Africa. Identification of molecular markers associated with nematode resistance will enable genomics-assisted breeding to facilitate the introgression of nematode resistance in elite chickpea cultivars in breeding programs worldwide.

#### FUTURE PERSPECTIVES

In this review we have outlined progress in the discovery of resistance to plant-parasitic nematodes in various germplasm sources suitable for introgression into chickpea cultivars. Screening a large number of germplasm lines is expensive and time-consuming. In the past this has either limited the number of lines that have been evaluated for nematode resistance or required large investments in resources and effort. The development of the chickpea mini-core and reference set germplasm collections of landraces and C. arietinum breeding lines, provides cost-effective and manageable entry points into the vast global chickpea germplasm collections (Gaur et al., 2012). Although major genetic bottlenecks may have contributed to the lack of genetic diversity for resistance against plantparasitic nematodes available in the C. arietinum cultigen, new opportunities exist to widen the genetic base of chickpea for traits of interest. The small number of wild genotypes contributing to the domesticated C. arietinum cultigen can be circumvented by evaluating recent collections of chickpea wild species C. reticulatum and C. echinospermum for resistance to plant-parasitic nematodes.

To the best of our knowledge, no information is currently available on the nature of inheritance and genetics of plantparasitic nematode resistance genes in chickpea. Considerable advancements in chickpea genomic resources since the majority of the past efforts to identify sources of resistance to various nematode species, provide unprecedented opportunities to accelerate identification and characterization of nematode resistance genes. Availability of an extensive number of molecular markers and genomic resources in chickpea, coupled with robust

#### REFERENCES


phenotyping, will facilitate identification of markers linked with resistance to plant-parasitic nematodes. Identification of candidate genes for nematode resistance could provide diagnostic markers that could be used for indirect selection of nematode resistance. Furthermore, genomic tools can provide insights into the mechanisms of resistance to plant-parasitic nematodes in chickpea. Identification of marker-trait associations will facilitate rapid introgression of resistance to plant-parasitic nematodes and adoption of genomics-assisted breeding into chickpea breeding programs world-wide. Sources of moderate resistance can be dissected with molecular markers to identify minor genes. If additive in gene action, sources of moderate resistance could be successfully combined using genomics-assisted selection to produce nematode resistant chickpea cultivars. We have indicated a number of successes in the identification of resistance to plant-parasitic nematodes that provide encouragement to apply and exploit genomic tools and intensify efforts to have resistant cultivars available to growers in all regions where plantparasitic nematodes diminish production of chickpea and of other host crops grown in rotation.

# AUTHOR CONTRIBUTIONS

All authors contributed to sections of the manuscript according to their expertise and have edited, read, and approved the submitted version.

# FUNDING

RZ and SC acknowledge support from University of Southern Queensland. MT acknowledges support from Science and Engineering Research Board, Government of India and CGIAR Research Program on Grain Legumes and Dryland Cereals for ongoing research on root-lesion nematode work in chickpea. JT acknowledges support of the Grains Research and Development Corporation (GRDC) through Projects USQ00017 and USQ00019 and the Queensland Department of Agriculture and Fisheries (QDAF) through the Broadacre Cropping Initiative with USQ.

related annual wild species. Genet. Resour. Crop Evol. 51, 765–772. doi: 10.1023/ B:GRES.0000034580.67728.e4




rotations affecting population densities of Pratylenchus thornei and arbuscular mycorrhizal fungi. Crop Past. Sci. 65, 428–441. doi: 10.1071/CP13441


pigeonpea and groundnut. Plant Sci. 242, 98–107. doi: 10.1016/j.plantsci.2015. 09.009


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zwart, Thudi, Channale, Manchikatla, Varshney and Thompson. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Genome-Wide Association Study Reveals Candidate Genes for Flowering Time Variation in Common Bean (Phaseolus vulgaris L.)

Lorenzo Raggi<sup>1</sup> \* † , Leonardo Caproni<sup>1</sup>† , Andrea Carboni<sup>2</sup> and Valeria Negri<sup>1</sup>

<sup>1</sup> Dipartimento di Scienze Agrarie, Alimentari e Ambientali (DSA3), Università degli Studi di Perugia, Perugia, Italy, <sup>2</sup> CREA Research Centre for Cereal and Industrial Crops, Bologna, Italy

The common bean is one of the most important staples in many areas of the world. Extensive phenotypic and genetic characterization of unexplored bean germplasm are still needed to unlock the breeding potential of this crop. Dissecting genetic control of flowering time is of pivotal importance to foster common bean breeding and to develop new varieties able to adapt to changing climatic conditions. Indeed, flowering time strongly affects yield and plant adaptation ability. The aim of this study was to investigate the genetic control of days to flowering using a whole genome association approach on a panel of 192 highly homozygous common bean genotypes purposely developed from landraces using Single Seed Descent. The phenotypic characterization was carried out at two experimental sites throughout two growing seasons, using a randomized partially replicated experimental design. The same plant material was genotyped using double digest Restriction-site Associated DNA sequencing producing, after a strict quality control, a dataset of about 50 k Single Nucleotide Polymorphisms (SNPs). The Genome-Wide Association Study revealed significant and meaningful associations between days to flowering and several SNP markers; seven genes are proposed as the best candidates to explain the detected associations.

#### Edited by:

Matthew Nicholas Nelson, Agriculture & Food (CSIRO), Australia

#### Reviewed by:

Elena Bitocchi, Marche Polytechnic University, Italy Monica Rodriguez, University of Sassari, Italy

#### \*Correspondence:

Lorenzo Raggi lorenzo.raggi@unipg.it; lorenzo.raggi@gmail.com

†These authors have contributed equally to this work as first authors

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

> Received: 12 April 2019 Accepted: 10 July 2019 Published: 24 July 2019

#### Citation:

Raggi L, Caproni L, Carboni A and Negri V (2019) Genome-Wide Association Study Reveals Candidate Genes for Flowering Time Variation in Common Bean (Phaseolus vulgaris L.). Front. Plant Sci. 10:962. doi: 10.3389/fpls.2019.00962 Keywords: Phaseolus vulgaris L., flowering time control, ddRAD-seq, GWAS, candidate gene analysis

# INTRODUCTION

Achieving food security is one of the most important challenges to face in the next three decades. FAO's 2017 prospects' revision on the world's population growth reports an expected growth of the population of more than 2 billion people by 2050 (United Nations, 2017). Accordingly, the demand of food will increase, especially in the areas of the world where most of the developing countries are located, mainly in the African continent (Jensen et al., 2012).

In this context, grain legumes are generally regarded as key commodities for improving food security as they are a relatively inexpensive source of amino acids and other important nutrients such as minerals, when compared to livestock and dairy products (Jensen et al., 2012). In addition, due to their ability to fix atmospheric nitrogen, legumes can generally help reducing the use of fertilizers, thus the environmental impact of agriculture (Reay et al., 2012; Andrews and Andrews, 2017). For all these reasons the use of legumes as a key ingredient for

a sustainable agricultural production system is at the core of agricultural policy debates in different countries (Zander et al., 2016).

Among grain legumes, common bean (Phaseolus vulgaris L., 2n = 2x = 22) is one of the most important staples in the world, produced over an area of 18 million hectares with a total production of 12 million tons per year (Akibode and Maredia, 2011; Faostat, 2019). Its production mainly occurs in the sub-Saharan Africa and in many Latin American countries (Petry et al., 2015), where it is critical to nutritional security and farmers income generation (Broughton et al., 2003). The cultivated common bean originated in two centers of diversity, giving rise to two genepools: the Mesoamerican, from Central America and the Andean, from the Andes mountains in South America. Many evidences demonstrated that the two genepools are the result of two independent domestication events that led to many morphological and genetic differences (Singh et al., 1991a,b; Kwak and Gepts, 2009).

Upon the introduction of the common bean in Europe from the Americas, hybridization of the two genepools generated further genetic diversity (Gepts et al., 1988; Zeven, 1997; Angioi et al., 2009; Gioia et al., 2013; Maras et al., 2013), for this reason Europe is considered a secondary center of diversification for this species (Angioi et al., 2010). This process led to the constitution of many European common bean landraces that represent a very important resource for plant breeding. In fact, they have been and still are a useful, sometimes unique, source of favorable alleles for abiotic stress, pest and disease resistances (Esquinas-Alcázar, 1993; Angioi et al., 2010).

Landraces are distinct and variable populations that are characterized by useful agronomical traits and adaptation to the specific environments where they were cultivated for a long time. It is important to stress that landraces differ from historical varieties; in fact, they lack "formal" crop improvement and are closely related to knowledge, habits and uses of the people that have been grown them until present times (Raggi et al., 2013). Even if landraces are excellent raw material for breeding new varieties, the within-population genetic diversity of such materials makes their exploitation in plant breeding challenging. This applies to common bean too where intralandraces genetic diversity can be rather high (Tiranti and Negri, 2007) while intra-individual heterozygosity rather low (Caproni et al., 2018). Indeed, difficulties may arise in the attempt of associating phenotypic traits of interest with the corresponding genetic determinants when using landraces; the identification of such associations is a fundamental prerequisite for allele mining (Visioni et al., 2013). Therefore, the development of a panel of a manageable number of diverse homozygous common bean genotypes is needed to cope with the above-mentioned limitations (Pignone et al., 2015).

The Single Seed Descent (SSD), initially proposed as a modification of the classical bulk breeding scheme to overcome the problem of natural selection (Goulden, 1939), represent a cost-effective approach to achieve that purpose. Given a certain cross, the application of this method to segregating generations allows to maximize the level of retained genetic variation in relation to cost and labor. SSD consists of passing from a generation to the next one by sowing a single seed from each plant (Brim, 1966). In a self-pollinating species like common bean, SSD can be effectively exploited to generate highly homozygous genotypes starting from single individuals of different landraces (Snape and Riggs, 1975).

Although bi-parental mapping has been successful in identifying many significant Quantitative Trait Loci (QTL) mapped to wide intervals in the common bean genome, our knowledge of genes controlling certain traits is still limited (Johnson and Gepts, 2002; Kelly et al., 2003; Blair et al., 2006, 2011; Miklas et al., 2006; Kwak et al., 2008; Perez-Vega et al., 2010). In fact, the resolution of QTL analysis is generally limited by the number of the recombination events; it means that a QTL can span a few centiMorgans (cM), which can indeed be translated into relatively long physical distances, sometimes containing hundreds of candidate genes (Moghaddam et al., 2016). By contrast, Genome Wide Association Mapping (GWAM) considers much more recombination events by using an association panel of individuals, each of those potentially characterized by a unique recombination history (Visscher et al., 2017). In addition, Genome Wide Association Studies (GWAS), based on very high number of markers, allow to test association of the trait of interest with a large part of the genome of the target species. Due to low cost by data point, high robustness, reproducibility and number in the genome, molecular markers based on Single Nucleotide Polymorphism (SNP) detection are those of election for conducting GWAS.

Currently, different approaches can be used to generate large SNP datasets. For example, high-density SNP arrays are already available for several crops (Hao et al., 2017) including common bean. However, such arrays are often designed starting from a limited number of elite genotypes and can produce biased data when used for characterization of non-elite materials. Next Generation Sequencing (NGS) techniques, that equally produce high number of datapoints, are an interesting alternative as they allow cheap not-biased SNP discovery and genotyping. This approaches have already been used and proven efficient in several crops including wheat, barley and pea (Poland et al., 2012; Liu et al., 2014; Annicchiarico et al., 2017). Moreover, the current availability of reference genomes of several crops (that allows to perform in silico simulations to optimize the technique and to map the markers) and of collections of genetically diverse pure lines (that allow to reduce sequence coverage due to the absence of heterozygous loci) makes Next Generation Genotyping (NGG) extremely attractive. Among the possible different NGG strategies (Davey et al., 2011; Barilli et al., 2018) double digest Restriction-site Associated DNA sequencing (ddRAD-seq) was the one of choice for this work. ddRAD-seq is a technique based on a digestion of genomic DNA carried out using two restriction enzymes (instead of a single restriction enzyme as in RAD-seq); the resulting DNA fragments are then ligated to sample-specific barcode adapters for subsequent bulk genotyping on an Illumina platform (Peterson et al., 2012).

Schmutz et al. (2014) published the first reference genome for P. vulgaris. This achievement opened novel possibilities for common bean NGG making the use of techniques, such as ddRAD-seq, potentially very effective.

Recently, association studies were carried out on the common bean using different plant materials and genotyping approaches. These studies focused on the search of meaningful association of agronomic traits (Moghaddam et al., 2016), nitrogen fixation (Kamfwa et al., 2015a), resistance to diseases (Perseguini et al., 2016), seed weight (Yan et al., 2017), and some technological traits as cooking time in dry beans (Cichy et al., 2015) with possible genetic determinants involved in their control. In some cases, these studies allowed the identification of candidate genes that can be used to develop new genetic stocks for bean breeding programs.

As flowering time is a key trait determining the production of dry matter and seed yield in many crops such as common bean, its manipulation is a relevant plant breeding target to produce novel varieties that are better adapted to changing climatic conditions (Jung and Müller, 2009). For example, early flowering can be exploited to avoid harsh environmental conditions (e.g., drought and heat) and/or escape pathogen attacks that can both negatively affect seed production (as they occur during/after the seed set stage). On the other hand, late flowering can increase seed yield by extending the vegetative phase and increasing the photosynthate accumulation. A flowering time well-synchronized with target environmental conditions would contribute to the achievement of optimal crop performances.

Extensive studies on floral transition revealed a network of regulatory interactions among genes able to promote or inhibit the phenological transition to the reproductive phase (i.e., flowering). In Arabidopsis many of the regulatory genes have been identified and functionally characterized (Putterill et al., 2004; Bäurle and Dean, 2006). Moreover, different species, such as medicago (Medicago truncatula) (Pierre et al., 2008, 2011; Laurie et al., 2011), pea (Pisum sativum) (Lejeune-Hénaut et al., 2008) and narrow-leafed lupin (Lupinus angustifolius) (Ksiazkiewicz et al., 2016; Nelson et al., 2017) have been used to investigate the genetic control of flowering in legumes.

In P. vulgaris few studies on flowering time variation and control have been carried out to date. QTL mapping studies detected some genomic regions associated with the trait (Koinange et al., 1996; Blair et al., 2006; Perez-Vega et al., 2010). Raggi et al. (2014) found significant associations between some candidate genes and flowering time variation in a common bean collection. Recently, Kamfwa et al. (2015b) and Moghaddam et al. (2016) identified SNPs significantly associated with days to flowering.

In our study GWAS was used to detect key genomic regions involved in flowering time control. To the purpose, a panel of highly homozygous and diverse common bean genotypes was developed using SSD. Genotypes within the panel were subjected to an extensive genotyping, using ddRAD-seq, and phenotypic characterization carried out in different years and locations.

#### MATERIALS AND METHODS

#### Plant Material

The plant material of this work was initially selected with the idea of creating a balanced collection of Andean and Mesoamerican landraces potentially representing an important portion of the European diversity of this species. A similar number of accessions from the two common bean genepools was initially considered; according to the available data of the phaseolin alleles, 97 Andean (57 T + 40 C type) and 84 Mesoamerican (all S type) accessions were included. Europe is the most represented geographical area in the panel (153 accessions) followed by South and Central America (22 and 17, respectively). Italy accounts the highest number of accessions followed by Turkey, Spain, Netherlands, and Portugal. A heatmap representing the origin of the materials is reported in **Figure 1**.

Starting from the above described collection, 181 common bean highly homozygous genotypes (i.e., pure lines) were obtained applying SSD for at least 5 consecutive generations under isolated conditions. The 181 lines together with 11 cultivars, included as controls, constitute the diversity panel used in this study accounting a total of 192 lines (NCBI BioSample accessions from SAMN12035168 to SAMN12035359). Further details about lines within the panel, including the genebank from which each accession has been originally obtained, are reported in **Supplementary Table 1**.

#### Phenotyping

The phenological characterization of the 192 genotypes was carried out for two consecutive seasons (2016 and 2017) at: (i) DSA3-UNIPG experimental field located in Sant'Andrea d'Agliano, Perugia, Italy (43◦ 3 0 15.1200N; 12◦ 230 41.6400E, 175 m a.s.l.) (hereafter PG) and (ii) CREA-CI experimental field located in Anzola dell'Emilia, Bologna, Italy (44◦ 340 30.5100N, 11◦ 9 0 55.6400E, 38 m a.s.l.) (hereafter BO). In 2016 plant material was only evaluated in PG while in 2017 in both PG and BO. In both years sowing has been carried out in May: the 4th (PG\_2016), the 11th (PG\_2017), and the 12th (BO\_2017).

The three experiments were all arranged using partially replicated randomized designs in which five entries were replicated five times and two were replicated six times, producing a total of 222 single plant samples out of 192 entries [total samples = 192 – 7 + (5 × 5) + (2 × 6)]. In PG, the 222 common bean samples were grown in 6 adjacent blocks (fixed size of 1 column × 37 rows) covered by anti-insect net; in BO the same samples were arranged in 3 adjacent blocks (fixed size of 1 column × 74 rows). Plants were grown in a net covered nursery supplied with an automatic drip-irrigation system throughout the entire duration of the trials (May to mid-October). For each sample days to flowering (dtf) was recorded as days between sowing and the opening of the first flower (Raggi et al., 2014); a value of 162 days was assigned to the genotypes that did not flower by the end of the experiments (Zhao et al., 2007).

# Phenotypic Data Analyses

The row-column layout of the grown plants and their partial replication allowed for a bi-dimensional spatial analysis of dtf (Singh et al., 1997, 2003; Rollins et al., 2013; Raggi et al., 2017). To the purpose, "plot," "row," and "column" number was assigned to each sample according to its position. For each entry dtf Best Linear Unbiased Predictors (BLUPs) of the genotype effect were calculated in GenStat <sup>R</sup> (Payne et al., 2011) using the most

suitable spatial model determined for the row and column field layout as described by Singh et al. (2003). The procedure consists in gauging the spatial variability by nine applicable models accounting for the existence of different trends, fitting each model (according to the sample position, using the Restricted Maximum Likelihood method, REML), and choosing the best possible one using the Akaike Information Criterion (AIC) (Akaike, 1974). The variance components were used to estimate dtf broad-sense heritability He<sup>2</sup> B , along with its standard error, on a plot basis as:

$$He\_B^2 = \frac{\sigma\_\xi^2}{\sigma\_\mathcal{P}^2} \times 100$$

where σ 2 <sup>p</sup> = σ 2 <sup>e</sup> + σ 2 g (phenotypic variance), σ 2 <sup>g</sup> = genotypic variance, and σ 2 <sup>e</sup> = error variance.

Descriptive statistics and Pearson's correlation coefficients among dtf BLUPs of the three trials were calculated using the R package "agricolae" (de Mendiburu, 2017); results were then visualized using "ggplot2" package (Wickham, 2009). BLUPs datasets were then used to perform GWAS.

#### DNA Extraction

Genomic DNA was isolated from young leaf tissues, collected from 15 days-old single seedlings, using the TissueLyser II (Qiagen) and the DNeasy 96 plant kit (Qiagen) according to the procedure provided by the manufacturer. DNA concentration and quality were estimated using UV-Vis spectrophotometry (NanoDrop 2000TM, Thermo Fisher Scientific). DNA integrity was evaluated after 1% agarose gels (Euro Clone) stained with ethidium bromide electrophoresis. DNA samples were then diluted to 30 ng/µl for following genotyping.

#### Genotyping

A double digest Restriction-site Associated DNA sequencing (ddRAD-seq) approach was used for genotyping. The library preparation and the sequencing were carried out by IGAtech (Udine, Italy). Before starting the procedure, a further check of the DNA concertation was produced using a fluorimetric assay to further normalize and uniform the samples. The libraries were produced using a custom protocol (IGAtech), with minor modifications in respect to the one implemented by Peterson and colleagues (Peterson et al., 2012). In silico analysis was initially performed to select the best combination of restriction enzymes using the common bean reference genome v1.0 (Schmutz et al., 2014). Since the analysis indicated SphI and MboI as the best restriction enzymes combination to maximize the number of sequenced loci, they were used for DNA digestion. Digested DNA was purified with AMPureXP beads (Agencourt) and ligated to barcode adapters. Samples were than pooled on multiplexing batches and bead purified. For each pool, target fragment distribution was collected on BluePippin instrument (Sage Instruments Inc., Freedom, CA, United States). Gel eluted fraction was amplified with oligo primers that introduce TruSeq indexes and subsequently bead

purified. The resulting libraries were than checked with both Qubit 2.0 Fluorimeter (Invitrogen, Carlsbad, CA, United States) and Bioanalyzer DNA assay (Agilent Technologies, Santa Clara, CA, United States). Libraries were processed with Illumina cBot for cluster generation on the flow cell, following the manufacturer's instruction and sequenced with V4 chemistry pair end 125 bp mode on HiSeq2500 instrument (Illumina, San Diego, CA, United States).

Demultiplexing of raw Illumina sequences was performed using Stacks v 2.0 (Catchen et al., 2013) and subsequent alignment to the common bean reference genome v 1.0 (Schmutz et al., 2014) using BWA-MEM (Li and Durbin, 2009) with default parameters. Stacks v2.0 was also used to detect all the covered SNP loci from the aligned reads and to filter the detected loci using the population program (included in Stacks v2.0). In this last step, only loci that are represented in at least 75% of the population were retained.

#### SNP Quality Control

Several quality control steps were performed on the SNP dataset using PLINK v1.09 (Purcell et al., 2007) and TASSEL v 5.2 software (Bradbury et al., 2007). In particular: (i) SNP loci characterized by values of missingness higher than 10%, (ii) individuals with more than 10% missing loci, and (iii) markers with a Minor Allele Frequency (MAF) lower than 5% were filtered. Loci characterized by heterozygosity ≥2% were also discarded.

### Detection of Population Structure and Cryptic Relatedness

The analyses of structure and cryptic relatedness of genotypes in the panel were carried using a reduced dataset where loci in strong Linkage Disequilibrium (LD) (r <sup>2</sup> ≥ 0.3) were removed. In order to detect the population stratification of the developed panel, a Bayesian clustering approach was used. The number of clusters was initially tested in STRUCTURE v.2.3.4 (Pritchard et al., 2000) assuming an admixture model for different number of clusters (K), ranging from 1 to 11. For each tested cluster 10 iterations were carried out resulting from a 30,000 burn-in period and a Markov Chain Monte Carlo (MCMC) of 30,000 iterations after burn-in. The effective number of clusters was than inferred using the Evanno test (Evanno et al., 2005) implemented in the on-line tool STRUCTURE HARVESTER (Earl and vonHoldt, 2012). According to the result, a new single run was performed at the designed K using 100,000 burn-in period and 200,000 MCMC. The resulting population Q-matrix was used to (i) generate the corresponding Q-plot using the software DiStruct (Rosenberg, 2003) and (ii) to correct the association analyses for the putative population structure. Moreover, a kinship matrix was generated using PLINK v. 1.19 and visualized as heatmap and dendrogram using the R package "ggplot2" (Wickham, 2009).

#### Genome-Wide Association Analysis

Marker-trait association analyses were performed using a Mixed Linear Model (MLM) implemented in TASSEL v 5.2 that includes corrections for both population structure (Q) and kinship (K). In fact, the use of such model was necessary as P. vulgaris is characterized by a strong genetic structure (Kwak and Gepts, 2009; Raggi et al., 2013). The three BLUP datasets were used as phenotype input matrix in a single association analysis.

The resulting p-values were then plotted, as –log10(p) to produce a Manhattan plot using the R package "CMplots" (Yin, 2016). The correction for multiple-testing was carried out using the Bonferroni adjustment based on the estimated number of independent recombination blocks calculated using PLINK according to Gabriel et al. (2002). For the SNP markers that remained significant after the application of Bonferroni correction, possible candidate genes were identified based on proximity (maximum ± 100 kb) (Patishtan et al., 2018) and by browsing the P. vulgaris genome using the online tool Jbrowse on Phytozome v. 12.1 (Goodstein et al., 2012). In order to take advantage of the latest version of the common bean reference genome, sequences containing the significant SNP were positioned against the P. vulgaris reference version 2.1. Nucleotide sequences of putative candidate genes were translated into the corresponding proteins and used as queries against the Arabidopsis thaliana protein database (Araport11 protein sequences) using the online tool BLASTP (AA query, AA db) available at: https://www.arabidopsis.org/Blast/.

# Linkage Disequilibrium

A raw estimation of LD decay was obtained dividing the size of common bean genome (bp) by the number of independent recombination blocks within the panel, calculated according to Gabriel et al. (2002). In order to ascertain whether significant SNPs, and their relative candidate genes, were located on the same recombination blocks, further LD analyses were carried out. In particular, LD patterns were studied within a window of ±1.5 Mb (centered on the significant marker). Such analysis was only performed for those SNPs located outside the identified candidate genes. Markers within the windows surrounding the associated SNPs were generated using PLINK v. 1.09 by subsetting the whole SNP dataset obtained after QC and then paired with their corresponding p-values. Pairwise LD between markers within the windows (r 2 ) were calculated using HaploView 4.2 (Barrett et al., 2005); the same software was also used to produce graphical representations of the results.

# RESULTS

#### Phenotyping

A total of 648 (97.3%) common bean samples were successfully characterized for dtf during the three experiments. In BO-2017, the spatial analysis was more efficient than the completely randomized design with a superior efficiency of the spatial model CrdL (Completely randomized design with linear trends along rows) of 22.4% over the Completely randomized design (Crd); the Crd was the best model for BLUPs calculation from data collected in PG-2016 and 2017. Summary statistics of BLUPs are reported in **Table 1**. As expected, dtf showed high broad sense heritability (He<sup>2</sup> <sup>B</sup>) in all trials (**Table 1**). Distribution analysis showed consistent data dispersion and the existence of a number of late flowering genotypes (**Figure 2A**); on the other hand, no differences were observed when data were analyzed


TABLE 1 | Summary statistics, broad sense heritability, and spatial models used for the estimation of days to flowering BLUPs of 192 common bean genotypes.

<sup>a</sup>Broad sense heritability. <sup>b</sup>Completely randomized design. <sup>c</sup> Completely randomized design with linear trends along rows.

separately according to the genepool (**Supplementary Figure 1**). Simple linear regressions of dtf in pairwise comparisons between years (**Figure 2B**) and experimental sites (**Figure 2C**) revealed significant and high correlation in both cases with an R<sup>2</sup> values equal to 0.90 (P < 0.001) and 0.93 (P < 0.001), respectively. The full BLUPs dataset is available in **Supplementary Table 2**.

#### Genotyping

The ddRAD-seq genotyping generated a dataset of 106,072 polymorphic loci, of those 99.3% (105,319) were mapped on the reference genome v.1.0 (Pvulgaris\_218\_v1.0.fasta) (Schmutz et al., 2014). The full genotyping dataset is available at: https: //www.ebi.ac.uk/ena/data/view/PRJEB33063.

After quality control, no genotype was excluded and a dataset of 49,518 SNPs markers evenly distributed over the 11 common bean chromosomes was retained for association analyses. A graphical representation of SNPs' distribution over the eleven bean chromosomes is reported in **Figure 3**.

#### Genetic Structure and Cryptic Relatedness

After removing SNP markers in strong LD (r <sup>2</sup> ≥ 0.3) a dataset of 2,518 SNP was generated and used to perform STRUCTURE and cryptic relatedness analyses (**Supplementary Figure 2**). Results of the Evanno test clearly indicated K = 2 as the most suitable level of population subdivision to explain the genetic structure of the studied panel. STRUCTURE group attributions were strongly consistent with the two common bean genepools (i.e., Mesoamerican and Andean). A graphic representation of the genetic structure of the panel is reported in **Figure 4**. Considering a threshold of q ≥ 0.8 (Bitocchi et al., 2012; Klaedtke et al., 2017), 11 out of 192 genotypes resulted product of admixture between the two genetic groups. All the eleven admixed genotypes derived from European accessions (153) indicating a level of hybridization, between Andean and Mesoamerican genepools, equal to 7.2% (11 out of 153). The admixed entries derived from 9 landrace accessions (Pv\_072, Pv\_073, Pv\_077, Pv\_086, Pv\_092, Pv\_128, Pv\_131, Pv\_134, and Pv\_190) and 2 cultivars (Pv\_059 and Pv\_064).

Results of cryptic relatedness are also graphically presented in **Figure 4**. According to genotype origins, inferred using the available information about phaseolins, the blueish square, bottom-left part of the heatmap, includes most of the genotypes of Mesoamerican origin (72). The plot also indicates a further possible sub-structure of the Mesoamerican genotypes into 3 subgroups, the largest of which includes about 50% of all the Mesoamerican samples (**Figure 4**). On the other hand, the bluish square, top-right part of the heatmap, groups the Andean genotypes (94) with very few exceptions. In this case too, a further subdivision of the group is evident but only one sub-group is clearly distinct (**Figure 4**). The 11 admixed genotypes are grouped right in the middle of the heatmap; they are characterized by average relatedness values in regard to all other genotypes (light blue, light red, or white color).

#### Genome-Wide Association Analysis

Across all the trials, high and consistent He<sup>2</sup> B values were observed for dtf confirming the suitability of the trait to perform GWAS. The Bonferroni correction for multiple testing, calculated considering the number of independent recombination blocks (2,443), resulted in a threshold equal to 5.4 (–log10(p)). GWAS results showed that multiple regions are associated with dtf in the common bean genome (**Table 2**). The lowest p-value (i.e., the strongest association) was obtained for SNP 123164\_60

on chromosome Pv08. Significant associations were also found for SNPs 66929\_307, 17455\_7, 95297\_22, 59746\_63, 59746\_36, 116028\_71, and 17777\_7. In total, 8 significant SNPs for dtf were identified in 4 different common bean chromosomes: Pv01, Pv04, Pv06, and Pv08 (**Table 2**). The Manhattan plot of GWAS results, based on the 49,518 SNP markers, is reported in **Figure 5**.

#### Candidate Gene Identification

The search of possible candidate genes for the most meaningful identified SNP, carried out using Phytozome (v. 12.2) and TAIR, resulted on the identification of 7 possible candidates.

When aligned to the P. vulgaris reference genome, the sequenced fragment containing SNP 123164\_60 produced multiple hits making the discovery of an associated candidate gene rather complex. However, our analysis detected a relevant

TABLE 2 | List of the significant SNPs identified in the study including physical position, association level, phenotypic variation explained by the SNP and MAF.


<sup>a</sup>SNP names are coded as: "fragment number"\_"SNP physical position in the fragment." <sup>b</sup>SNP physical position on the respective chromosome and according to P. vulgaris genome v 1.0.

gene, Phvul.008G149900, located 100 kb upstream of a highly significant hit. Even if no functional annotation was found on the common bean reference genome for this gene, it is noteworthy that its encoded protein is highly similar to Arabidopsis At3G12810. This protein, similar to ATP dependent chromatin-remodeling proteins of the ISWI family, is encoded by Photoperiod-Independent Early flowering 1 (PIE1) that is involved in multiple flowering pathways.

Located only 50 kb downstream of the SNP 66929\_307, on Pv04, Phvul.004G112100 is our best candidate to explain the phenotypic variation associated with this marker. The gene encodes for a NAD(P)-binding Rossmann-fold superfamily protein, carrying out oxidoreductase activity in the chloroplast.

The search of the best candidate gene for marker SNP 17455\_7 resulted in the identification of Phvul.001G227200. In this case, the SNP is located in the first intron of the gene. Phvul.001G227200 is homologous of the A. thaliana At1G56260, also known as Meristem Disorganization 1 (MD1) that is required for the maintenance of stem cells through a reduction in DNA damage (TAIR, 2019b). Phvul.001G236000 is the best candidate to explain the phenotypic variation associated with the second peak observed in the same region on Pv01 (i.e., SNP 17777\_7) as displayed in **Figure 6A**. The gene, located only 10 kb upstream of the SNP, encodes for a protein phosphatase 2C 3-Receptor, involved in abscisic acid signal transduction.

Phvul.006G215800 resulted as the best candidate to explain the effect of the SNP 95297\_22 detected on Pv06. The gene encodes for Potassium Channel AKT2/3.

The two significant SNPs detected on Pv04 (SNP 59746\_36 and 59746\_63) co-localize on the same chromosome region being

separated by 27 bp only. Located 40 kb upstream of the signal, Phvul.004G085100 is the best candidate gene explaining the effect of the markers. The homolog gene in Arabidopsis encodes for a sucrose transporter protein: AtSUC2.

Finally, we identified Phvul.008G055400 as the most meaningful candidate gene associated to the SNP 116028\_71.

#### The gene encodes for a Leucine-Rich Repeat Receptor-Like Protein, also known as Clavata2 (CLV2).

#### Linkage Disequilibrium

In the studied panel, LD decays in an average distance of circa 240 kb. According to the results of LD analysis, SNP

17777\_7 and the candidate Phvul.001G236000 are in the same recombination block showing the association between the marker and the identified gene (**Figure 6A**). In the same figure section SNP 17455\_7 is also displayed due to its position near to SNP 17777\_7. In this case LD analysis was not necessary as the marker is physically located in the first intron of the corresponding candidate (Phvul.001G227200); it is noteworthy that several recombination events occurred between the two markers. A clear association was also observed for SNPs 59746\_36 and 59746\_63 with Phvul.004G085100 (**Figure 6B**) and for SNP 66929\_307 with Phvul.004G112100 (**Figure 6C**). A fairly high average r 2 value was observed between Phvul.008G055400 recombination block and the SNP 116028\_71 (**Figure 6D**).

#### DISCUSSION

The SSD strategy used in this study allowed to produce a panel of highly homozygous common bean genotypes starting from 179 different landraces each of which putatively characterized by relatively high levels of diversity (Tiranti and Negri, 2007; Negri and Tiranti, 2010). Indeed, molecular data demonstrated that the genotypes in our panel are genetically uniform with a very low level of heterozygosity. At the same time, the panel retained a high level of among-genotypes diversity due to the different origin of the initially selected landraces (**Figure 4**). This approach allowed to build a panel of common bean pure lines that can be indefinitely used for association analyses on a plethora of traits of interest for both basic biology studies as well as for plant breeding. Sample seeds of each developed pure lines are currently conserved, using long-term storage conditions, in the genebank held by DSA3 (FAO code: ITA-363).

Results of the phenotypic characterization for dtf showed a rather high level of diversity within the panel (Rodiño et al., 2003; Raggi et al., 2014; Rana et al., 2015). Results of the partially replicated experimental design indicated high levels of dtf He<sup>2</sup> B that is a crucial parameter to find meaningful and promising associations. Indeed, such design has been already used on barley for association analysis on yield performance (Al-Abdallat et al., 2017). In our study, the use of such experimental design also allowed to test, and to possibly correct, the existence of any bias related to the sample position within the experimental plots such as soil fertility and light exposure. As expected, Crd was the best model for BLUP calculation in two out of three cases since biases were not detected. It is also noteworthy that this particular experimental design allowed to maximize the number of phenotypic datapoints and, at the same time, reducing costs and space needed to characterize such a collection of germplasm.

Among different methods that can be used to generate SNP datasets, the selection of ddRAD-seq approach resulted in a very high number of SNPs evenly distributed over the eleven common bean chromosomes (**Figure 3**). Regarding the ddRADseq used protocol, the in silico digestion of the common bean reference genome allowed to select the best enzyme combination maximizing the number of sequenced loci. It is noteworthy that in our study, ddRAD-seq overcame available common bean SNP chips in terms of number of markers successfully genotyped (Cichy et al., 2015; Kamfwa et al., 2015a,b; Moghaddam et al., 2016). In addition, we believe that the technique used for genotyping can help in reducing the ascertainment bias deriving from the use of chip arrays for genotyping of non-elite material.

GWAS is a powerful tool to dissect the genetic control of quantitative traits, potentially providing a higher resolution than QTL mapping. Therefore, in recent years, the interest on this approach arose in both academic and commercial sectors (Davey et al., 2011). In our study, association analysis allowed to detect eight significant SNPs associated with dtf on four

common bean chromosomes: Pv01, Pv04, Pv06, and Pv08. The analysis of the genomic regions surrounding the detected SNPs allowed the identification of seven meaningful candidate genes that could have an important role in controlling the studied trait. In previous studies, QTL for dtf on common bean chromosome Pv01 has been widely reported (Blair et al., 2006; Perez-Vega et al., 2010; Mukeshimana et al., 2014). Moreover, recent research based on GWAS further confirmed the presence of genomic regions involved in the control of this trait on the same chromosome (Kamfwa et al., 2015b; Moghaddam et al., 2016). Similarly, QTL on Pv08 was already reported (Koinange et al., 1996; Perez-Vega et al., 2010) and also confirmed in the above-mentioned studies based on GWAS. According to the mentioned bibliographic records and data produced in our study, the associations on Pv01 and Pv08 are likely to be stable across different environments and genetic background (Kamfwa et al., 2015b). In addition, we observed significant associations on chromosomes Pv04 that were reported by a QTL mapping (Mukeshimana et al., 2014) and GWAS (Moghaddam et al., 2016) study. Finally, Blair et al. (2006) indicated the presence of a QTL for dtf on common bean chromosome Pv06 that was also detected in our study.

The first candidate gene identified in this study, Phvul.008G149900, encodes for a protein that is highly similar to PIE1. It is noteworthy that mutations of PIE1 in A. thaliana resulted in the suppression of Flowering Locus C-mediated delay of flowering and causes early flowering even during non-inductive photoperiods (Noh and Amasino, 2003).

Phvul.004G112100 resulted the best candidate to explain the phenotypic variation associated with SNP 66929\_307. In A. thaliana mutants for the homologous gene (At4G23430) showed an early-flowering phenotype (TAIR, 2019d) suggesting a role for Phvul.004G112100 in common bean flowering time control. Mapping homologous Arabidopsis sequences for photoperiod sensitivity in common bean, Kwak et al. (2008) located one of the homolog of Terminal Flower1 (PvTFLx) on chromosome Pv04. In particular, PvTFLx is located 2 Mb downstream our best candidate suggesting that this region harbors different genes involved in flowering time control.

Phvul.001G227200, homolog of MD1 in Arabidopsis, was the resulting candidate to explain the significance of the SNP 17455\_7. Interestingly, in A. thaliana mutants of this gene showed several development defects such as abnormal phyllotaxy and plastochron, stem fasciation and reduced root growth (Hashimura and Ueguchi, 2011). In the same study the authors reported that in mutants "leaves and floral buds did not develop in a spatially and temporally regulated manner" opening for a possible role of the gene in flowering control. It is also noteworthy that, in maize, shoot apical meristem development has been associated with flowering time (Leiboff et al., 2015). Further analyses, within the same chromosomic region of the previous candidate, revealed that Phvul.001G236000 is the best candidate to explain the phenotypic variation of the marker 17777\_7. In A. thaliana mutants of the homologous At4G26080 showed a late flowering phenotype. It is noteworthy that this narrow chromosomic region (circa 1.3 Mb) contains the two candidate genes detected in this study on Pv01 together with Phvul.001G221100 that encodes for Phytocrome A (Kamfwa et al., 2015b). Finally, this chromosomic region overlaps with a QTL for days to flowering identified by Blair et al. (2006) using a by-parental mapping approach: one of the two flanking markers of the QTL falls in the same above-mentioned chromosomic region. All these experimental evidences suggest the presence of a gene cluster involved in flowering time control in chromosome Pv01.

Phvul.006G215800, the best proposed candidate for the marker 9529\_7 on chromosome Pv06, encodes for Potassium Channel AKT2/3, a photosynthate and light-dependent inward rectifying potassium channel with unique gating proprieties that are regulated by phosphorylation (TAIR, 2019e). It has been demonstrated that loss of function of AKT2/3 affects sugar loading into the phloem of A. thaliana and mutants show delayed flower induction and rosette development (Deeken et al., 2002). Such phenotype strongly corroborates our hypothesis of the involvement of Phvul.006G215800 in flowering.

Phvul.004G085100 on the chromosome Pv04 encodes for a sucrose transport protein. Interestingly, in A. thaliana, a mutation of the homolog (At1G22710) causes dwarfism, delayed development and it has been reported that such plants can occasionally flower, but never produce viable seeds (TAIR, 2019a). At1G22710, also known as AtSUC2, was one of the first genes associated with sucrose transporters (Sauer and Stolz, 1994); this gene is required for phloem loading of sucrose and its activity has been described in detail by Chandran et al. (2003). It is well known that sucrose is a relevant element within the flowering induction process (Corbesier et al., 1998); in plants, sucrose is the main form of fixed carbon that is transported in phloem and also serves as specific signaling molecule (Teng et al., 2005; Solfanelli et al., 2006). An increase in carbohydrate export from leaves has been generally associated with floral induction in Arabidopsis (Corbesier et al., 1998). Consistently, in Nicotiana tabacum L., a decreased phloem loading of sucrose, induced by antisense repression of the NtSUT1 causes delayed flowering (Burkle et al., 1998). Moreover, in A. thaliana sucrose availability on the aerial part of the plant promotes flowering even in dark conditions (Roldán et al., 1999). All these evidences strongly suggest a role of Phvul.004G085100 in controlling flowering time in P. vulgaris. In addition, it is also relevant to mention that Mukeshimana et al. (2014) found two QTLs for days to flowering on the mid-terminal part of Pv04 that might roughly correspond to the chromosome region in which Phvul.004G085100 is located. However, since in above-mentioned study SNP marker positions are expressed as cM, it is difficult to ascertain whether our candidate falls within these regions.

In conclusion, Phvul.008G055400, the candidate gene identified in relation the marker 116028\_71 on the chromosome Pv08, encodes for CLV2. In A. thaliana a mutation of the homolog of this gene (At1G65380) causes altered flower development, late flowering or interrupted flowering caused by a temporary termination of the main inflorescence flower meristem (TAIR, 2019c). In a recent paper, Basu et al. (2019) identified four Clavata genes, including Clavata2, that are highly associated with days to flowering in Cicer arietinum L.

As from the discussed evidences and related bibliographic records, homolog of most of the genes that we propose as candidates for explaining observed flowering time variation in the studied common bean panel are involved in different pathways regulating flowering in Arabidopsis, tobacco, maize, and chickpea. Indeed, results of this research are an important step forward in understanding flowering time control in one of the most important pulses world-wide. Although this diversity panel is representative of a large portion of the European common bean diversity, performing similar analyses on a wider and/or more diverse panel would help in confirming the detected associations. In addition, the application of gene knockout to the proposed candidates would further confirm their involvement in the genetic control of flowering time and allow to measure their contribution to its expression under different experimental conditions (e.g., short vs. long day treatments). The exploitation of the genes identified in this research will hopefully allow the development of new common bean varieties able to better adapt to changing climatic conditions.

#### DATA AVAILABILITY

NCBI BioSample accessions of the 192 common bean lines from: SAMN12035168 to SAMN12035359. The full genotyping dataset of the same samples is available at: https://www.ebi.ac.uk/ena/ data/view/PRJEB33063.

### AUTHOR CONTRIBUTIONS

VN and LR conceived and designed the experiments. LC and AC performed plant phenotyping. LR, LC, and AC performed phenotypic data analyses. LR and LC performed GWAS and candidate gene data analyses. AC and VN contributed to phenotyping costs. VN funded reagents, materials, and analysis tools. All authors wrote the manuscript.

#### REFERENCES


## FUNDING

This research was partially funded by "Novel characterization of crop wild relative and landrace resources as a basis for improved crop breeding" (PGR Secure) project, European Community's Seventh Framework Program [GA n. 266394], "Strategies for Organic and Low input Integrated Breeding and Management" (SOLIBAM) project, European Community's Seventh Framework Program [GA n. 245058], "Networking, partnerships and tools to enhance in situ conservation of European plant genetic resources" (Farmer's Pride) project, European Community H2020 Framework Program [GA n.774271], and "Progetto per l'attuazione delle attività contenute nel programma triennale 2017–2019 per la conservazione, caratterizzazione, uso e valorizzazione delle risorse genetiche vegetali per l'alimentazione e l'agricoltura" (RGV-FAO project) [DM N.0011746].

#### ACKNOWLEDGMENTS

This work is an integral part of LC Ph.D. thesis, carried out under the supervision of VN and LR. The authors wish to acknowledge Professor S. Ceccarelli for his contribution in carrying out spatial analyses, Dr. Thâmara Figueiredo Menezes Cavalcanti for her contribution to the phenotyping activities and all the field technicians of the DSA3-UNIPG (Sant'Andrea d'Agliano) and CREA-CI (Anzola) for managing the trials.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00962/ full#supplementary-material

and identification of QTLs controlling rust resistance. Front. Plant Sci. 9:167. doi: 10.3389/fpls.2018.00167



cells through the reduction of DNA damage. Plant J. 68, 657–669. doi: 10.1111/j.1365-313X.2011.04718.x




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Raggi, Caproni, Carboni and Negri. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Agronomic Performance and Nitrogen Fixation of Heirloom and Conventional Dry Bean Varieties Under Low-Nitrogen Field Conditions

Jennifer Wilker<sup>1</sup> , Alireza Navabi<sup>1</sup>† , Istvan Rajcan<sup>1</sup> , Frédéric Marsolais<sup>2</sup> , Brett Hill<sup>3</sup> , Davoud Torkamaneh<sup>1</sup> and K. Peter Pauls<sup>1</sup> \*

<sup>1</sup> Department of Plant Agriculture, University of Guelph, Guelph, ON, Canada, <sup>2</sup> Agriculture and Agri-Food Canada, London Research and Development Centre, London, ON, Canada, <sup>3</sup> Agriculture and Agri-Food Canada, Lethbridge Research and Development Centre, Lethbridge, AB, Canada

#### Edited by:

Matthew Nicholas Nelson, Agriculture and Food (CSIRO), Australia

#### Reviewed by:

Karl Kunert, University of Pretoria, South Africa Elena Bitocchi, Marche Polytechnic University, Italy

> \*Correspondence: K. Peter Pauls ppauls@uoguelph.ca †Deceased March 10, 2019

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 29 March 2019 Accepted: 09 July 2019 Published: 26 July 2019

#### Citation:

Wilker J, Navabi A, Rajcan I, Marsolais F, Hill B, Torkamaneh D and Pauls KP (2019) Agronomic Performance and Nitrogen Fixation of Heirloom and Conventional Dry Bean Varieties Under Low-Nitrogen Field Conditions. Front. Plant Sci. 10:952. doi: 10.3389/fpls.2019.00952 Common beans (Phaseolus vulgaris) form a relationship with nitrogen-fixing rhizobia and through a process termed symbiotic nitrogen fixation (SNF) which provides them with a source of nitrogen. However, beans are considered poor nitrogen fixers, and modern production practices involve routine use of N fertilizer, which leads to the downregulation of SNF. High-yielding, conventionally bred bean varieties are developed using conventional production practices and selection criteria, typically not including SNF efficiency, and may have lost this trait over decades of modern breeding. In contrast, heirloom bean genotypes were developed before the advent of modern production practices and may represent an underutilized pool of genetics which could be used to improve SNF. This study compared the SNF capacity under low-N field conditions, of collections of heirloom varieties with and conventionally bred dry bean varieties. The heirloom-conventional panel (HCP) consisted of 42 genotypes from various online seed retailers or from the University of Guelph Bean Breeding program seedbank. The HCP was genotyped using a single nucleotide polymorphism (SNP) array to investigate genetic relatedness within the panel. Field trials were conducted at three locations in ON, Canada from 2014 to 2015 and various agronomic and seed composition traits were measured, including capacity for nitrogen fixation (using the natural abundance method to measure seed N isotope ratios). Significant variation for SNF was found in the panel. However, on average, heirloom genotypes did not fix significantly more nitrogen than conventionally bred varieties. However, five heirloom genotypes fixed >60% of their nitrogen from the atmosphere. Yield (kg ha−<sup>1</sup> ) was not significantly different between heirloom and conventional genotypes, suggesting that incorporating heirloom genotypes into a modern breeding program would not negatively impact yield. Nitrogen fixation was significantly higher among Middle American genotypes than among Andean genotypes, confirming previous findings. The best nitrogen fixing line was Coco Sophie, a European heirloom white bean whose genetic makeup is admixed between the Andean and Middle American genepools. Heirloom genotypes represent a useful source of genetics to improve SNF in modern bean breeding.

Keywords: nitrogen fixation, symbiosis, heirloom, bean, breeding

#### INTRODUCTION

fpls-10-00952 July 25, 2019 Time: 15:24 # 2

Since its origin in central Mexico some 2 My ago, common bean (Phaseolus vulgaris L.) has diverged into two genepools in Central America and South America, been domesticated and spread throughout the world (Kaplan and Lynch, 1999; Gepts et al., 2000; Bitocchi et al., 2017). First Nations' ancestral groups gathered wild beans and cultivated them with other crops, including maize (Zea spp.) and squash (Cucurbita spp.). Beans were among the crops which explorers brought back to Europe after they visited the Americas. Centuries of cultivation and movement of seed through human migration and trade led to beans becoming staples in diets around the world, and inseparable parts of numerous cultural heritages. Recent years have seen increases in heirloom bean popularity, stretching beyond farmers' markets and seed exchanges to specialty grocers, culinary circles, and mainstream culture.

Before the establishment of formal bean breeding programs, landraces maintained by First Nations groups and European settlers were grown throughout North America (Kelly, 2010). Aside from their historical origin and association with early farming systems, bean landraces are characterized by having local genetic adaptation, high genetic diversity and a lack of formal genetic improvement (Villa et al., 2005). In many instances, heirloom beans have distinctive characteristics such as unique seed coat colors/patterns, and desirable flavors or cooking traits. However, yield, disease resistance, and growth habit may be poor compared to conventionally bred, relatively modern, bean cultivars. In contrast, modern bean cultivars conform to standard requirements for size and color particular to a few market classes, and are bred to produce high yields under conventional production practices (Kelly, 2010). Market demands and producer requirements are believed to have led to narrow breeding objectives and reduced genetic diversity in modern bean cultivars (Singh, 1988). This reduction in genetic diversity may have also led to a reduction in diversity and capacity for nitrogen fixation in modern bean genotypes.

Between the two genepools of common bean, the Andean genepool is much less diverse than the Middle American genepool. This reduced diversity is a result of a bottleneck created when founder populations established the Andean genepool at a distance from the center of origin of bean, in present-day central Mexico (Bitocchi et al., 2012). The independent and parallel domestication of beans beginning some 8000 years ago in the Andean and Middle American regions resulted in separate genepools of domesticated bean (Papa and Gepts, 2003; Chacón et al., 2005; Kwak and Gepts, 2009; Rossi et al., 2009; Mamidi et al., 2011; Nanni et al., 2011; Bitocchi et al., 2013, 2017; Schmutz et al., 2014; Rendón-Anaya et al., 2017). The divergence has led to some difficulties in hybridization between Andean and Middle American genotypes (Johnson and Gepts, 1999). Nevertheless, introgression between genepools has been found in bean collections throughout the world (Gioia et al., 2013). In particular, introgression has influenced the diversity of the bean germplasm grown across Europe, where 40.2% of accessions show introgression compared to the much lower level of introgression in North American genotypes, which is 12.3% (Gioia et al., 2013).

Symbiotic nitrogen fixation (SNF) is an ancient trait, characteristic of the Fabaceae family. In bean, Rhizobium leguminosarum bv phasioli bacteria inhabit root nodules and fix atmospheric nitrogen, which is utilized by the plant in exchange for carbohydrates. However, among modern leguminous crops, beans are considered to be poor nitrogen fixers (Hardarson et al., 1993). In the latter half of the 20th century, research largely concluded that rates of nitrogen fixation in bean were low, at 25 to 71 kg N<sup>2</sup> fixed ha−<sup>1</sup> for mid- to long-season cultivars (Graham, 1981). These values are considerably lower than rates for soybean at the time, which ranged from 78 to 161 fixed ha−<sup>1</sup> in one study (Muldoon et al., 1980). LaRue and Patterson (1981) reviewed multiple studies of nitrogen fixation in legume species and calculated that soybean fixed 75 kg N<sup>2</sup> ha−<sup>1</sup> on average while dry beans fixed just 10 kg N<sup>2</sup> ha−<sup>1</sup> . However, recent studies have examined hundreds of bean genotypes for traits related to nitrogen fixation and reported wide-ranging capacity for these traits (Ramaekers et al., 2013; Kamfwa et al., 2015; Diaz et al., 2017; Farid et al., 2017; Heilig et al., 2017; Wilker et al., unpublished), indicating genotypic and genetic diversity which could be exploited to enhance this trait through breeding. For example, Farid (2015) tested twelve modern genotypes and found nitrogen fixing capacity ranged from 2.7 to 69.7 kg N<sup>2</sup> fixed ha−<sup>1</sup> , which represents a range of 5.2 to 78.5% nitrogen derived from the atmosphere (%Ndfa). Heilig (2015) examined 79 navy and black commercial cultivars and advanced breeding lines under organic production and found a similar range for nitrogen fixing capacity (16 to 94 kg N<sup>2</sup> ha−<sup>1</sup> ) and for %Ndfa (9.8 to 71.7%).

Nitrogen fixation and root nodule traits are controlled by multiple genes. They are affected by environmental conditions, and are difficult to measure. As a result, modern bean breeding programs do not focus on breeding genotypes efficient at nitrogen fixation but rather release high-yielding genotypes which perform consistently under conventional production practices, which include the application of 33–67 kg ha−<sup>1</sup> of nitrogen fertilizer (OMAFRA, 2009) and crop protection chemicals. In contrast, many heirloom varieties were developed and are maintained under natural growing conditions where fertility is managed using crop rotation and organic fertilizer and symbiosis with appropriate Rhizobia species occurs naturally or is enhanced by the use of inoculants. Therefore, heirloom genotypes may be a genetic resource for modern breeding programs that contain genetic diversity for nitrogen fixation and other traits that have not been eroded by modern breeding practices.

Nitrogen fixation capacity among modern dry bean varieties needs to be improved and discovery of diversity for the trait will provide genetic resources for breeding programs. The current study tests the hypothesis that heirloom beans have a greater capacity for nitrogen fixation than conventionally bred bean varieties and examines whether they could be useful germplasm sources to improve this trait. The objectives of this study were to compare heirloom and conventionally bred bean genotypes from both the Andean and Middle American genepools for their capacity for SNF, to assess whether genetic diversity has been lost over years of modern breeding, and to assess agronomic characteristics to determine the suitability of using heirloom varieties in modern breeding programs.

#### MATERIALS AND METHODS

## Plant Material

fpls-10-00952 July 25, 2019 Time: 15:24 # 3

The heirloom-conventional panel (HCP) was assembled in 2014 and contained 25 heirloom and 17 conventionally bred dry bean genotypes. In the first growing season, six genotypes failed to reach physiological maturity and were removed from the panel. For the second growing season, six new genotypes were added, and the HCP consisted of 23 heirloom and 19 conventional genotypes. Only genotypes for which two or three location years of data was collected are included in the analyses in this report. Seed images of the genotypes in the HCP are displayed in **Figure 1**.

Heirloom seeds were purchased as pure line varieties from Canadian on-line seed retailers (Heritage Harvest Seed<sup>1</sup> , Assiniboine Tipis<sup>2</sup> , and Annapolis Seeds<sup>3</sup> ) with the intent of including a wide representation of seed coat patterns, seed sizes and plant growth habits. Heirloom seed coat patterns ranged from uniform, to bi-color spotted/speckled/striped, or tri-color; often very different in appearance compared to conventional market classes. In this study, the term "heirloom" refers to genotypes of the HCP that were not derived from a conventional bean breeding program. Given the limited information available for each heirloom genotype in this panel (see compiled variety descriptions<sup>4</sup> ), it was impossible to further categorize these genotypes into groupings such as "improved landrace" or "vintage cultivar."

Seed of conventional bean genotypes was sourced from the University of Guelph Bean Breeding program's seed stores. Germplasm was chosen to represent a range of market classes, seed sizes and growth habits, mirroring the diversity found among the heirloom genotypes, where possible. Conventional genotypes were registered with the Canadian Food Inspection Agency between 1938 and 2016 and were developed by modern breeding programs and institutions [including: University of Guelph (UG), Michigan State University (MSU), United States Department of Agriculture-Agriculture Research Station (USDA-ARS), Crop Development Centre (CDC) in Saskatchewan, International Center for Tropical Agriculture (CIAT), Instituto Colombiano Agropecuario (ICA), and Agriculture Agrifood Canada (AAFC) in Ontario and Alberta]. Descriptions of the genotypes, including market class, origin, seed size, plant growth habit, and genepool membership are presented for the HCP in **Table 1**.

### Field Experimental Design and Maintenance

Field trial locations were selected based on low soil nitrogen levels as measured by pre-planting soil tests which showed that rate levels of NO3<sup>−</sup> were under 5 ppm ("very low") or 5–10 ppm ("low") and by site crop rotation histories that indicated that no

<sup>2</sup>http://www.assiniboinetipis.com

dry bean crops had been produced at the sites for the previous decade, at a minimum. Soil nitrogen and growing season details can be found in **Supplementary Table 1**.

Clean seed of each genotype was coated with commercially available Nodulator (Becker-Underwood) Rhizobium legumin osarum bv phaseoli inoculant prior to planting. The day before planting, 1/8 teaspoon (approximately 0.2 g) of inoculant powder was added to each seed envelope and the contents were shaken to coat the seeds. Inoculated seed was stored at the Elora Research Station (ERS) at 4◦C until planting to maintain inoculant viability. The entire contents of each envelope (coated seed + loose inoculant powder) was planted.

The HCP was grown in three low-nitrogen field location-years using a rectangular lattice design (6 × 7) with two replications. At the ERS in 2014, 100 seeds of each genotype were grown in single-row plots 6 m in length with approximately 6 cm between plants and 60 cm spacing between entry rows. In 2015, the HCP was grown in another field at the ERS and at an offsite location near Belwood, Ontario. Increased seed availability enabled planting of 135 seeds in 4-row plots (150 cm × 90 cm, 37.5 cm between rows) with approximately 5 cm between plants within rows.

Throughout the growing season, plots were maintained with standard practices, except no-nitrogen fertilizer was used. Preplant fertilizer (0–20–20) at a rate of 200 kg ha−<sup>1</sup> was applied approximately 1 week prior to planting. Pre-plant herbicides [200 ml ha−<sup>1</sup> Pursuit (BASF) and 1.5 L ha−<sup>1</sup> Frontier (BASF)] were applied to control broadleaf and grass weeds. At Elora 2014, insecticides against leaf hoppers were applied July 11 [1.0 L ha−<sup>1</sup> Lagon (Loveland products) and 40 ml ha−<sup>1</sup> Matador (Syngenta)], fungicides against Anthracnose and root rot were applied July 11 [0.5 L ha−<sup>1</sup> Quadris (Syngenta) and 1.0 L ha−<sup>1</sup> Allegro (Syngenta)], and again against Anthracnose on August 7 [400 ml ha−<sup>1</sup> Headline (BASF) and 1 L ha−<sup>1</sup> Allegro]. At Elora 2015, herbicides were applied July 15 [2.25 L ha−<sup>1</sup> Basagran (BASF) and 0.67 L ha−<sup>1</sup> Excel Super (Excel Crop Care)], [1 L ha−<sup>1</sup> Assist (BASF)] followed by insecticides against leaf hoppers [1.0 L ha−<sup>1</sup> Cygon (FMC Corporation) and 40 ml ha−<sup>1</sup> Matador] and fungicides against Anthracnose [400 ml ha−<sup>1</sup> Headline (BASF) and 1 L ha−<sup>1</sup> Allegro] on July 16. Fungicide against Anthracnose (0.5 L ha−<sup>1</sup> Quadris) and insecticide against leaf hoppers [200 ml ha−<sup>1</sup> Admire (Bayer)] were again applied August 6. At Belwood 2015, insecticides (1.0 L ha−<sup>1</sup> Cygon and 40 ml ha−<sup>1</sup> Matador) and fungicides (400 ml ha−<sup>1</sup> Headline and 1 L ha−<sup>1</sup> Allegro) were applied on July 16. The Belwood plots were treated against Anthracnose (0.5 L ha−<sup>1</sup> Quadris) and leaf hoppers (200 ml ha−<sup>1</sup> Admire) again on August 6. Plots at all locations were manually weeded once before canopy closure each year.

#### Phenotyping

Days to flowering was observed throughout July and August and was recorded as the date when 50% of the plants in a plot had one flower open. The days to flowering measurements were converted into growing degree days to flowering (GDDf) by summing the calculated GDD temperature from daily max and min temperatures. Hourly temperatures were recorded at the ERS

<sup>1</sup>http://www.heritageharvestseed.com

<sup>3</sup>http://www.annapolisseeds.com

<sup>4</sup>https://doi.org/10.5683/SP2/NZY3W5

TABLE 1 | Market class, seed size, growth habit, genepool and race for 42 dry bean (Phaseolus vulgaris L.) genotypes of the heirloom-conventional panel.


†Small, 13 to 29 g per 100 seeds; medium, 30 to 45 g per 100 seeds; large, 46 to 63 g per 100 seeds. ¶ Growth habit according to Singh (1982). ‡Genepool assigned according to STRUCTURE analysis. Threshold genetic contribution from assigned genepool was >50%. Genotypes marked with (<sup>∗</sup> ) were assigned to genepool according to market class appearance – these genotypes were not SNP genotyped MA, Middle American.

by the University of Guelph School of Environmental Sciences Agrometeorology group<sup>5</sup> . For the Belwood site, temperature data from the nearest Government of Canada weather station data was used (Fergus Shand Dam<sup>6</sup> ).

Relative leaf chlorophyll content was measured twice during the growing season [when the mean number of plots had reached (1) the second trifoliate stage, and (2) at 100% flowering] using a SPAD 502 Plus Chlorophyll Meter (Konica Minolta). The meter was calibrated according to manufacturers' instructions each time the unit was powered-on<sup>7</sup> . The middle leaflet in the top-most, fully expanded trifoliate leaf was used for the measurements and three plants were sampled per plot.

Plots were rated for days to maturity throughout September and early October. Plots were considered to have reached maturity when they were ready for harvest. Days to maturity measurements were converted into growing degree days to maturity (GDDm) in the same way as for GDDf (see above).

Three plants were randomly sampled from mature plots, placed in large paper bags, and dried in a re-purposed tobacco kiln (De Cloet Bulk Curing Systems, model TPG-360, Tillsonburg, ON, Canada) at 33◦C at the ERS for 24–48 h. Prior to weighing, roots were cut from each plant and aboveground biomass was measured. Plants were then threshed using an indoor belt thresher (Agriculex SPT-1A, Guelph, ON, Canada), their seed collected, weighed and counted. Harvest index (biomass/seed weight) as well as 100 seed weight (HSW) were calculated.

At Elora 2014, the harvest was staggered according to maturity. The plots were pulled by hand at maturity and threshed at the side of the field using a Wintersteiger plot combine (Wintersteiger AG, Upper Austria, Austria) with a Classic Seed-Gauge weighing system by Harvest-Master (Juniper Systems Inc., UT, United States) and plot seed weight and moisture content were recorded. In 2015, plot harvest took place after all plots reached maturity with the same Wintersteiger combine.

#### Seed Isotope Analysis

The natural abundance method (Shearer and Kohl, 1986) was used to calculate percent nitrogen derived from the atmosphere (%Ndfa) for each genotype. Seed was used for this assessment because seed N at maturity represents the total N accumulated over the growing season, whereas shoot N is transitory and fluctuates over the plant life cycle (Masclaux-Daubresse et al., 2010) making coordination of sampling times challenging in studies with multiple genotypes. Additionally, %Ndfa levels measured in shoot and seed samples are highly correlated, and processing of seed samples is faster and less expensive than shoot tissue (Barbosa et al., 2018).

Nodule traits (number and size), as an indicator of nitrogen fixing capacity, were not measured in this study. Numerous studies in dry bean have found that nodule traits are not correlated with nitrogen fixation capacity. For example, Farid (2015) found no correlation between nodule numbers and SNF,

<sup>6</sup>http://climate.weather.gc.ca/index\_e.html

and in a study of SNF in the Middle American Diversity Panel (Wilker et al., unpublished) found no correlation between SNF and nodule size or nodule number. An in-field ureide assay was not feasible and a controlled environment study was not initiated for this panel.

To prepare for gas-chromatography mass-spectrometry (GCMS) analysis, a 5 g subsample of seed from each plot was oven-dried (Blue M Electric, SPX Corporation) at 60◦C at the University of Guelph for 24 h prior to being ground to a coarse powder in a coffee grinder (various models used). The coarse seed powder was further processed into a fine powder suitable for gas chromatography mass spectrometry (GCMS) analysis by grinding a sub-sample in a small Eppendorf tube along with a steel bead in a bead mill (Beadruptor 12, Omni International Inc.). Samples (5 mg) of bean powder were measured into small tin capsules (8 mm × 5 mm, standard weight, Elemental Microanalysis) using an analytical balance (Quintix 65-1S, Sartorius Lab Instruments GmbH & Co.), enveloped and compressed into a tiny pellet so that no atmosphere remained in the capsule. The bean powder pellets were collected in 96-well plates and sent to the Agriculture and Agri-food Canada (AAFC) GCMS facility in Lethbridge, Alberta for analysis. The samples were analyzed with a Finnigan Delta V Plus (Thermo Electron, Bremen, Germany) Isotope Ratio Mass Spectrometer (IRMS) fitted with a Flash 2000 Elemental Analyzer (Thermo Fisher Scientific, Voltaweg, Netherlands) and Conflo IV (Thermo Fisher Scientific, Bremen, Germany) interface between the IRMS and the analyzer. A standardized curve for nitrogen content was created using an alfalfa standard provided by the AAFC GCMS facility. Further isotope standards L-glutamic acid USGS40 and USGS41 (United States Geological Survey) were included with each plate of samples processed to normalize isotope values and enable inter-lab comparison. Samples were analyzed for %N, δ <sup>15</sup>N (h), and <sup>δ</sup> <sup>13</sup>C (h).

The natural abundance method uses the following equation,

$$\% \text{Ndfa} = \frac{\ $^{15}N \text{ reference plant} - \$ ^{15}N \text{ Nfixing plant}}{\ $^{15}N \text{ reference plant} - \$ }$$

where, δ <sup>15</sup>Nref.plant is the rate of δ <sup>15</sup>N in the reference genotype (R99), δ <sup>15</sup>Nfixingplant is the δ <sup>15</sup>N of the N-fixing bean genotype and B is the average δ <sup>15</sup>N of beans grown in an environment where its entire N source is from fixation (Peoples et al., 2009). The B-value was obtained for this experiment as described by Farid (2015). Briefly, δ <sup>15</sup>N was measured and averaged for 20 bean genotypes from both the Andean and Middle American genepools which were grown in a growth room in N-free media. Normalized δ <sup>15</sup>N values were used for all genotypes and an average of δ <sup>15</sup>N values for R99 were used in %Ndfa calculations.

#### Genotyping

Leaf tissue samples were collected from young plants of 42 genotypes grown in a controlled environment (16 h photoperiod, 22◦C) at the University of Guelph. For 29 genotypes, DNA was extracted using the manufacturer's instructions for the NucleoSpin Plant II kit (Macherey-Nagel, Germany), and for the remaining 13 genotypes the DNeasy Plant Mini Kit

<sup>5</sup>https://www.uoguelph.ca/ses/service/weather-records

<sup>7</sup>https://www.specmeters.com/assets/1/22/2900P\_SPAD\_502.pdf

(Qiagen, Canada) was used. DNA quality was tested using a spectrophotometer (ND-1000, Nanodrop) and a fluorometer (Qubit 2.0, Invitrogen by Life Technologies), and DNA of 39 genotypes was determined to be of sufficient quality to send for genotyping. Genomic DNA was analyzed at the Genome Quebec Innovation Centre (McGill University, Montreal, QC, Canada) for single nucleotide polymorphisms (SNPs) using the Illumina Infinium iSelect Custom Genotyping BeadChip (BARCBEAN6K\_3) containing 5398 SNPs (Song et al., 2015).

#### Identity by State Analysis

Single nucleotide polymorphism data from the above analysis was imported to TASSEL (Bradbury et al., 2007) for filtering such that the retained SNPs were present in 95% of the panel and the minor allele frequency was 0.05. This resulted in 39 genotypes and 4704 SNPs retained for further analysis. TASSEL was used to generate a genotype distance matrix and R software (R Core Team, 2013) was used to create a dendrogram using the dendextend package (Galili, 2015). The hierarchical clustering function, hclust (Müllner, 2013) was used to perform the cluster analysis using the UPGMA method. The as.dendrogram function was used to create dendrograms which were then modified in R using the dendextend package and the circlize package (Gu et al., 2014). STRUCTURE (Pritchard et al., 2000) was used to determine the population genetic structure of the HCP. The analysis was performed (20 replications) with the length of burnin set at 5000 and the number of MCMC replications after burn-in set at 50000. A range of genetic groups (2K to 9K) were tested and the number that best fit the data was determined by visualizing the STRUCTURE results and using the 1K statistic in STRUCTURE HARVESTER online (Evanno et al., 2005<sup>8</sup> ; Earl and vonHoldt, 2012).

### Nucleotide Diversity Analysis

The levels of genetic diversity in the heirloom vs. conventional categories and the Andean vs. Middle American categories of the HCP were assessed. The π statistic provides an indication of polymorphism within a population as measured by nucleotide diversity (Nei and Li, 1979), and Tajima's D provides an indication of selection pressure (Tajima, 1989). The 5K SNP dataset was used to calculate π and Tajima's D with VCFtools 0.1.12b (Danecek et al., 2011), and MAF ≥ 0.01 and a window of 1000 bp was used. Genome-wide averages of π and Tajima's D for each germplasm category were generated by taking the average across all windowed calculations. A t-test (GraphPad Prism8) was used to determine differences in both π and Tajima's D-values between heirloom and conventional categories within each genepool.

#### Statistical Analysis

Analysis of variance (ANOVA) tests were performed on the data collected from each environment and the environments combined using the MIXED procedure in SAS (version 9.4, SAS Institute, 2012. Cary, NC, United States). In each ANOVA, genotypes were considered fixed effects while all other effects and the interaction effects were considered random. The Shapiro– Wilks test (Shapiro and Wilk, 1965) was performed on the residuals in the UNIVARIATE procedure to test their normality. Random and independent distributions of the residuals were visually examined by plotting the studentized residuals against the predicted values. Data that generated outlier residuals were removed from the data set. Further, single degree of freedom contrasts were conducted in ANOVA between genotype categories, heirloom vs. conventional and Middle American vs. Andean. Repeated measures of leaf chlorophyll content (SPAD) were taken, and a separate ANOVA test was used to compare SPAD values at each time point. In each ANOVA, the genotype least squared means (LSmeans) were computed using the LSMEANS statement in the MIXED procedure.

The pair-wise Pearson's coefficients of correlation were computed for all traits measured using the CORR procedure in SAS. The RINCOMP and PRINQUAL procedures were used in SAS to generate the principal component (PC) values, to estimate the proportion of variance accounted for by each PC, and to plot PC1 against PC2 to generate a genotype × trait (GT) biplot (Yan and Rajcan, 2002) to determine genotype and trait interactions overall and in each environment.

# RESULTS

### Origins and Phenotypic Characteristics of Selected Beans

The germplasm comprising the HCP includes genotypes with a wide diversity of seed traits (colors, patterns, shapes, and sizes) found in dry bean. According to the descriptions from the source seed retailers, 16 of the heirloom genotypes are part of the cultural heritage of North American First Nations communities (the Algonquin, the Iroquois, the Seneca, and the Mohawk from the Great Lakes region of North America; the Arikara, the Hidatsa, and the Mandan from the Plains region in present-day United States). Genotype descriptions for the remaining nine heirloom genotypes suggest the varieties were passed down through communities or families from as far back as colonial times. For example, Sweeney Family Heirloom was first grown by the Sweeney family in Nova Scotia and has been moved with the family and grown in Alberta (Heritage Harvest Seeds). Further, while Sweeney Family Heirloom shows similarities to other heirloom genotypes, it is considered a unique variety by heirloom seed growers. Coco Sophie is a European variety from the 1700s (Heritage Harvest Seed). Amish Gnuttle (Amish Nuttle; also known as Cornhill Bean or Mayflower) is described by some retailers as a variety that was introduced to America with the early settlers and has been grown by Amish communities for generations, while other variety descriptions suggest that Amish Gnuttle originated with the Seneca First Nation.

The heirloom category was equally split between Andean and Middle American types (**Table 1**) and a variety of seed coat color patterns are represented, including bi-color, yellow eye, pinto/cranberry, and uncommon solid colors (**Figure 1**) which make them unique and difficult to categorize using conventional

<sup>8</sup>http://taylor0.biology.ucla.edu/structureHarvester/

market classes. The conventional category was equally split between Andean and Middle American types and could mostly be categorized as kidney (dark red, light red, and white), cranberry, yellow, white, or black market class beans (**Figure 1** and **Table 1**).

# Field Conditions

fpls-10-00952 July 25, 2019 Time: 15:24 # 8

Fields with low nitrogen levels were used in this study to maximize the potential for SNF activity. In the growing seasons prior to 2014 and 2015, fields at the ERS had been planted with high-N demanding cereal crops to remove as much available nitrogen from the soil as possible. At the Belwood location, the field had been used to produce mixed hay with minimal inputs in the growing seasons previous to our trial. Soil test results showed that nitrate (NO3−) levels ranged between 3.7 and 8.6 ppm and ammonium (NH4) levels ranged between 2.6 and 6.1 ppm in the bean root zone. Soil analysis laboratory guidelines indicate that levels of NO3<sup>−</sup> below 10 ppm are considered low (A & L Canada Laboratories Inc.).

Planting in 2015 occurred 2 weeks later than in 2014 as a result of wet spring weather. Despite the late start to the 2015 season, accumulated growing degree days (GDD) over the growing season were similar for all three locations (Elora 2014 – 1912.8, Elora 2015 – 1862.6, and Belwood 2015 – 2012.3). A summary of pre-plant soil test results, precipitation and total GDD for all location-years is provided in **Supplementary Table 1**.

### Genetic Analysis of Relatedness

The HCP was composed of genotypes from both the Middle American and Andean genepools, however the genepool composition and genetic relatedness of the genotypes was unknown. An identity-by-state (IBS) analysis on SNP genetic data from 39 genotypes of the HCP was undertaken to confirm genotype membership in either genepool and to determine the genetic relationships among them. The IBS analysis found that the panel is composed of three sub-groupings, with 19 genotypes belonging to the Andean genepool and 20 belonging to the Middle American genepool (11 race Mesoamerica and 9 race Durango-Jalisco). In the dendrogram (**Figure 2A**), largeseeded genotypes generally sorted into the Andean grouping while smaller-seeded genotypes sorted into the Middle American grouping. STRUCTURE analysis (**Figure 2B**) and determination of the best-fit 1K value for the panel (**Figure 2C**) using STRUCTURE HARVESTER confirmed that there were three genetic groupings in the panel, corresponding to the Andean genepool and the two races (Mesoamerica and Durango-Jalisco) present in the Middle American genepool. The IBS analysis revealed the degree of genetic relatedness between modern and heirloom genotypes. For example, all of the black seed coat genotypes belong to the race Mesoamerica grouping of the Middle American genepool, and the University of Guelph breeding line, "Hi N" (**Figures 2A,B**, #9), is most closely related to the heirloom genotype Mandan Black (#18) and the conventional genotype ICA Pijao (#35), but it is less similar to Zorro (#30) and ICB-10 (#37). Assignment of varieties to either genepool based on genetic composition was generally in agreement with genepool assignments using seed characteristics, except for a few cases. For example, the large, flat-seeded Limelight (#48) and Flagg (#10) genotypes, which appear to be of Andean origin, belong by genetic analysis, to the Middle American genepool.

Evidence of admixture is apparent for a number of genotypes in the panel. Within the Middle American genepool, five of the genotypes are of entirely Durango-Jalisco and five are of entirely Mesoamerican ancestry. The remaining 10 Middle American genotypes are admixed between Durago-Jalisco and Mesoamerican races with 4 genotypes also containing <10% genetic material from the Andean genepool. Less admixture is evident within the Andean genepool, where 10 genotypes are entirely Andean and 8 genotypes contain <10% Middle American genetic material. Coco Sophie (**Figures 2A,B**, #46), a round, white bean of European heritage is unique in that it is approximately 50% Andean and 50% Middle American. In the principle component analysis (**Figure 2D**) Coco Sophie falls midway between the three genepool/race clusters. Repeated iterations of the STRUCTURE analysis of the panel assigned Coco Sophie to the Andean genepool 60% of the time, whereas on the basis of its seed color and shape this genotype would have been assigned to the Middle American genepool.

# Nucleotide Diversity Among Genotype Categories

Nucleotide diversity was measured in the HCP to ascertain whether genotypes comprising the heirloom category are more diverse than those in the conventional category, and similarly whether genotypes belonging to the Middle American genepool are more diverse than those belonging to the Andean genepool. According to the π and Tajima's D statistics, nucleotide diversity for the heirloom category overall (π = 3.64 × 10−<sup>4</sup> , D = 7.262 × 10−<sup>3</sup> ) was very similar to that found in the conventional category overall (π = 3.88 × 10−<sup>4</sup> , D = 7.908 × 10−<sup>3</sup> ).

The number of SNPs among the Middle American genotypes in the HCP was 3294 compared to 2696 for the Andean genotypes. Nucleotide diversity using π, for the Middle American group (π = 3.64 × 10−<sup>4</sup> ) was significantly (p = 0.0014) larger than for the Andean group (π = 2.13 × 10−<sup>4</sup> ). Similarly, Tajima's D statistic for the Middle American genepool (D = 0.79) was significantly higher (p = 0.0009) than for the Andean genepool (D = −0.18).

Nucleotide diversity between heirloom and conventional categories was further analyzed within the genepools. In the Middle American genepool, nucleotide diversity was not significantly different (π: p = 0.4137; D: p = 0.9783) between the heirloom (π = 4.08 × 10−<sup>4</sup> , D = 0.63) and the conventional genotypes (π = 3.61 × 10−<sup>4</sup> , D = 0.64). However, within the Andean genepool, heirloom nucleotide diversity was significantly higher (p = 0.0082) in conventional genotypes (π = 3.98 × 10−<sup>4</sup> ) than heirloom genotypes (π = 2.35 × 10−<sup>4</sup> ), but Tajima's D-values were not significantly different (p = 0.1310) between heirloom (D = −0.09) and conventional genotypes (D = 0.47).

# Diversity for Seed Isotope Traits

Significant differences were seen among the genotypes for the seed traits analyzed by GCMS, including: nitrogen derived

FIGURE 2 | Analysis of genetic structure and relatedness of thirty-nine genotypes of the heirloom-conventional panel. (A) Dendrogram of genetic relatedness generated in R. Andean genotypes above and Middle American genotypes below the mid-line. Heirloom or conventional category membership is denoted by an "h" or "c," respectively, along with the genotype code number; (B) STRUCTURE plot indicating the division of the panel into three genetic sub-groupings, Andean (red), Mesoamerica (green); and Durango-Jalisco (blue); (C) Delta K plot from fastSTRUCTURE indicating that the most appropriate sub-division of the panel is into three genetic groupings; (D) Principle component analysis plot confirming three genetic groupings in the panel.

from the atmosphere (%Ndfa; p = 0.0002), seed nitrogen content (%N; p < 0.0001), and carbon discrimination (δ <sup>13</sup>C; p < 0.0001) (**Figure 3** and **Supplementary Table 2**). Among the categories overall, significant differences were found for %Ndfa (p <0.0001), where Middle American genotypes (mean 62.16%) outperformed Andean (mean 54.82%) genotypes, and for seed nitrogen content (p < 0.0001), where heirloom genotypes (mean 3.97%N) contained higher levels of N than conventional (mean 3.79%N) genotypes. Significant differences were not found for other category comparisons of seed composition traits. While the effect of environment alone was not significant, the environment by genotype interaction effect (env∗ENTRY) was significant for all seed composition traits (**Supplementary Table 2**), and warranted further exploration.

When seed composition traits are analyzed for each location, significant genotype effects were found. At Elora 2014 (**Figure 3** and **Supplementary Table 3**), significant differences were found between genotypes for %Ndfa (p = 0.0072), seed N content (p = 0.0105), and carbon discrimination (p = 0.0031). A comparison of genotype categories found significantly higher levels for %Ndfa (p = 0.0144) in Middle American genotypes (mean 54.37%) compared to Andean genotypes (mean 45.94%); and conventional genotypes (mean 53.06%) fixed more nitrogen than heirloom genotypes (mean 48.23%), although this difference was not statistically significant. For seed N content, significant differences (p = 0.0070) were seen at Elora 2014 where the heirloom category (mean 4.14%N) had higher seed N content than the conventional category (mean 3.88%N), however, no significant differences were seen between Andean (mean 4.1%N) and Middle American (mean 3.97%N) genotypes. For carbon discrimination (δ <sup>13</sup>C), significant differences (p = 0.0452) were found between heirloom (mean −27.5) and conventional (mean −27.8) genotypes, but not between Andean (mean −27.54) and Middle American (mean −27.75). Although significant differences were found among genotypes for %Ndfa (p = 0.0049), seed N content (p = 0.0126), and carbon discrimination (p = 0.0001) at Belwood in 2015 (**Figure 3** and **Supplementary Table 4**), the only genotype category comparison where significant differences were found was for seed N content (p = 0.0251), where heirloom genotypes had higher %N (mean 3.71) than conventional genotypes (mean 3.52). At Elora 2015 (**Figure 3** and **Supplementary Table 5**), significant differences were found between genotypes for %Ndfa (p = 0.0026), seed N content (p < 0.0001), and carbon discrimination (p = 0.0078), and comparisons of genotype categories found further significant differences. Similar to results for 2014, at Elora 2015 Middle American genotypes (mean 63.54%) fixed significantly (p = 0.0020) more nitrogen than the Andean genotypes (mean 54.19%), while the difference between heirloom (mean 58.36) and conventional (mean 59.59) was not significant (p = 0.6980). For seed N content at Elora 2015, no significant differences were seen between heirloom (mean 4.08%N) vs conventional (mean 3.95%N) or Andean (mean 4.04%N) vs Middle American (mean 4.02%N) categories. For carbon discrimination (δ <sup>13</sup>C), significant differences (p = 0.0233) were found between heirloom (mean −27.8) and conventional (mean −27.41) genotypes. Additionally, significant differences (p = 0.0049) between Andean (mean −27.9) and Middle American (mean −27.34) were found.

#### Diversity for Agronomic Traits

For agronomic traits, significant differences in the combined environments analysis were found among genotypes for days to flowering (GDD; p < 0.0001), days to maturity (GDD; p < 0.0001), yield (kg ha−<sup>1</sup> ; p = 0.0003), and hundred seed weight (g; < 0.0001) (**Figure 4** and **Supplementary Table 2**). Among categories overall, significant differences were found for days to flowering, where heirloom genotypes (mean 819.44 GDD) flowered significantly earlier than conventional genotypes (mean 849.83 GDD), and Andean genotypes (mean 798.80 GDD) flowered significantly earlier than Middle American genotypes (mean 865.44 GDD). Similarly, for days to maturity, heirloom genotypes reached maturity significantly earlier (mean 1811.24 GDD) than conventional genotypes (mean 1857.28 GDD). Significant differences were not found for either genotype category comparison for yield (kg ha−<sup>1</sup> ), however, significant differences were found for 100 seed weight, where heirloom genotypes (mean 40.7 g) were larger than conventional genotypes (mean 28.8 g), and Andean (mean 48.35 g) genotypes were larger than Middle American genotypes (mean 22.70 g). While the effect of environment alone was not significant, the environment by genotype interaction effect (env∗ENTRY) was significant for days to flowering, yield and 100 seed weight (**Supplementary Table 2**), and warranted further exploration.

When agronomic traits are analyzed for each location, significant genotype effects were found. At Elora 2014 (**Figure 4** and **Supplementary Table 3**), significant differences were found between genotypes for days to flowering (GDD; p = 0.0485), days to maturity (GDD; p < 0.0001), yield (kg ha−<sup>1</sup> ; p = 0.0033), and 100 seed weight (g; p < 0.0001), and comparisons of genotype categories found further significant differences. For days to flowering, Middle American genotypes (mean 820.86 GDD) flowered significantly earlier than Andean genotypes (mean 783.60 GDD); and heirloom genotypes (mean 795.02 GDD) flowered earlier than conventional genotypes (mean 813.74 GDD), although this difference was not statistically significant. For days to maturity, heirloom genotypes (mean 1780.40 GDD) matured significantly earlier than conventional genotypes (mean 1842.20 GDD), and Andean genotypes (mean 1783.29 GDD) matured significantly earlier than Middle American genotypes (mean 1829.42 GDD). For yield, no significant differences were found between heirloom and conventional genotypes nor between Andean and Middle American genotypes. For 100 seed weight, heirloom genotypes had significantly higher weights (mean 40.82 g) than conventional genotypes (mean 31.35 g), and Andean genotypes (mean 49.97 g) were significantly heavier than Middle American genotypes (mean 22.69 g). At Belwood 2015 (**Figure 4** and **Supplementary Table 4**), significant differences were found between genotypes for days to flowering (p < 0.0001), days to maturity (p < 0.0001), 100 seed weight (p < 0.0001). No significant differences were found among genotypes for yield. When category comparisons were performed, significant differences were found for days to flowering, with Andean

FIGURE 3 | Means for seed composition traits measured from seed harvested at three field locations from genotypes of the heirloom-conventional panel. Comparisons within each year and subcategory ± standard error are presented. Means labeled with different letters within categories are significantly different according to ANOVA, p = 0.05.

genotypes (mean 782.02 GDD) flowering earlier than Middle American genotypes (mean 848.38 GDD). For 100 seed weight, heirloom genotypes (mean 41.51 g) were significantly heavier than conventional genotypes (mean 28.68 g), and Andean genotypes (mean 48.23 g) were significantly heavier than Middle American genotypes (mean 23.59 g). At Elora 2015 (**Figure 4** and **Supplementary Table 5**), significant differences were found between genotypes for days to flowering (p =< 0.0001), days to maturity (p = 0.0002), yield (p < 0.0001), and comparisons of genotype categories found further significant differences for days to flowering and 100 seed weight. In particular, heirloom genotypes (mean 860.49 GDD) flowered significantly earlier than conventional genotypes (mean 903.76 GDD), and Andean genotypes (mean 833.02 GDD) flowered significantly earlier than Middle American genotypes (mean 924.12 GDD). For 100 seed weight, it was found that heirloom genotypes (mean 39.87 g) were significantly heavier than conventional genotypes (mean 27.53 g), and Andean genotypes (mean 47.24 g) were significantly heavier than Middle American genotypes (mean 21.80 g).

When random effects in the combined ANOVA are considered, the effect of environment is not significant for any trait, however, the genotype by environment interaction was significant for all traits, except Days to Maturity (**Supplementary Table 2**), indicating that genotype performance for most traits was affected by the growing environment. The block within environment interaction was not significant at any location, however, the incomplete block within the environment by

FIGURE 4 | Means for agronomic traits measured at three field locations for the genotypes of the heirloom-conventional panel. Comparisons within each year and subcategory ± standard error are presented. Means labeled with different letters within categories are significantly different according to ANOVA, p = 0.05.

block interaction was significant for %Ndfa, yield, and days to flowering (**Supplementary Table 2**), indicating some variation in performance across the field sites.

# Diversity for Leaf Chlorophyll Content

As a repeated measure, leaf chlorophyll content (SPAD) was analyzed in separate F-tests. Overall, SPAD values differed significantly by genotype during each field season (p < 0.0001, **Supplementary Table 6**) and at all locations, significant differences were found among genotypes for leaf chlorophyll content (p < 0.0001, **Supplementary Table 6**). In 2015, at both locations, significant differences were seen between SPAD measurements taken at different growth stages (early vegetative stage vs. reproductive stage) (**Supplementary Table 6**). Furthermore, at each location the growth stage at which leaf chlorophyll content was measured had a significant effect on genotype SPAD performance (significant SPADT∗G interaction; **Supplementary Table 6**). The observation within block by genotype by SPAD time interaction was significant in all environments (**Supplementary Table 6**).

Leaf chlorophyll content rating comparisons were also made between genotype categories using ANOVA. In 2014, no significant difference was found between heirloom and conventional genotypes (p = 0.7372), whereas Middle American genotypes had significantly higher SPAD ratings (mean SPAD value 37.19) than Andean genotype ratings (mean SPAD value 34.39). At Belwood 2015, significant differences (p = 0.0121) were found between heirloom (mean SPAD value 37.15) and conventional (mean SPAD value 38.90) genotypes, and further SPAD sampling time (p = 0.0002) and category<sup>∗</sup> SPADT interaction (p = 0.0164) were significant for the heirloom vs. conventional comparison. When genotypes were categorized according to genepool membership, significant differences (p = 0.0013) were found between Middle American (mean SPAD value 39.24) and Andean (mean SPAD value 36.43) genotypes. In addition, SPAD sampling time was significant (p = 0.0007), as was the interaction between genepool category and SPAD sampling time (p = 0.0222). At Elora 2015, no significant difference was found between heirloom and conventional genotypes (p = 0.7840), nor SPAD sampling time or the interaction (SPADT∗breeding category). When genotypes were compared according to genepool membership, significant differences (p < 0.0001) were found, where Middle American genotypes had significantly higher SPAD ratings (mean SPAD value 35.07) than Andean genotypes (mean SPAD value 31.86). Neither the SPADT nor the genepool<sup>∗</sup> SPADT interaction was significant at Elora in 2015.

#### Nitrogen Fixation in the HCP

**Table 2** ranks all genotypes in the panel for nitrogen fixing capacity as measured by %Ndfa. At Elora 2014, the %Ndfa range was between 20.8% (Jacob's Cattle, heirloom, Andean) and 76.4% (Flagg, heirloom, Middle America) with an average value of 48.3%. At Elora 2015, the %Ndfa range was from 19.9% (Thermo Fisher Scientific, heirloom, Andean) to 70.9% (Coco Sophie, heirloom, Middle America) with an average value of 53.0%. At Belwood 2015, the %Ndfa range was from 43.5% (Limelight, conventional, Andean) to 76.3% (Hi N line, conventional, Middle American) with an average value of 60.3%.

Although no differences were found in nitrogen fixing capacity between the heirloom and conventional genotype categories, when ranked overall, four of the top five genotypes for nitrogen fixation capacity in this study were heirloom genotypes (including: Coco Sophie, Mandan Black, Roja de Seda, and PI2017262). The conventional genotypes which ranked in the top ten for nitrogen fixation consist of two breeding lines (Hi N and Vax 4) and two recently released cultivars (OAC Inferno and Zorro).

In addition to desirable growth habit, the modern cultivars also possess disease resistance; the cream-colored Vax 4 is resistant to Common Bacterial Blight (CBB) and Bean Common Mosiac (BCM) virus (Singh et al., 2001), the light red kidney bean OAC Inferno is BCM and Anthracnose resistant (Smith et al., 2012), and the black bean Zorro is resistant to rust and Anthracnose and partially resistant to CBB (Kelly et al., 2009). Disease resistance and good nitrogen fixing performance make these genotypes desirable candidates for breeding programs. Nitrogen fixing capacity was consistently higher in Middle American than Andean genotypes, and four of the top five nitrogen fixing genotypes belong to the Middle American genepool (Mandan Black, Roja de Seda, PI207262, and Hi N line).

# Trait Correlation

At Elora 2014, the correlation between days to flowering and days to maturity and the correlation between the first and second SPAD measurement time were positive and significant (**Supplementary Table 7**). At Elora 2015, significant, positive correlations were found between %Ndfa and all traits except yield; a significant, negative correlation was seen that year between seed N and yield (**Supplementary Table 8**). Similarly, at Belwood 2015, significant, positive correlations were seen for %Ndfa and all traits except yield and δ <sup>13</sup>C (**Supplementary Table 9**). Yield was not found to be significantly correlated with any trait in 2015 at either location (**Supplementary Tables 8**, **9**).

The first two principle components in trait biplots (**Figure 5**) accounted for 49.9% of the variation in Elora 2014 (**Figures 5A,B**), 64.9% in Elora 2015 (**Figures 5C,D**), and 51.3% in Belwood 2015 (**Figures 5E,F**). The positive relationships between days to flowering and %Ndfa at each location-year are indicated by the acute angle formed by the vectors for these traits. The near-right angles formed by the %Ndfa and SPAD vectors at each location-year indicate that no relationship exists between these traits. The obtuse angle formed by the carbon discrimination (δ <sup>13</sup>C) and %Ndfa vectors in Elora 2014 indicates a negative relationship between these traits, while in 2015 the vectors are closer together forming a smaller angle and indicating a closer relationship.

When genotypes are categorized according to breeding history (**Figures 5A,C,E**), the conventional and heirloom genotypes occupy largely overlapping areas of the plot. However, when the genotypes are categorized according to genepool membership

TABLE 2 | Nitrogen derived from the atmosphere (%) and differential ranking of common bean genotypes at three locations (Elora and Belwood) and in two seasons (2014 and 2015).


#Code for genotypes shown in Figure 1. ‡Genepool according to Gepts (1988) MA, Middle American.

(Andean vs Middle America, **Figures 5B,D,F**), a significant fraction of the Andean population falls exclusively into areas defined by PC1. In these representations the Middle American genotypes are clustered in the direction of the %Ndfa vector.

### DISCUSSION

# Genetic Diversity Is Greater in the Middle American Than the Andean Genepool

The IBS and nucleotide diversity analyses of the HCP was in accordance with the often-observed higher level of genetic diversity within the Middle American genepool compared to the Andean genepool. Multiple studies have found higher levels of diversity in the Middle American genepool than the Andean (Koenig and Gepts, 1989; Beebe et al., 2000, 2001; Papa and Gepts, 2003; McClean et al., 2004; Mamidi et al., 2011, 2013; Bellucci et al., 2014; Schmutz et al., 2014). In a study of AFLP and SSR marker diversity in domesticated and wild bean populations, Rossi et al. (2009) found evidence of a bottleneck event before domestication in the Andean genepool. Bitocchi et al. (2012) also found significant differences in genetic diversity between wild Middle American and Andean genotypes lending support to the occurrence of a genetic bottleneck prior to domestication of the Andean genepool. Therefore, the current low level of diversity among domesticated Andean genotypes was caused by bottlenecks during the establishment of the wild progenitor bean populations and during domestication (Bitocchi et al., 2013). The HCP has similar nucleotide diversity and no genetic differentiation between heirloom and conventional genotypes.

Decades of breeding, based on the use of a limited pool of elite cultivars has generated concern that this practice has led to a narrowing of crop genetic diversity in modern varieties (Plucknett et al., 1987; Gepts, 2006). However, the perception that heirloom genotypes are more genetically diverse than varieties from modern breeding programs was not supported by the genetic diversity analysis of the HCP. The interspersion of heirloom and conventional genotypes around the dendrogram (**Figure 2**) suggests that decades of isolated development of these two germplasm categories has not led to genetic divergence. Furthermore, genetic diversity measurements with the π and Tajima's D statistics were not significantly lower for conventional genotypes than heirloom genotypes in this study. This was true for the overall comparison and the comparison within the Middle American genepool. Within the Andean genepool, greater nucleotide diversity was indicated by π and Tajima's D within the conventional genotypes compared to the heirloom genotypes.

While this finding is in accordance with analyses performed in other crop species which concluded that modern breeding practices have not reduced genetic diversity (van de Wouw et al., 2010), it contradicts a recent comprehensive study in bean based on SSR marker diversity among wild, landrace and modern American genotypes of each genepool that concluded that genetic diversity has been lost as a result of breeding practices (Gioia et al., 2019). The contradictory conclusions may be related to differences in the marker systems and number of markers that were used in the studies; 24 SSR markers were used in the Gioia et al. (2019) study versus more than 4700 SNP markers in the current study. In addition, the number of individuals that were analyzed differed, with 192 advanced bean cultivars plus 349 accessions of wild plus domesticated beans used in the Gioia et al. (2019) study versus 25 heirloom and 17 conventionally bred dry bean genotypes in the present study. However, it is likely the case that the difference is related to the fact that both the heirloom and conventional varieties used in the present study were selected materials that have both been subjected to a domestication bottleneck. Our results suggest that modern practices have not introduced another significant loss in genetic diversity.

## Nitrogen Fixation Capacity in Middle American Genepool Exceeds That Found in Andean Genepool

Although the range for nitrogen fixation among genotypes in the Middle American genepool (Mist, 35.2%Ndfa to Hi N, 76.3%Ndfa) was narrower than in the Andean genepool (Fisher, 19.9%Ndfa to Coco Sophie, 75.7%Ndfa), nitrogen fixation among Middle American genotypes (average = 62.2%Ndfa) was significantly higher than among the Andean genotypes (average = 54.8%Ndfa). This suggests that the genes controlling nitrogen fixation capacity may differ between the genepools, perhaps both in number of loci and their diversity. However, few studies exist that compare the nitrogen fixing capacities of Middle American genotypes with Andean genotypes. Ramaekers (2011) identified a few quantitative trait loci (QTL) associated with SNF-capacity using a recombinant inbred line (RIL) population created from a cross between an Andean and a Middle American genotype. Other studies have used sets of either Middle American or Andean genotypes. For example, Kamfwa et al. (2015) studied 259 genotypes belonging to the Andean Diversity Panel (Cichy et al., 2015), and a 188 F4:<sup>5</sup> RIL population derived from two Andean parents (Kamfwa et al., 2019) and found a number of QTL associated with nitrogen fixation. Similar studies with Middle American germplasm have identified similar as well as unique QTL associated with nitrogen fixation (Farid, 2015; Diaz et al., 2017; Heilig et al., 2017; Wilker and Pauls, 2019). Further research to identify QTL associated with nitrogen fixation in a panel comprised of genotypes from each genepool followed by assessment of haplotype diversity at the QTL would provide information on whether Middle American genotypes contain a greater number of active sites for N fixation than Andean genotypes or unique, more effective alleles. The higher levels of SNF in the Middle American genepool may be attributable to the higher level of genetic diversity on the Middle American genepool overall, as confirmed in this study. Alternatively, the Middle American genotypes may have performed better with the Rhizobia inoculant and/or strains present in the soil.

# Diversity for Nitrogen Fixation in Conventional Bean Genotypes Similar to Other Studies

Nitrogen fixation (%Ndfa) among the 18 conventional genotypes in the HCP (excluding R99) ranged from the lowest overall ranked AAFC-bred Limelight historic variety at 35.8% to the

highest overall ranked University of Guelph breeding line Hi N at 66.9% (**Table 2**). These results fall generally within the range of %Ndfa reported for beans in contemporary research studies using conventional genotypes but other studies of nitrogen fixation, using conventional genotypes, have reported a broader range for this trait. For example Kamfwa et al. (2015) found a range from 3.6 to 98.2%Ndfa in their study of the 259-genotype Andean Diversity Panel and a study with 79 Middle American genotypes under organic production (Heilig et al., 2017) reported a range of 9.8 to 71.1%Ndfa. Early studies of nitrogen fixation in bean (Graham and Rosas, 1977; Graham, 1981) reported that fixation varied according to plant architecture, where determinate bush types had poorer performance than indeterminate climbing types. Economically viable seed yields (1000–2000 kg ha−<sup>1</sup> ) were not attainable when plant %Ndfa levels were low, although variation for nitrogen fixation was acknowledged (Bliss, 1993). Therefore, the 18 conventional genotypes in the HCP, spanning decades of cultivar releases by breeding programs across North America, likely represent the mid-range of nitrogen fixing capacity among conventional bean genotypes.

# Modern Breeding Has Not Reduced SNF Capacity

This study showed that despite decades of modern production and breeding practices, which include the use of nitrogen fertilizer that downregulates SNF activity, SNF capacity has not been lost from conventional genotypes. Recently released varieties such as Zorro, a black bean developed at Michigan State University (Kelly et al., 2009), and OAC Inferno, a light red kidney bean developed at the University of Guelph (Smith et al., 2012), showed good performance for nitrogen fixation in our study. OAC Inferno also performed well in a study examining SNF in the Andean Diversity Panel in Michigan (Kamfwa et al., 2015). The breeding methodologies used to develop Zorro and OAC Inferno are representative of modern breeding practices. Zorro was developed by pedigree and pure line selection from a backcross population generated from a bi-parental cross of Michigan State University black bean breeding lines (B00103 and X00822), with emphasis on selection for disease resistance, plant architecture and yield. OAC Inferno was derived from a conical cross of diverse kidney bean variety parentage (HR85-1885/Montcalm//USWA-39/AC Litekid///Foxfire/AC Elk//Sacramento/AC Calmont) sourced from across North America, using disease resistance and yield as selection criteria. Kamfwa et al. (2015) found that OAC Inferno was the only genotype in that study to contain major effect alleles for Ndfa at three loci. The complex pedigree of OAC Inferno may have contributed to its genetic diversity and higher than usual capacity for nitrogen fixation in this Andean genotype.

The finding that SNF in the heirloom category overall was not superior to the conventional category did not support the hypothesis on which the study was based and may be attributable to the composition of the HCP. The panel is small and was designed to include a broad representation of bean genotypes; the heirloom cultivars come from wide geographic origins and are of unspecified breeding heritage (landraces and vintage varieties), and the modern genotypes include those released across recent decades as well as recent, elite modern cultivars. Different results may have been achieved had the study included wild bean germplasm and landraces and more-recently registered modern cultivars.

# Incorporating Heirloom Genotypes Into Breeding for Improved SNF Holds Potential

Previous to this study, there was no indication that nitrogen fixation capacity would be superior in heirloom bean genotypes. The discovery of the diversity in capacity for nitrogen fixation among the 23 heirloom genotypes in the HCP [ranging from the lowest overall ranked genotype (Fisher at 32.1%) to the highest overall ranked (Coco Sophie at 69.0%, **Table 2**)] suggests that heirloom varieties may be an excellent germplasm resource for studying this trait. Furthermore, we found a wide range in capacity for nitrogen fixation and yield performance among the heirloom genotypes of the HCP that was on par with conventional genotypes, indicating the suitability of heirloom beans for incorporation into breeding programs. In addition, the ranked panel for SNF performance (%Ndfa), was dominated by heirloom genotypes. Heirloom bean landraces are not routinely used to breed conventional varieties. For example, Navabi et al. (2014) undertook a pedigree analysis of Canadian dry bean varieties since the 1930s, and while a few introgressions of P. coccinius and P. acutifolius were made, heirloom genotypes were not evident, except among the oldest crosses. Heirloom beans possess diversity that could be exploited without the challenges encountered when breeding with wild relatives, such as infertile crosses and reintroduction of 'wild' traits. In addition, heirloom varieties grown by First Nations groups for centuries in the Great Lakes region of North America, are well-adapted to the climate and soils and perhaps the Rhizobium of this region.

Additionally, Coco Sophie (#46), which is unique in the HCP for its admixture between the genepools and is representative of European bean germplasm (Gioia et al., 2013), might be used as a bridge parent to transfer desirable traits from one genepool to the other (Duc et al., 2015). In particular, because Coco Sophie already possesses good SNF capacity it could be useful to introgress SNF traits from higher-fixing Middle American germplasm to lower-fixing Andean germplasm.

The similar yield of heirloom and conventional categories indicates heirloom genotypes have breeding potential in modern programs. When genotypes of the HCP were compared based on breeding history, no significant difference was found in yield between heirloom (1651 kg ha−<sup>1</sup> ) and conventional (1714 kg ha−<sup>1</sup> ) groups. A number of explanations for the similar yield performance of heirloom and conventional genotypes in the present study are plausible. Firstly, heirloom varieties were sourced from commercial seed suppliers and the HCP may have been enriched in heirloom lines that had reasonable performance characteristics. Secondly, low soil nitrogen levels may have limited the yield performance of conventional genotypes, which have been bred to perform under intensive management regimes. And finally, the conventional genotypes were not chosen for the panel based on superior yield potential

but on market class similarity to heirloom genotypes in the panel. Some of the conventional genotypes were registered as long ago as the 1940s, and yields in bean crops grown in Ontario have increased by 1000 kg ha−<sup>1</sup> in three decades (OMAFRA, 2016) and comparisons of bean varieties released over 40 years produced under conventional conditions show that breeding has increased their yield potential by more than 1% per year (Navabi et al., personal communication). Overall, the yield performance of the heirloom genotypes in our study would suggest that incorporation of these genotypes into a modern breeding program for organic production would not introduce significant yield drag. Singh et al. (2011) suggested that the use of well-adapted heirloom genotypes in bean breeding could be "crucial for developing highyielding broadly adapted cultivars for sustainable organic and conventional production systems, thus reducing research and production costs."

## Heirloom Beans May Be Particularly Suited to Breeding for Organic Agriculture

The rise in demand for organic food has broadened societal interest in heirloom varieties. Heirloom genotypes may be inherently well suited to organic production practices where growing conditions share similarities with the environments in which First Nations peoples grew them (Singh et al., 2011). Heirloom beans often possess characteristics such as attractive seed coat colors and patterns, desirable texture and flavor, and heritage value which increase their marketability and make them attractive to organic growers (Boyhan and Stone, 2016). Culinary characteristics were found to be of particular importance to heirloom bean growers in one study (Brouwer et al., 2016), while unique seed coat patterns as well as flavor and texture characteristics were emphasized by growers in another study (Swegarden et al., 2016).

Conventional varieties lack traits which give them a competitive advantage in low-input productions systems and may hamper their yield performance. However, modern, conventionally bred crop varieties account for more than 95% of varieties grown in organic production (Lammerts van Bueren et al., 2011). Direct comparisons of the yield performance of heirloom and conventional genotypes under organic production show mixed results. Miles et al. (2015) found that yield did not differ significantly between heirloom (1852 kg ha−<sup>1</sup> ) and conventional (1983 kg ha−<sup>1</sup> ) groups, whereas, Swegarden et al. (2016) found that heirloom genotypes (1362 kg ha−<sup>1</sup> ) yielded significantly less than the conventional genotypes (2447 kg ha−<sup>1</sup> ). In an evaluation of a large panel of conventional black and navy bean genotypes under organic production the yields ranged from 1228 to 1762 kg ha−<sup>1</sup> (Heilig et al., 2017), which is similar to the range found in the current study (1160– 2002 kg ha−<sup>1</sup> ) of heirloom and conventional genotypes under low nitrogen management.

In the present study, weed growth was difficult to manage, and lesions symptomatic of Common Bacterial Blight or Anthracnose were found on various genotypes (disease notes not recorded). Therefore, the development of genotypes exhibiting early canopy closure and disease resistance might be particularly advantageous for organic production systems. Studies in bean comparing the outcome of selection under organic and conventional growth conditions resulted in different genotypes being chosen based on yield performance (Singh et al., 2011). Similarly in soybean, Boyle (2016) found that selection performed under organic production favored genotypes with improved performance for resource acquisition traits (early canopy development, nodule mass, and root length).

# CONCLUSION

This study represents the first comparison of SNF in a panel of heirloom and conventional dry beans and will serve as a starting point for further research on promising heirloom genotypes. The finding that genetic diversity is similar between heirloom and conventional categories is consistent with the finding that %Ndfa in heirloom and conventional categories is not significantly different. This result does not support the hypothesis that genetic diversity for nitrogen fixation has been eroded over years of modern breeding practices. The heirloom genotypes, as a group, had similar yield performance to the conventional genotypes under low-input field conditions, and although their capacity for nitrogen fixation was not significantly better than the conventional genotypes, they dominate the list of the best nitrogen fixers. Considering these characteristics, heirloom genotypes hold some promise for breeding to improve nitrogen fixation capacity in modern bean varieties. Heirloom beans represent an underutilized resource which could be exploited to improve nitrogen fixation in breeding for organic production and conventional production where reduction of synthetic inputs and improved environmental stewardship are of growing concern.

## DATA AVAILABILITY

Publicly available datasets were analyzed in this study. This data can be found here: https://doi.org/10.5683/SP2/NZY3W5.

# AUTHOR CONTRIBUTIONS

JW, AN, and KP designed the project. JW performed the experiments and carried out the seed analysis. BH and FM helped in the seed analysis experiments. DT carried out the genetic diversity analysis. JW and AN analyzed the data. AN, KP, and IR guided the experimental work. JW and KP wrote the manuscript.

# FUNDING

This study was funded by the Ontario Bean Growers, Agriculture Agri-Food Canada through their Cluster Program, the Ontario Ministry for Food and Rural Affairs, the Ontario Ministry for Research and Innovation, and the Food from Thought Program at the University of Guelph funded by the Canada First Research Excellence Fund.

#### ACKNOWLEDGMENTS

fpls-10-00952 July 25, 2019 Time: 15:24 # 19

The authors wish to thank Tom Smith, Lyndsay Schram, and other field staff for their support in the field and lab work that was essential for this publication.

#### REFERENCES


#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00952/ full#supplementary-material



Institute, S. A. S. (2012). SAS Enterprise Miner 13.1. Cary, NC: SAS Institute Inc.


origin and domestication of common bean unveils its closest sister species. Genome Biol. 18:60. doi: 10.1186/s13059-017-1190-6


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Wilker, Navabi, Rajcan, Marsolais, Hill, Torkamaneh and Pauls. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Waterlogging Tolerance at Germination in Field Pea: Variability, Genetic Control, and Indirect Selection

Md Shahin Uz Zaman1,2,3 \*, Al Imran Malik<sup>1</sup>† , Parwinder Kaur1,2‡ , Federico Martin Ribalta<sup>1</sup> and William Erskine1,2

<sup>1</sup> Centre for Plant Genetics and Breeding, UWA School of Agriculture and Environment, The University of Western Australia, Crawley, WA, Australia, <sup>2</sup> The UWA Institute of Agriculture, The University of Western Australia, Crawley, WA, Australia, <sup>3</sup> Pulses Research Centre, Bangladesh Agricultural Research Institute, Ishwardi, Bangladesh

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC), Granada, Spain

#### Reviewed by:

Silvia Pampana, University of Pisa, Italy Wricha Tyagi, Central Agricultural University, India

#### \*Correspondence:

Md Shahin Uz Zaman md.shahinuzzaman@ research.uwa.edu.au

†Present address: Al Imran Malik, International Center for Tropical Agriculture (CIAT), Vientiane, Laos ‡orcid.org/0000-0003-0201-0766

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 29 March 2019 Accepted: 09 July 2019 Published: 30 July 2019

#### Citation:

Zaman MSU, Malik AI, Kaur P, Ribalta FM and Erskine W (2019) Waterlogging Tolerance at Germination in Field Pea: Variability, Genetic Control, and Indirect Selection. Front. Plant Sci. 10:953. doi: 10.3389/fpls.2019.00953 In the Eastern Gangetic Plain of South Asia field pea (Pisum sativum L.) is often grown as a relay crop where soil waterlogging (WL) causes germination failure. To assess if selection for WL tolerance is feasible, we studied the response to WL stress at germination stage in a recombinant inbred line (RIL) population from a bi-parental cross between WL-contrasting parents and in a diversity panel to identify extreme phenotypes, understand the genetics of WL tolerance and find traits for possible use in indirect selection. The RIL population and the diversity panel were screened to test the ability of germination under both waterlogged and drained soils. A total of 50, most WL tolerant and sensitive, genotypes from each of both the RIL and the diversity panel were further evaluated to assay testa integrity/leakage in CaSO<sup>4</sup> solution. Morphological characterization of both populations was undertaken. A wide range of variation in the ability to germination in waterlogged soil was observed in the RIL population (6–93%) and the diversity panel (5–100%) with a high broad-sense heritability (H <sup>2</sup> > 85%). The variation was continuously distributed indicating polygenic control. Most genotypes with a dark colored testa (90%) were WL tolerant, whereas those with a light colored testa were all WL sensitive in both the RIL population and diversity panel. Testa integrity, measured by electrical conductivity of the leakage solute, was strongly associated with WL tolerance in the RIL population (r<sup>G</sup> = −1.00) and the diversity panel (r<sup>G</sup> = −0.90). Therefore, testa integrity can be effectively used in indirect selection for WL tolerance. Response to selection for WL tolerance at germination is confidently predicted enabling the adaptation of the ancient model pea to extreme precipitation events at germination.

Keywords: germination, waterlogging tolerance, indirect selection, secondary traits, Pisum sp.

# INTRODUCTION

Peas (Pisum sativum L.) are an important pulse crop, ranks second in global production after beans among the pulse crops (Food and Agriculture Organization [FAO], 2017). Pea seeds are rich in protein, slowly digestible starch, soluble sugars, fiber, minerals, and vitamins (Dahl et al., 2012). It has an economic and agronomic importance in cropping systems (Yang et al., 2018). The crop is

**162**

also an important component of agroecological cropping systems in diverse regions of the world. In South Asia, there is a history of relay-sowing of pea into standing rice on waterlogged soil (Ali and Sarker, 2013). Waterlogging (WL) can cause germination failure (Crawford, 1977) and lead to reduced plant population in pea (Zaman et al., 2018).

Global climate change causes WL events to be more frequent, severe, and unpredictable (Intergovernmental Panel on Climate Change [IPCC], 2014). Climate change predictions for South Asia suggest alterations in the intensity of rainfall events, an increase in inter-annual precipitation variability (Sivakumar and Stefanski, 2010), and delayed monsoon rains (Li et al., 2017). This constitutes a major threat to regional crop production. Pea is very prone to WL, even more than other grain legumes (Solaiman et al., 2007; Pampana et al., 2016). In recent years, unseasonal rain during sowing exposed the pea crop to WL stress (Zaman et al., 2018). Therefore, it is crucial to develop stress-resistant peas and to improve agricultural practices to cope with WL stress.

Developing pea genotypes tolerant to WL might be an effective strategy to mitigate WL stress. Variation in WL tolerance at germination among three pea genotypes was demonstrated by Zaman et al. (2018) indicative of valuable diversity within the species. WL tolerance at germination has also been identified in lentil (Lens culinaris Medik. ssp. culinaris) (Wiraguna et al., 2017), pigeonpea (Cajanus cajan (L.) Millsp.) (Sultana et al., 2013), soybean (Glycine max (L.) Merr.) (Hou and Thseng, 1991), wheat (Triticum aestivum L.) (Ueno and Takahashi, 1997), maize (Zea mays L.) (Zaidi et al., 2012), and barley (Hordeum vulgare L.) (Takeda and Fukuyama, 1986). However, the long history of focused breeding on high yield and food quality has led to a loss of genetic diversity and stress resistance. Therefore, breeders have to undertake more efficient methods of selection and take advantage of the large genetic diversity present in pea genepool. Recently, Simple Sequence Repeat marker panels have been developed that could be useful for identifying markers linked to WL tolerance and marker-assisted selection (Burstin et al., 2015), but no markers linked to WL tolerance have been identified yet. The value of morpho-physiological traits as indirect selection criteria for WL tolerance is also worthy of evaluation. Several traits are associated with WL tolerance at germination. Small seeds in soybean showed a higher germination rate than large seeds when exposed to WL (Sayama et al., 2009). Testa (seed coat) color is sometimes associated with WL tolerance (Hou and Thseng, 1991; Ueno and Takahashi, 1997; Zhang et al., 2008). Several studies on the role of the testa in preventing cellular damage during imbibition showed that seeds with cracked testa and seeds without testa had rapid imbibition and higher solute leakage than those with intact testa and no cracks [Larson, 1968, pea; Powell and Matthews, 1978, pea; Duke and Kakefuda, 1981, soybean, navy bean (Phaseolus vulgaris L.), pea, and peanut (Arachis hypogaea L.); Duke et al., 1983, soybean]. Furthermore, a short period (i.e., 24 h) of seed submergence showed rapid imbibition leading to solute leakage, and was associated with low seedling vigor (Perry and Harrison, 1970, pea; Yaklich et al., 1979, soybean; and Kantar et al., 1996, faba bean). Testa integrity appears to be a key trait for WL tolerance at germination.

Here, to assess if selection for WL tolerance is feasible in peas, we studied the response to WL stress at germination stage in a recombinant inbred line (RIL) population from a bi-parental cross between WL-contrasting parents and a diversity panel to: (i) identify extreme phenotypes for WL tolerance, (ii) understand the genetic basis of WL tolerance, and (iii) find traits for possible use in indirect selection for WL tolerance.

### MATERIALS AND METHODS

#### Plant Materials

A RIL population and a diversity panel of pea germplasm were used in this study.

The RIL population (108 lines) originated from a bi-parental cross between WL-tolerant genotype Kaspa and sensitive BM-3 (Zaman et al., 2018). Hybridization was done at the University of Western Australia (UWA) in 2015. The F<sup>1</sup> generation was allowed to self-pollinate and 250 F<sup>2</sup> seeds were produced in the glasshouse at an average temperature of 25◦C in 2016. Generation advancement from F<sup>2</sup> to F<sup>6</sup> was undertaken by a rapid generation system using single seed descent from May 2016 to June 2017. The rapid generation acceleration involves growing the plants under conditions optimized to induce rapid flowering, tagging the flowers at anthesis, then removing pods from the plant prior to treatment to induce precocious germination. The second generation plants are then returned to soil and the process repeated until the desired generations are achieved. Seeds from the last generation are then left to mature on the plant. Plants were grown under far red-enriched LED light (AP67 spectrum) from B series Valoya lights (Helsinki, Finland), with the temperature set at 24/20◦C and a photoperiod of 20 h (Croser et al., 2016; Ribalta et al., 2017). Seeds were sown in 0.4 L plastic pots filled with steam pasteurized potting mix (UWA Plant Bio Mix – Richgro Garden Products Australia Pty Ltd.). Plants were watered daily and fertilized weekly with a water soluble N–P–K fertilizer (19–8.3–15.8) with micronutrients (Poly-feed, Greenhouse Grade, Haifa Chemicals Ltd.) at a rate of 0.3 g per pot.

The diversity panel of 110 genotypes comprised five Australian varieties and germplasm accessions from the Australian Temperate Field Crops Collection (ATFCC), Department of Primary Industries, Victoria. The panel included the WL contrasting genotypes – Kaspa and BM-3. The germplasm represents global pea diversity and originates from the geographic regions of South Asia (21 genotypes), former USSR (18), Northern Europe (18), Mediterranean (17), North America (12), Australia (9), South America (8), and Africa (7).

#### Methods

Three types of experiments and within each a RIL population and diversity panel trial were conducted.

#### Experiment 1: Studies on Waterlogging Tolerance **RIL population**

The experiment to assay WL tolerance was conducted in a glasshouse of the Plant Growth Facility at UWA as in

Zaman et al. (2018) using the 108 RIL population and the parents – tolerant Kaspa and sensitive BM-3. The experimental design was split-plot with three replicate blocks. Main plots were WL treatments [two levels: drained control and 8 d (days) WL] while the genotypes (108 RILs and two parents) were in sub-plots. The experimental unit was plastic pot-free draining with a sealed base. Free-draining pots contained gravel at the bottom and 3.5 kg sand and soil mixed (1:1) [pH 6.7 and electrical conductivity (EC) 0.46 dS m−<sup>1</sup> at 1:5 w/v soil/water] at the top. Soil was collected from Mukinbudin (30◦ 780 S, 118◦ 31<sup>0</sup> E), Western Australia (Kotula et al., 2015). Each freedraining pot (19-cm height × 21-cm diameter) was placed in a sealed base pot (24-cm height × 26-cm diameter). Platinum (Pt) electrodes were inserted in the substrate at a depth of 100 mm in 10 pots for redox measurement (Patrick et al., 1996). For the waterlogged treatment, DI water was added to sealed pots so that free-draining pots could be waterlogged from the bottom to maintain a water table at 10 mm below the soil surface. Pots were waterlogged for 4 d prior to sowing to ensure hypoxia at sowing. Water was added to sealed base pots daily as required to maintain the water table. For drained control treatments, there was no water in the sealed base pots, but the soil moisture in free-draining pots was maintained at ∼80% of field capacity. Seeds were treated with tetramethylthiuram disulfide (Thiram) at the rate of 3 g/kg seeds just before sowing. Twenty seeds of each genotype were sown in a free-draining pot by dibbling at 5 mm soil depth. The seed rate for WL screening followed the WL-screening protocol of Zaman et al. (2018). All pots were covered for 3 d after sowing to ensure darkness for germination. Waterlogged pots were drained after 8 d of WL treatment. Drained control pots were weighed every day and watered to ∼80% field capacity. Within replicates, pots were moved every 5 d to minimize the effects of varying conditions in the glasshouse. The experiment was conducted at 25◦C temperature and was terminated 23 d after sowing, when there was no sign of further emergence.

#### **Diversity panel**

The experiment to assay WL tolerance was conducted on 110 genotypes of the diversity panel including the WL controls – WL-tolerant Kaspa and WL-sensitive BM-3 at germination under similar growth conditions and management practices as described above for the RIL population. The experimental design was split-plot in three replicate blocks with WL treatments (as above) as main plots and genotypes as sub-plots.

Seed emergence was recorded daily during WL and during the recovery period (draining of pots after 8 d WL); and expressed as a percentage of the total number of seeds sown. The emergence was assessed till 23 d, the end day of experiment. Seeds with an epicotyl longer than 5 mm were considered as germinated (i.e., emerged). The redox potential of the soil was recorded daily from 10 pots with a Pt electrode and silver/silver chloride reference electrode attached to a millivolt-meter following the procedure described by Patrick et al. (1996).

# Experiment 2: Agro-Morphological Traits and WL Tolerance

#### **RIL population**

The RIL population (108 lines) and two parents were screened for agro-morphological traits in the UWA glasshouse from May to September 2017 in a randomized complete block design with two replications. The experimental unit was plastic pot (diameter 26 cm and height 23 cm). Each pot was filled with gravel at the bottom with 4.0 kg of potting mix (composition described above) on top. Five seeds of each genotype were sown in each pot. After 3 weeks, plants were thinned to two plants per pot. Four weeks after sowing, a water soluble N–P–K fertilizer (19– 8.3–15.8) with micronutrients (Poly-feed, Greenhouse Grade, Haifa Chemicals Ltd.) at a rate of 0.3 g per pot were applied and this concentration was doubled after 6 weeks. The fertilizer was applied weekly until the end of grain filling. Insecticide Spinetoram (DOW Agrosciences Australia Limited) was applied as required to control fungal gnat larvae (Orfelia and Bradysia sp.). Pots were watered to ensure the plants had access to adequate moisture. Watering was stopped to individual pots when pod color turned to light yellow. The average temperature of the glasshouse was 18◦C from May to September 2017.

#### **Diversity panel**

The diversity panel (110 genotypes including two controls – WLtolerant Kaspa and WL-sensitive BM-3) was screened for agromorphological traits in the UWA glasshouse in a randomized complete block design with two replications. Seed sowing and other management practices were the same as for the RIL population above. The experiment was conducted in the UWA glasshouse with an average temperature of 22◦C from September to December 2016.

Stem base width and plant height of five plants were measured 3 weeks after sowing using digital Vernier caliper (Kincrome, Australia) and 30-cm plastic scale (Promotion Products, Australia), respectively. Flower color and leaf axil pigmentation were noted at flowering using UPOV pea descriptors (International Union for the Protection of New Varieties of Plants [UPOV], 2009) were used with 1–3 (1 = white, 2 = pink, and 3 = purple) and 1–2 (1 = absent and 2 = present as single ring) scoring scales, respectively. Time to 50% flowering (d) was recorded for individual plants. Testa color and seed weight were recorded after drying for 3 months at room temperature following harvest. Testa color was scored with a 1–9 (1 = light yellow, 2 = yellow pink, 3 = waxy, 4 = yellow–green, 5 = gray– green, 6 = dark green, 7 = light brown, 8 = brown, and 9 = black) scoring scale (Pavelkova et al., 1986). The color of flower, leaf axil, and testa was observed and confirmed by horticultural color chart (Wilson, 1942).

#### Experiment 3: Testa Leakage and WL Tolerance **RIL population**

To assay for testa integrity/leakage under WL conditions 50 genotypes with contrasting responses (i.e., 25 tolerant and 25 sensitive RIL lines) to WL treatment were selected from the 108 RIL population. Testa color was categorized into two groups based on scoring scale of Pavelkova et al. (1986), where 1–6

score for light and 7–8 for dark colored testa. In the WLtolerant group all genotypes had dark testa, but the sensitive group comprised 3 dark and 22 light testa genotypes. The testa of tolerant parent (Kaspa) was dark in color, whereas the sensitive parent BM-3 had a light colored testa. The 50 genotypes were subjected to a submergence treatment with eight replications in a completely randomized design. An individual seed representing a replicate of each genotype was submerged in a 50 ml centrifuge tube (SARSTEDT, Germany) containing 40 ml of 0.5 mM CaSO<sup>4</sup> solution and incubated in a germination cabinet at 25◦C temperature with 12:12 light–dark cycle for 6 d.

#### **Diversity panel**

A total of 50 genotypes with contrasting responses (i.e., 25 tolerant and 25 sensitive) to WL were selected from the 110 genotype diversity panel. The tolerant 25 genotypes comprised 20 dark and 5 light colored testa, whereas sensitive 25 genotypes comprised of 23 light and 2 dark testa. Experimental design and growth conditions were similar to that of the RIL population.

Electrical conductivity of submergence solution was measured after 6 d of treatment with an AQUA-PH v1.0 conductivity meter (TPS, Brisbane, QLD, Australia). Seeds were germinated in CaSO<sup>4</sup> solution so the germination was counted at the end of the experiment on day 6. Seeds with a radicles longer than 3 mm were considered as germinated. Germination was reported in percent based on the number of seeds germinated from eight seeds of each genotypes.

#### Statistical Analysis

Data were analyzed using GenStat 16th edition for Windows statistical software (VSN International, United Kingdom). Analyses of variance (ANOVAs) were undertaken to determine the effects of the different treatments, and least significant differences (l.s.d) at P > 0.05 calculated for significant differences between treatments, genotypes, and interaction means. A one-way ANOVA was also conducted by region of origin. Spearman's rank correlation coefficient was calculated by STAR statistical software, version 2.0.1 2014 (Biometrics and Breeding Informatics, PBGB Division, International Rice Research Institute, Los Baños, Philippines). Chisquare tests for goodness-of-fit was conducted to measure the inheritance of testa color.

#### Response to Selection

The broad-sense heritability was estimated by: σ 2 <sup>g</sup> = (σ 2 g )/(σ 2 <sup>g</sup> + σ 2 e ), where σ 2 g and σ 2 e are the estimated genotypic and error variances, respectively (Nyquist and Baker, 1991). The estimated genotypic and error variances were calculated as: σ 2 <sup>g</sup> = (MS<sup>g</sup> – MSe)/r while σ 2 <sup>e</sup> = MSe/r, where MS<sup>g</sup> is the mean square of the population, MS<sup>e</sup> is the residual error, and r is the number of replicates.

Genetic correlations between traits were computed as: rG12 = rP12/ √ (H<sup>2</sup> <sup>1</sup> <sup>×</sup> <sup>H</sup><sup>2</sup> 2 ) (Cooper et al., 1996) where rG12, rP12, H2 1 , and H<sup>2</sup> 2 are the genotypic correlation between traits 1 and 2, phenotypic correlation between the same pair of traits, and heritability of traits 1 and 2, respectively.

The efficiency of indirect selection was estimated as (Cooper et al., 1996; Kumar et al., 2008): CRG/DR<sup>G</sup> = r<sup>G</sup> √ (H<sup>2</sup> s /H<sup>2</sup> g ); where CR<sup>G</sup> is the correlated response to indirect selection for germination based on secondary traits, DR<sup>G</sup> indicates direct response to selection for germination, r<sup>G</sup> is the genotypic correlation, H<sup>2</sup> s and H<sup>2</sup> g represent heritability for the secondary trait and germination, respectively, under waterlogged stress.

#### RESULTS

#### Redox Measurements

In drained control soil the redox potential in the RIL population was 585 ± 5 mV throughout the experimental period. By contrast, the redox potential in waterlogged pots was 318 ± 6 mV throughout the WL period and this increased on draining the pots to 565 ± 13 by 23 d. In the experiment with the diversity panel the redox potential in drained and waterlogged pots followed the same trend as for RIL population experiment.

#### Variation of Waterlogging Tolerance

In the RIL population, all the genotypes including parents showed close to 100% germination in drained soil. However, in waterlogged soil the RIL parents showed contrasting responses in germination (measured as emergence) − tolerant Kaspa 73% and sensitive BM-3 20% (LSD<sup>P</sup> <sup>=</sup> <sup>0</sup>.<sup>05</sup> = 22). The population of 108 RIL lines exhibited segregation from 6 to 93% germination (**Figures 1A,B**). The mean germination of the RIL population was 41%, mid-way between the parents. Significant transgressive segregation was not recorded in either direction. A high broadsense heritability of H<sup>2</sup> = 89% was found for germination under waterlogged conditions for this RIL population.

In the experiment with the diversity panel, in drained soil all the genotypes (controls) showed close to 100% germination. However, in waterlogged soil a wide range in germination was observed from 5 to 100% exhibiting a continuous distribution (**Figures 1C,D**). The mean for germination in the diversity panel was 48%, mid-way between tolerant control Kaspa (68%), and sensitive control BM-3 (22%) (LSD<sup>P</sup> <sup>=</sup> <sup>0</sup>.<sup>05</sup> = 25). Five genotypes significantly (P < 0.05) exceeded the tolerant parent Kaspa in germination under WL, but no genotype was significantly less tolerant than the sensitive BM-3 control. In the diversity panel the broad-sense heritability for germination in waterlogged soil was high at H<sup>2</sup> = 87%.

In the diversity panel, a one-way ANOVA by region of origin showed that geographic region accounted for significant (P < 0.001) variation in WL tolerance at germination (**Figure 2**). Genotypes from Africa (i.e., Ethiopia in the current study) showed highest germination (80%) on average when exposed to soil waterlogged. The poorest performance under waterlogged conditions was from genotypes from the former USSR.

# Morphological Traits and Waterlogging Tolerance

Correlation coefficients showed pair-wise associations between WL tolerance and morphological traits as well as among the

morphological traits (**Table 1**). In the RIL population the strongest positive correlations with WL tolerance were found for the three traits – flower color (r = 0.62), leaf axil pigmentation (r = 0.66), and testa color (r = 0.59) (**Table 1A**). Furthermore, a detailed analysis of testa color showed two distinct parental groups: dark like WL-tolerant Kaspa (**Figure 3A**) and light testa like WL-sensitive BM-3 (**Figure 3B**). Overall, 52 of the 108 genotypes had dark colored and the rest 56 was lightcolored testa that segregated in 1:1 ratio (χ <sup>2</sup> = 0.15, P < 0.001), indicating single gene controlling the trait (**Figure 3C**). The average germination of dark testa RIL genotypes was 58%, which was significantly (P < 0.001) higher than the mean for genotypes with light-colored testa (26%). The range of percent germination was from 8 to 92% for dark and 8 to 65% for light testa genotypes.

In the diversity panel correlations with WL tolerance were similar to the RIL population with r = 0.57 for flower color, leaf axil pigment (r = 0.51), and testa color (r = 0.51) again showing strong positive correlations (**Table 1B**). The trait stem base width exhibited a weaker correlation with WL tolerance. Analysis of testa color showed that 34 out of 110 genotypes had dark-colored testa and the rest 76 had light-colored testa (**Figure 3D**). The mean germination of dark testa colored genotypes was 71%, which was significantly (P < 0.001) higher than the mean for genotypes with light-colored testa (37%). However, the range of percent germination was similar for both dark and light testa genotypes.

#### Solute Leakage/EC and WL Tolerance

In a sub-group of the RIL population comprising contrasting tolerant and sensitive genotypes (i.e., 25 tolerant and 25 sensitive) selected from variation in WL tolerance experiment (**Figure 1**), germination and EC of 0.5 mM CaSO<sup>4</sup> solution following 6 d of submergence were measured and were found strongly correlated (r = −0.94) (**Figure 4A**). In this association there was a clear boundary of EC value of 200 µS cm−<sup>1</sup> g −1 seed between tolerant and sensitive groups. All the genotypes in the tolerant group had a dark testa with a low EC (61– 161 µS cm−<sup>1</sup> g −1 seed). However, all the WL-sensitive samples, composing 22 light and 3 dark testa genotypes had higher EC (220–498 µS cm−<sup>1</sup> g −1 seed). Visual observation showed that genotypes in the WL-tolerant group had intact testa and low EC (**Figure 4B**). Conversely, in the WL-sensitive group many of the genotypes showed dissolved testa and higher EC (**Figure 4C**).

Similarly, in a sub-group of the diversity panel (25 tolerant and 25 sensitive genotypes), germination and EC of 0.5 mM CaSO<sup>4</sup> solution following 6 d of submergence were measured and were found strongly correlated (r = −0.89) (**Figure 4D**). This association again had a clear boundary of EC value of 200 µS cm−<sup>1</sup> g −1 seed separating WL-tolerant and sensitive genotypes. In the WL-tolerant group (i.e., 20 dark and 5 light testa), all the dark testa genotypes had low EC (25– 172 µS cm−<sup>1</sup> g −1 seed) but the five light testa genotypes showed higher EC (222–374 µS cm−<sup>1</sup> g −1 seed) as sensitive group. In contrast, all the genotypes in the WL-sensitive group (i.e., 23 light and 2 dark testa) had higher EC (240–588 µS cm−<sup>1</sup> g −1 seed). The genotypes in the tolerant group again had visually intact testa and low EC (**Figure 4E**). In contrast, many of the genotypes in the sensitive group exhibited dissolved testa and high EC values (**Figure 4F**).

# Direct and Indirect Response to Selection for Waterlogging Tolerance

Direct and indirect responses to selection for WL tolerance were estimated from germination on waterlogged soil, as there was a strong concurrence between the germination in waterlogged soil and germination of seed submerged in CaSO<sup>4</sup> solution in both RIL population (r = 0.95) and diversity panel (r = 0.95) (**Supplementary Figure S1**). The direct response to selection for WL tolerance was based on germination values, while the indirect responses to selection for WL tolerance were based on four secondary traits (EC, testa color, flower color, and axil pigment). All the secondary traits exhibited even higher heritability (H<sup>2</sup> = 0.95–1.00) values than that of germination % (H<sup>2</sup> = 0.89RIL; 0.87diversity panel) when grown on waterlogged soil (**Table 2**). Among the four secondary traits, EC had the highest genetic correlation with germination in RIL population (r<sup>G</sup> = −1.00) and diversity panel (r<sup>G</sup> = −0.90). Comparing the efficiency of indirect selection for germination under waterlogged conditions, among the four secondary traits EC had the highest efficiency for selection in both RIL population (CRG/DR<sup>G</sup> = −1.10) and diversity panel (CRG/DR<sup>G</sup> = −0.98) (**Table 2**).

# DISCUSSION

Waterlogging is a major constraint to crop production globally. Genetic variation is prerequisite for any crop to mitigate WL stress, which is predicted to be more frequent and extreme with climate change in temperate-tropical cropping regions (Lobell et al., 2008). In pea variation for WL tolerance at germination has been reported for only three cultivars (Zaman et al., 2018). From these three, the present study identified the extended variation of germination/WL tolerance (5–100%) first to a RIL population from a bi-parental cross and then to a broad germplasm diversity panel. During WL, due to shortage of oxygen (Armstrong and Drew, 2002), ATP formation is inhibited (Jackson and Drew, 1984) and the oxidation–reduction state between cell membranes becomes unbalanced and membrane permeability is increased. This leads to increased solute leakage (Hsu et al., 2000) (i.e., increased EC in current experiments) and decreased germination in our study. Thus, testa integrity is an indirect evaluation of seed vigor. Furthermore, high broad-sense heritability estimates for WL tolerance at germination were found in both the RIL population (H<sup>2</sup> = 0.89) and the diversity panel (H<sup>2</sup> = 0.87) indicating that most of the variation observed is genetic (Visscher et al., 2008). The frequency distribution of RIL lines for germination under WL showed a continuous variation indicating polygenic control for the trait. This was reinforced by the continuous distribution for WL germination expressed in the diversity panel.

Environmental stress is a powerful force to generate local adaptation through strong directional selection and rapid evolution (Erskine, 1997; Hoffmann and Parsons, 1997). We found that the germplasm most tolerant to WL was from Africa (i.e., Ethiopia), where peas are generally sown at the start of the rains (mid-June to July) at elevations from 1800 to 3000 m a.s.l. (Telaye, 1979; Tsidu, 2012). The prevailing temperature at germination in Ethiopia is warmer than at the pea's domestication region in the Near East where germination occurs during cool wet winter conditions. With the rains in Ethiopia being more intense than those in a Mediterranean winter, the tolerance to WL of Ethiopian genotypes is probably due to their adaptation to excess soil moisture during germination. Similar adaptive potential has been identified in lentil genotypes from Bangladesh, where the crop is often sown onto waterlogged soil in the ricebased cropping system (Malik et al., 2016; Wiraguna et al., 2017). However, such directional selection causes genetic bottleneck in plant breeding; thus, we linked WL tolerance to some phenotypic traits.

Testa color is associated with WL tolerance, for example, dark (red/black/brown) testa genotypes in wheat (Ueno and Takahashi, 1997), rapeseed (Zhang et al., 2008) and soybean (Hou and Thseng, 1991) are tolerant to WL compared to light (white/yellow) testa genotypes. The difference in tolerance

TABLE 1 | Spearman's rank phenotypic correlation coefficients (r) between morphological traits for (A) RIL population (n = 108) and (B) diversity panel (n = 110 genotypes).


TF, time to flower. Significant correlations are shown in bold. <sup>∗</sup>P < 0.05; ∗∗P < 0.01; ∗∗∗P < 0.001.

between dark and light testa genotypes is probably due to the levels of phenolic compounds in the testa, as in the rapeseed study dark testa genotypes had higher levels of phenolic compounds than the sensitive light testa genotypes (Zhang et al., 2008). Higher levels of phenolic or tannin compounds in the testa are considered as a barrier to imbibition, since the dark testa genotypes of pea, faba bean (Vicia faba L.), and Arabidopsis (Arabidopsis thaliana L.) are restricted in imbibition, whereas light testa is completely permeable to water and subsequent solute leakage (Marbach and Mayer, 1974; Kantar et al., 1996; Debeaujon et al., 2000). The present study showed that dark testa genotypes both in RIL population and diversity panel had high

TABLE 2 | Heritability (H 2 ), genetic correlation (rG) of secondary traits with germination on waterlogged soil, and the efficiency of indirect selection for germination (CRG/DRG) were estimated in the RIL population and the diversity panel.


Germination was on the basis of data from waterlogged soil. CR<sup>G</sup> is the correlated response to indirect selection for germination based on secondary traits and DR<sup>G</sup> is the direct response to selection for germination.

percent of germination with lower solute leakage; in contrast, light testa genotypes had low percent of germination and higher solute leakage. Furthermore, genes of wound responsive family protein are highly upregulated in dark testa genotypes in pea during WL stress, which are involved in providing the cell with lignin and phenolic precursors to the wounded surface and provide defense to plants (Zaman et al., 2019). Therefore, it is likely that testa pigmentation plays a protective role against imbibition damage during WL stress. Additionally, in the current study, among RIL population, lines with a dark testa – similar to WL-tolerant parent Kaspa – all had pigmented leaf axils and purple/pink flowers, while the light

testa lines – similar to sensitive parent BM-3 – had green unpigmented leaf axils and white flowers. Similarly in the diversity panel, dark testa genotypes predominantly had pigmented leaf axils and purple/pink flowers, whereas the light testa genotypes had non-pigmented axils and white flowers. The exceptions were a few (6%) genotypes with dark testa and pigmented leaf axils but white flowers, indicating that the effects are not pleiotropic. Such exceptions suggest that the loci for the three traits (flower color, leaf axil pigmentation, and testa color) are linked, and thus any of the traits could be a potential marker/indicator to identify WL tolerance. This finding is consistent with genes for testa and flower color which are located in the linkage group II reported by Reid and Ross (2011). Similarly, Statham et al. (1972) found that flower color in Pisum is controlled by six major genes, where A gene is necessary for general flavonoid production in the plant, and for anthocyanin production in the flowers, axils, and pods. However, Mendel observed that colored seed coats were always associated with colored (purple) flowers, and these colored varieties possessed pigmentation in the leaf axils. By contrast, a colorless testa was always associated with white flowers and the absence of pigmentation in the leaf axils, indicating that these were pleiotropic effects of a single gene. Flower color and leaf axil pigment are reported for the first time to be associated with WL tolerance in the model crop pea in our study.

Testa integrity is a pre-requisite for germination under waterlogged stress. In the present study, testa integrity measured as EC in submerged solution showed a very strong correlation with germination for both RIL population (r = −0.94) and diversity panel (r = −0.89), indicating that testa integrity might be an effective trait for WL tolerance selection. Visual observation in the RIL population and diversity panel showed that all the genotypes in the tolerant group had intact testa with low EC; in contrast, around 90% genotypes in the sensitive group had dissolved testa with higher EC. This is consistent with sudangrass (Sorghum sudanense Stapf) where testa integrity is associated with germination (Hsu et al., 2000). During WL the integrity of testa is lost due to the lipid peroxidation in the testa membrane (Crawford and Braendle, 1996) by the two possible pathways – enzymatic and non-enzymatic. In the enzymatic pathway, due to decreased ATP formation during WL stress, different lipid metabolic enzymes such as lipase and lipoxygenase are induced and cause membrane damage (Rawyler et al., 1999) which is supported by the highly upregulated lipid metabolic genes in the sensitive genotype in pea during WL (Zaman et al., 2019). In the non-enzymatic pathway, excessive amount of reactive oxygen species (ROS) are accumulated during WL stress that reacts with lipids in the cell membranes cause oxidative damage and eventually cell death in the testa membrane. As a result of membrane damage in the testa, the electrolytes – in particular potassium ion (K+), along with seeds solutes including sugars and amino acids – are released from seeds (De Vos, 1993). Thus, we can use the amount of electrolyte leaked from the seeds as a proxy for the extent of testa leakage and tolerance under waterlogged stress. However, tolerant genotypes control the lipid peroxidation/testa integrity by neutralizing ROS activity in cells by producing increased antioxidant enzymes such as superoxide dismutase, ascorbate peroxidase, glutathione reductase, and catalase during WL (Kumutha et al., 2009). The concentrations of antioxidants are positively correlated with phenolic compounds [Canola seed, Jun et al., 2014; hazelnuts (Corylus avellana L.), walnuts (Juglans nigra L.), and pistachios (Pistacia vera L.), Arcan and Yemenicioglu, 2009 ˘ ]; and seeds with dark testa/pigmentation exhibit higher levels of phenolic compounds, lower leakage, and higher WL tolerance than the seeds with a lighter colored testa (Zhang et al., 2008, rapeseed). In the present study, 90% dark testa genotypes showed tolerant with intact testa whereas all the light testa genotypes were sensitive with dissolved testa. Therefore, it may be inferred that the antioxidant properties of dark testa seeds are key to testa integrity/WL tolerance in pea.

Peas are a major pulse crop globally, but are more sensitive to WL than other pulses (Solaiman et al., 2007). Predictions of global warming and climate variability in South Asia suggest a change in the inter-annual precipitation pattern – alterations in the intensity of rainfall events (Sivakumar and Stefanski, 2010), and delayed monsoon rains (Li et al., 2017), which could significantly change crop productivity – particularly in pea which is highly sensitive to WL and sown as a relay crop in waterlogged soil. However, as the present study has illustrated the extent of variation in WL tolerance at germination in pea, its polygenic control and the possibilities for indirect selection for WL tolerance, the prospect is raised of adapting the pea, which originates from a rainfed Mediterranean environment, into stable production under relay-sowing in moist soils with rice. We anticipate that selection for WL tolerance at an early stage of crop growth could significantly improve the reliability of relay sowing and help climate-proof production as part of a strategy to enhance productivity in South Asia.

#### DATA AVAILABILITY

All datasets generated for this study are included in the manuscript and/or the **Supplementary Files**.

#### AUTHOR CONTRIBUTIONS

MZ designed the study in consultation with AM, PK, and WE. FR was involved in the RIL development with the advancement from F<sup>2</sup> to F<sup>6</sup> generation. MZ conducted the experiment, analyzed the data, and wrote the first draft of the manuscript. All authors contributed to the manuscript revision and editing and approved the submitted version of the manuscript.

#### FUNDING

This work was supported through projects CIM-2009-038 and CIM-2014-076 funded by Australian Centre for International Agricultural Research and by the Centre for Plant Genetics and Breeding (PGB), The University of Western Australia, Crawley, WA, Australia.

#### ACKNOWLEDGMENTS

fpls-10-00953 July 27, 2019 Time: 14:57 # 10

MZ gratefully acknowledges the support of a John Allwright Fellowship award from the Australian Centre for International Agricultural Research (ACIAR).

#### REFERENCES


#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00953/ full#supplementary-material



water treatment. Euphytica 94, 169–173. doi: 10.1023/A:10029767 32395


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Zaman, Malik, Kaur, Ribalta and Erskine. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# KnowPulse: A Web-Resource Focused on Diversity Data for Pulse Crop Improvement

Lacey-Anne Sanderson, Carolyn T. Caron, Reynold Tan, Yichao Shen, Ruobin Liu and Kirstin E. Bett\*

*Department of Plant Sciences, University of Saskatchewan, Saskatoon, SK, Canada*

KnowPulse (https://knowpulse.usask.ca) is a breeder-focused web portal for pulse breeders and geneticists. With a focus on diversity data, KnowPulse provides information on genetic markers, sequence variants, phenotypic traits and germplasm for chickpea, common bean, field pea, faba bean, and lentil. Genotypic data is accessible through the genotype matrix tool, displayed as a marker-by-germplasm table of genotype calls specific to germplasm chosen by the researcher. It is also summarized on genetic marker and sequence variant pages. Phenotypic data is visualized in trait distribution plots: violin plots for quantitative data and histograms for qualitative data. These plots are accessible through trait, germplasm, and experiment pages, as well as through a single page search tool. KnowPulse is built using the open-source Tripal toolkit and utilizes open-source tools including, but not limited to, species-specific JBrowse instances, a BLAST interface, and whole-genome CViTjs visualizations. KnowPulse is constantly evolving with data and tools added as they become available. Full integration of genetic maps and quantitative trait loci is imminent, and development of tools exploring structural variation is being explored.

Keywords: legumes, pulses, web resource, diversity, genotypic data, phenotypic data

#### INTRODUCTION

Legumes are immensely important in agricultural ecosystems with the legume family (Leguminosae) being second only to the grass family (Poaceae) in economic and nutritional value (Graham and Vance, 2003). Grain legumes, also known as "pulses," are primarily marketed for human consumption and are a good source of dietary fiber, protein, slow-release carbohydrates, B vitamins, iron, copper, magnesium, manganese, zinc, and phosphorous (Tharanathan and Mahadevamma, 2003; Polak et al., 2015). They are also naturally low in fat, virtually free of saturated fat and cholesterol free (Polak et al., 2015). In recent years there has been an explosion of genome assemblies for legumes (Varshney et al., 2009, 2012, 2013; Schmutz et al., 2010, 2014; O'Rourke et al., 2014; Tang et al., 2014; Parween et al., 2015; Pandey et al., 2016). In addition, there has been a dramatic increase in sequence variation data (Kamfwa et al., 2015; Boutet et al., 2016; Moghaddam et al., 2016; Pandey et al., 2016; Gali et al., 2018; Ogutcen et al., 2018). In order to maximize the usefulness of this data, it should be curated with connections between phenotypic and genotypic data verified in a web resource which is friendly to both breeders and researchers.

#### Edited by:

*Matthew Nicholas Nelson, Agriculture and Food (CSIRO), Australia*

#### Reviewed by:

*Ethalinda K. S. Cannon, Iowa State University, United States Elisa Bellucci, Marche Polytechnic University, Italy*

> \*Correspondence: *Kirstin E. Bett k.bett@usask.ca*

#### Specialty section:

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

Received: *03 April 2019* Accepted: *10 July 2019* Published: *31 July 2019*

#### Citation:

*Sanderson L-A, Caron CT, Tan R, Shen Y, Liu R and Bett KE (2019) KnowPulse: A Web-Resource Focused on Diversity Data for Pulse Crop Improvement. Front. Plant Sci. 10:965. doi: 10.3389/fpls.2019.00965*

**173**

Several legume-focused databases have been developed including Legume Information System (LIS; https:// legumeinfo.org, Dash et al., 2015), Medicago truncatula Genome Database (http://www.medicagogenome.org, Krishnakumar et al., 2014), SoyBase (https://www.soybase.org/, Grant et al., 2009), PeanutBase (https://peanutbase.org; Dash et al., 2016), and Cool Season Food Legume Database (https:// www.coolseasonfoodlegume.org/). While these resources are invaluable to their crop-specific and comparative communities, none provide the integration between germplasm, genotypic and phenotypic data to adequately develop the genetic markers useful in pulse breeding programs.

Over 100 plant and animal databases use Tripal (https://www. drupal.org/project/tripal; http://tripal.info/sites\_using\_tripal, Sanderson et al., 2013), an open-source, highly customizable toolkit providing efficient development of biological web portals. Tripal extends the popular Drupal content management system (CMS). Use of a CMS enables developers to focus on the specific needs of their community without the overhead of user and security management, or the database schema design frequently associated with web portal development. Tripal's use of the Generic Model Organism Database (GMOD) Chado schema (Mungall and Emmert, 2007) provides flexible support for biological data, while facilitating the exchange of data and expertise among Tripal sites through common infrastructure.

KnowPulse, a breeder-focused web portal, was first released in 2010 to serve the pulse breeders at the University of Saskatchewan. There is a focus on common bean, chickpea, field pea, lentil and faba bean, as these are the crops of interest in their program. KnowPulse is built using Tripal, with the purpose of serving as a reliable data storage solution with metadata preservation. It has since evolved into a public resource by housing a large number of continually expanding datasets focused on genetic variation. We describe the novel genetic variation display and tools of KnowPulse below to inform the greater legume community.

#### MATERIALS AND METHODS

#### Datasets

KnowPulse houses data for chickpea, dry bean, field pea, lentil, and faba bean. The magnitude of all data is summarized by type (e.g., germplasm, genotypes, phenotypes) on the home page. There is information on Genebank accessions and University of Saskatchewan cultivars. Users can access a number of genotypic (i.e., genetic markers, sequence variants, and genotypic calls) and phenotypic (i.e., traits, experiments, and measurements) datasets. Lastly, the pre-release genomic sequence information for Lens culinaris is available through the web portal by request. In an effort to provide researchers with data as soon as possible, KnowPulse houses unpublished data. However, all data is required to have a long-term data management plan ensuring integrity and availability.

#### Implementation

KnowPulse uses Drupal 7 (https://www.drupal.org/), an opensource enterprise-level content management system, and Tripal 3, which extends Drupal for biological data. The modular PHP framework provided by Drupal and Tripal allows KnowPulse to use community-contributed extensions and an advanced administrative interface to speed up development time and provide more functionality to users. The core Tripal modules power the ontology-driven content pages (e.g., genetic markers, germplasm accessions, research projects), content-type specific searches and semantic web-ready web services for all content. Customized displays were developed through extension modules. The entire technology stack is open-source and all extension modules are publicly available on GitHub and open to collaboration (https://uofs-pulse-binfo.github.io/our-modules/).

All data, excluding the BLAST databases, are stored in a single PostgreSQL instance using the Drupal schema and GMOD Chado schema (Mungall and Emmert, 2007) for web-related data and biological data, respectively. PostgreSQL constraints and data type checking ensure data integrity and standards compliance. For example, genotypic data must be linked to the germplasm assayed, the experiment, and the genetic marker including assay information. Well-chosen indices and materialized views mitigate any performance issues incurred by use of a relational database by speeding up queries. This combination allows us to meet the speed and data integrity needs of the user.

#### Permissions and Accessibility

KnowPulse acts as both a public data portal and a private breeding program management system. All the functionality described herein is publicly available unless otherwise stated. Since KnowPulse provides access to pre-publication data, you may find restrictions on download for specific datasets and watermarked charts. Private data and tools can be accessed via a user account with specific permissions. If you need access to private data for your research, please contact Dr. Kirstin Bett, corresponding author, with an explanation and in most cases we will be happy to collaborate with you.

# RESULTS AND DISCUSSION

#### Genomic Variation

In the genomic context, genotypic data are particularly important in KnowPulse. These data are used by researchers for marker development and association studies with the ultimate goal of facilitating pulse crop breeding. KnowPulse provides a germplasm-by-variant genotype matrix for researchers to explore genotypic data for their germplasm set (**Figure 1**). Since genotypic datasets are increasingly expanding, this tool provides filter options including experiment, variant list, genomic position, marker or variant type, and pairwise polymorphisms. Additionally, if the data is overwhelming to analyze within the browser, users can request permission to downloaded it via KnowPulse in a variety of formats (e.g., comma-separated values, hapmap).

Sequence variants and genetic markers are each represented with their own pages in KnowPulse. Sequence variant pages list all the markers available for a given genomic position, whereas genetic marker pages provide details for a specific marker assay.

FIGURE 1 | Germplasm by variant genotype matrix. This screenshot shows the genotype matrix for CDC Rosetown, CDC Blaze, CDC Vantage, and ILL 7502 restricted to the beginning of LcChr1. The form near the top provides additional filter options while the color-coded table below shows the allele calls for each known variant. Researchers can use this tool to inspect the genotypes of a region of interest (e.g., QTL region) for their germplasm set. This tool can be accessed in the right side menu under Genomic Data > Sequence Variants > Lentil Genotypes.

This distinction allows researchers to evaluate genotypes in context of the assay. Additionally, genetic marker pages pinpoint the location of the variant on each available genome assembly. More advanced features include: the flanking sequence with additional known variants indicated using their IUPAC codes, a pie chart summarizing the allele calls recorded, and a link to the genotype matrix to access specific calls for germplasm of interest (**Figure 2A**). Sequence variant pages reveal similar information with the context of all markers for that variant for comparison (**Figure 2B**).

A number of tools which provide further context to these genetic markers through whole-genome visualizations include CViTjs (https://github.com/LegumeFederation/cvitjs) and JBrowse (Buels et al., 2016). CViTjs provides whole-genome views of specific datasets such as gene and genetic marker distribution. These are available on KnowPulse for chickpea, common bean, lentil, soybean, and medicago (**Figure 3A**). CViTjs charts allow researchers to see broad trends across the genome; whereas, JBrowse instances are highly suitable for graphical browsing of a specific region of interest. KnowPulse has JBrowse instances for kabuli chickpea (v1.0, Varshney et al., 2013), common bean (v1.0, Schmutz et al., 2014), lentil (v1.2, Ramsay et al., 2014), soybean (v2.0, Schmutz et al., 2010), and medicago (v4.0, Tang et al., 2014) with tracks for gene sets, genetic markers, and putative orthologs from related species (**Figure 3B**).

Tripal BLAST (https://www.drupal.org/project/tripal\_blast) provides sequence alignment searches for users with a region of interest but no prior information about its location in hosted genome assemblies. In KnowPulse, users can BLAST against pulse-specific datasets such as genome and transcript assemblies for crops (i.e., chickpea, common bean, field pea, and lentil), related wild species, and model legume species (i.e., soybean, lotus, medicago). The user simply enters their sequence in the search box, selects the dataset to BLAST against and clicks BLAST, which uses NCBI BLAST+ command-line tools (Camacho et al., 2009) to perform the search. The results are then displayed in a table with links to the appropriate JBrowse.

#### Phenomic Variation

With our focus on variation data, phenomics is a very important component of KnowPulse. Not only are phenotypic data used for association studies and marker discovery, they are also used for breeding activities such as germplasm selection and identification. As such, visualizations focus on the distribution of phenomic data, often in reference to specific germplasm and between site-years within an experiment.

KnowPulse provides trait distribution plots to summarize phenotypic data for a given experiment. Data from different site-years are stored separately but averaged across replicates. For quantitative data, violin plots are used to demonstrate data structure (i.e., median, interquartile range, and 95% confidence interval) and distribution. The x-axis labels each site-year, whereas the y-axis labels the observed values for the given trait (**Figure 4A**). Qualitative data is summarized with histograms which consist of a series for every site-year (**Figure 4B**). In both plot types, the phenotypic value for a given germplasm can be highlighted within the context of the larger dataset. This proves quite helpful in breeding programs to provide additional data for selections, highlight potential planting errors, and plan crosses.

Trait distribution plots can be accessed in a number of different ways. Plots are found on all associated trait, germplasm,

and experiment pages. There is also a tool which allows users to generate their own plots based on KnowPulse-housed data. This kind of integration ensures that the system is intuitive to all users. Context and summaries for the trait, experiment or germplasm being viewed is also provided.

Additionally, trait pages in KnowPulse contain an overview describing the trait, linking it to ontologies and describing the methodology used for data collection (**Figure 5A**). Experiments in which the traits were measured are listed, along with information on the number of associated site-years (**Figure 5B**).

observed values labeled by the y-axis. This allows researchers to see the data structure (i.e., median, interquartile range, and 95% confidence interval) and distribution per site year. Qualitative phenotypic data (B) is shown as a multi-series histogram with each series representing a site-year and the observed phenotypes defined on the x-axis. The quantity of germplasm exhibiting each phenotype is shown on the y-axis allowing researchers to evaluate how prevalent a phenotype is in their population. These plots can be accessed via the trait distribution plot tool under Phenotypic data in the right side menu, as well as through trait, germplasm, and project pages with associated phenotypic data.

Trait pages can be accessed via the crop-specific trait search in the right side menu under phenotypic data.

Traits can be searched for by keyword and filtered by a minimum number of site-years or germplasm.

#### Germplasm

At the core of KnowPulse are the germplasm collections including both public diversity panels and private crossing blocks. Germplasm pages contain all metadata stored in KnowPulse (e.g., origin, name, synonyms, accessions, known parents). Known pedigrees are displayed in a tree diagram with collapsible nodes (**Figure 6A**). The magnitude of genotypic data available for that individual is indicated, followed by a quick marker search and a link to the genotype matrix (**Figure 6B**). Similarly, the phenotypic data section contains an indication of magnitude, trait quick search, and access to the trait distribution plot (**Figure 6C**). Specialized searches depending on the type of germplasm (e.g., accessions vs. breeding material) with specific filter criteria are available. For example, accessions can be searched by name or accession; whereas, breeding material can also be restricted by crossing block.

#### CONCLUSION

As a breeder-focused resource, KnowPulse emphasizes germplasm information and variation. Both genotypic and phenotypic data are supported with rich visualizations and detailed pages. Future enhancements include support for genetic maps and quantitative trait loci (QTL), as well as enhanced displays for exploring structural variation. KnowPulse is continually updated as new data become available.

#### DATA AVAILABILITY

All datasets analyzed for this study are included in the manuscript and/or the supplementary files.

# AUTHOR CONTRIBUTIONS

L-AS provided the initial concept and design for the resource with extensive input from KB. L-AS and CC wrote the manuscript. L-AS, CC, RT, and YS contributed significantly to the development of the resource. CC, RL, and L-AS curated the data and populated the resource. KB contributed a lot of the data held in the resource. All authors contributed to manuscript revision, read and approved the submitted version.

#### FUNDING

This work was supported by Saskatchewan Pulse Growers [grant: BRE1516, BRE0601], Western Grains Research Foundation, Genome Canada [grant: 8302], Government of Saskatchewan [grant: 20150331], and the University of Saskatchewan.

#### ACKNOWLEDGMENTS

KnowPulse would not be where it is today without the continued development and maintenance of Tripal, and thus we extend a big thank you to the entire Tripal community and Dr.

#### REFERENCES


Stephen Ficklin. KnowPulse is a member of the Legume Federation and, as such, we would like to thank them for their guidance and their collaborative efforts. We are also grateful for the support, guidance, and feedback from our community and colleagues in the Pulse group at the University of Saskatchewan. Some of the development work was done under AGILE, a Genome Canada funded project managed by Genome Prairie.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Sanderson, Caron, Tan, Shen, Liu and Bett. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

**181**

,

# Transcriptional Reprogramming of Pea Leaves at Early Reproductive Stages

Karine Gallardo<sup>1</sup> \*, Alicia Besson<sup>1</sup> , Anthony Klein<sup>1</sup> , Christine Le Signor <sup>1</sup> , Grégoire Aubert <sup>1</sup> Charlotte Henriet <sup>1</sup> , Morgane Térézol <sup>1</sup> , Stéphanie Pateyron<sup>2</sup> , Myriam Sanchez <sup>1</sup> , Jacques Trouverie<sup>3</sup> , Jean-Christophe Avice<sup>3</sup> , Annabelle Larmure<sup>1</sup> , Christophe Salon<sup>1</sup> , Sandrine Balzergue<sup>2</sup> and Judith Burstin<sup>1</sup>

<sup>1</sup> Agroécologie, AgroSup Dijon, Institut National de la Recherche Agronomique, Université Bourgogne Franche-Comté, Dijon, France, <sup>2</sup> IPS2, Institute of Plant Sciences Paris-Saclay (Institut National de la Recherche Agronomique, Centre National de la Recherche Scientifique, Université Paris-Sud, Université d'Evry, Université Paris-Diderot, Sorbonne Paris-Cité, Université Paris-Saclay), POPS-Transcriptomic Platform, Saclay Plant Sciences (SPS), Orsay, France, <sup>3</sup> Normandie Université, Institut National de la Recherche Agronomique, Université de Caen Normandie, UMR INRA–UCBN 950 Ecophysiologie Végétale et Agronomie, SFR Normandie Végétal FED 4277, Caen, France

#### *Edited by:*

Penelope Mary Smith, La Trobe University, Australia

#### *Reviewed by:*

Alistair McCormick, University of Edinburgh, United Kingdom Pedro Carrasco, University of Valencia, Spain John William Patrick, University of Newcastle, Australia Yong-Ling Ruan, University of Newcastle, Australia

*\*Correspondence:* Karine Gallardo karine.gallardo-guerrero@inra.fr

#### *Specialty section:*

This article was submitted to Plant Physiology, a section of the journal Frontiers in Plant Science

*Received:* 12 February 2019 *Accepted:* 19 July 2019 *Published:* 07 August 2019

#### *Citation:*

Gallardo K, Besson A, Klein A, Le Signor C, Aubert G, Henriet C, Térézol M, Pateyron S, Sanchez M, Trouverie J, Avice J-C, Larmure A, Salon C, Balzergue S and Burstin J (2019) Transcriptional Reprogramming of Pea Leaves at Early Reproductive Stages. Front. Plant Sci. 10:1014. doi: 10.3389/fpls.2019.01014 Pea (Pisum sativum L.) is an important source of dietary proteins. Nutrient recycling from leaves contributes to the accumulation of seed proteins and is a pivotal determinant of protein yields in this grain legume. The aim of this study was to unveil the transcriptional regulations occurring in pea leaves before the sharp decrease in chlorophyll breakdown. As a prelude to this study, a time-series analysis of <sup>15</sup>N translocation at the whole plant level was performed, which indicated that nitrogen recycling among organs was highly dynamic during this period and varied depending on nitrate availability. Leaves collected on vegetative and reproductive nodes were further analyzed by transcriptomics. The data revealed extensive transcriptome changes in leaves of reproductive nodes during early seed development (from flowering to 14 days after flowering), including an up-regulation of genes encoding transporters, and particularly of sulfate that might sustain sulfur metabolism in leaves of the reproductive part. This developmental period was also characterized by a down-regulation of cell wall-associated genes in leaves of both reproductive and vegetative nodes, reflecting a shift in cell wall structure. Later on, 27 days after flowering, genes potentially switching the metabolism of leaves toward senescence were pinpointed, some of which are related to ribosomal RNA processing, autophagy, or transport systems. Transcription factors differentially regulated in leaves between stages were identified and a gene co-expression network pointed out some of them as potential regulators of the above-mentioned biological processes. The same approach was conducted in Medicago truncatula to identify shared regulations with this wild legume species. Altogether the results give a global view of transcriptional events in leaves of legumes at early reproductive stages and provide a valuable resource of candidate genes that could be targeted by reverse genetics to improve nutrient remobilization and/or delay catabolic processes leading to senescence.

Keywords: legumes, leaves, reproductive period, nitrogen remobilization, transcriptomics, co-expression, transcription factors, transporters

# INTRODUCTION

Grain legumes accumulate large amounts of proteins in their seeds, which are widely used for human and animal nutrition. In legumes, symbiotic nitrogen fixation, and nitrate uptake by roots are two complementary modes of nitrogen acquisition that decline during the reproductive period (Salon et al., 2001). Nitrogen stored in plant parts is then remobilized to sustain seed protein accumulation. The contribution of nitrogen remobilization to seed protein yield varies from 45 to 90%, depending on the species and conditions (Warembourg and Fernandez, 1985; Kurdali et al., 1997). In pea (Pisum sativum L.), 70% of the amount of nitrogen in mature seeds is derived from remobilization processes (Jensen, 1987; Schiltz et al., 2005). The chloroplast enzyme ribulose-1,5-bisphosphate carboxylase/oxygenase, which plays an essential role in carbon fixation, is one major source of nitrogen in leaves (Jiang et al., 1993). Its degradation starts before leaf senescence, a catabolic process leading to yellowing, chloroplast disassembly, and finally cell death (Kohzuma et al., 2017). Because most leaf nitrogen is stored in the form of proteins with roles in the photosynthetic machinery, nitrogen remobilization may affect photosynthetic activities, which may curtail the reproductive period and limit seed yield. Nutrient deficiencies, high temperature and drought, are environmental factors accelerating leaf senescence, thereby shortening the reproductive period and impacting negatively seed filling (Olsson, 1995; Srivalli and Khanna-Chopra, 1998). Stay-green varieties, where leaf senescence is delayed, are used in some cereal improvement programs since they display a greater grain yield under post-anthesis drought (Borrell et al., 2001). However, stay-green phenotypes are not necessarily associated with higher yields, especially when chlorophyll catabolism is blocked since the active degradation of chlorophyll is a prerequisite for nitrogen remobilization from the pigmentassociated proteins (Thomas, 1997; Thomas and Howarth, 2000). Hence, optimizing the balance between nutrient recycling and leaf longevity is necessary to increase and stabilize protein yield. This requires the identification of the underlying molecular determinants that could be targeted in breeding programs for higher and stable protein yields.

The mechanisms controlling nutrient recycling have been mainly studied during senescence associated with leaf yellowing. Genes up-regulated during this process, generally referred to as senescence-associated genes (SAGs) or senescence-enhanced genes, were identified (Buchanan-Wollaston et al., 2005). Several SAGs are related to autophagy, a vesicular trafficking process that regulates nutrient recycling and remobilization by participating in the methodical degradation of the cell constituents (Masclaux-Daubresse et al., 2017). Several lines of evidence indicate that senescence-related transcription factors (TFs) can directly regulate autophagy genes in plants (Garapati et al., 2015). Transcriptomics revealed that a large number of NAC (no apical meristem, transcription activation factors, and cup-shaped cotyledon) TFs are expressed during leaf senescence (Balazadeh et al., 2010; Breeze et al., 2011; Yang et al., 2016). Functional studies in Arabidopsis showed that NACs can act as positive or negative regulators of senescence (Yang et al., 2011; Liang et al., 2014; Garapati et al., 2015; Zhao et al., 2015; Pimenta et al., 2016). However, we are far from a comprehensive understanding of the pathways and regulatory networks influencing nutrient recycling in crops, especially in grain legumes such as pea, a monocarpic species that exhibits different patterns of whole plant senescence compared to Arabidopsis, and in which the production of seeds triggers nutrient remobilization (Noodén and Penney, 2001; Pic et al., 2002). The aim of the present study was to unveil the transcriptional reprogramming of pea leaves at stages preceding the sharp decrease in chlorophyll breakdown. Nitrogen remobilization between tissues was highly dynamic during this period, as shown through a time-series analysis of the translocation of <sup>15</sup>N absorbed in the form of nitrate up to flowering. Leaves of the vegetative and reproductive nodes were analyzed by transcriptomics and a gene co-expression approach was used to highlight potential regulators of specific biological processes. The same approach in the fodder legume species M. truncatula revealed a number of shared co-expression modules.

# MATERIALS AND METHODS

#### Plant Growth Conditions

Pea (Pisum sativum L, genotype "Caméor") and Medicago truncatula (M. truncatula, Gaertn., A17 genotype) plants were grown in a greenhouse under controlled temperature (at least 18◦C during the day and 15◦C during the night) and photoperiod (16h/d). M. truncatula seeds were scarified and vernalized 4d at 5 ◦C before sowing. Plants were grown in 7L (pea) or 3L (M. truncatula) pots containing 40% attapulgite and 60% clay balls. Plants were not inoculated with Rhizobia. Nitrogen nutrition of all plants relied on the absorption of nitrate for the purpose of long-term <sup>15</sup>N-labeling. Two nitrogen availability conditions were used. Control plants (N+) were supplied with the nutrient solution previously described (Zuber et al., 2013) until tissue collection. For N– plants, nitrate was depleted at the beginning of flowering using the same solution without KNO<sup>3</sup> and Ca(NO3)<sup>2</sup> (replaced by 1.85 mM KCl and 0.25 mM CaCl2). Leaf chlorophyll content at the first flowering node was measured using a SPAD-502 chlorophyll meter on 12–16 plants per condition and stage (Minolta Camera Co. Ltd., Japan). The plant, pod and seed characteristics in **Table S1** were measured at maturity (63 days after flowering) from eight biological replicates (i.e., individual plants). An analysis of variance was performed to reveal significant effects of nitrogen limitation on these traits (Statistica v7.0 software).

# Dynamic of Nitrogen Remobilization at the Whole Plant Level in Pea and *M. truncatula*

For each time point [beginning of flowering, 14, 27, and 63 days after flowering (DAF)], six plants were used per condition (N+, N–): four plants were supplied with the nutrient solutions described above labeled with 3 atom% excess of <sup>15</sup>N (as K15NO3) until flowering (i.e., 35 days labeling), and two unlabeled plants served to estimate natural <sup>15</sup>N abundance. The pots were organized in a randomized complete-block design. For each time point and condition, leaves of the vegetative nodes (lower leaves), and reproductive nodes (upper leaves), stems, roots, pods (M. truncatula), seeds, and pod wall (pea) were harvested separately. The dry matter of each tissue was determined after oven-drying at 80◦C for 48 h. All tissues were ground using the cutting mill SM200 (Retsch, Haan, Germany), then using the ZM 200 grinder (Retsch). Total N and <sup>15</sup>N/14N ratio were determined from 5 mg powder using a PDZ Europa ANCA-GSL elemental analyzer interfaced to a PDZ Europa 20-20 isotope ratio mass spectrometer (Sercon Ltd., Cheshire, UK). The calculation of endogenous nitrogen (i.e., stored during the vegetative phase) remobilized across plant tissues between two developmental stages was determined from elemental and isotope amounts in the different organs using the PEF (Plant Elemental Flux) tool developed in visual basic applications (Salon et al., 2014). The quantitative values for nitrogen remobilized (mg) from or to each tissue between two time points were subjected to a t-test using Statistica software (v7.0) to reveal significant effects of nitrogen deficiency on the quantity of nitrogen remobilized from each tissue.

#### Leaf Samples and RNA Extraction

Lower and upper leaves were collected from 6 to 8 individual plants deprived or not of nitrate, at three stages: flowering, 14 and 27 DAF. The absence of nodules on the root system was checked at the time of tissue collection. Lower leaves corresponded to leaves of the two last vegetative nodes and upper leaves corresponded to leaves of nodes carrying flowers at the flowering stage, and to leaves of the third and fourth reproductive nodes at 14 and 27 DAF. The leaf samples were immediately frozen in liquid nitrogen, then stored at −80◦C. RNA was extracted from 100 mg of frozen powder using the RNeasy Plant Mini Kit according to manufacturer's protocol (Qiagen, Courtaboeuf, France). RNA quality was checked on agarose gel 1.5%, then using the Agilent 2100 Bioanalyzer.

# RT-qPCR Using *ELSA* as Indicator of Leaf Senescence

For profiling the expression of the Early Leaf Senescence Abundant cysteine protease gene (ELSA) (Pic et al., 2002) by RTqPCR, leaf samples collected at flowering, 14 and 27 DAF on plants deprived or not of nitrate (n = 6–8) were used. RT-qPCR was performed with the iScript cDNA synthesis kit according to manufacturer's protocol (Bio-Rad, Marnes-la-Coquette, France) and the GoTaq qPCR Master Mix (Promega, Charbonnières, France) using 10 ng cDNA and 0.2µM of each primer in a final volume of 5 µl. Analyzes were performed in triplicates from each biological replicate using the LightCycler 480 system (software v1.5.0, Roche, Meylan, France) as previously described (Zuber et al., 2013). The normalization method was 11ct using actine, histone, and EF1 α as reference genes (primers in **Table S2**). Analyses of variance and Student-Newman-Keuls (SNK) tests using the Statistica software (v7.0) revealed significant changes in gene expression between stages and/or in response to nitrate deficiency.

# Transcriptomics of Leaves and Validation by RT-qPCR

Three biological replicates of leaves from vegetative and reproductive nodes were subjected to transcriptomics. Pea NimbleGen-microarrays were developed to profile expression of 40795 sequences: 40454 mRNA originating from the PsCameor\_Uni\_Lowcopy set (Alves-Carvalho et al., 2015), 323 putative precursors of miRNA predicted in the "Test assembly multiple k-mer" contig set (Alves-Carvalho et al., 2015), and 18 controls. Two specific oligonucleotides were used for each mRNA sequence and one oligonucleotide was used per miRNA precursor sequence (forward and reverse). These probes were spotted in triplicates on the GENOPEA array. M. truncatula NimbleGenmicroarrays (Herrbach et al., 2017) were used in parallel. They represent 83029 probes (spotted in triplicates) corresponding to transcribed regions of the M. truncatula genome from the Symbimics program (https://iant.toulouse.inra.fr/symbimics/). The Ambion MessageAmpTM II aRNA Amplification Kit was used to amplify sufficient amounts of copy RNA extracted, as described above, from upper leaves and lower leaves of three biological replicates (independent plants). The Double stranded cDNA synthesis was realized using T7-oligo-dT and the antisense RNA (aRNA) was created by in vitro transcription according to manufacturer's protocol (Life technologies SAS, Saint Aubin, France). The labeling with Cy3 or Cy5 was performed by reverse transcription of aRNA using labeled nucleotides (Cy3 dUTP or Cy5-dUTP, Perkin-Elmer-NEN Life Science Products). For each nutritional condition and leaf type, the following cohybridizations were performed: 14 DAF vs. flowering, 27 DAF vs. 14 DAF. For each comparison, a dye swap was realized. The hybridization of labeled samples on the slides, scanning and data normalization were performed as previously described (Lurin et al., 2004). Differential analysis was based on the log<sup>2</sup> ratios averaged on the dye-swap: the technical replicates were averaged to get one log<sup>2</sup> ratio per biological replicate and these values were used to perform a paired t-test. The raw P-values were adjusted by the Bonferroni method, which controls the family wise error rate, and probes were considered as differentially expressed when the Bonferroni corrected Pvalue was <0.05. Transcriptome datasets were deposited in the NCBI Gene Expression Omnibus database with the accession numbers GSE109789 for pea and GSE109521 for M. truncatula. All pea sequences with "PsCam" accession numbers could be retrieved from the pea RNAseq gene atlas at http://bios.dijon. inra.fr/ (PsUniLowCopy data set).

Twenty genes differentially regulated between two stages were selected for RT-qPCR analyses (as describe above) in leaves from three biological replicates of plants well-supplied with nitrate. For each leaf sample (lower and upper leaves) and developmental period (14 DAF vs. flowering, 27 vs. 14 DAF), Pearson's correlation coefficient (r) between microarray and RT-qPCR expression levels were calculated (**Table S3**, primers in **Table S2**). Hierarchical clustering of transporter and TF genes was performed using the Genesis software (v1.8.1; default parameters) (Sturn et al., 2002). Gene Ontology (GO) term enrichment analysis was performed using topGO (elim method and Fisher's exact test) in Bioconductor v2.9 implemented in BIOS (Architecture BioInformatique Orientee Services, http:// bios.toulouse.inra.fr/). Phylogenetic trees were generated from protein sequences using the Neighbor-joining method of the ClustalW2 program available at https://www.ebi.ac.uk/Tools/ phylogeny/. Orthologous genes between pea and M. truncatula (v4.02) were identified using OrthoFinder v1.1.8 (MCL clustering algorithm and DIAMOND v0.9.10.111 for the alignment with default parameters). Of the 19055 clusters identified, 15445 were retained for transcriptome comparisons because they were made of a unique gene per species (14980 sequences with probes on the arrays).

#### Gene Co-expression Network Construction

Log<sup>2</sup> intensity values from each red and green channels were normalized based upon quantiles using the preprocess Core package (v1.34.0) available in R (v3.3.1). Gene variance was calculated using the gene filter R package (Gentleman et al., 2018) (v1.54.2) and only sequences displaying a variance >0.2 were retained for co-expression studies. Gene co-expression networks were built using the Expression Correlation plugin (v1.1.0, http://apps.cytoscape.org/apps/expressioncorrelation) of Cytoscape (v3.5.1) (Cline et al., 2007). We have chosen r cut-off of 0.95 and −0.95 (r <sup>2</sup> >0.9) to build P-REMONET from the pea transcriptome dataset, and of 0.90 and −0.90 (r <sup>2</sup> >0.81) to build M-REMONET from the M. truncatula transcriptome dataset. The node degree of the networks followed a powerlaw distribution. A Prefuse Force Directed layout was used to visualize the entire networks in Cytoscape. For ease of visualization of TF-related modules, the genes connected to the TFs were organized using the Circular Layout algorithm.

### RESULTS

# Dynamics of Nitrogen Remobilization During the Reproductive Phase in Pea

An overview of nitrogen remobilization between tissues was obtained through a time-series analysis of the translocation of <sup>15</sup>N absorbed in the form of nitrate during the vegetative phase (**Figure 1**). From the beginning of flowering to seed filling in the first pods (14 days after flowering, DAF), nitrogen taken up during the vegetative period was remobilized from leaves below the first flowering node (lower leaves; 46.5%, **Figure 1A**) and roots (53.5% of the total amount of remobilized nitrogen). This pool of nitrogen was mainly redistributed toward leaves of the reproductive part (upper leaves) and to pod walls. Then, from 14 DAF until the end of 1st pod seed filling (27 DAF), nitrogen was remobilized from stems (20%), lower, and upper leaves (80%) to seeds, pod wall, and roots. Roots behave as a transient sink of nitrogen during this period, probably because leaves, and stems provide sufficient amounts of nitrogen to fulfill seed nitrogen requirements. At later stages (27–63 DAF), nitrogen was remobilized from all tissues to seeds, which at maturity contained 54% of nitrogen derived from remobilization processes (**Figure 1A**). This shift to systemic remobilization to seeds coincided with the beginning of chlorophyll degradation in leaves (starting 33 DAF, **Figure S1A**). The increased expression of the early senescence marker ELSA in lower and upper leaves 27 DAF was indicative of a molecular switch toward proteolysis (**Figure S1B**). The 4-fold higher expression of ELSA in upper leaves 27 DAF, compared to lower leaves, suggests higher proteolytic activities in these leaves. Altogether, the data indicate that 27 DAF is a transition stage toward leaf senescence.

Nitrate deficiency during the reproductive phase triggered major changes in the dynamics of nitrogen remobilization (**Figure 1B**). From flowering to 14 DAF, nitrogen remobilization from roots decreased while nitrogen remobilization from lower leaves increased significantly in response to nitrate deficiency. From 14 to 27 DAF, roots became the major source of nitrogen specifically under nitrate deficiency and nitrogen remobilization from other tissues was significantly reduced in that condition, especially from lower leaves that became a transient sink for nitrogen. This may be part of the mechanisms used by plants to avoid precocious senescence in response to nitrogen deficiency. While leaf nitrogen content decreased continuously from flowering to maturity under nitrate-sufficient conditions, it remained unchanged between 14 and 27 DAF in nitrate-deprived plants (**Figure S2**). These data and the lower expression of PsELSA in lower and upper leaves of these plants, suggest a lower remobilization rate in response to nitrate deficiency (**Figure S1B**), associated with a maintained chlorophyll content (**Figure S1A**). At later stages (27–63 DAF), nitrogen remobilization from almost all tissues was significantly reduced in response to nitrate deficiency and, at maturity, these plants were characterized by a reduced seed yield and one-seed weight (**Table S1**).

# Transcriptome Changes in Pea Leaves at Early Reproductive Stages

The molecular processes regulated in pea leaves at stages characterized by dynamic nitrogen remobilization between tissues, from flowering to 27 DAF, were investigated by transcriptomics. An analysis of transcriptome changes occurring in leaves of the vegetative and reproductive nodes under both nitrate-sufficient and -deficient conditions was carried out. The GENOPEA array representing 40777 pea sequences was used. Quantitative RT-PCR data for 20 genes differentially expressed showed high correlations with array data (Pearson's correlation coefficient r ranging from 0.80 to 0.93, **Table S3**), confirming the robustness of the approach to identify genes differentially regulated in pea leaves. An analysis of gene ontology (GO) terms significantly enriched (Fisher's P-value<0.005) in the lists of genes differentially regulated during the time course provided an overview of the biological processes activated or repressed (**Figure 2**). Major changes occurred in the upper leaf transcriptome from flowering to 14 DAF regardless of nitrate supply. Between 14 and 27 DAF, 2074 and 2193 genes were, respectively, up- and down-regulated in lower leaves specifically under nitrate supply. Many GO terms in **Figure 2** are related to transport processes. Expression patterns and annotations of the 678 transport-related probes differentially regulated between

nitrogen remobilized in response to nitrogen deprivation: \*P < 0.1, \*\*P < 0.05, \*\*\*P < 0.01 (t-test, data from 4 individual plants).

at least two developmental stages are presented in **Table S4**, thus providing a set of candidate genes for controlling the transfer of nutrients. The most differentially regulated genes (more than 4-fold) are presented in **Figure 3A**. These 88 genes were classified into six main clusters based on hierarchical clustering of their expression patterns. GO analysis revealed an over-representation of genes encoding transporters of sulfate (SULTR), metal ions, and lipids. The previously reported role of sulfate-derived molecules in controlling autophagy and SAGs (Álvarez et al., 2012; Yarmolinsky et al., 2014) prompted us to study the expression and homologies of SULTR genes. A phylogenetic tree based on alignments of all SULTRs present in the Pea Gene Atlas (Alves-Carvalho et al., 2015) and a search for the well-characterized Arabidopsis homologs revealed that the differentially regulated genes belong to groups 2 and 3 of low-affinity SULTR (**Figure 3B**). Of the five differentially regulated SULTR genes, four were up-regulated in leaves of the reproductive nodes 14 DAF (**Figure 3C**), suggesting they could contribute to sulfate transport in these leaves.

#### TF Genes Differentially Regulated in Pea Leaves Between Stages

To identify putative regulators in pea leaves, genes belonging to the categories "TF activity" (GO:0003700) and "regulation

flowering (A, 14 DAF) and/or between 14 and 27 days after flowering (B, 27 DAF). The number of genes whose expression varied regardless of nitrate nutrition are shown in gray boxes, while the number of genes whose expression varied specifically under nitrate-supply (N+) or nitrate-deprivation (N–) are shown in red and blue boxes, respectively. The circles are proportional to the number of genes in each box. GO terms significantly enriched (Fisher P values < 0.005) in each gene list are sorted according to P values (lowest at the top).

of transcription" (GO:0045449), and significantly regulated between at least two stages, were selected. The annotation and expression patterns of these 625 probes are available in **Table S5**. We subsequently focused on the 78 TF genes displaying more than a 4-fold change in expression. They belonged to various families, the most enriched TF families in this dataset being NAC and ethylene response factor (ERF), followed by myeloblastosis (MYB), nuclear factor Y (NF-Y), and WRKY TFs (**Figure 4A**). These were classified into eight main clusters based on hierarchical clustering of their expression patterns (**Figure 4B**). The regulation of NAC and ERF genes suggested specialized functions at early or late stages and/or in leaves at specific positions. For example, while NAC2/PsCam033601 and NAC100/PsCam038037 were up-regulated in all sample comparisons, NAC1/PsCam050102 expression only increased in upper leaves 14 DAF. The well-known regulation of NAC transcript abundance by miR164 in Arabidopsis (Guo et al., 2005; Kim et al., 2009) prompted us to examine whether it could also apply to pea. By exploiting an internal miRNA database, we observed that NAC1 and NAC100 are indeed predicted targets of members of the miR164 family in pea (**Table S6**).

#### TF-Related Co-expression Modules in Pea Leaves

To predict putative regulations by the TFs, a co-expression network based on high Pearson correlations (r <-0.95 or >0.95) was built from the normalized intensities (Log2) of the 48 samples hybridized on the arrays. Variables with low overall variance were filtered out to reduce the impact of noise (see Materials and Methods). The filtered dataset (11949 probes), provided in **Table S7**, can be imported in Cytoscape and easily converted into an interaction Network using the Expression Correlation package (Cline et al., 2007). This Pea REMObilization NETwork (P-REMONET) consisted of 4523

nodes (i.e., genes) and 67447 edges (i.e., co-expression links). A total of 436 components were identified in P-REMONET, the largest containing 3225 nodes/genes (**Figure S3A**). Of the TF genes differentially regulated at least 4-fold, 39 were connected to one, two, or many genes. Several TFs were linked together, leading to 30 different TF-related modules (**Table 1**). The list of genes in each module is available in **Table S8** along with the strength (r), type of interaction (i.e., correlation either positive or negative), and expression patterns. Several modules contain TF genes whose regulation depends on nitrogen availability, such as NAC073#2 and NAC043, which were downregulated 27 DAF specifically under nitrate deficiency (**Table 1** and **Table S8**).

To investigate the robustness of P-REMONET for predicting TF-TF or TF-target interactions, a search for the best Arabidopsis homologs was performed for each gene in the TF-related modules. The P-REMONET predictions showed similarities to interactions validated in Arabidopsis. For example, module M22 consisted of two positively correlated genes, PsCam002187 and PsCam001382, respectively, homologous to MYC2 and JAZ5 (jasmonate-zim-domain protein 5), which interact in yeast two-hybrid assays (Chini et al., 2009). Module M4 was enriched for genes related to cell wall biosynthesis and contains two potential regulators, MYB46 (PsCam038865) and MYB83 (PsCam038898), shown in Arabidopsis to bind to the same secondary wall MYB-responsive element consensus sequence and activate the same set of direct targets involved in secondary wall biosynthesis (Zhong and Ye, 2012). Module M7 for two NAC073 TFs sharing 70% homologies (NAC073#1 and NAC073#2) was enriched in genes for cellulose biosynthesis, including two cellulose synthase genes. Consistently, NAC073 in Arabidopsis was named SND2 for Secondary wallassociated NAC Domain protein 2 and transactivates the cellulose synthase 8 promoter (Hussey et al., 2011). These observations validated P-REMONET as a useful tool to predict relevant regulations.

The largest TF-related modules in P-REMONET contain genes down-regulated during the time course (TFs belonging to cluster VI in **Table 1**). The higher number of connections was

identified for module M1, which contained 197 genes connected to the ethylene response factor/Apetala2 TF (ERF/AP2#1, PsCam039498, **Table 2**), suggesting this TF acts as a hub. Several TFs in these modules could act in concert since they were positively connected: ERF/AP2#1 and a plant ATrich sequence and zinc-binding protein (PLATZ) in module M1, bZIP61, bHLH#2, and a GATA-type zinc finger TF in module M3, bZIP34, bHLH70, MYB12, and ERF/AP2#3 in module M5 (**Table 1**). An analysis of GO terms for the coexpressed genes predicted biological processes that could be repressed in coordination with the down-regulation of the TFs (**Table S8**).

Other TFs were positively connected with genes upregulated at 14 or 27 DAF, thus identifying some putative transcriptional activators of processes induced during the time course (**Table 1** and **Table S8**). Two of these modules, M11 and M12, are depicted in **Figures S3B,C** since they contain the higher number of positive links with the TFs. The TF in module M11 (PsCam055941) was homologous to the subunit A3 of the nuclear factor Y (NF-YA3, AT1G72830), which in Arabidopsis stimulates the transcription of various genes by recognizing and binding to a CCAAT motif in promoter regions (Leyva-González et al., 2012). In pea, NF-YA3 was up-regulated in lower and upper leaves 14 DAF compared to flowering, then down-regulated 27 DAF (cluster V in **Figure 4B**), highlighting important regulations of this gene during the time course investigated. In contrast, the TF gene in module M12 (PsCam025011), homologous to MYB63 (AT1G79180), was down-regulated 14 DAF, then up-regulated 27 DAF in both vegetative and upper leaves (cluster VII in **Figure 4B**), suggesting a role at the transition stage toward chlorophyll breakdown and TABLE 1 | TF-related modules in the P-REMONET co-expression network.


The table describes the co-expression modules containing the TFs differentially expressed at least 4-fold in leaves between two developmental stages. The modules were retrieved from P-REMONET (*Figure S3A*). † indicates that gene expression varied significantly in response to nitrate (N) nutrition. The cluster in *Figure 4B* to which belong the TFs is indicated, along with the number of positive (pos) and negative (neg) connections, of edges, proportion of genes regulated by nitrogen availability, module, and number of different IDs/genes in the module. In the last column are additional TFs, regulated <4-fold, in the modules. Details about genes in each module are provided in *Table S8*.

TABLE 2 | TF-related co-expression modules conserved between pea and M. truncatula.


The pea sequence IDs were from the Pea Gene Atlas (http://bios.dijon.inra.fr/) and the M. truncatula IDs are from the Symbimics program (https://iant.toulouse.inra.fr/symbimics/). The best M. truncatula homologs (v4.02) were identified using OrthoFinder v1.1.8. The signs indicate whether the correlation with TF gene expression was positive (+) or negative (–). Genes were annotated by homology with sequences in The Arabidopsis Information Resource (TAIR, https://www.arabidopsis.org/). † indicates that gene expression varied significantly in response to nitrate nutrition.

senescence. Annotations of the co-expressed genes indicated that MYB63 may activate defense responses. These data were summarized in **Figure 5**, which provides a global view of the TF-related co-expression modules identified in pea leaves, depending on the developmental stages and nitrate availability.

#### Comparing Nitrogen Remobilization and TF Modules Between Pea and *M. truncatula*

A comparative study in M. truncatula was performed by coupling nitrogen remobilization analysis at the whole plant level with a transcriptome analysis of leaf samples collected under the same conditions as were the pea samples. The dynamic of nitrogen remobilization was similar between pea and M. truncatula from flowering to 14 DAF (**Figure S4**). Some differences occurred between 14 and 27 DAF: unlike pea, nitrogen was mainly remobilized from lower leaves of M. truncatula during this period. For transcriptomics comparisons, we focused on the 14980 orthologous sequences with a unique gene per species. A Pearson's distance correlation matrix was generated to compare transcriptomics data (expressed in log<sup>2</sup> ratio) between pea and M. truncatula (**Figure 6A**). The correlations were positive between species (0.13≤r≤0.45) for all pairwise comparisons, indicating transcriptional regulations at least in part conserved between the species.

To identify shared regulators between pea and M. truncatula, we focused on the 39 TF genes highly regulated in pea leaves and for which putative targets were identified in P-REMONET. A M. truncatula ortholog was found for 31 of these TFs (**Table S9**). After building a gene co-expression network (M-REMONET) from the normalized intensities (Log2) of the 48 leaf samples hybridized on the M. truncatula arrays (21164 probes, 8778 nodes, 108210 edges), a search for co-expression modules containing these TFs was performed. Four TFs (ERF/AP2#1, MYB83, bHLH70, NAC073#1) were closely connected to genes orthologous between the species (**Figure 5**). These putative conserved targets were listed in **Table 2** along with the type of correlation with the TFs (positive or negative). Notably, the

connected genes in modules M1, M4, and M7 (**Table 2**) were related to cell wall metabolism/structure, suggesting important transcriptional regulation of cell wall structure in leaves of both species. In the M7 module depicted in **Figure 6B**, of the eight genes similarly connected to NAC073#1 in both species, seven were related to cell wall metabolism (**Table 2**). Almost all genes in this module were down-regulated in lower and upper leaves 14 DAF (**Table S8**), suggesting major modifications of cell wall structure in these leaves at early reproductive stages.

# DISCUSSION

To provide a first overview of the transcriptional regulations occurring in pea leaves during seed development, we focused on stages of the reproductive phase preceding the sharp decrease in chlorophyll breakdown, up to a transition stage toward senescence (27 DAF, **Figure 1**). A long-term <sup>15</sup>N-nitrate-labeling experiment indicated that these stages were associated with dynamic nitrogen recycling and remobilization between tissues, leaves from vegetative and reproductive nodes contributing, respectively, to 29 and 44% of the total amount of nitrogen remobilized during this period (**Figure 1**). The subsequent stages were associated with nitrogen recycling from all tissues, including pod walls, and at maturity, 54% of nitrogen accumulated in pea seeds was derived from remobilization processes (**Figure 1**). Our data demonstrated that leaves are the main source of remobilized nitrogen, followed by pod wall, roots and stems, which is consistent with data previously obtained in a pulse-chase <sup>15</sup>N-labeling experiment (Schiltz et al., 2005). A transcriptome analysis of leaves from vegetative and reproductive nodes from flowering to 27 DAF showed that most of the well-known SAG, such as the cysteine protease gene SAG12, were not significantly up-regulated in our leaf samples. By contrast, genes that might contribute to promote nutrient recycling while maintaining leaves in a healthy metabolic state, i.e., with limited protein degradation, were identified. Complemented by a gene co-expression approach targeted on the most regulated TFs, this study provides a repertoire of regulatory predictions, some of which were conserved in the forage legume species M. truncatula (**Table 2**), that can broadly serve as a backdrop for studying the role of individual genes in legumes.

# Molecular Features of Leaves From the Reproductive Nodes During Seed Embryogenesis

#### Sulfur Transport and Metabolism

From the beginning of flowering to 1st node seed filling (14 DAF), seeds progress through embryogenesis on the reproductive nodes. This period was associated with deep transcriptional changes in leaves of the reproductive nodes regardless of nitrate availability (**Figure 2A**), as exemplified by transporter gene expression (**Figure 3A** and **Table S4**). SULTR genes were among the most up-regulated transporter genes in upper pea leaves 14 DAF as compared to flowering (**Figure 3A**). The most up-regulated was homologous to SULTR2;1 (PsCam025051, **Figure 3C**), which has been shown in Arabidopsis to be expressed in vascular tissues and proposed to regulate internal translocation and distribution of sulfate (Takahashi et al., 2000). The overrepresentation of genes related to methionine metabolism in the set of genes up-regulated 14 DAF in leaves of the reproductive nodes suggests sulfate can be used for methionine metabolism in these leaves (**Figure 2A**). Sulfate transport in upper pea leaves can also contribute to avoid precocious senescence owing to the role of sulfate-derived compounds in preventing autophagy and senescence in Arabidopsis and tomato (Álvarez et al., 2012; Yarmolinsky et al., 2014).

The gene co-expression approach enabled us to deduce some possible regulators of SULTR2;1/PsCam025051. The transporter was positively connected to five genes in P-REMONET, one of which encodes a Ser/Thr kinase (PsCam034543, **Figure 3D**). In the green alga Chlamydomonas reinhardtii, a Snf1-like Ser/Thr kinase positively regulates sulfate transporters (Davies et al., 1999), and in Arabidopsis all the substitutions at the phosphorylation site Thr-587 of a SULTR led to a complete loss of sulfate transport (Rouached et al., 2005). Hence, the Ser/Thr kinase may be a promising candidate for investigating the signal transduction system regulating sulfate homeostasis in upper leaves. In addition, SULTR2;1/PsCam025051 was negatively connected to two TF genes homologous to bHLH70 and MYB12 (module M5 and **Figure 3D**). Many MYB/bHLH complexes have been described in plants (Pireyre and Burow, 2015) and MYB factors have been shown to regulate genes related to sulfate assimilation (Koprivova and Kopriva, 2014), reinforcing the interest of further studies on the interplay of these genes.

#### Other TF Candidates for Maintaining Leaf Metabolism or Preventing Senescence

The above-mentioned bHLH70 gene was among the most down-regulated TFs 14 DAF (cluster VI in **Figure 4B**). It was connected to genes with different functions in module M5, suggesting pleiotropic roles. In particular, bHLH70 was negatively connected to WEB1 (weak chloroplast movement under blue light 1) in both P- and M-REMONETs (**Table 2**), pointing out bHLH70 as a putative repressor of WEB1 expression in leaves of both forage and grain legume species. WEB proteins maintain the velocity of chloroplast movements via chloroplastactin filaments in response to ambient light conditions (Kodama et al., 2010). By controlling chloroplast redistribution, they prevent the dismantling of the photosynthetic apparatus by excess light. The increased WEB1 expression in upper leaves at 14 DAF may be part of the mechanisms by which photosynthesis is maintained before senescence initiation. Our data suggest bHLHL70 to be a good candidate for investigating the regulation of these mechanisms. Another TF candidate up-regulated in upper pea leaves at 14 DAF is NAC1/PsCam050102 (**Figure 4B**). Overexpression of a NAC1-type TF in wheat delayed leaf senescence, leading to a stay-green phenotype (Zhao et al., 2015). Therefore, the up-regulation of NAC1 in upper leaves could contribute to prevent senescence, even when nitrate absorption by roots becomes limiting (**Figure 1**). The mRNA abundance of NACs, including NAC1, is controlled by miR164 in Arabidopsis25. It was therefore interesting to observe that all four predicted targets of miR164 in pea belong to the NAC family (**Table S6**), of which one corresponds to NAC1. This reinforces the possible regulation of NAC1 transcript abundance by miR164 in pea leaves.

### The Early Reproductive Phase Is Accompanied by a Reprogramming of Cell Wall-Related Genes in Leaves of Both Vegetative and Reproductive Nodes

Genes of lignin catabolism and cell wall organization were enriched in the list of genes down-regulated 14 DAF in lower and upper pea leaves (**Figure 2A**), reflecting a shift in cell wall structure at early reproductive stages. Interestingly, three TF-related modules conserved between pea and M. truncatula contained genes of cell wall metabolism/organization (**Table 2**). These conserved modules, described in **Table 2**, were identified for:

(i) ERF/AP2#1, which shares homologies with Arabidopsis AP2 TFs that have roles in plant protective layers such as the cuticle (Aharoni et al., 2004). In pea and M. truncatula, ERF/AP2#1 was positively linked to five genes related to cell wall organization and to a PLATZ TF responsible for A/Trich sequence-mediated transcriptional repression (Nagano et al., 2001). The identification of PLATZ and ERF/AP2 in the cell wall network built from a co-expression analysis in rice (Hirano et al., 2013) reinforces their possible coordinated function in controlling cell wall structure.

(ii) MYB83, which was similarly connected to the PLATZ TF and co-expressed with a gene encoding an oxidative enzyme (laccase, LAC17, **Table 2**) proposed to determine the pattern of cell wall lignification (Schuetz et al., 2014). The role of MYB83 in secondary wall biosynthesis has been demonstrated in Arabidopsis, where its overexpression induced the expression of secondary wall biosynthetic genes and resulted in an ectopic deposition of secondary wall components. In P-REMONET, MYB83 was linked to a third TF, MYB46 (module M4 in **Table S8**), shown in Arabidopsis to act redundantly with MYB83 in regulating secondary cell wall biosynthesis (McCarthy et al., 2009). The authors have shown that simultaneous RNAi inhibition of MYB83 and MYB46 reduced secondary wall thickness in fibers and vessels. Other authors demonstrated that MYB46 was sufficient to induce the entire secondary wall biosynthetic program (Zhong et al., 2007).

(iii) NAC073#1, which was positively linked to eight genes orthologous between pea and M. truncatula, of which five may have roles in cell wall formation/organization (2 cellulose synthases, a trichome birefringence-like protein, a glycosyl hydrolase and a Fasciclin-like Arabinogalactan protein, **Figure 6B**). In pea, NAC073#1 was positively connected to two additional NAC TFs: NAC073#2 and NAC043 (also named NST1 for Secondary Wall Thickening Promoting Factor1) (**Figure 6B**). Evidence is accumulating to suggest that a subset of closely related NACs act as master transcriptional switches governing secondary wall biosynthesis and fiber development (Zhong et al., 2008). In Arabidopsis, NAC073 and NAC043/NST1 contribute to the formation of secondary cell wall, and their repression resulted in a remarkable reduction in the secondary wall thickening (Zhong et al., 2008).

Taken altogether, the data indicate that the transcriptional regulation of cell wall organization and metabolism in leaves of legumes occurs at early reproductive stages and may involve seven transcription factors pinpointed here for the first time in pea: ERF/AP2#1, a PLATZ TF, MYB83, MYB46, NAC073#1, NAC073#2, and NAC043. The expression of NAC073#2 and NAC043 decreased 27 DAF under nitrate-deficiency only (**Figure 5B**), indicating that the intricate control of cell wall metabolism in pea leaves may rely on nitrate-dependent regulations. Although data accumulate in the literature on the role of NAC TFs in regulating cell wall metabolism (Zhong et al., 2007, 2008; McCarthy et al., 2009; Hirano et al., 2013; Schuetz et al., 2014), the full list of their targets remains to be established. The present study highlighted some putative targets for further investigations (**Figure 6B**).

#### Transcriptional Reprogramming of Leaves at a Transition Stage Toward Senescence

Transcriptome changes in pea leaves 27 DAF, which marks the switch toward senescence-associated yellowing (**Figure S1**), contributed to our understanding of molecular events underlying this transition.

#### Autophagy-Related Processes

GO enrichment analysis of genes up-regulated in leaves 27 DAF revealed an over-representation of genes involved in defense responses, such as disease resistance proteins (R proteins, **Figure 2B**). Accordingly, several defense-related genes known to be induced by pathogens were found to be expressed during Arabidopsis leaf senescence in a pathogen-independent manner (Quirino et al., 1999). Seven R protein genes up-regulated 27 DAF were in the MYB63-related module (M12 in **Figure S3C**). All contain an NB-ARC domain (Nucleotide-Binding adaptor shared by Apoptotic protease-activating factor-1, R proteins, and Caenorhabditis elegans death-4 protein) essential for protein activity (van Ooijen et al., 2008; **Table S10**). Interestingly, in rice, an R protein with NB-ARC domain has been named RLS1 (Rapid Leaf Senescence 1) because the disruption of the gene accelerated leaf senescence due to a rapid loss of chlorophyll (Jiao et al., 2012). The authors showed that RLS1 is involved in the autophagy-like programmed cell death and partial degradation of chloroplast. The R proteins in module M12 could play a similar role in the autophagy-mediated programmed cell death to promote nutrient remobilization while avoiding rapid senescence. By activating the autophagy process, reactive oxygen species (ROS) are key players in the regulation of programmed cell death (Pérez-Pérez et al., 2012). Therefore, the increased expression 27 DAF of RRTF1 (**Figure 5A**, module M23 in **Table S8**), encoding the Redox-Responsive TF1 that controls positively the accumulation of ROS in Arabidopsis shoots and roots (Matsuo et al., 2015), might contribute to orchestrate autophagy-mediated programmed cell death. Because autophagy allows the remobilization of nutrients while preserving cell longevity, identifying autophagy regulators is of particular interest. In module M12, all R proteins were positively connected to MYB63, which plays a dual function in regulating secondary cell wall formation and genes involved in disease resistance in Arabidopsis (Zhou et al., 2009). MYB63 and three R proteins were also positively connected to a Zinc finger-type TF whose closest Arabidopsis homolog (AT2G40140) was ROS-responsive (Gadjev et al., 2006). All these features indicate these two TFs may regulate autophagy, possibly through ROS perception.

#### Transporters

In the quest to identify transporters contributing to the recycling of nutrients at the transition toward senescence, the list of transporter genes differentially regulated during the developmental period was examined (**Table S4**). The most upregulated nitrogen transporters were high-affinity transporters of basic amino acids (e.g., CAT5) and nitrate (NRT2.5). In Arabidopsis, NRT2.5 plays a role in nitrate loading into the phloem during remobilization processes under nitrogen starvation (Lezhneva et al., 2014). The NRT2.5 homolog in pea was up-regulated 27 DAF in lower and upper leaves whatever nitrate supply (**Table S4**), suggesting a contribution to nitrogen recycling not restricted to low nitrate environments in pea. Although a role for CAT5 in leaf nitrogen remobilization has not yet been demonstrated, one CAT5 gene (At2g34960) was up-regulated in senescing Arabidopsis leaves (van der Graaff et al., 2006). The up-regulation of CAT5 in both leaf types 14 DAF and specifically in response to nitrate-deficiency 27 DAF (**Table S4**) suggests this gene could contribute to the recycling of amino acids in pea leaves, notably under nitrate-deficiency at later stages. However, nitrogen/amino acid transporters were not among the most regulated genes at 27 DAF, contrarily to genes encoding transporters of nucleotides, sugars, lipids, phosphate, potassium, nickel, and copper, which were up-regulated at least 4-fold at this stage (**Table S4**). Although these genes have not been reported to play a role in preventing rapid senescence, potassium homeostasis is known to play an essential role in stress-induced senescence (Anschütz et al., 2014), and a recent study highlighted the need to maintain potassium levels in leaves during nitrate starvation to prevent senescence (Meng et al., 2016). The potassium transporter identified (PsCam042603) was homologous to the high affinity K+ transporter gene HAK5. In P-REMONET, HAK5 was connected to a TF gene homologous to WRKY75 (**Figure 6A**), which has been shown to be induced during potassium starvation in Arabidopsis (Devaiah et al., 2007), highlighting the interest to investigate the relationship between this TF and the regulation of potassium transport in leaves.

#### Translation-Associated Processes

By influencing ribosome structure and function, ribosomal RNA (rRNA) processing, and modifications play key roles in protein synthesis, and thereby control metabolic activities (Bohne, 2014). Interestingly, genes of rRNA processing and modifications, and of translation were among the most represented in the list of genes down-regulated 27 DAF, compared to 14 DAF, in lower leaves, especially under nitrate-deficient conditions (**Figure 2B**). This suggests reduced translational activities in these leaves at the transition toward senescence. Genes related to these functional categories were among the most over-represented in the set of genes up-regulated 14 DAF, compared to flowering, in lower and upper pea leaves (**Figure 2A**), emphasizing the importance of these processes at early reproductive stages. The relationship between these genes and the progression toward senescence in leaves has not yet been established. However, perturbations of rRNA biogenesis are closely related to cell senescence in human cells (Yuan et al., 2017). Importantly, six of these genes were positively connected to the NF-YA3 TF (**Figure S3B** and **Table S8**), making it a good candidate for controlling metabolic activities in leaves. In Arabidopsis, overexpression of NF-YA members resulted in dwarf late-senescing plants (Leyva-González et al., 2012). Furthermore, overexpression of the soybean gene NF-YA3 in Arabidopsis enhanced drought resistance (Ni et al., 2013), indicating this nuclear factor subunit may be associated with protective roles in plants, but the targets potentially coregulated by the NF-Y complex are yet to be identified. Our results pinpoint genes in module M11 (**Table S8**) as attractive candidates for a deeper study of NF-YA3 function in leaves.

Overall, our results provided new information in understanding the complexity of the transcriptional regulations governing leaf metabolism during seed development in pea up to the transition toward senescence. These findings could serve future in depth investigations on specific genes or TF-related modules.

### DATA AVAILABILITY

The datasets generated for this study can be found in NCBI Gene Expression Omnibus database, GSE109789 for pea and GSE109521 for M. truncatula.

#### AUTHOR CONTRIBUTIONS

JB conceived the project. JB, GA, and SB conceived the pea 40 karrays. KG, AK, CS, and AL designed the nitrogen remobilization experiment, and the overall research plan with JB. AB contributed to all experiments with GA and MS (molecular aspects), AK

#### REFERENCES


and CL (phenotyping and greenhouse experiments), SP and SB (microarray analyses). KG developed the gene networks with contribution of CH and built the nitrogen remobilization diagrams with contributions of JT and J-CA. MT provided the miR164 data and performed the othology search between species. KG analyzed all the data, with contribution of AB for phenotypic characteristics, and wrote the manuscript.

### FUNDING

This study was supported by the French National Research Agency GENOPEA project (ANR-09-GENM-026).

#### ACKNOWLEDGMENTS

We thank colleagues in UMR1347 Agroécologie (France) for helping with growth of the plants (Eric Vieren and members of the 4PMI, Plant Phenotyping Platform for Plant and Microorganisms Interactions, platform), tissue collection (Françoise Jacquin, MS), nitrogen measurements (Anne-Lise Santoni), bio-informatics aspects (Vincent Savois, Jonathan Kreplak), and very helpful comments and corrections on the manuscript (Vanessa Vernoud, Richard Thompson). We are grateful to Véronique Brunaud (Institut des Sciences des Plantes—Paris-Saclay) for bio-informatics support and Julia Buitink (IRHS Angers) for helpful advice regarding network construction.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019. 01014/full#supplementary-material


directly targeting senescence-associated genes in rice. Proc. Natl. Acad. Sci. U.S.A. 111, 10013–10018. doi: 10.1073/pnas.1321568111


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Gallardo, Besson, Klein, Le Signor, Aubert, Henriet, Térézol, Pateyron, Sanchez, Trouverie, Avice, Larmure, Salon, Balzergue and Burstin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Harnessing Implementation Science and Self-Determination Theory in Participatory Research to Advance Global Legume Productivity

#### Simon Mark Payne<sup>1</sup> \*, Phillipa Nicholas-Davies <sup>2</sup> and Robert Home<sup>3</sup>

<sup>1</sup> Department of Psychology, Faculty of Earth and Life Sciences, Aberystwyth University, Aberystwyth, United Kingdom, <sup>2</sup> Faculty of Earth and Life Sciences, Institute of Biological, Environmental, and Rural Sciences, Aberystwyth University, Aberystwyth, United Kingdom, <sup>3</sup> Department of Socio-Economic Science, Research Institute of Organic Agriculture, Frick, Switzerland

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Karl Kunert, University of Pretoria, South Africa Ilias Travlos, Agricultural University of Athens, Greece Chrysanthi Charatsari, Aristotle University of Thessaloniki, Greece Peter Gresshoff, University of Queensland, Australia

> \*Correspondence: Simon Mark Payne smp14@aber.ac.uk

#### Specialty section:

This article was submitted to Agroecology and Ecosystem Services, a section of the journal Frontiers in Sustainable Food Systems

> Received: 21 April 2019 Accepted: 19 July 2019 Published: 16 August 2019

#### Citation:

Payne SM, Nicholas-Davies P and Home R (2019) Harnessing Implementation Science and Self-Determination Theory in Participatory Research to Advance Global Legume Productivity. Front. Sustain. Food Syst. 3:62. doi: 10.3389/fsufs.2019.00062 There are many challenges associated with increasing global legume production, and to overcome them will require stakeholders to modify certain perceptions and behaviors. Unfortunately, stakeholder motivation has been under-appreciated in global legumes research, despite its central role as a predictor of research uptake. Observational studies exist but often, motivation theory is wielded with a lack of conviction, and intervention studies have not yet emerged. Thus, participatory intervention research that embeds insight from contemporary understandings of motivated behavior, is a fruitful line of investigation. Participatory/transdisciplinary, reflective learning methodologies have demonstrated an ability to create new, and maximize existing, pathways to impact in legume productivity. Conversely, successes from the burgeoning field of implementation science have yet to be translated to agriculture research; frameworks exist that simplify the researcher's task of planning, applying, reporting, and replicating their transdisciplinary research. This review describes a novel methodological approach which promotes cross-fertilization of ideas between scientific, extension, farmer, and industry co-actors, engendering a dynamic learning culture; partners co-plan, co-execute, and co-disseminate their work together, in an equitable arrangement. This ensures that outputs are targeted to the needs of end-users and that both scientific and practical (local) knowledge is taken into account. Despite a recent proliferation of useful articles on knowledge co-creation in sustainable agriculture, this review is the first to rationalize to researchers the need to design participatory research which is informed by social psychology (Self-Determination Theory) and adheres to procedures championed in implementation science (e.g., feasibility and fidelity studies, systematic reporting). Theoretical rigor is added to the participatory research agenda, but this review also offers some practical suggestions for application in legumes research. While the focus is on legumes, this guidance is equally applicable to other crops and agricultural systems.

Keywords: motivation, participatory, implementation, uptake, barriers, solutions, self-determination, legumes

# INTRODUCTION: GLOBAL CHALLENGES

Among the global challenges facing humanity, providing food security for a growing population, addressing climate change through reducing production and release of greenhouse gases, and securing sustainable and renewable sources of energy feature strongly (see UN Sustainable Development Goals two, thirteen, and seven, respectively; sustainabledevelopment.un.org/). Legumes can play a central role in addressing these challenges through: the provision of nutrient rich diets for humans and livestock; fixing atmospheric nitrogen via rhizobial symbiosis, consequently contributing to improved soil quality, reduced need for synthetic nitrogen fertilizers and the fossil fuels associated with their production; providing feedstock for biofuel production and other industrial processes; contributing soil nutrition, biodiversity and biocontrol benefits to the sustainable intensification of farming in developing countries and in sustainable mixed farming systems in currently intensively farmed regions (e.g., Europe and North America). Forage legumes in particular, as part of sustainable grasslandbased animal production, can contribute to addressing these challenges by increasing forage yield, mitigating and facilitating adaptation to climate change (as elevated atmospheric CO2, higher temperatures and drought-stress periods increase), increasing the nutritive value of herbage and raising the efficiency of conversion of herbage to animal protein (Lüscher et al., 2014; Watson et al., 2017).

# The Potential Contribution of Legumes to Address Pressing Global Challenges

Despite so many potential benefits associated with the increased use of legumes, numerous significant challenges will have to be overcame if this strategy is to be realized. Of the total global plant protein produced, less than half is used for human consumption (Forum for the Future, no date) and this includes high quality soya protein which could be used for human nutrition. The shift toward industrialized animal farming systems creates significant demand for grain and other plant proteins as feed for animals, as well as contributing to production challenges of waste, pollution, deforestation, greenhouse gas (GHG) emissions, and soil degradation. The recent rise in prices of grain legumes due to this livestock feed demand has also led to an increase in demand for legumes worldwide through both income and population growth (Nedumaran et al., 2015). In addition to this increased nutritional demand, there is also a significant demand for soya in the bio-diesel industry. Nedumaran et al. (2015) predict that based on these changing demands, there will, in the near future, be substantive shifts in the utilization patterns and price structure of grain legumes. Interested readers are referred to existing reviews which compare global legumes production statistics, document historical trends, and discuss the hypothetical implications (cf. Asner et al., 2004; Lüscher et al., 2014; Nedumaran et al., 2015; Phelan et al., 2015; Stagnari et al., 2017; Watson et al., 2017).

There is general agreement on the potential of legumes to provide a healthy, affordable, and sustainable contribution as a food source for humans (cf. Lüscher et al., 2014; Polak et al., 2015; Ivarsson and Wall, 2017; Joshi and Rao, 2017; Mottet et al., 2017; Röös et al., 2018). However, challenges associated with realizing this potential include variable and low yields, poor seed availability, lack of market, low awareness of indigenous legumes, and the lack of convenient food applications (Philips, 1993; Mtambanengwe and Mapfumo, 2009; Mhango et al., 2012), shifting consumer preferences away from meatheavy consumption, educating consumers about how to cook legumes and integrate them into their staple diets (Bezner Kerr et al., 2010; Polak et al., 2015), empowering women as agents of improved nutrition outcomes, taking local contexts into account and providing small producers with support to capitalize on changing market demand for delivering agricultural and nutritional improvements (Hawkes and Ruel, 2008).

With regards legumes as biofuels, research has intensified into the use of second generation biomass feedstocks (Timilsina et al., 2010; Carriquiry et al., 2011), for example, crop residues, wood residues, and dedicated energy crops such as perennial legumes, cultivated primarily for the purpose of biofuel production (Ben-Iwo et al., 2016). Perennial legumes—including alfalfa, clovers, various tree (e.g., Pongamia pinnata), and shrub legumes are not only non-competitive with human nutrition, they also have the benefit of being able to grow in marginal soil and climatic conditions, fix rhizobial nitrogen, and also provide a source of protein for grazing livestock (Jensen et al., 2012). If numerous barriers to their development can be overcame (e.g., long reproductive cycle and genetic variability of cross pollinated tree legumes), it is environmentally, economically, socially, and politically beneficial (the "Quadruple Bottom-line") to grow this group of plants in nutritionally depleted and stressed soils and use them for purposes such as biomaterial and biodiesel/biofuel feed stock production (Biswas et al., 2011).

Oft-cited benefits of the use of legumes in cropping systems center on sustainability and climate resilience outcomes. For example, legume-based systems can convey advantages to soil fertility, water quality, the requirement for N fertilization in subsequent crops, weed regulation, pest and disease mitigation, reduced GHG emissions, increased light interception, and more (Barbery, 2002; Peoples et al., 2009; Jensen et al., 2012; Ngwira et al., 2012; Seymour et al., 2012; Voisin et al., 2014; Preissel et al., 2015; Stagnari et al., 2017; Watson et al., 2017; Kinama and Habineza, 2018). Unfortunately, the magnitude of the impact varies across legume species, soil properties and climatic conditions (Stagnari et al., 2017). Moreover, reports from varied farming contexts indicate that significant concerns and barriers exist around technical knowledge, management skills, poor seed availability, perceived (and often realized) low and variable yields, inadequate policy support, lack of markets, lack of proper quantification (and recognition) of long-term benefits of legumes within cropping systems, lack of persistence and stress tolerance (temperature, N, phosphorus and water), nodulation efficiency, and the supply of seed of adapted varieties with appropriate inoculant (Mtambanengwe and Mapfumo, 2009; Ncube et al., 2009; Peoples et al., 2009; Bues et al., 2013; Preissel et al., 2015; Stagnari et al., 2017). In an attempt to address the challenge associated with the availability (production and dissemination) of good-quality seed initiatives such as the Alliance for a Green Revolution in Africa (AGRA) established the Programme for Africa's Seed System (PASS), though the impact of this system at grass roots level is yet to be evaluated.

The overall picture is one of a research area in need of a coherent strategy to expedite uptake and impact; the potential advantages—or hypotheses—of increased global legume productivity touched upon above (and reviewed extensively elsewhere) must be tested using methodologies that give research its best chance of generating knowledge that is quickly translatable into policy and sustainable practices. Good participatory legumes research certainly exists (Payne et al., 2017) but still, questions persist about uptake, impact, and sustainability. As such, the purpose of the present review is to rationalize and provide practical suggestions for a novel, more theoretically rigorous approach to participatory legumes research: harnessing insight from implementation science and Self-Determination Theory will help researchers to establish the organizational and collaborative conditions in which each knowledge co-creation projects fulfills its ambition.

# RESEARCHING SOLUTIONS TO THE SIGNIFICANT CHALLENGES ASSOCIATED WITH INCREASING LEGUME PRODUCTION

A gradual philosophical shift is being witnessed in the agriculture literature: more and more qualitative, participatory, and psychologically-informed research is slowly being published, and the number of journals that support this philosophy is increasing (e.g., Journal of Agricultural Development & Policy, Journal of Rural Studies, International Journal of Agricultural Extension & Rural Development). However, traditional approaches to research still predominate, characterized by hegemonic power hierarchies and beneficiaries-as-passive-recipients of the researcher's scientific expertise. Furthermore, poor uptake of legumes research can be interpreted as a residual effect of previous research and dissemination that was not grounded in knowledge co-creation approaches. The limited cultivation of legumes raises the question of how farmers can be engaged and motivated to commit resources to overcoming these challenges. An obvious approach, given political will, would be to provide subsidies for legume production. However, subsidies have themselves proved to be a problematic market intervention and often produce unintended consequences. Cowe (2012) points out that "subsidies given to farmers as part of the CAP are blamed for encouraging intensive farming that degrades land, water and habitats. Similarly, rich-world subsidies, like the CAP, make life even tougher for poor farmers in developing countries" (NB, The European Union's Common Agricultural Policy, or CAP, is "a partnership between agriculture and society, and between Europe and its farmers;" see https://ec.europa.eu/ info/food-farming-fisheries/key-policies/common-agriculturalpolicy/cap-glance\_en). The challenge is to find interventions that work for, rather than against, the environment and international development. In other words, how can researchers design interventions that can be adapted and scaled up in ways that are accessible and equitable? What insights from social psychology can contribute to addressing the challenge and motivating farmer engagement? To find answers to these questions, we look to Self-Determination Theory, implementation science, the participatory research paradigm, and a novel integration of all three.

### Harness Insight From Self-Determination Theory Regarding Human Motivation and Behavior

Self-Determination Theory (SDT) is a broad theoretical framework that explains human motivation and the functions of personality (Deci and Ryan, 1985, 2000; Ryan and Deci, 2000). Research has applied SDT in a range of domains (e.g., organizations, religion, education, health, medicine, sport, and physical activity) and a vast literature supports its explanatory and predictive utility. Self-determination refers to the degree to which individuals feel that their behavior is controlled vs. autonomous, and SDT posits contrasting motivational consequences associated with this perception (i.e., positive vs. negative). SDT is comprised of six mini-theories, of which Cognitive Evaluation Theory, Organismic Integration Theory, and Basic Psychological Needs Theory can be especially helpful in understanding the psychology, behavior, and by implication performance, of stakeholders in legume production.

Cognitive Evaluation Theory is concerned with intrinsic motivation, which is "the innate, natural propensity to engage one's interests and exercise one's capacities, and in so doing, to seek and conquer optimal challenges" (Deci and Ryan, 1985, p. 43). Human development, as characterized by learning, adaptation, and a growth in competencies, is greatly facilitated by intrinsic motivation (Deci and Ryan, 1985). Interest and intrinsic motivation can be supported or thwarted by one's social context, and in particular, the presence of factors such as environmental controls and rewards. Theoretically, farmers who feel that they operate within a controlling system, where rewards are dependent on behaviors that they do not truly believe in, or where there are constraints on their opportunities to exercise their capacities—to learn, adapt, and grow—are unlikely to experience intrinsic motivation, and might instead suffer from disinterest and stagnation (cf. Deci and Moller, 2005). For example, market forces that encourage specialization in Soya beans (Stagnari et al., 2017) might stifle farmers' desire to master a mixed legume farming system. Similarly, researchers should employ participatory methods to produce solutions that are cocreated with farmers and other stakeholders, thereby supporting their intrinsic motivation and maximizing eventual uptake of the research.

On the other hand, Organismic Integration Theory focuses on extrinsic motivation, which is reflected in behavior that serves an instrumental purpose rather than being done "for its own sake" (Ryan and Deci, 2002). According to Organismic Integration Theory, the instrumental purposes underpinning behavior are less or more extrinsically motivating depending on how internalized or integrated they are to the individual's sense of self. The less internalized forms of instrumentality are characterized by the salience of extrinsic rewards or punishments to the individual, their perception of the need for compliance in the situation, and/or a focus on approval from self or others (Ryan and Deci, 2000). For example, in Thailand, public standards of good agricultural practices have been established, but most farmers "do not understand the underlying rationale for these guidelines and therefore do not feel intrinsically motivated to follow them, but rather perceive the guidelines as requirements that need to be fulfilled explicitly and exclusively for the audit" (Schreinemachers et al., 2012, p. 525). In stark contrast, more internalized forms of instrumentality are characterized by a conscious valuing of the activity, self-endorsement of goals associated with the activity, and/or a sense of congruence between the activity and one's sense of self (Ryan and Deci, 2000). Effective farming relies on a set of conditions and behaviors that are instrumental (extrinsically motivated) to the goal of keeping the farm running (e.g., early mornings, long hours, low pay, grueling manual labor, often isolation), but unlikely to achieve the status of an intrinsically motivated behavior (e.g., done for enjoyment). On the other hand, farmers may experience "integrated regulation"—the least extrinsically motivating force—because their work responsibilities are integral to their core identity and help fulfill their basic psychological needs (BPNs) (see below; NB: "Farming is much more than an occupation: it is the reproduction of the family; it is work; it is their public role; it is their social status; and, it is their self-image. These multiple layers of meaning combine in such a way that the work of farming becomes an end in itself and survival its own logic," Pile, 1990, pp. 160–161).

In some cases, extrinsically motivated behavior can be difficult to sustain in the long term because the effects of the external inducements tend to "wear off " (cf. Deci and Moller, 2005; NB. perhaps the extrinsically regulating force is a particular policy, and the policy changes). CAP payments in the EU are an example of an external incentive to keep one's farm running, but that and similar motives do not necessarily filter down to motivated behavior on a day-to-day basis; theoretically, extrinsic motivation can contribute to more frequent lapses in any behavior, to the detriment of the desired goal (see Vande Velde et al., 2018, for an excellent discussion of the perils of regulation and economic rationality as extrinsic regulators of farmers' use of anthelmintic treatment strategies). Implementing a crop rotation suggestion that is based on research evidence is an example of behavior change, specifically: although the behavior serves instrumental purposes (e.g., increased income), the farmer is more likely to sustain the new practice if they are assisted to quickly internalize and integrate it into their modus operandi, as contrasted with feeling impelled to do it or otherwise controlled. To evidence the importance of this theory, legume production research could compare pertinent outcomes for carefully matched participant groups that either receive or do not receive an SDT-informed version of a legumes trial. These theoretical principles warrant investigation in a general sense but also in diverse agricultural contexts—developed vs. developing countries, for example, where the factors regulating farmers' behaviors might look different on the face of it, but should follow these SDT tenets nonetheless.

Some published studies rely on the psychological construct of intrinsic motivation (cf. Greiner and Gregg, 2011; Mzoughi, 2011; Besser and Mann, 2015; Greiner, 2015; Carlisle, 2016) but too often do not theoretically define it, do not refer to the SDT or Cognitive Evaluation Theory formulation, confound motives and motivation, and/or confound intrinsic and extrinsic motivation. For example, Kessler et al. (2016) documented an "integrated soil fertility management" intervention that was tested in Burundi, the aim of which was to foster "farmers' intrinsic motivation to invest in activities that make the household more resilient and profitable, while moving toward sustainable agricultural intensification" (p. 249). Referring to the above definitions of intrinsic motivation, however, will make it clear that these are not intrinsic regulators of behavior (they are instrumental motives and therefore more extrinsic). A non-SDT but nevertheless interesting example is provided by studies on farmers' adoption of conservation actions in the context of land management and land use (cf. Pannell et al., 2006; Farmar-Bowers and Lane, 2009): behavioral decisions are often made for what Cognitive Evaluation Theory would consider to be instrumental, and thus extrinsically motivated, reasons (e.g., to make money which secures a stable family lifestyle). Conversely, many farmers choose to build long-term soil health for non-economic reasons, such as environment protection, land conservation, and to "do right by my downstream neighbors" (Carlisle, 2016). Cognitive Evaluation Theory would theorize that these farmers have integrated such behaviors into their sense of self, and the behavior is at the lowest end of the extrinsic motivation spectrum.

Pertaining to a smallholder dairy development project in Kenya, Uganda, and Rwanda, Kiptot et al. (2016) investigated the motivations of volunteer farmer-trainers, which is a communitybased extension approach. Kiptot et al. (2016) observed that the farmers and trainers were generating income from inputs and services associated with the training activities. They were concerned that this conflicts with the volunteerism philosophy of the scheme, and the introduction of rewards would undermine the trainers' intrinsic motivation over time (leading to their withdrawal from the scheme). Lioutas and Charatsari (2017) designed a questionnaire to assess farmers' motives for the adoption of "green innovations." Whilst the study was not explicitly grounded in Cognitive Evaluation Theory, a factor emerged which captured boredom and lost interest. Lioutas and Charatsari labeled this sub-scale, "Need for change," and it is interpretable in SDT terms as the farmer's drive to seek and conquer new challenges, to learn, adapt, and grow in their competencies (Deci and Ryan, 1985).

Other studies have explicitly employed Cognitive Evaluation Theory to interpret farmer intrinsic motivation and behavior. Herzfeld and Jongeneel (2012) argue that, in some cases, the introduction of incentives and penalties (extrinsic forms of motivation) can detract from a farmer's intrinsic motivation to participate in voluntary EU agri-environmental schemes and comply with EU regulations. Similarly, Kvakkestad et al. (2015) argued that the "the wider meaning of being a farmer"—representative of the integrated form of extrinsic motivation—is often more important than profit maximization. In Luhmann et al. (2016), dairy farmers demonstrated long-term

willingness to participate in an initiative to promote high animal welfare standards. Where financial inducements were reported as a weak motivator, and/or farmers were willing to adhere even if they incurred additional costs, Luhmann et al. (2016) interpreted this behavior to reflect personal belief in the sustainable activities, appreciation of society's recognition of their commitment to the standard, and/or a sense of personal joy stemming from taking responsibility for the welfare of their animals. Unfortunately, the authors confound a lack of financial motivation for the behavior to indicate an intrinsic motivation to do it, which is a limited view of Cognitive Evaluation Theory and Organismic Integration Theory.

The final SDT mini-theory of interest, Basic Psychological Needs Theory, suggests that humans have evolved to seek activities which fulfill three innate needs: autonomy ("the need to self-regulate one's experiences and actions. . . associated with feeling volitional, congruent, and integrated"), competence ("our basic need to feel effectance and mastery. People need to feel able to operate effectively within their important life contexts"), and relatedness ("feeling socially connected. . . a sense of being integral to social organizations beyond oneself;" Ryan and Deci, 2017, pp. 10–11). The three BPNs are considered essential to human functioning, and at a universal level this relevance has been demonstrated across a variety of life domains (cf. Chen et al., 2015; Nishimura and Suzuki, 2016; Yu et al., 2018). Farmers have the opportunity to seek workplace opportunities that fulfill their BPNs, but this has not been investigated. Theoretically, for example, a farmer will function well both on-farm and off if they feel able to (1) exercise self-determination in their professional decisions (autonomy), (2) competently master those work tasks that, to them, most strongly reflect their identity as a farmer, and (3) contribute to a wider social purpose, which is inherent in the farmers' profession (e.g., environmental stewardship, combating food insecurity). Unfortunately, the converse is also true: when the farmer's BPNs are thwarted by their workplace circumstances (e.g., constraining systems, isolation), sub-optimal functioning will likely manifest (Ryan and Deci, 2000). Indeed, the scientific literature is replete with studies of farmer mental health (often negative), and a Basic Psychological Needs Theory perspective on this issue is long overdue.

Research has indirectly demonstrated that extrinsic motivators for behavior which the farmer would otherwise wilfully undertake because it contributes to a common goal they share with others, such as payments for ecosystem services, can thwart fulfillment of the relatedness BPN and cause dissatisfaction with the initiative (cf. Kerr et al., 2012; Narloch et al., 2012). Using structural equation modeling, Gyau et al. (2012) demonstrated a positive impact of Cameroonian kola producers' intrinsic motivation for engaging in collective action on their perceptions of the "ease of use" and usefulness of such activities (e.g., "group training in production and storage facilities, negotiation abilities and group marketing, and aiming to improve small-holder benefits in the value chain have been used to improve market access and bargaining power of producer," p. 43). It is possible that the farmers fulfilled multiple BPNs during this collective action, reciprocally benefiting their intrinsic motivation. Theoretically applicable to farmer-consultant dyads, it has been argued that the degree of knowledge transfer between parties is influenced by their shared understanding and personal relationship (akin to the relatedness BPN), as well as a cumulative sense of intrinsic motivation (Ko et al., 2005). Membership of communitysupported agriculture activities in Wisconsin (USA) has been explained in BPN terms (Zepeda et al., 2013). In Greece, Charatsari et al. (2017b) found that farmer participation in competence development projects is, perhaps not surprisingly, associated with the autonomy and competence BPNs, as well as motivation to seek knowledge. Similarly, participation in farmer field schools was both motivated by, and helped to fulfill, farmers' relatedness BPN, especially for those whose needs were not supported prior to participation (Charatsari et al., 2017a). Triste et al. (2018) provided a compelling argument for sustainable farming initiatives (SFIs) to be underpinned by SDT. Specifically, they have supportive data for the need to both market SFIs in order to appeal to BPNs, and to design SFIs in such a way as to support farmers' autonomous motivational process, via the BPNs. Similarly, Rothmann's (2013) findings led them to urge South African agricultural organizations to train managers to support the autonomy and relatedness satisfaction of employees, as these BPNs were shown to mediate the relationship between employee-manager relations and intention to leave.

#### Conclusion

Farmers live with "multiple uncertainties and indeterminacies in their farming presents and futures" (Robinson, 2017, p. 168), and these transient conditions thwart self-determination (i.e., detract from the fulfillment of autonomy and competence needs, minimize the desire to internalize vocational behaviors, and remove opportunities to experience a sense of intrinsic motivation). Moreover, many of the challenges to global legumes production referred to in section Introduction: Global Challenges point to a controlling motivational climate, and the consequences to farming have been made evident. Despite its widespread and successful adoption in many other domains of human experience, there has been only small pockets of observational research that has applied SDT in agricultural contexts. Hence, some progress has been made to understand what regulates (motivates) farmers' behavior on a day-to-day basis, but the field requires SDTinformed intervention studies. If researchers build into their study designs the explicit aim of satisfying farmers' BPNs and intrinsic motivation, improved research uptake should follow.

Armed with the robust SDT framework and full intention to integrate it into their participatory research (see section Combining Insight From SDT, Implementation Science, and the Participatory Research Paradigm to Solve the Challenge of Increasing Global Legume Production for practical applications), legumes researchers can shift attention to the systematic use of guidance on research planning, application, and reporting from the field of implementation science. The aim is to design research methodologies that have the maximum likelihood of quickly achieving "demonstrated sustainability" status.

# Learn From the Field of Implementation Science

Countless scientific studies are published each year that evaluate potentially game-changing techniques and interventions. Examples include oncology-based drug developments, health education and health promotion programmes, and innovations in agriculture. Unfortunately, the "lag" that is witnessed between the completion of research and its implementation in the field—whether it is ∼17 years in health research translation (Morris et al., 2011) or ∼30 years in agriculture (Alston et al., 2009), for example—too often renders these "solutions" redundant, or at the least, compromised. The burgeoning field of implementation science is fundamentally devoted to "the scientific study of methods to promote the systematic uptake of research findings and other evidence-based practices into routine practice" (Eccles and Mittman, 2006, p. 1), and it ". . . examines what works, for whom and under what circumstances, and how interventions can be adapted and scaled up in ways that are accessible and equitable" (Global Alliance for Chronic Diseases; gacd.org/research/implementation-science). The speed of research uptake is prioritized equally alongside accurate translation of the research, thereby helping to reduce the lags that plague applied science. Implementation science helps researchers to interrogate their design decisions, critically evaluate the outcomes of their projects, and effectively share the insight that is gained. Intended beneficiaries of research are intimately involved in the entire process and thus co-create new knowledge. Implementation science recognizes that there are many social actors with a role to play in the uptake of research into practice, the number of which—and the complexity of interrelationships within their systems—depends on each context. It is beneficial to explore how comprehensive frameworks designed to maximize uptake in natural contexts (e.g., RE-AIM, below)—and shown to be effective in other disciplines and challenges (e.g., sustainability of health interventions in Sub-Saharan Africa; cf. Iwelunmor et al., 2016)—might map over to agriculture. Specifically, if increased legume production is to help solve the global challenges outlined in section Introduction: Global Challenges, what research is needed to overcome the numerous barriers that have been identified, and how should this research be designed to create optimized pathways to impact and maximize the uptake of its findings?

#### Research Planning

The first step in research planning is to identify the most appropriate research approach. The traditional approach would be to leave the research to researchers in the formal Agricultural Knowledge System (AKS), which consists of agricultural research, education and extension establishments (Rivera and Sulaiman, 2009). The AKS paradigm assumes that knowledge and innovation only need to come from official science, which is free from the need to take the views, needs, and knowledge of the end users of innovation into consideration (Dosi, 1988). However, this neglect of societal actors as contributors to innovation (Leeuwis and Van den Ban, 2004; Knickel et al., 2009) reduces the capability of the AKS to address the goals of the agricultural sector or to support sustainable rural development. Systems approaches are therefore replacing the linear view (e.g., Röling and Engel, 1991; Hall et al., 2003; Sumberg and Reece, 2004; Knickel et al., 2009) and the formal institutions of the AKS have shifted toward the inclusion of farmers as important actors who participate in joint learning and negotiation to shape innovations (Leeuwis and Van den Ban, 2004). The overarching term to describe this shift is participatory research, or sometimes, transdisciplinary research, in which non-scientific stakeholders take ownership of both research and results by deciding on research objectives and strategies, while staying within the framework of scientific inquiry (Pohl and Hirsch Hadorn, 2008). Schneider et al. (2009) point out that social learning takes place when the knowledge of, in this case farmers, scientists, advisors, and other experts is integrated in a participatory process in which stakeholders and researchers collaborate to identify and rank specific problems, agreeing on methods to find the causes, and finding ways to realistically and practically solve them (cf. Bradford and Burke, 2005). Transdisciplinary research appears appropriate to meet the challenge of creating the conditions for meaningful and successful collaboration between researchers and stakeholders (Wicks and Reason, 2009; Caister et al., 2012), and is therefore clearly compatible with SDT and implementation science ("what works, for whom and under what circumstances").

There are problems—if not insurmountable—associated with participatory approaches, however. For example, participatory research is susceptible to reproducing and reinforcing existing power relationships within the participants (or ignoring women), with a common example being a hierarchical relationship between academics and the participants (Cooke and Kothari, 2001). A transdisciplinary research approach must carefully consider the implications of the processes at the local level to encourage and facilitate co-learning processes. This calls for an approach with continued reflection on the participatory process (Loeber et al., 2007), which in turn requires skills that an academic researcher might not fully possess. However, implementation science gives sufficient encouragement that the advantages of this process can outweigh its disadvantages (Pain, 2004), and harnessing SDT principles inherently breaks down power inequities.

Calls for transdisciplinary research to motivate transitions to more sustainable agriculture became loud in the late 2000s, with prominent scholars such as Aeberhard and Rist (2009) and Vandermeulen and Van Huylenbroeck (2008) highlighting its potential to elicit change. Participatory research is advocated by the European Union in its long-term strategy for European agricultural research and innovation and reflected in the substantial Horizon2020 funding stream. Common to most transdisciplinary research methodologies, in addition of course to participation of relevant stakeholders, are iteration and reflection, leading to ownership and implementation. These characteristics are evident in the following examples. Nyang'au et al. (2018) found that collaborative leadership enhanced implementation of a method using intercropping with a moth repellent fodder legume to control stem borer pest in Maize crops in Ethiopia. Sousa et al. (2016) concluded that participatory video: a transdisciplinary research method, contributed to uptake of novel composting methods by giving ownership of the video-based information, which thereby extended its outreach. Although fewer examples can be found in the literature about grain-legume production, the common themes of ownership, and sustainability suggest that such methodologies have at least the potential for application to inspire change. Indeed, Magrini et al. (2016) suggest the factors that hinder grain-legume development are primarily social rather than technical, and that engaging farmers is essential to promoting grain-legumes. SDT provides an overarching framework to understand and better promote stakeholder engagement.

A characteristic of participatory research approaches is that they consistently meet their aims. Home and Rump (2015) evaluated 17 diverse Learning and Innovation Networks for Sustainable Agriculture (LINSA) in Europe. As defined by network members (i.e., researchers and agriculture stakeholders), successful collaboration was characterized by strong internal engagement, co-development of strategy, creation of concrete outputs, equal give-and-take of benefits (new knowledge or improved practical solutions), joint reflection, mutual trust and commitment, finding a "balance between guidance and listening, interactions and freedom, and positive and critical reflection" (Home and Rump, 2015, p. 73). Implicit in such research is the need-fulfillment and intrinsically motivating properties of the collaborative research process. Many examples of impactful participatory research exist in developing countries (cf. Kangmennaang et al., 2017), and excellent guidance documents are available for this context (cf. Garibaldi et al., 2017). Unfortunately, review articles still warn that participatory research is not a widespread as might be expected and suitable evaluation measures are inconsistently employed (Schindler et al., 2015; Smith et al., 2017). This is especially true in the vital areas of innovation platforms and technology adoption.

Participatory research offers clear advantages over traditional approaches in crop and animal science, but still, little participatory research has explicitly addressed stakeholder motivation to the level of theoretical rigor afforded by SDT. In terms of increasing global legume production, researchers who perhaps lack confidence in participatory methods (cf. Payne et al., 2017) or awareness of the mechanisms by which they work (e.g., social learning theory, SDT), are urged to treat the present article as a catalyst to gain further methodological experiences in the integration of SDT, participatory approaches, and implementation science procedures (see section Combining Insight From SDT, Implementation Science, and the Participatory Research Paradigm to Solve the Challenge of Increasing Global Legume Production).

#### Research Application and Reporting

RE-AIM (cf. Glasgow et al., 2019) stands for: Reach (the intervention's target population), Effectiveness (or efficacy, of the intervention), Adoption (the population who are willing to initiate the intervention), Implementation (consistency, costs and adaptations made during delivery), and Maintenance (of intervention effects in individuals and settings over time; see **Figure 1** for more detail). The RE-AIM framework would help focus the researcher's attention if they wanted to investigate, for example, the high variability in yield and susceptibility to biotic and abiotic stresses of grain legumes (Nedumaran et al., 2015); it facilitates an examination of what works, for whom and under what circumstances, and "how interventions can be adapted and scaled up in ways that are accessible and equitable." Despite its widespread use in other fields of applied research, an early 2019 Google Scholar search of academic publications since 2015 using the term "re-aim 'AND agriculture OR farming"' (minus patents and citations) provided just 321 hits, and very few were related to food production. The RE-AIM framework assists stakeholders to (i) organize the results of their research for reporting, (ii) translate their research into practice, (iii) organize reviews of existing literature, (iv) plan programs with an enhanced chance of achieving impact in the field, and (v) weigh-up and understand the relative (hypothetical) costs and benefits of taking alternative approaches to a single challenge. All of these aims are pertinent to researchers interested in increasing legume production to meet the global challenges of food insecurity, climate change resilience, and sustainable energy. In sum, then: "The overall goal of the RE-AIM framework is to encourage program planners, evaluators, readers of journal articles, funders, and policy-makers to pay more attention to essential program elements including external validity that can improve the sustainable adoption and implementation of effective, generalizable, evidence-based interventions" (www.re-aim.org). So-called "essential program elements" (e.g., clearly defined primary and secondary outcome measures and the levels at which they were measured, how sample size was determined, baseline demographic and clinical characteristics of each group, number of study units in each group included in each analysis, sources of potential bias or imprecision) can be incorporated into applied research in legumes production by working backwards from a checklist of information to include when reporting a feasibility or full randomized controlled trial (RCT) trial.

The REFLECT statement (Reporting guidElines For randomized controLled trials for livEstoCk and food safeTy; Sargeant et al., 2010) is an evidence-based checklist of items that should be included when a RCT is reported with production, health, and food-safety outcomes (www.reflect-statement.org). REFLECT, much like the CONSORT statement from which it was adapted (Consolidated Standards of Reporting Trials; Altman et al., 2001; Moher et al., 2010), is much more than a list to facilitate the transparent reporting of an RCT; it is an a priori guide to the level of methodological rigor that is required for a study's findings to have a chance at being implemented in the field. Following these recommendations can mitigate against the "startling lack of consensus amongst experts about how best to measure agricultural sustainability" (de Olde et al., 2017, p. 1327). Where a full RCT is not suitable, and a pilot or feasibility trial is the best option (albeit still randomized), legumes researchers can refer to the appropriate CONSORT extension (Eldridge et al., 2016) and modify the REFLECT checklist accordingly.

Feasibility and fidelity work should be an essential component of legumes research, just as is it in the most impactful medical and psychological research (Cohen et al., 2008; Gearing et al., 2011). A thorough feasibility study would assess stakeholder enthusiasm for the project and the probability of successful recruitment to it (including participant adherence and retention), predict associated risks and determine the safety of participants, and

FIGURE 1 | The RE-AIM framework (http://www.re-aim.org/about/frequently-asked-questions/).

increase researcher experience with the intervention methods; it would investigate economic, market, technical, financial, and management aspects of feasibility (cf. Van Hemelrijck and Guijt, 2016). As such, a feasibility study can protect against potential misallocation of research funds and establish strong foundations for a project's eventual success. Feasibility studies would be essential in translating findings from the laboratory to help farmers increase utilization of locally grown, less commonly demanded varieties of legumes that will be affordable for low income families, for example.

Intervention fidelity refers to those "back-stage" factors which influence the outcomes of the intervention, whether a feasibility/pilot trial or RCT. A trial may work perfectly in the laboratory and even the field, but will the farmer maintain the corresponding behaviors once the study closes? The answer depends on many factors, of course, including their selfdetermination (section Harness Insight From Self-determination Theory Regarding Human Motivation and Behavior), but how the intervention is delivered is also vitally important (cf. Cook and Thigpen, 2019). For example, if a participatory project seeks to help willing farmers who are used to farming monocultures to include grain legumes in cropping sequences (cf. Stagnari et al., 2017), this represents a behavior change intervention and detailed reporting is required of (i) how training providers were themselves trained, (ii) the credentials of the trainers and providers, (iii) the theoretical model on which the behavior change intervention is based (e.g., SDT), (iv) a method to ensure that the content of the intervention was being delivered as specified, (v) a mechanism to assess whether the providers adhered to the intervention plan, and (vi) assessment of farmer comprehension and implementation of the intervention during and beyond the study period (see Borrelli et al., 2005 for full guidance). Without such detailed reporting, how can future researchers hope to replicate the positive results, or identify where things did not work so well in an intervention? Hence, fidelity should be treated as a core component of intervention research and stimulate detailed reporting of factors which influence the probability of eventual uptake by stakeholders.

RE-AIM, the REFLECT statement, and associated concepts of intervention feasibility and fidelity have a theoretically sound basis for upskilling researchers: they are grounded in reliable evidence and provide sufficient detail to raise the researcher's self-efficacy for the challenge of comprehensive and transparent study design and implementation. Indeed, it is worth exploring the research question that the integration of these approaches would also help researchers and partners fulfill their own needs for competence and autonomy within a participatory legumes production project. Despite this, the REFLECT statement has not been adopted to anywhere near the same extent in agriculture as its parent approach (CONSORT) has in health and medicine. Indeed, the CONSORT statement has been cited more than 8000 times (Eldridge et al., 2016), and its use is associated with an improvement in the quality of reporting of RCTs in these fields (Turner et al., 2011). In medicine, poorly designed and/or reported RCTs can lead to overestimation of the treatment effect, diminished quality of pooled analyses (e.g., meta-analyses), and impaired clinical practice decisions (cf. Moher et al., 1998; Péron et al., 2012); parallels to crop and animal science can be made and should not be ignored, and the REFLECT guidelines can address this concern.

In conclusion, implementation science urges researchers to adhere to a systematic process, from the conception of a research idea through to dissemination and monitoring of uptake. Such an approach, while more prescriptive than typically seen in agriculture, does not mean that the research team loses flexibility to manage factors as they unfold on the ground during research in complex scenarios. Participatory research imbued with SDT compensates for this, but the implementation science frameworks add a level of ("meta") rigor. Existing frameworks which facilitate this process in health/medicine research translation can be modified to suit the legume production context at hand; legume research planning, transparent reporting, and replication efforts will benefit.

# COMBINING INSIGHT FROM SDT, IMPLEMENTATION SCIENCE, AND THE PARTICIPATORY RESEARCH PARADIGM TO SOLVE THE CHALLENGE OF INCREASING GLOBAL LEGUME PRODUCTION

A commitment to RE-AIM, the REFLECT statement, and the need for intervention feasibility and fidelity work will provide participatory legume projects with a greatly enhanced chance of success, whether they want to (i) explore the challenges associated with increasing legumes production, (ii) explore the feasibility of hypothetical solutions to known challenges (i.e., prior to RCTs), (iii) test hypothetical solutions to known challenges (i.e., fidelity studies, RCTs), (iv) test the sustainability of a demonstrated solution, and/or (v) design follow-up research where a viable solution demonstrated non-sustainability in the field. The suggestions made in section Learn From the Field of Implementation Science provide a clear "road map" for incorporating insights from implementation science into legume research. Crop scientists may find the concept of harnessing stakeholder psychology a more challenging prospect, however. Section Harness Insight From Self-determination Theory Regarding Human Motivation and Behavior rationalized the importance of self-determination in agriculture: full volitional stakeholder engagement in legume production research, from study conception through dissemination of results to the evaluation of impact, would theoretically stimulate an internalization of the science that underpins the research (Baard et al., 2004; Gagné and Deci, 2005). Thus, the behaviors required of farmers to implement the findings of the research into routine practice would become intrinsically motivated, and therefore sustainable (cf. Pelletier et al., 2011). Farmers' direct involvement in framing the research questions and informing project modifications via real-time feedback would foster a sense of autonomy; being involved in a constructive two-way dialogue with the project's science and industry partners would foster a sense of competence and relatedness; fulfillment of these BPNs is associated with intrinsic motivation and optimal human functioning (Deci and Ryan, 2000; Ryan and Deci, 2000). Farmers who are asked to implement recommendations from research that did not include them in the decision making process, are less likely to internalize the necessary behaviors—via a thwarting of their BPNs—and this extrinsic, controlling sense of motivation is difficult to sustain. Indeed, farmers in many regions already have to cope with structural issues over which they have little control (e.g., non-availability of quality seed, prohibitive market structures, poor funding mechanisms), but SDT-informed research can at least assuage this cold reality and help farmers work around such constraints. Section Combining Insight From SDT, Implementation Science, and the Participatory Research Paradigm to Solve the Challenge of Increasing Global Legume Production will describe how legumes researchers can make their first foray into SDT-informed participatory methodologies.

# Practical Tools to Harness Stakeholder Psychology

In workplace, education, and healthcare contexts, SDT-based interventions have proven effective with managers, teachers, and healthcare practitioners, respectively; such leaders can be trained to communicate and behave in a way that satisfies their employees'/students'/patients' BPNs, and this is associated with an increase in their intrinsic motivation in the context, enhanced task engagement, satisfaction, and performance (Baard et al., 2004; Williams et al., 2006; Entwistle et al., 2010; Su and Reeve, 2011; Cheon et al., 2012). Autonomy-supportive communication, in particular, is key to promoting optimal conditions for success, and is a leader and team member characteristic that can be trained. When autonomy support is emphasized in a working relationship, collaborators tend to experience a sense of being "in synch" with each other, where the behaviors of one member are understood to influence the behavior of many others (Reeve et al., 2004; Lee and Reeve, 2012). Each partner in a legumes production project is integral to its success and the project will rely on reciprocal knowledge sharing: legumes research leaders are encouraged to build autonomy support training for all partners into their research plans (and funding applications).

Prior to a project's first formal meeting, all partners (farm, industry, extension, science) are encouraged to communicate on a secure online forum to "get the conversation started" about their legumes challenges, both common and unique (autonomy support and promotion of relatedness). In project meeting number one the work package members and farmer network representatives could receive training on how to promote need fulfillment and intrinsically motivating opportunities in their work with all project partners. Such training represents the primary mechanism by which the researchers will ensure that the project influences stakeholder behavior in a sustainable way: possessing a logical rationale and a sense of competence for the associated tasks is important if one is to invest time and energy to a new course of action; and if the behavior is to be maintained, a sense of self-determination is absolutely vital (cf. Deci and Ryan, 1985, 2000). This training will also cover the proper use of suitable tools for monitoring the psychological outcomes associated with the methodology (e.g., farmers' autonomous motivation, need fulfillment; see section Measurement Tools to Help Legumes Researchers Assess Progress and Project Outcomes).

# Measurement Tools to Help Legumes Researchers Assess Progress and Project Outcomes

SDT has been extensively applied in a variety of applied research contexts. Associated with this activity is the availability of well-validated measurement tools that tap SDT constructs such as the behavioral regulations (intrinsic through to extrinsic forms) and BPNs. For example: (1) basic psychological need satisfaction and frustration scales (BPNSFS) assess the degree to which people feel that their BPNs of autonomy, competence and relatedness are being satisfied, and this is important because need satisfaction is associated with well-being whereas need frustration is associated with ill-being (Chen et al., 2015). Domain-specific BPNSFS exist (education and physical education, relationships, training, sport, physical exercise, work), and while an agriculture version has yet to be constructed, the work domain version will certainly suffice in the meantime (cf. Kasser et al., 1992; Ilardi et al., 1993; Deci et al., 2001); (2) the "Work Climate Questionnaire" (WCQ; Baard et al., 2004) asks respondents to indicate their perception of the autonomy support provided by a target other (e.g., their manager or work package leader) or group (e.g., organization), and its wording can be adapted to suit the particular situation; and (3) selfregulation questionnaires tap into the reasons why individuals do a certain behavior, i.e., for relatively controlled (external and introjected) or autonomous (identified and integrated) reasons (cf. Ryan and Connell, 1989; Williams and Deci, 1996; Black and Deci, 2000; Ryan and Deci, 2000). The SDT-based constructs measured by these scales have consistently demonstrated expected positive or negative relationships with certain other psychological constructs (i.e., convergent and divergent validity), and perhaps most importantly, the ability to predict theoretically associated behaviors (e.g., perception of an autonomy supportive work climate predicting task engagement and performance). Hence, stakeholders (science, industry, farm) could anonymously complete a relevant questionnaire—perhaps with additional space for open-ended answers to allow for elaboration of important issues—to help project leaders longitudinally track a project's ability to satisfy (vs. thwart) partners' BPNs and intrinsically motivate them; a positive trend would theoretically predict project sustainability in the field once the research element comes to an end. Legumes researchers are directed to www.selfdeterminationtheory.org to further their understanding of SDT and the available measurement tools, and encouraged to discuss potential applications with motivation specialists.

## Acceptability of Methodological Suggestions to Legumes Researchers and Their Partners

Adopting the SDT-implementation science approach corresponds to a minimal amount of additional project planning and execution. The cost-benefit ratio is favorable. If communicated effectively to project partners, the theoretical and practical rigor it adds to the participatory research agenda should stimulate their implicit buy-in. Of course it is also possible to formally evaluate their perceptions of acceptability, difficulty, complexity, and applicability. The need for feasibility and fidelity work is once again foregrounded: as previously explained, a feasibility study would assess stakeholder enthusiasm for the project and the probability of successful recruitment to it (including participant adherence and retention), predict associated risks and determine the safety of participants, and increase researcher experience with the intervention and process evaluation methods. By necessity the SDT-informed training and evaluation tools would be included in the feasibility study (sections Practical Tools to Harness Stakeholder Psychology and Measurement Tools to Help Legumes Researchers Assess Progress and Project Outcomes). Similarly, where intervention fidelity refers to the extent to which an intervention is delivered as intended (e.g., legume intercropping), an awareness of the requirements of a test of intervention fidelity allows legume researchers to maximize likelihood of this essential outcome (cf. Borrelli et al., 2005). Researchers are compelled to scrutinize their laboratory protocols, intervention training methods, and communication/dissemination strategies, and fidelity should be a logical consequence; as with project feasibility, the SDT-informed training and evaluation tools would comprise a component of this in-depth scrutiny.

The comprehensive SDT-implementation science participatory approach would essentially militarize all types of legumes research as a powerful weapon against the threat of climate change, food insecurity, and dwindling energy reserves (see **Table 1** for examples). The examples given to illustrate the suggestions made in this review have mostly focused on farmers, but are equally applicable to all stakeholders in global legume production. For example, researchers and their principal investigators—as well as their home research institution and associated funding bodies—can seek to create and contribute to a need-supportive and intrinsically motivating work climate (cf. Lam, 2011; Mamiseishvili and Rosser, 2011; Lechuga and Lechuga, 2012; Lyness et al., 2013; Biondi et al., 2015).

# CONCLUDING STATEMENTS

The threat of climate change, food insecurity, and dwindling energy reserves are ever more pressing. Similarly, the challenges of achieving associated legumes research objectives are sizeable and complex (section Introduction: Global Challenges and **Table 1**). Multi-stakeholder collaborative and participatory approaches that account for (stakeholder) human factors, group dynamics, environmental and biological influences, as well as structural constraints and enablers, are urgently needed (cf. Payne et al., 2017). Hence, legume production in the global context will be advanced by participatory research methods that harness SDT principles and are underpinned by the rigorous planning and reporting standards advocated by implementation science. Specifically, this integrated approach can help us to address what works, for whom, under what circumstances, and collectively, help researchers to design interventions that TABLE 1 | Examples of legumes research challenges that would benefit from the suggested approach.


can be "adapted and scaled up in ways that are accessible and equitable." Pathways to impact are created, utilized, and ultimately streamlined throughout each research project because stakeholders are actively involved from the genesis of the research question. Researchers come to embed stakeholder motives and motivation in all aspects of their project by employing SDT as a guiding framework. This, in turn, helps stakeholders to internalize the behaviors that are incumbent on them to enact if the implementation is to be beneficial and sustainable. While the focus of this review is legumes production, the guidance is equally applicable to other crops and agricultural systems.

#### AUTHOR CONTRIBUTIONS

SP conceived of the premise of the manuscript, wrote the SDT and implementation science sections, matched the

#### REFERENCES

Aeberhard, A., and Rist, S. (2009). Transdisciplinary co-production of knowledge in the development of organic agriculture in Switzerland. Ecol. Econ. 68, 1171–1181. doi: 10.1016/j.ecolecon.2008.08.008

legumes challenges to these approaches, and acted as overall editor of the work. PN-D wrote the legumes-specific content (state-of-knowledge and associated challenges) and provided feedback on SP and RH's sections. RH wrote the participatory research content and participated in discussions whilst the team finalized the premise of the manuscript and its structure.

#### FUNDING

This review was undertaken as part of the authors' substantive roles with their respective institutions; the Institute of Biological, Environmental, and Rural Sciences (IBERS) at Aberystwyth University funded its open access publication. IBERS receives strategic funding from the BBSRC, including a grant to assist with costs of Gold Open Access Publishing of BBSRC-funded science.

Alston, J. M., Pardey, P. G., James, J. S., and Andersen, M. A. (2009). The economics of agricultural R&D. Annu. Rev. Resour. Econ. 1, 537–565. doi: 10.1146/annurev.resource.050708.144137

Altman, D. G., Schulz, K. F., Moher, D., Egger, M., Davidoff, F., Elbourne, D., et al. (2001). The revised CONSORT statement for reporting randomized trials: explanation and elaboration. Ann. Intern. Med. 134, 663–694. doi: 10.7326/0003-4819-134-8-200104170-00012


education teachers be more autonomy supportive toward their students. J. Sport Exerc. Psychol. 34, 365–396. doi: 10.1123/jsep.34.3.365


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Payne, Nicholas-Davies and Home. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Genetic Diversity Linked to Haplotype Variation in the World Core Collection of *Trifolium subterraneum* for Boron Toxicity Tolerance Provides Valuable Markers for Pasture Breeding

*Hediyeh Tahghighi1,2, William Erskine1,2, Richard G. Bennett1,2, Philipp E. Bayer3, Maria Pazos-Navarro1,2 and Parwinder Kaur1,2\*†*

#### *Edited by:*

*Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain*

#### *Reviewed by:*

*Hamid Khazaei, University of Saskatchewan, Canada Zhiying Ma, Hebei Agricultural University, China Raul Huertas Ruz, Noble Research Institute, United States*

#### *\*Correspondence:*

*Parwinder Kaur parwinder.kaur@uwa.edu.au*

*†ORCID: Parwinder Kaur orcid.org/0000-0003-0201-0766*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 11 February 2019 Accepted: 26 July 2019 Published: 30 August 2019*

#### *Citation:*

*Tahghighi H, Erskine W, Bennett RG, Bayer PE, Pazos-Navarro M and Kaur P (2019) Genetic Diversity Linked to Haplotype Variation in the World Core Collection of Trifolium subterraneum for Boron Toxicity Tolerance Provides Valuable Markers for Pasture Breeding. Front. Plant Sci. 10:1043. doi: 10.3389/fpls.2019.01043*

*1 Centre for Plant Genetics and Breeding, School of Agriculture and Environment, The University of Western Australia, Perth, WA, Australia, 2 Institute of Agriculture, The University of Western Australia, Perth, WA, Australia, 3 School of Biological Sciences, The University of Western Australia, Perth, WA, Australia*

In alkaline soils in arid and semi-arid areas toxic concentrations of the micronutrient boron (B) are problematic for many cereal and legume crops. Molecular markers have been developed for B toxicity in cereals and *Medicago*. There is a need for such tools in clovers—*Trifolium*. To this end, we undertook a genome-wide association study (GWAS) with a diversity panel of subterranean clover (*Trifolium subterraneum* L.), an established model pasture legume for genetic and genomic analyses for the genus. The panel comprised 124 *T. subterraneum* genotypes (97 core collection accessions and 27 Australian cultivars). Substantial and useful diversity in B toxicity tolerance was found in *T. subterraneum*. Such variation was continuously distributed and exhibited a high broad sense heritability *H2* = 0.92. Among the subspecies of *T. subterraneum*, ssp. *brachycalycinum* was most susceptible to B toxicity (*P* < 0.05). From the GWAS, the most important discoveries were single-nucleotide polymorphisms (SNPs) located on Chr 1, 2, and 3, which mapped to haplotype blocks providing potential genes for a B toxicity tolerance assay and meriting further investigation. A SNP identified on Chr 1 aligned with *Medicago truncatula* respiratory burst oxidase-like protein (TSub\_ g2235). This protein is known to respond to abiotic and biotic stimuli. The identification of these novel potential genes and their use to design markers for marker-assisted selection offer a pathway in pasture legumes to manage B toxicity by exploiting B tolerance.

Keywords: abiotic stress, boron toxicity, genome-wide association study, haplotype analyses, hydroponic, forage legumes, subterranean clover

# INTRODUCTION

Boron (B) is one of the essential micronutrients for healthy plant growth (Tomić et al., 2015) and is available to plants as boric acid (Reid, 2014). Due to its small molecular size and high membrane permeability in comparison to many other nutrients, the uptake and diffusion of B can be difficult for plants to control (Reid, 2014). Boron deficiency and toxicity are known to have adverse effects on agricultural production around the world (Oertli and Kohl, 1961; Nable et al., 1997;

**214**

Hamilton et al., 2015). Although B deficiency is relatively easy to manage using B-rich fertilizers, B toxicity is more difficult to manage. Soil B concentration can be reduced by leaching, and B availability can be modified by pH adjustment, but this is impractical on a large scale (Yau and Ryan, 2008). Therefore, the use of genetic variation and plant breeding for tolerance is likely the best way to overcome toxicity (Yau and Ryan, 2008).

Boron toxicity mostly occurs in dry areas with an alkaline soil pH, particularly above pH 9, and in areas with low rainfall or in heavier clay soils, where B does not readily leach into deep soil layers below the root zone (Yau and Ryan, 2008). In 1983, a widespread B toxicity problem was reported in South Australia (Cartwright et al., 1984)—recently estimated to affect ~4.9 million hectares (31%) of the agricultural zone in South Australia (Howie, 2012), and ~15% in Western Australia (Lacey and Davies, 2009) have been identified as at risk of B toxicity. A concentration of B in the range of only 10 to 54 mg kg-1 in the soil inhibits plant growth (Javid et al., 2015), and soils in the southern Australian cropping region can reach 52 mg kg-1 B (Nuttall et al., 2003). Yield losses due to B toxicity have been reported for cereals (Cartwright et al., 1984; Paull et al., 1988), annual medics (*Medicago* spp.), field pea (*Pisum sativum* L.) (Paull et al., 1992), and lentil (*Lens culinaris* Medik.) (Yau and Erskine, 2000).

Strategies to cope with low and high soil B vary among plant species and genotypes (Reid, 2014). Plants cells are able to adjust the flow of most nutrients by selective membrane transport proteins, but in this regard, B is exceptional as it exists as uncharged boric acid at physiological pH and is therefore highly permeable through the lipid bilayers that form the basis of biological membranes (Reid, 2014). The three known pathways by which B enters and exits cells are: 1) passive, bidirectional diffusion through the lipid bilayer; 2) passive bidirectional diffusion through selective or non-selective channels; and 3) active efflux pumping (Reid, 2014). Hayes and Reid (2004) showed that a B-tolerant barley cultivar (Sahara) was able to maintain an internal B concentration lower than the external medium, presumably with an associated need for energy to preserve the gradient across the plasma membrane.

Boron toxicity symptoms vary with its mobility within the plant (de Abreu Neto et al., 2017). In common crop species, B is largely immobile once inside the cell wall, which leads to an accumulation at the leaf margins where the xylem vessels terminate (Reid, 2014), thereby causing chlorosis or necrosis of leaf tips and margins in older leaves (Brown and Shelp, 1997; Yau and Ryan, 2008). Lentil, barley (*Hordeum vulgare* L.), alfalfa (*Medicago sativa* L.), faba bean (*Vicia faba* L.), chickpea (*Cicer arietinum* L.), bread wheat (*Triticum aestivum* L.), durum wheat (*T. durum* Desf.), vetch (*Vicia* spp.), and field pea exhibit this type of B toxicity symptom (Yau and Ryan, 2008).

Reid et al. (2004) demonstrated that B inhibited growth by 40 to 60% in a monocot (barley), dicot (*Arabidopsis*—*Arabidopsis thaliana* L.) and an alga (*Chara*) when its soluble concentration reached 10 mM in the growth medium. Additionally, Kohl and Oertli (1961) reported the same concentration of B caused necrotic leaf margins in various plants. There is clearly variation in B tolerance among plant species and even among cultivars of the same species (Nable, 1988), indicating that natural variation in B toxicity tolerance exists within species, which could be used for selection and breeding of tolerant genotypes (Reid et al., 2004), leading to the development of several screening methodologies in crop species (Paull et al., 1992; Yau and Ryan, 2008; Schnurbusch et al., 2010; Javid et al., 2015; Bennett et al., 2017).

*Arabidopsis* has been used as a model of B tolerance to identify genes involved in B uptake and translocation—AtBOR1 and AtNIP5;1 (Takano et al., 2002; Takano et al., 2006), which conferred tolerance to plants under B deficient conditions. AtBOR4 and OxAtTIP5;1 are over-expression of an AtBOR1 paralog and AtTIP5;1, respectively, in transgenic *Arabidopsis* and encode transport molecules that prevent or regulate excess intercellular B (Miwa et al., 2007; Pang et al., 2010). Other studies have identified homologous genes related to B toxicity tolerance in other species: HvBot1 in barley (Sutton et al., 2007), MtNIP3 in the model legume *Medicago truncatula* (Bogacki et al., 2013), along with a single chromosomal region controlling tolerance to B in lentil (Kaur et al., 2014), two additive loci with incomplete dominance that admitted excess B tolerance in peas (Bagheri et al., 1996), and the Bo1 marker allele in durum and bread wheat (Schnurbusch et al., 2007; Schnurbusch et al., 2008). However, there are no reports of the genetic basis of B tolerance in *Trifolium*.

Subterranean clover (*Trifolium subterraneum* L.) is the most important sown annual pasture legume species in southern Australia and is grown over an estimated area of 29 million ha in the 250 to 1200 mm annual average rainfall band (Nichols et al., 2013). *T. subterraneum* is established as a model for *Trifolium* for genetic and genomic studies (Kaur et al., 2017a) on the basis of its diploidy (2n = 16), self-pollinating habit, and presence of major genomic resources. The species consists of three subspecies: 1) ssp. *subterraneum*, 2) ssp. *yanninicum*, and 3) ssp. *brachycalycinum*, of which *subterraneum* and *yanninicum* are adapted to acidic soils, and ssp. *brachycalycinum* is better adapted to neutral–alkaline soils where B is often problematic (Nichols et al., 2013). Although there is no information available on variation in B toxicity tolerance within the genus *Trifolium*, *T. subterraneum* has a wide natural distribution, which includes various soil types, presumably with variable B content, leading to the expectation that there may be a wide range of tolerance to B toxicity in this species.

The objectives of this study, using *T. subterraneum* as a *Trifolium* model, were to 1) develop a hydroponic screening system and identify a suitable concentration of B to differentiate B tolerance, 2) investigate variation for B toxicity tolerance in a wide range of germ plasm of *T. subterraneum*, and 3) investigate the genetic and molecular basis for B toxicity tolerance in *T. subterraneum* by using the candidate gene approach. We tested the specific hypotheses that: 1) there exists significant level of variation for boron toxicity tolerance within the existing *T. subterraneum* germ plasm diversity panel, 2) ssp. *brachycalycinum* is most likely to demonstrate tolerance to excess B among the three subspecies, and 3) a genomewide association study (GWAS) will indicate potential genomic associations with B tolerance in *T. subterraneum*. The outcome of this study will help to enhance the efficiency of breeding for B toxicity tolerance in *T. subterraneum* and other *Trifolium* species. Increasing the tolerance of *T. subterraneum* to B toxicity may have a direct productivity benefit in soils with high B levels in the subsoil enabling plants to better access subsoil moisture reserves in dry seasons (Holloway and Alston, 1992).

# MATERIALS AND METHODS

# Plant material: A Diversity Panel of Core Collection Lines and Cultivars

A diverse panel of 124 *T. subterraneum* genotypes (**Supplementary Table S1**) was selected for the study, which included 97 core collection accessions (Nichols et al., 2013) and 27 diverse Australian cultivars (Kaur et al., 2017b). The core collection was developed by K. Ghamkhar, R. Appels and R. Snowball to represent the genetic diversity within the world collection of >10,000 phenotypes (Nichols et al., 2013; Ghamkhar et al., 2015). Selection of the core collection followed the methodology of Ghamkhar et al. (2008) to identify a subset of 760 lines, on the basis of 1) diversity for eco-geographical data from their sites of collection; and 2) agromorphological data obtained by the Australian Trifolium Genetic Resource Centre (ATGRC) of the Department of Agriculture and Food Western Australia (DAFWA). DNA was then extracted from leaf material of each short-listed line, and 48 single-sequence repeat (SSR) primers, spread across each of the eight *T. subterraneum* chromosomes, were selected from the results of Ghamkhar et al. (2012) to identify the most diverse lines. Analysis using MSTRAT software (Gouesnard et al., 2001) to optimize maximum diversity within the minimum number of lines, resulted in an optimum core collection of 97 lines, covering 80.1% of the genetic diversity within the whole *T. subterraneum* collection. For these wild accessions with passport data, a total of 19 bioclimatic variables representing the climate of collection sites were derived from the WorldClim database (Hijmans et al., 2005).

# Protocol Development for B Toxicity Phenotyping

All phenotyping experiments were carried out in the Plant Growth Facility at The University of Western Australia. A preliminary experiment was conducted to develop a hydroponic screening system for B toxicity tolerance in *T. subterraneum* and also to identify the level of B which showed the maximum discrimination among a selection of genotypes. Ten diverse genotypes of *T. subterraneum* (**Supplementary Table S1**) were subjected to four concentrations of B (0, 15, 30, 45 mg B L-1) in a hydroponic system under controlled temperature and photoperiod, an adaptation of the method reported in Bennett et al. (2017). The temperature was set at 24/20° C day/night and a 20-h photoperiod supplied by LED lights (4:3 ratio of model 108D18-V12 tubes from S-Tech Lighting, Australia and AP67L series tubes from Valoya, Helsinki, Finland). Forty seeds of each genotype were scarified to ensure uniform germination, and seeds were placed in plastic Petri dishes on moist filter paper to imbibe. Petri dishes were wrapped in Parafilm to avoid evaporation and aluminum foil to maintain seeds in darkness. The Petri dishes were stored at 15° C for 2 days. Then, for each treatment, 10 seeds of each genotype were sown into moist peat plugs within

Styrofoam trays (Garden city plastic, PLT288S). One Styrofoam tray was allocated to each B treatment and placed in separate storage tubs (35L Icon Plastics) in the controlled environment room. The experimental design was based on a strip-plot with full replication among treatments. Genotypes were arranged in rows, and rows were randomized in each B treatment. Seedlings were watered immediately after sowing and again after 24 h with tap water. After a further 24 h (Day 5), 20 L of tap water was added to each storage tub to float the Styrofoam trays. The final number of individuals was greater than 5 for all genotype–treatment combinations. On day 8, the water in the tubs was replaced with a nutrient solution in DI water (adapted from Hoagland and Arnon, 1938) (**Supplementary Table S2**). The pH was maintained between 6 and 7 with bi-weekly adjustments using KOH to raise pH and H3PO4 to lower pH. On day 14, B (as H3BO3) was added into the hydroponic solution to achieve the desired concentration for treatments. On day 19, individual plants were scored for severity of B toxicity leaf symptoms using a 0.0- to 8.0-rating scale adapted for *T. subterraneum* from that used by Bagheri et al. (1992) (**Supplementary Table S3**). Leaf symptom scores were chosen as the B toxicity metric as they are routinely used in phenotyping similar legume species for B toxicity (Yau and Ryan, 2008) Leaf symptom scores for each B treatment were analyzed separately using "LSD.test" (agricolae package) in RStudio (Version 0.99.484, RStudio, Inc. R Core Team 2009-2015). The B treatment that provided the greatest level of discrimination (15 mg B L-1) was selected for further screening of the panel of 125 genotypes.

# Diversity Panel Screening for B Toxicity

Screening the panel of 125 genotypes (**Supplementary Table S1**) was conducted in two sub-experiments, with genotypes allocated randomly to sub-experiments. Each sub-experiment contained four hydroponic culture tubs subjected to identical growth conditions, making a total of eight "blocks" for the 125 genotypes' screening. To test and correct for block effects, each tub contained five "check" cultivars (Antas, Dalkeith, Gosse, Izmir, and Losa) arranged randomly as partial replicates and 15 "test" entries in rows of up to ten individual plants (**Supplementary Figure S1**). Test entries were randomly allocated to tubs. Seeds were germinated, planted, and grown for 14 days as previously described. Five days after the plants were transferred to Hoagland's solution, we observed some mild leaf symptoms on some individuals that could potentially confound B toxicity symptom expression. Hence, all plants were scored for leaf damage prior to the addition of B, and these data were used as a covariate for correction in the final analysis as described below. On day 14, 15 mg B L-1 (as H3BO3) was added to the hydroponic solution. Boron exposure was increased from 5 to 7 days when screening the 125 genotypes to improve the phenotyping result in this wider selection of germ plasm. Plants were scored individually for B toxicity symptoms on day 21 (**Supplementary Figures S2**, **S3A** and **Supplementary Table S3**). The final number of individuals was greater than 3 for all genotype–treatment combinations. After correction for block effects and prior leaf damage, the corrected means of B toxicity score (**Supplementary Figure S3B**) were used for further analysis.

A covariate was applied to each tub to correct for block effects by averaging the B toxicity scores of all five check lines in each tub. This covariate and a covariate derived from prior leaf damage scores were used to correct the average B toxicity score of each of the 125 genotypes (on an individual plant basis) using "UNIANOVA" in SPSS (IBM Corp., 2013). This corrected mean B toxicity score (B score) was plotted and used in the GWAS, and in further analysis of correlations between B tolerance and data from passport information. The B score was also compared to the climate at the collection site of genotypes using Bioclimatic variables (Hijmans et al., 2005; Nichols et al., 2013) (**Supplementary Table S1**) as described below.

Differences in the B score among non-continuous variables in the passport data (Soil texture, country of origin, subspecies and category) were analyzed by one-way ANOVA in RStudio with B score as the dependent variable and results were tabulated and plotted (Box plot RStudio default function). Continuous variables (latitude, longitude, altitude, soil pH and 19 BioClim variables) were analyzed by ANOVA in RStudio to produce Rcorr correlation coefficients (Hmisc package) and their significance (*R*<sup>2</sup> and *P* value).

#### Genotyping of the Diversity Panel and Genome-Wide Associations

Phenotypic B tolerance information obtained from the core collection of 97 accessions and the 27 elite Australian cultivars of *T. subterraneum* were associated with specific regions in the advanced assembly (Tsub\_Refv2.0) (Kaur et al., 2017a) using GWAS analyses. Genomic DNA (gDNA) was extracted from a single plant of each of the 124 genotypes of *T. subterraneum* and sequenced. Highquality whole-genome resequencing (WGRS) data were generated for all 124 of these accessions and cultivars as described by Kaur et al. (2017b). SNP identification was conducted using samtools and bcftools (Li et al., 2009; Li, 2011), then SNPs with at least one heterozygous allele, those with an Minor Allele Frequency (MAF) ≤ 5%, and those that were not present in at least one individual were removed to keep only homozygous SNPs and remove errors of mis-mapping heterozygotes. This lead to removal of clustered SNPs, which is further confounded by the relatively low population. QQ plot was conducted to evaluate the effect of low population size (**Supplementary Figure S4**)**.** Consecutive SNPs were merged using PLINK v1.9 (Purcell et al., 2007; Chang et al., 2015) into haplotype blocks if their *r*<sup>2</sup> values were above 0.8. Linkage disequilibrium was visualized using Haploview v4.2 (Barrett et al., 2005).

The population structure for GWAS was of two sub-populations: the first sub-population comprised 27 cultivars released in Southern Australia for grazing; while the second sub-population of 97 accessions was a core germ plasm collection—a stratified sample of the world collection of *T. subterraneum* (Kaur et al., 2017b). Despite these two sub-populations, a principal component analysis reported in Kaur et al. (2017b) revealed the diversity between the sub-species: ssp. *subterraneum*, ssp. *yanninicum* and ssp. *brachycalycinum.* Four principal components were used to correct for population stratification. Individuals are split up in three subpopulations corresponding to the three sub-species (**Supplementary Figure S5)**. GWAS was conducted *via* a logistic regression using the four principal components as covariates to correct for population stratification using PLINK v1.90b3.42 (Chang et al., 2015) (**Supplementary Figure S5**). BLASTP v2.2.30+ was used to link the identified genes with known genes within the annotations of *M. truncatula* Mt4.0v2 (Tang et al., 2014) and *Arabidopsis thaliana* (TAIR10) (Haas et al., 2005; Lamesch et al., 2012).

#### Marker-Trait Association Studies and Putative Candidate Gene Analysis

Each significant marker-trait association (MTA) resulting from the GWAS was checked for any overlaps with haplotype blocks with *r*<sup>2</sup> values above 0.8. In which case, sequences 25-bp upstream and downstream from the SNP were extracted from the reference and were used to design PCR-ready markers for MAS for this B toxicity tolerance trait in primer3 v2.3.7 (Koressaar and Remm, 2007; Untergasser et al., 2012) (settings: primer product size, 250– 500; primer optimum size, 300; primer minimum temperature, 55°C; optimal temperature, 57°C; maximum temperature, 60°C).

Putative candidate genes were proposed for each significant MTA by extracting the genes upstream, downstream, or overlapping with GWAS candidate SNPs.

#### RESULTS

#### Phenotypic Traits and Boron Toxicity Tolerance

To establish a suitable screening system and to determine the best concentration of B for phenotypic traits in a hydroponic system, ten genotypes of *T. subterraneum* were tested under four different concentrations of B (preliminary experiment). Tip chlorosis and necrosis were apparent in leaves after 5 days of B treatment, and plants were scored (**Supplementary Figure S2**). ANOVA indicated that both treatment and line had a significant effect (P < 0.05) on the score, with a significant interaction between these factors. Overall, the severity of symptoms increased with increasing B concentration (*P* < 2.2e-16). An LSD test in RStudio for leaf symptom score revealed that 15 mg B L-1 provided the greatest level of discrimination among genotypes, and so this concentration was selected to screen the panel of 125 genotypes (**Table 1**).

In the subsequent screening experiment, 125 genotypes of *T. subterraneum* were subject to 15 mg B L-1. Among the 125 accessions, there was a continuous distribution of tolerance to B toxicity (**Figure 1**). The genotype most tolerant to excess B concentration was L44 (ssp. *subterraneum*) with an average B score of 0.3 (se 0.27). The most tolerant among the cultivars of *T. subterraneum* tested was Dwalganup, which is also a *subterraneum* ssp., with a B score of 0.8 (se 0.20) (**Figure 1** and **Supplementary Table S1**).

In terms of susceptibility, L74 (ssp. *subterraneum*) showed the most severe B toxicity symptom with an average B score of 4.1 (se 0.27). This genotype was collected from an area with clayey soil texture and pH = 7.3 (**Figure 1** and **Supplementary Table S1**). Among the cultivars, Nungarin (ssp. *subterraneum*) was the most susceptible with a B score of 3.9 (se 0.21) (**Figure 1** and **Supplementary Table S1**).

Associations between continuous variables of origin (latitude, longitude, altitude, soil pH, and 19 BioClim variables) were tested by Pearson's correlation (**Supplementary Figure S6** and **Supplementary Table S4**). We anticipated a significant correlation

#### TABLE 1 | Effect of four different concentrations of boron for 10 genotypes of *T. subterraneum* in preliminary experiment.


*B, Boron (0, 15, 30, 45 mg B L-1).*

*abcdefMeans within columns not sharing a common letter are significantly different (P < 0.05).*

GRC identity using Supplementary Table S1.

between B score and soil pH. However, the strongest correlation with B score was longitude (*P* value <0.1, *R*<sup>2</sup> = 0.029). The analysis of B scores compared among discontinuous variables (soil texture, country of origin, subspecies, and category) (**Supplementary Figures S7A-D**) indicated a significant difference (*P* value <0.05) existed between B toxicity symptoms of different subspecies (**Supplementary Figure S7C**). The *post hoc* LSD test showed subspecies *brachycalycinum* was significantly more susceptible than ssp. *subterraneum* or *yanninicum* (*P*-value <0.05) (**Supplementary Table S5**). Soil pH at collections sites ranged from 6 to 9 (mean value = 7.3) at ssp. *brachycalycinum* sites and from 5 to 9 (mean value = 6.3) in ssp. *subterraneum* sites (**Supplementary Figure S8**

and **Supplementary Table S1**). Soil pH data were not available for ssp. *yanninicum* accessions (**Supplementary Figure S8**). Comparing the 28 cultivars with the 97 wild accessions, the means for B toxicity tolerance of the groups were similar as were the ranges (**Supplementary Figure S7B**).

#### Associating SNPs to Gene Models and PCR-Ready Markers to Track Haplotype Variation

Potential genes were proposed for each significant MTA by extracting the genes upstream, downstream or overlapping with GWAS candidate SNPs. BLASTP was used to search for homologues of proteins encoded by the candidate genes within *M. truncatula*  Mt4.0v2 and *Arabidopsis thaliana* (TAIR10) database.

The GWAS identified eight markers which reached suggestive *P* value below 1e-5 on chromosomes 1, 2, 3, 5, 6, and 7 associated with the B trait (**Figure 2** and **Table 2**). QQ plot suggested a relatively weak effect due to the low population size (**Supplementary Figure S4**). The SNPs located on Chr 1, 2, and 3 were mapped in haplotype blocks containing 21, 13, and 5 other SNPs with a total length of 366.62, 240.97, and 13.43 kbp, respectively (**Figure 3** and **Table 3**). The significant SNP identified on chromosome 1 was located on the region of candidate gene TSub\_ g2235 positioned between 32,860,391 and 32,866,822 with a total exon length of 2548 (**Table 2**) (Kaur et al., 2017a). Two significant SNPs were identified on Chr 2. For the first of these, an upstream endonuclease/exonuclease/phosphatase family protein (Tsub\_g4776) and a downstream subtilisin-like serine protease (Tsub\_g4777) were identified at a distance of 28878 and 20045 bp, respectively, from the suggestive SNP. For the second, an upstream calcium-binding EF-hand protein (Tsub\_g7559) and a downstream pinoid-binding protein 1 (Tsub\_g7560) were identified at a distance of 23628 and 5415 bp, respectively, from the suggestive SNP (**Table 2**).

An upstream leucine-rich repeat receptor-like protein kinase (Tsub\_g9589) and a downstream ribosomal protein L16p/L10e family (Tsub\_g9590) was identified for the significant SNP on Chr 3 at a genetic distance of 2943/289 bp, respectively (**Table 2**). On Chr 5, an upstream transcription factor-like protein (Tsub\_g16463) and a downstream cytochrome P450 family protein (Tsub\_g16464) were detected at the distance of 1804 and 43166 bp, respectively, from the suggestive SNP. Two significant SNPs were found on Chr 6, both being in the region of potential gene (TSub\_19611) which aligned with signal recognition particle 54-kDa protein in the *M. truncatula* database (**Table 2**). On Chr 7, we identified an upstream rare lipoprotein A-like double-psi beta-barrel protein (TSub\_g22842), and no significant hits for a downstream (Tsub\_ g22843) at the distance of 3094 and 57 bp, respectively, from the suggestive SNP (**Table 2**).

The haplotype block containing the MTA SNPs on Chr 1, 2, and 3 with total length of 366,616, 109,074, and 32,973 bp, and 13,434 bp, respectively, was used to design PCR-ready markers for MAS for this B toxicity tolerance trait (**Table 3**).

#### DISCUSSION

The present study was designed to estimate variation in B stress tolerance and to identify potential B stress-responsive genes in *Trifolium* using *T. subterraneum* as a model. The study is the first report to demonstrate that substantial useful variation in B toxicity tolerance exists in *T. subterraneum*. Furthermore, the high broad-sense heritability *H2* = 0.92 indicates that the trait is little influenced by environmental conditions. Variation in B toxicity tolerance has previously been reported for cereals—barley and wheat, and legumes—medics, peas and lentils (Nable, 1988;


TABLE 2 | Significant genomic associations identified using the phenotyping data for the boron toxicity tolerance in *T. subterraneum*.

*\*Haplotype Block \*\*Alleles: G/T,A means that G is reference, T,A are alternative alleles in the population Omitted three SNPs on chromosome 2 that behaved identical to 2\_56134002 (2\_56134932, 2\_56138979, 2\_56139128).*

Paull et al., 1988; Paull et al., 1992; Bagheri et al., 1994; Yau and Erskine, 2000; Schnurbusch et al., 2008). Among *T. subterraneum* cultivars tested, Dwalganup and Nungarin were the most tolerant and susceptible to B toxicity, respectively. Clearly response to selection for B stress tolerance can be anticipated.

Significant MTAs for B toxicity tolerance were detected through GWAS analysis with the most significant discovery being the SNPs located on chromosome 1, 2, 3, which mapped into haplotype blocks. The potential gene on Chr 1 (TSub\_ g2235) aligned with *M. truncatula* respiratory burst oxidaselike protein and respiratory burst oxidase homolog (RBOH) protein B in *A. thaliana*. Respiratory burst NADPH oxidase is found in plant proteins, such as respiratory burst NADPH oxidase protein, which produces reactive oxygen species (ROS) as a defence mechanism (Suzuki et al., 2011). Respiratory burst oxidase homologues in plants are plasma membrane enzymes which produce ROS. They participate in a variety of mechanisms, such as cell elongation and abiotic stress signaling pathways, hormonal signaling, and pathogen response (Montiel et al., 2016; Arthikala et al., 2017). Recent studies have revealed that RBOHs participate in legume–rhizobia interaction (Montiel et al., 2016). Ozhuner et al. (2013) described B toxicity symptoms as cell wall biosynthesis degradation, inhibition of cell division, and elongation and metabolic decline by binding to the ATP, NADH, and NADPH component of the ribose. Our results suggest that RBOHs may also be involved in B toxicity tolerance in *T. subterraneum*.

BLAST search revealed that some of the other SNPs identified in MTAs have high-sequence similarities with potential genes known for plant stress responses (**Table 2**). Based on the results of the current study, these derived proteins may also be expressed in B toxicity conditions in *T. subterraneum*. Being in haplotype blocks, these genes identified on chr 1, 2, and 3 are the most stable and potential for designing molecular markers to track haplotype variation for this trait (**Table 3**). We plan to now functionally validate these genes found associated with B toxicity in subterranean clover using the CRISPR-Cas system in a follow-up study.

Although the corresponding proteins for the B transporter/ channel genes AtBOR1 and AtNIP5;1 in *Arabidopsis* are responsible for B uptake in B deficient conditions (Takano et al., 2006), similar proteins in barley and *M. truncatula* have been found to be linked to B toxicity tolerance (Reid, 2007; Sutton et al., 2007; Bogacki et al., 2013). However, in the present study, no linkage was found between B toxicity tolerance and AtBOR1, AtNIP5;1 in *T. subterraneum*.

Molecular markers have been identified for selection of B toxicity tolerance in other plant species. Tolerance to excess of B is controlled

by a single gene in the model legume *M. truncatula* (Bogacki et al., 2013), barley (Sutton et al., 2007), and lentil (Kaur et al., 2014). Based on our phenotypic and genotypic results, this trait could be controlled by more than one gene in *T. subterraneum*.

The hydroponic system developed herein provides an efficient, rapid (21 days) method to screen nutrient toxicity and deficiency in breeding studies. This is the first report of hydroponic screening for B toxicity tolerance in *T. subterraneum*. Hydroponics have previously been used as a rapid method for B toxicity tolerance screening in barley, *Brassica rapa* L., wheat, rice (*Oryza sativa* L.), and field pea (Jefferies et al., 1999; Kaur et al., 2006; Schnurbusch et al., 2007; Pallotta et al., 2014; de Abreu Neto et al., 2017; Bennett et al., 2017), B toxicity and salinity tolerance screening in field pea (Javid et al., 2015), and aluminum tolerance screening in barley and wheat (Baier et al., 1995; Ma et al., 1997). Screening for abiotic stress tolerance in the field is difficult due to environmental heterogeneity and variation of mineral content in soil (Stoddard et al., 2006). This research has provided a robust, high-throughput hydroponic protocol for screening B toxicity which could be readily applicable to screen other plant species and/or for other abiotic stresses.

In previous hydroponic B toxicity screening studies, the B concentrations used ranged from 162 mg L-1 in wheat (Schnurbusch et al., 2007) to 8 mg L-1 in rice (de Abreu Neto et al., 2017). In the latter study, 8 mg B L-1 induced severe toxicity in many varieties of rice (de Abreu Neto et al., 2017). Wheat appears to more tolerant, and rice more susceptible to B concentration compared with *T. subterraneum*.

Boron toxicity is problematic in soils with high pH (Yau and Ryan, 2008). Our expectation was that ssp. *brachycalycinum*, which is commonly found on alkaline soils, was more likely to demonstrate B tolerance than the other two subspecies of *T. subterraneum*. However, B toxicity tolerance was not significantly correlated with any passport data or BIOCLIM variables, including soil pH (**Supplementary Table S4**) and, among the three subspecies, *brachycalycinum* was the most susceptible species for B toxicity. Therefore, our expectation was not met. A possible explanation for these results is that the *brachycalycinum* genotypes tested here did not come from highly alkaline soils. B has relatively high availability in soils of pH 5 to 6.5, with availability then dropping as pH increases to 8.5. In soils above pH 8.5, B once again becomes highly available (Dwivedi et al., 1992). This reduced availability of B in neutral to moderately alkaline soils is particularly prevalent in soils with high calcium content as B has the tendency to bind with Ca in the soil (Dwivedi et al., 1992). Some studies have demonstrated that


adding lime to acidic soil increased soil pH to a more moderate pH and, consequently, could result in lower concentrations of B in pea and barley plants tissue (Gupta and Macleod, 1981; Dwivedi et al., 1992). The *brachycalycinum* genotypes tested here were collected from soils with pH ranging from 6 to 9, with most (48%) collected from soil pH 6.5 to 7.5, where B would have poor availability. In contrast, the ssp. *subterraneum* genotypes tested in the current study were mostly (70%) collected in soils with pH less than 6.5. As previously highlighted, B is readily available in soil pH 5 to 6.5, consistent with our results indicating that the most tolerant genotypes were found in ssp. *subterraneum*.

In conclusion, this study demonstrated substantial variation in tolerance to B toxicity in *T. subterraneum* germ plasm, which was genetically dissected by GWAS. Potential genes were identified through GWAS associated with B toxicity tolerance that merit further investigation. The high throughput hydroponic system developed here could be applicable to other plants for screening for abiotic stress. Furthermore, tolerant cultivars, such as Dwalganup and Napier, would be priorities for use in soil types with potential for B toxicity. The results from this study provide valuable, new information for both plant breeding and gene validation studies using CRISPR technology in *T. subterraneum*.

#### AUTHOR CONTRIBUTIONS

HT performed the boron phenotyping research under the guidance of PK, WE, RB, PB, and MP-N and wrote the article with contributions from all. PK designed and performed the sequencing

#### REFERENCES


experiments and conducted the bioinformatics analysis with PB. All authors read the article and approved the content.

### FUNDING

This study was conducted by the Centre for Plant Genetics and Breeding (PGB) at The University of Western Australia (UWA). Funding for this work was also provided by an Australian Research Council Linkage Grant (LP100200085), UWA RMF Grant, Meat and Livestock Australia (MLA grant B.PBE.037) and the Department of Agriculture and Food Western Australia (DAFWA).

#### ACKNOWLEDGMENTS

The authors gratefully acknowledge the Department of Primary Industry and Regional Development (DPIRD) for the provision of seeds for this study. This work was also supported by resources provided by the Pawsey Supercomputing Centre with funding from the Australian Government and the Government of Western Australia. The support to HT from the Australian Wool Education Trust (AWET) and to PB from the Forrest Research Foundation is gratefully acknowledged.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01043/ full#supplementary-material.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Tahghighi, Erskine, Bennett, Bayer, Pazos-Navarro and Kaur. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Evaluation of Protein and Micronutrient Levels in Edible Cowpea (Vigna Unguiculata L. Walp.) Leaves and Seeds

Felix D. Dakora<sup>1</sup> \* and Alphonsus K. Belane<sup>2</sup>

<sup>1</sup> Chemistry Department, Tshwane University of Technology, Pretoria, South Africa, <sup>2</sup> Department of Crop Sciences, Tshwane University of Technology, Pretoria, South Africa

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Abe Shegro Gerrano, Agricultural Research Council of South Africa (ARC-SA), South Africa Ainong Shi, University of Arkansas, United States

> \*Correspondence: Felix D. Dakora DakoraFD@tut.ac.za

#### Specialty section:

This article was submitted to Crop Biology and Sustainability, a section of the journal Frontiers in Sustainable Food Systems

> Received: 23 May 2019 Accepted: 16 August 2019 Published: 04 September 2019

#### Citation:

Dakora FD and Belane AK (2019) Evaluation of Protein and Micronutrient Levels in Edible Cowpea (Vigna Unguiculata L. Walp.) Leaves and Seeds. Front. Sustain. Food Syst. 3:70. doi: 10.3389/fsufs.2019.00070 Cowpea is the most important seed legume in Africa. Its leaves and seed are consumed to meet the dietary requirements of protein and micronutrient in rural African communities. In this study, leaf protein of 32 cowpea genotypes was 23–40% at Taung (South Africa), 28–40% at Wa and 24–35% at Manga (Ghana). Seed protein level was also up to 40% in landrace Bengpla and more than 30% in nine other genotypes planted at Taung. Trace elements in cowpea leaves showed markedly high concentrations of Fe (2,011 µg.g−<sup>1</sup> ), Zn (150 µg.g−<sup>1</sup> ), Mn (325 µg.g−<sup>1</sup> ), and B (43 µg.g−<sup>1</sup> ) in genotype Apagbaala, in contrast to the very low levels of Fe (273 µg.g−<sup>1</sup> ), Zn (40 µg.g−<sup>1</sup> ), Mn (219 µg.g−<sup>1</sup> ), and B (32 µg.g−<sup>1</sup> ) in genotype Encore. Leaf Fe concentration was highest in genotype Apagbaala (2,011 µg.g−<sup>1</sup> ), followed by Fahari (2,004 µg.g−<sup>1</sup> ), Iron Gray (1,302 µg.g−<sup>1</sup> ), Line 2020 (944 µg.g−<sup>1</sup> ), Bensogla (927 µg.g−<sup>1</sup> ), Omondaw (605 µg.g−<sup>1</sup> ), IT96D-1951 (591 µg.g−<sup>1</sup> ), IT93K-452-1 (574 µg.g−<sup>1</sup> ), Ngonji (569 µg.g−<sup>1</sup> ), and Mchanganyika (566 µg.g−<sup>1</sup> ), and lowest in Bechuana white (268 µg.g−<sup>1</sup> ). Cowpea seed also showed greater concentrations of Fe in genotype Soronko (67 µg.g−<sup>1</sup> ), IT93K-452-1 (67 µg.g−<sup>1</sup> ), Brown Eye (65 µg.g−<sup>1</sup> ), Bensogla (61 µg.g−<sup>1</sup> ), and TVU11424 (62 µg.g−<sup>1</sup> ). Trace elements in cowpea seed differed among genotypes, and ranged from 45.1 to 67.0 µg.g−<sup>1</sup> for Fe, 33.9 to 69.2 µg.g−<sup>1</sup> for Zn, 10.1 to 17.4 µg.g−<sup>1</sup> for Mn, 14.7 to 21.4 µg.g−<sup>1</sup> for B, and 5.2 to 8.1 µg.g−<sup>1</sup> for Cu. Genotypes Apagbaala, Fahari, Iron Gray, and Line 2020, respectively, exhibited 34.2-, 34.0-, 22.5-, and 18.3-fold higher Fe concentration in leaves than seed, and 3.5-, 2.0-, 2.0-, and 3.5-fold greater Zn in leaves than seed (in that order). The genotypes that accumulated significantly high levels of protein and trace elements in cowpea leaves and seed, were generally high N2-fixers, thus suggesting a link between N<sup>2</sup> fixation and cowpea's ability to synthesize protein and accumulate nutrient elements in leaves and seed. Therefore, identifying cowpea genotypes that can enhance protein accumulation and micronutrient density in edible leaves and seed through breeding has the potential to overcome protein-calorie malnutrition and trace element deficiency in rural Africa.

Keywords: cowpea, food, sustainability, breeding, micronutrients

# INTRODUCTION

Food and nutritional insecurity remain a major problem facing Africa, as about 239 million people are suffering from proteincalorie malnutrition (Fanzo, 2012), and another 232 million from micronutrient deficiency (Andrea and Rose, 2015). In Africa, hunger is the result of food insecurity due to low crop yields stemming from soil moisture deficit from low rainfall, farmer use of nutrient-poor soils for agriculture and unimproved crop varieties, as well as the effects of biotic stress such as insect pests and diseases (Dakora and Keya, 1997). Although N fertilizers can be used to overcome soil infertility and increase crop yields, they are expensive and inaccessible in Africa. On average, only about 8.8 kg NPK fertilizer is applied per hectare by smallholder farmers in Africa (Henao and Baanante, 2006).

Due to low nutritious food production, protein-calorie malnutrition is highly prevalent in African children, and is the outcome of low protein and calorie intake. Although the consumption of meat, dairy products and seafood can overcome protein-calorie malnutrition, these foods are expensive to resource-poor households in rural Africa. So the use of protein-rich plant foods has been the main option for many poor African communities. Leafy vegetables, for example, are a good source of dietary protein (Aletor et al., 2002), however, nodulated legumes are even better due to their ability to fix N<sup>2</sup> when in symbiosis with soil bacteria termed "rhizobia." Here, N2-fixing bacteroids in root nodules are able to reduce atmospheric N<sup>2</sup> to NH3, which is incorporated into amino acids and protein, and stored in leaves and seeds. This explains why the edible leaves and seed of legumes (or pulses) are a very high source of dietary protein. Of the cultivated legumes used as food, seed protein is as high as 40% in soybean (Zarkadas et al., 2007), 33% in cowpea (Ddamulira and Santos, 2015), 20–25% in common bean (Broµghton et al., 2003), 20.6% in Bambara groundnut (Mazahib et al., 2013), 21.3% in Kersting's bean (Ayenan and Ezin, 2016), 27–29% in pigeonpea (Saxena et al., 1987), 21–31% in mungbean (Yi-shen et al., 2018), 21.8–25.8% in chickpea (Xu et al., 2016) and 20–30% in groundnut (Toomer, 2018). Additionally, cowpea also contain 34.9% of protein in edible leaves (Enyiukwu et al., 2018).

In addition to protein, the edible leaves and seeds of legumes also contain high levels of dietarily-important mineral nutrients, which are needed for human nutrition and health, especially for overcoming trace element deficiency and promoting brain development. For example, mineral concentrations are also reported to be 142–626 and 60–99 mg.kg−<sup>1</sup> for Fe, 49–104 and 44–65 mg.kg−<sup>1</sup> Zn, 196–394 and 5–32 mg.kg−<sup>1</sup> Mn, 8.6–19.7 and 8.3–14.7 mg.kg−<sup>1</sup> Cu and 42–55 and 10–22 mg.kg−<sup>1</sup> B in cowpea leaves and seeds, respectively (Belane and Dakora, 2011a). Other studies have reported 22.6 mg.kg−<sup>1</sup> Fe, 33.1 mg.kg−<sup>1</sup> Zn, 6.7 mg.kg−<sup>1</sup> Mn, and 7.5 mg.kg−<sup>1</sup> Cu for groundnut seed (Toomer, 2018), as well as 500.0 mg.kg−<sup>1</sup> Fe, 405.0 mg.kg−<sup>1</sup> Zn, 480.0 mg.kg−<sup>1</sup> Mn, and 85.0 mg.kg−<sup>1</sup> Cu for chickpea seed (Xu et al., 2016).

Given the inherently low infertility of African soils, as well as the high cost of chemical fertilizers and their polluting effect on the environment, there is a need to develop sustainably green and affordable technologies for increasing the nutritional quality of food legumes for use by resource-poor, smallholder farmers in Africa. The aim of this study was to assess protein level and trace element density in edible leaves and seed of 30–32 cowpea genotypes grown in the field at Wa and Manga in Ghana, and at Taung in South Africa.

# MATERIALS AND METHODS

# Site Description

Field trials were conducted in Ghana and South Africa in 2005 and 2006. In Ghana, these field experiments were carried out at Dokpong and Bamahu near Wa in the Upper West Region, and at Manga in the Upper East Region, in 2005 and 2006, while in South Africa, these trials were conducted at Taung. Details of the experimental environments in the countries (altitude, longitude, mean annual rainfall, soil characteristics, cropping history, etc.) have been described elsewhere (Belane and Dakora, 2009, 2010, 2011b; Belane et al., 2011).

# Origin of Cowpea Genotypes

The cowpea genotypes used for this study were collected from Ghana, Tanzania, South Africa, and the International Institute of Tropical Agriculture (IITA) in Nigeria, as indicated by Belane and Dakora (2010). The 30 genotypes exhibited different useful biological traits ranging from number of days to 50% flowering and number of days to physiological harvest, to levels of N<sup>2</sup> fixation, pest resistance, and seed yield (Belane and Dakora, 2010).

# Field Design, Planting, and Pest Management

The experimental design used in this study has been described elsewhere (Belane and Dakora, 2009, 2010, 2011b; Belane et al., 2011), and involved the use of a randomized complete block design with four replicate plots per cowpea genotype in all the experiments. Each plot measured 3 m × 5 m (i.e., 15 m<sup>2</sup> ). The experiments were planted in mid-July each year, with a row-torow spacing of 60 and 20 cm within-row. Weeds were controlled with a hoe. Two low-dose sprays of lambda cyhalothrin (Karate 2.5 EC) insecticide were applied at flowering and at pod formation to control pests.

# Plant Harvest and Processing

Healthy young trifoliate leaves were harvested from 12 plants per plot at 46 and 72 DAP in 2005 and 2006 to assess for any changes in mineral density close to physiological maturity. The leaf samples were oven-dried (60◦C), weighed, and ground to fine powder (0.85 mm) for mineral analysis. At physiological maturity, cowpea seeds were harvested and similarly processed for analysis of nutrient elements.

### Protein Analysis in Cowpea Leaves and Seed

The percent N in cowpea leaves and seeds was determined using mass spectrometry, as described by Belane and Dakora (2010). The protein in leaves and seeds was estimated as %N of organ × 6.25 (Jones, 1941; Mariotti et al., 2008).

# Determination of Micronutrients in Cowpea Leaves and Seeds

Trace elements such as Fe, Zn, Cu, Mn, and B in cowpea leaves and seeds were measured, as described by Belane and Dakora (2011a). Briefly, 1 g of ground cowpea leaf or seed sample was ashed in a porcelain crucible at 500◦C overnight, the ash was dissolved in 5 ml of 6 M HCl (analytical grade) and placed in an oven at 50◦C for 30 min, after which 35 ml of de-ionized water was added. The mixture was filtered through Whatman No. 1 filter paper, and mineral concentrations determined in leaf and seed extracts from four replicate samples using inductively coupled plasma mass spectrometry (IRIS/AP HR DUO Thermo Electron Corporation, Franklin, Massachusetts, USA) (Ataro et al., 2008).

# Correlation Analysis

Correlation analyses were performed for the levels of micronutrients in leaves and seeds of cowpea genotypes to ascertain any relationships that may exist in the translocation of trace elements between the two organs.

# Statistical Analysis

The data on protein and micronutrient levels in cowpea leaves and seed were subjected to analysis of variance (ANOVA) using a STATISTICA analytical software program version 7.1. A one-way ANOVA was used to compare protein and micronutrient levels among genotypes. Where significant differences were found, the Duncan's multiple range test was used to separate treatment means at p ≤ 0.05 or p ≤ 0.001.

# RESULTS

#### Leaf Protein Levels of Cowpea Genotypes

The leaf protein of cowpea genotypes used in this study varied markedly between and among genotypes irrespective of location. The leaf protein of 30 cowpea genotypes grown at Wa in the Guinea savanna of the Upper West Region in Ghana also differed significantly, and ranged from about 28% for genotype ITH98- 46 to 40% for Soronko (**Figure 2**). Of the 30 cowpea genotypes tested at Wa, 29 recorded more than 30% protein in their leaves (**Figure 2**).

Leaf protein was also assessed for 30 cowpea genotypes planted at Manga in the Sudano-Sahelian savanna in the Upper East Region of Ghana. Leaf protein levels also differed among the cowpea genotypes at Manga, and were found to vary from 24 to 35% (**Figure 3**). Some 12 out of the 30 genotypes studied recorded more than 30% protein in their leaves at Manga, Ghana.

# Seed Protein of Cowpea Genotypes

The concentration of protein in cowpea seed was determined for only the 32 genotypes planted at the Taung site in South Africa. The data revealed marked differences in seed protein, which ranged from about 20% in Soronko to 40% in Bengpla (**Figure 4**). Ten cowpea genotypes, including Bengpla, recorded more than 30% protein in their seed when planted at Taung in South Africa (**Figure 4**).

#### Micronutrient Density in Cowpea Leaves

The levels of micronutrients in edible leaves of cowpea genotypes were assessed using ICP-MS analysis for only the 32 cowpea genotypes planted at Taung (**Table 1**), but not Wa or Manga in Ghana. The concentration of micronutrients in the leaves varied hugely between and among genotypes. As show in **Table 1**, the level of Fe in cowpea leaves ranged from 268 µg.g−<sup>1</sup> in Bechuana white to 2,011 µg.g−<sup>1</sup> in Apagbaala landrace. Other genotypes with markedly high leaf Fe levels included Fahari (2,005 µg.g−<sup>1</sup> ), Iron Gray (1,302 µg.g−<sup>1</sup> ), Line 2020 (945 µg.g−<sup>1</sup> ), and Bensogla (927 µg.g−<sup>1</sup> ). In contrast, the genotypes which showed the lowest leaf Fe concentrations were Bechuana white (268 µg.g−<sup>1</sup> ), Encore (273 µg.g−<sup>1</sup> ), IT94D-437-1 (314 µg.g−<sup>1</sup> ), and TVU11424 (313 µg.g−<sup>1</sup> ).

The distribution of Zn in cowpea leaves also differed markedly, and ranged from 37 µg.g−<sup>1</sup> ) in Vallenga to 150 µg.g−<sup>1</sup> for Apagbaala (which also recorded the highest Fe concentration; **Table 1**). Other cowpea genotypes with high levels of Zn in leaves included Line 2020 (132 µg.g−<sup>1</sup> ), Iron Gray (90 µg.g−<sup>1</sup> ), Fahari

in 2005.

(80 µg.g−<sup>1</sup> ), and Bensogla (73 µg.g−<sup>1</sup> ), which incidentally also recorded high Fe concentrations in cowpea leaves. However, the genotypes with the least Zn concentration in leaves included Vallenga (37 µg.g−<sup>1</sup> ), Bechuana white (38 µg.g−<sup>1</sup> ), IT82D-889 (39 µg.g−<sup>1</sup> ), and Encore (40 µg.g−<sup>1</sup> ).

The density of Mn in edible cowpea leaves similarly differed among the genotypes, and ranged from 165 µg.g−<sup>1</sup> in IT86D-2075 to 404 µg.g−<sup>1</sup> in Line 2020 (**Table 2**). Other genotypes with increased Mn in leaves included Iron Gray (364 µg.g−<sup>1</sup> ) and Apagbaala (325 µg.g−<sup>1</sup> ). Leaf concentration of B also differed with cowpea genotype, with levels ranging from 31 µg.g−<sup>1</sup> in Bechuana white to 50 µg.g−<sup>1</sup> in Benpla (**Table 2**). The concentration of Cu in cowpea leaves was similar for all 32 genotypes (**Table 2**).

### Micronutrient Density in Cowpea Seed

The concentrations of trace elements (Fe, Zn, CU, Mn, and B) in cowpea seed were generally lower relative to leaves. As shown in **Table 2**, Fe levels in seed differed among the genotypes tested, and ranged from 45 µg.g−<sup>1</sup> for Bengpla to 67 µg.g−<sup>1</sup> in Soronko and IT95K-452-1. Other genotypes with high levels of Fe in seed included Brown Eye (65 µg.g−<sup>1</sup> ), IT98-46 (64 µg.g−<sup>1</sup> ), TVU11424 (62 µg.g−<sup>1</sup> ), IT86D-2075 (62 µg.g−<sup>1</sup> ), and Bensogla (61 µg.g−<sup>1</sup> ). In contrast, the genotypes with low levels of Fe in seed were Bengpla (45 µg.g−<sup>1</sup> ), followed by Mamlaka (50 µg.g−<sup>1</sup> ).

A shown in **Table 2**, the Cu levels in cowpea seed varied from 5.20 µg.g−<sup>1</sup> in genotype IT82D-889 to 8.11 µg.g−<sup>1</sup> for genotype IT96D-1951 and 8.06 µg.g−<sup>1</sup> for Vallenga (**Table 2**). Other genotypes with increased levels of Cu in cowpea seed included Bensogla (7.90 µg.g−<sup>1</sup> ), Iron Gray (7.86 µg.g−<sup>1</sup> ), Brown Eye (7.76 µg.g−<sup>1</sup> ), and Pan 311 (7.49). Similarly, Zn concentration in cowpea seed was different for the 32 genotypes tested (**Table 3**). Genotype TVU11424 recorded the highest levels of Zn (69.15 µg.g−<sup>1</sup> ), followed by Soronko (53.88 µg.g−<sup>1</sup> ), and IT90K-59 (49.78 µg.g−<sup>1</sup> ). In contrast, the lowest Zn concentration was found in Bengpla (33.89 µg.g−<sup>1</sup> ), followed by Mamlaka (34.64 µg.g−<sup>1</sup> ), and Line 2020 (37.86 µg.g−<sup>1</sup> ).

The Mn distribution in cowpea seed ranged from 10.05 µg.g−<sup>1</sup> in Bechuana white to 17.43 µg.g−<sup>1</sup> in CH14 (**Table 2**). The highest Mn concentrations in cowpea seed were recorded by genotypes CH14 (17.43 µg.g−<sup>1</sup> ), Iron Gray (17.06 µg.g−<sup>1</sup> ), Bechuana white (16.85 µg.g−<sup>1</sup> ), Fahari (16.46 µg.g−<sup>1</sup> ), and IT86D-2075 (16.21 µg.g−<sup>1</sup> ). By contrast, the lowest Mn levels were produced by Bechuana white (10.05 µg.g−<sup>1</sup> ) and IT82D-889 (10.09 µg.g−<sup>1</sup> ). The B levels in cowpea seed also differed among genotypes, and varied from 14.71 µg.g−<sup>1</sup> for IT82D-889 to Brown Eye (21.44 µg.g−<sup>1</sup> ). The highest concentration of B was found in Brown Eye (21.44 µg.g−<sup>1</sup> ), followed by IT94D-437-1 (21.30 µg.g−<sup>1</sup> ), Encore (19.81 µg.g−<sup>1</sup> ), IT90K-59 (19.20 µg.g−<sup>1</sup> ), Bechuana white (19.11 µg.g−<sup>1</sup> ), and IT93K-2045-29 (19.00 µg.g−<sup>1</sup> ). However, the lowest B levels were recorded by IT82D-889 (14.71 µg.g−<sup>1</sup> ), followed by Bensogla (15.57 µg.g−<sup>1</sup> ) and Bengpla (15.71 µg.g−<sup>1</sup> ).

#### Correlation Analysis of Micronutrients in Cowpea Leaves and Seeds

Leaf Fe was positively correlated with seed Fe, leaf Zn, leaf Mn, and seed Mn (**Table 4**). Seed Fe was also correlated positively with seed Zn and leaf B, but negatively with seed Cu and seed B. Seed Zn correlated with positively leaf Cu, seed Cu and seed B, but negatively with leaf Mn. Similarly, leaf Mn correlated significantly with seed Mn but negatively with seed Cu, leaf B and seed B, and seed Cu correlated with seed B (**Table 4**).

# DISCUSSION

### Leaf and Seed Protein of Cowpea Genotypes

Food and nutritional insecurity remain a major problem facing Sub-Saharan Africa, as about 239 million people are currently suffering from protein-calorie malnutrition (Fanzo, 2012; Andrea and Rose, 2015). In rural Africa, food/nutritional security and micronutrient deficiency are met through the consumption of leafy vegetables and seed legumes (Belane and Dakora, 2011a), as animal protein is too expensive for resource-poor households. In Sub-Saharan Africa, cowpea is the major food grain legume, cultivated and consumed by the majority of smallholder farming communities and is very important as a food crop in meeting dietary protein requirements, and overcoming micronutrient deficiency.

In this study, we evaluated 32 field-grown cowpea genotypes at Taung in South Africa, and 30 each at Wa and Manga in Ghana for leaf and seed protein, as well as for micronutrient density in the two organs at Taung. The results revealed marked differences in the levels of protein in cowpea leaves independent of location (**Figures 1**–**4**), as well as of seed protein and micronutrient density in plant parts at Taung (**Tables 1**, **2**). The leaf protein of 32 cowpea genotypes grown at Taung (South Africa) ranged from 23% for genotype IT96D-1951 to 40% for Bengpla, with nine genotypes recording more than 30% leaf protein (**Figure 1**). At Wa in the Guinea savanna of Ghana, cowpea leaf protein ranged from 28 to 40% for ITH98-46 and Soronko, respectively, with 29 genotypes accumulating more than 30% protein in their leaves (**Figure 2**). Similarly, at Manga in the Sudano-Sahelian savanna of Ghana, leaf protein levels varied from 24 to 35%, with 12 out of 30 genotypes recording more than 30% protein in their leaves (**Figure 3**). The leaves of N2-fixing legumes such as cowpea are very rich in N due to the species ability to reduce N<sup>2</sup> into NH<sup>3</sup> and subsequently into nitrogenous solutes for plant use (Belane et al., 2014). In plants, N is required for the synthesis of macromolecules such as chlorophyll needed for harvesting light photon energy during photosynthesis and formation of the enzyme ribulose-1,5-bisphosphate carboxylaseoxygenase (Rubisco), which reduces CO<sup>2</sup> during photosynthesis. Because Rubisco accounts for over 90% of leaf N (Belane and Dakora, 2015), most of the protein found in green leaves of



Values (Means ± S.E) followed by dissimilar letters are significant at P ≤ 0.05.

monocots and dicots consists of Rubisco. Thus, a culturable form of this protein could be a biotech spin-off for enhanced nutritional security.

Estimates of N<sup>2</sup> fixation by cowpea plants sampled from the same field experiments as the materials used in this study showed 43 to 93% N derived from atmospheric N<sup>2</sup> fixation at Taung (Belane et al., 2011), 8 to 60% at Manga (Belane and Dakora, 2009) and 64 to 87% at Wa (Belane and Dakora, 2010). Cowpea from farmers' fields could also derive about 30 to 99% of their N nutrition from symbiotic fixation at Wa in the Upper West Region of Ghana (Naab et al., 2009). Similar results from Botswana have shown that field-grown cowpea plants obtained between 19 and 92% of their N nutrition from symbiosis (Pule-Meulenberg et al., 2010). Clearly, the levels of N<sup>2</sup> fixation reported in those studies (Belane and Dakora, 2009, 2010, 2011b; Naab et al., 2009) can help to explain the strong variation in protein concentration found in the edible leaves and seed of the cowpea material used in this study.

At Taung in South Africa, cowpea genotypes Fahari, Glenda, IT93K-2045-29, Mamlaka, Pan311, and TVu11424 were among those with the highest amounts of N-fixed (Belane et al., 2011). Coincidentally, however, the same genotypes also revealed more protein in cowpea seed in this study (**Figure 4**), clearly indicating a direct link between protein concentration in cowpea seed and cowpea symbiotic efficiency, as well as the levels of Nfixed. In the same manner, N<sup>2</sup> fixation and photosynthesis are metabolically interlinked at the level of the plant's N and C economy, especially where N<sup>2</sup> reduced to NH<sup>3</sup> by the enzyme nitrogenase is incorporated with de novo photosynthate into amino acids, needed for protein biosynthesis. In this study, the TABLE 2 | Micronutrients in seed of field-grown cowpea varieties harvested at 150 DAP in Taung.


Values (Mean ± S.E) followed by dissimilar letters are significantly different at P ≤ 0.05.

relationship was however not so clear between percent N derived from fixation and protein levels in cowpea leaves, as found for the seeds. For example, cowpea genotypes Ngonji (56%), Mamlaka (51%) IT90K-59 (48%), IT93K-452-1 (46%), IT90K-59 (48%), and Mchanganyiko (44%), which derived relatively higher N from fixation at Manga in Ghana (Belane and Dakora, 2009), also produced significantly much greater leaf protein in this study (**Figure 3**). In contrast, Bechuana white, which obtained the highest N from symbiosis (60%) at Manga, was the fourth lowest in leaf protein production, while Fahari, which derived only 25% of its N from fixation, produced the highest leaf protein (**Figure 3**). The observed anomaly in the relationship between percent N derived from fixation and leaf protein concentration in cowpea genotypes appears to depend on the traffic and pathways of symbiotic N transported to leaves from N2-fixing bacteroids in root nodules, and the subsequent incorporation of fixed-N into protein. The variation in seed protein found among the cowpea genotypes tested in this study is consistent with a recent report by Gerrano et al. (2019).

### Trace Element Density in Cowpea Leaves and Seeds

In Africa, about 232 people are suffering from trace element deficiency (Andrea and Rose, 2015), a problem that can be addressed through studies of nodulated legumes that have the ability to accumulate micronutrients in organs. In this study, there were marked differences in the uptake and accumulation of the micronutrients Fe, Zn, Cu, Mn B in leaves of the 32 cowpea genotypes grown in the field at Taung in South Africa (**Table 1**). Of the five trace elements (Fe, Zn, Cu, Mn, and B)



\*, \*\*, and \*\*\* denote statistical significance at 0.05, 0.01 and 0.001 levels.

TABLE 4 | Amount of edible cowpea leaf to consume to meet the recommended daily dietary intake of Fe and Zn.


Calculated using leaf micronutrient concentration and leaf dry matter.

that were measured in cowpea leaves, Fe concentration was highest in genotype Apagbaala (2,011 µg.g−<sup>1</sup> ), followed by Fahari (2,004 µg.g−<sup>1</sup> ), Iron Gray (1,302 µg.g−<sup>1</sup> ), Line 2020 (944 µg.g−<sup>1</sup> ), Bensogla (927 µg.g−<sup>1</sup> ), Omondaw (605 µg.g−<sup>1</sup> ), IT96D-1951(591 µg.g−<sup>1</sup> ), IT93K-452-1 (574 µg.g−<sup>1</sup> ), Ngonji (569 µg.g−<sup>1</sup> ), and Mchanganyika (566 µg.g−<sup>1</sup> ), and lowest in Bechuana white (268 µg.g−<sup>1</sup> ). In that order, the 10 cowpea genotypes were 7.51-, 7.48-, 4.86-, 3.52-, 3.46-, 2.26-, 2.21-, 2.14-, 2.12-, and 2.11-fold greater in leaf Fe concentration than Bechuana white, the lowest Fe accumulation (**Table 2**).

However, the leaf concentrations of Zn, Mn and B in cowpea were much lower than that of Fe, with Zn distribution ranging from 37.2 µg.g−<sup>1</sup> for Vallenga to 150 µg.g−<sup>1</sup> for Apagbaala (**Table 1**). Nine out of the 32 cowpea genotypes studied showed a very high concentration of Zn in edible leaves. These included Apagbaala (150.0 µg.g−<sup>1</sup> ), Line 2020 (131.8 µg.g−<sup>1</sup> ), Iron Gray (89.9 µg.g−<sup>1</sup> ), Fahari (80.0 µg.g−<sup>1</sup> ), Omondaw (75.3 µg.g−<sup>1</sup> ), Bensogla (73.4 µg.g−<sup>1</sup> ), Ngonji (63.1 µg.g−<sup>1</sup> ), IT90K-59 (62.2 µg.g−<sup>1</sup> ), and TVU3236 (58.9 µg.g−<sup>1</sup> ), which (in that order) were 4.0-, 3.5-, 2.4-, 2.2-, 2.0-, 2.0-, 1.7-, 1.7-, and 1.6-fold higher in leaf Zn concentration than Vallenga, the genotype with the lowest Zn accumulation. Similarly, leaf Mn concentration was markedly greater in Line 2020 (403 µg.g−<sup>1</sup> ), followed by Iron Gray (365 µg.g−<sup>1</sup> ), Apagbaala (325 µg.g−<sup>1</sup> ), IT93K-421- 1(320 µg.g−<sup>1</sup> ), IT94D-437-1 (315 µg.g−<sup>1</sup> ), and IT96D-1951 (311 µg.g−<sup>1</sup> ); and these were, respectively, 2.5-, 2.2-, 2.0-, 1.9-, 1.9-, and 1.9-fold higher than genotype IT86D-2075, which recorded the least Mn in leaves (165 µg.g−<sup>1</sup> ). Boron concentration in leaves was also much greater in Bengpla (50.1 µg.g−<sup>1</sup> ), followed by Line 2020 (47.7 µg.g−<sup>1</sup> ), Pan 311 (46.9), Glenda (46.3 µg.g−<sup>1</sup> ), TVU3236 (45.9 µg.g−<sup>1</sup> ), Iron Gray (45.8 µg.g−<sup>1</sup> ), IT90K-59 (45.4 µg.g−<sup>1</sup> ), IT94D-437-1 (45.0 µg.g−<sup>1</sup> ), Omondaw (43.9 µg.g−<sup>1</sup> ), and Apagbaala (43.2 µg.g−<sup>1</sup> ), and lowest in Bechuana white (31.2 µg.g−<sup>1</sup> ). As a result, these genotypes were, respectively, 1.60-, 1.53-, 1.50-, 1.48-, 1.47-, 1.47-, 1.45-, 1.44-, 1.40-, and 1.38-fold higher in B than Bechuana white.

Taken together, the results of this study have shown that the concentrations of Fe and Zn (the two most important trace elements) were highest in the leaves of genotypes Apagbaala, Fahari, Iron gray, Line 2020, Bensogla and Omondaw. Furthermore, Fe, but not Zn, was also higher in the leaves of genotypes IT96D-1951 and IT93K-452-1, while conversely Zn, but not Fe, showed increased distribution in Ngonji, IT90K-59 and TVU3236. Interestingly, the leaves of cowpea genotypes Apagbaala, Line 2020, Iron Gray, IT94D-437-1, and Omondaw were similarly very rich in Mn and B, as found for Fe and Zn. These results agree with the findings of Belane and Dakora (2012), and suggest that, with little effort, breeders can easily identify cowpea genotypes with the ability to accumulate high levels of the micronutrients Fe, Zn, Mn, and B in edible leaves for use by farmers to overcome trace element deficiency in Africa.

In this study, the distribution of micronutrients in cowpea seed also differed markedly among the genotypes studied, with a range of 45.1 to 67.0 µg.g−<sup>1</sup> for Fe, 33.9 to 69.2 µg.g−<sup>1</sup> for Zn, 10.1 to 17.4 µg.g−<sup>1</sup> for Mn, 14.7 to 21.4 µg.g−<sup>1</sup> for B, and 5.2 to 8.1 µg.g−<sup>1</sup> for Cu (**Table 2**). The strong variation in cowpea micronutrient distribution found in this study is consistent with a report by Gerrano et al. (2019). Furthermore, we found that the cowpea genotypes with higher micronutrient accumulation, recorded much greater concentrations in leaves than seed. In fact, genotypes Apagbaala, Fahari, Iron Gray, and Line 2020, which showed an ability to increase micronutrient density, respectively, exhibited 34.2-, 34.0-, 22.5-, and 18.3-fold higher Fe concentration in leaves than seed, just as the same genotypes (in that order) revealed 3.5-, 2.0-, 2.0-, and 3.5-fold greater Zn distribution in leaves than seed. These results are consistent with the findings of an earlier study which showed that trace element concentration was much greater in cowpea leaves than seed (Belane and Dakora, 2011a). Our data also suggest that the assimilation and translocation of mineral nutrients from leaves to developing ovules to form seed differed among the cowpea genotypes probably as a result of traffic barriers to solute transport. Whatever the case, the observed differences in leaf micronutrient density seems to suggest that, depending on the cowpea genotype, a greater or lesser amount of leaf material must be consumed in order to meet the recommended daily intake of trace elements such as Fe and Zn (**Table 3**). Thus, a higher concentration of the micronutrients in cowpea leaves generally led to a lower amount (on dry matter basis) of the leaf material needed for consumption in order to meet the daily dietary intake of each trace element, and vice versa. Another factor that seems to define the level of mineral accumulation in nodulated legumes is the symbiotic efficiency of N2-fixing bacteria in root nodules. It has been shown that high N2-fixing legumes generally tend to accumulate more nutrient elements in shoots than their lowfixing counterparts (Belane et al., 2014). In this study, genotypes such as Apagbaala, Fahari, Iron Gray, Line 2020, Bengsogla and Omondaw, which accumulated significantly high concentrations of trace elements in leaves and seed, were earlier reported to be high N2-fixers, and to accumulate large amounts of symbiotic N in their biomass (Belane and Dakora, 2009, 2010).

Correlation analysis revealed some physiological relationships between micronutrients in leaves and seeds (**Table 3**). During organ development, ovules are generally regarded as sinks for nutrients stored in leaves as sources. The significant correlation (**Table 3**) between leaf Fe and seed Fe (r = 0.41∗∗∗), or leaf Mn and seed Mn (r = 0.37∗∗∗) attest to this source/sink relationship between leaves and seeds (developed ovules) when it comes to nutrient uploading in the phloem and its translocation to ovules that are developing into seeds. It however seems there was cotransport of Zn and Mn from xylem to leaves, just as there was synergy in the translocation of Fe and Mn to seeds. This was

#### REFERENCES

Aletor, O., Oshodi, A., and Ipinmoroti, K. (2002). Chemical composition of common leafy vegetables and functional properties of their leaf protein concentrates. Food Chem. 78, 63–68. doi: 10.1016/S0308-8146(01)00 376-4

evidenced by the positive correlation between leaf Fe and leaf Zn (r = 0.55∗∗∗), leaf Fe and leaf Mn (r = 0.25<sup>∗</sup> ), and/or leaf Mn and seed Mn (r = 0.37∗∗∗). Conversely, seed Fe was negatively correlated with seed Cu and seed B, and leaf Mn with seed Cu, seed B, and leaf B. This inverse relationship implies that when seed Fe was increasing, seed Cu and B were decreasing; just as when leaf Mn was accumulating, seed Cu, seed B, and leaf B were decreasing (**Table 3**). However, whether these synergies and/or antagonisms can be usefully exploited during cowpea breeding, remains to be seen.

The recommended daily dietary intake of the micronutrients Fe and Zn is 8 and 11 mg.day−<sup>1</sup> , respectively (Ross et al., 2011). Assuming cowpea leaves to be the sole source of dietary Fe and Zn (on dry weight basis), the estimated amount to consume in order to meet the daily intake of 8 and 11 mg.day−<sup>1</sup> for Fe and Zn was found to vary with cowpea genotype (**Table 4**). Smaller leaf material of the genotypes with higher leaf concentration of Fe and Zn was needed to meet the daily intake relative to their counterparts with low levels of Fe and Zn. Coincidentally, however, the genotypes with increased Fe and Zn in leaves were reported to be high N2-fixers in different studies (Belane and Dakora, 2009, 2010; Belane et al., 2011). Clearly, the cowpea/rhizobia symbiosis seems to be a major determinant of leaf and seed protein biosynthesis in cowpea, as well as the accumulation of dietary mineral nutrients in edible parts of this species. Therefore, enhancing these traits in cowpea genotypes through breeding has the potential to overcome protein-calorie malnutrition and trace element deficiency in rural Africa.

#### DATA AVAILABILITY

All datasets generated for this study are included in the manuscript/supplementary files.

#### AUTHOR CONTRIBUTIONS

FD designed the experiment and wrote the manuscript. AB undertook the field experiments, collected, analyzed the data, and was a doctoral student of FD.

#### FUNDING

This study was supported with funds from the South African Research Chair in Agrochemurgy and Plant Symbioses, the National Research Foundation, and the Tshwane University of Technology to FD, as well as a competitive grant from the McKnight Foundation.

Ataro, A., McCrindle, R. I., Botha, B. M., McCrindle, C. M. E., and Ndibewu, P. P. (2008). Quantification of trace elements in raw

Andrea, F., and Rose, M. (2015). Food insecurity and hunger: a review of FAO's annual report on state of food insecurity in the world. Int. J. Multidiscip. Allied Stud. 2, 1–5. Available online at: https://thescholedge.org/index.php/sijmas/ issue/view/30

cow's milk by inductively coupled plasma mass spectrometry (ICP-MS). Food Chem. 111, 243–248. doi: 10.1016/j.foodchem.2008. 03.056


based on grain mineral and total protein content. Acta Agric. Scand. Sect. B Soil Plant Sci. 69, 155–166. doi: 10.1080/09064710.2018.1520290


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Dakora and Belane. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Utilization of Interspecific High-Density Genetic Map of RIL Population for the QTL Detection and Candidate Gene Mining for 100-Seed Weight in Soybean

Benjamin Karikari, Shixuan Chen, Yuntao Xiao, Fangguo Chang, Yilan Zhou, Jiejie Kong, Javaid Akhter Bhat\* and Tuanjie Zhao\*

Soybean Research Institution, National Center for Soybean Improvement, Key Laboratory of Biology and Genetics and Breeding for Soybean, Ministry of Agriculture, State Key Laboratory of Crop Genetics and Germplasm Enhancement, Nanjing Agricultural University, Nanjing, China

#### Edited by:

Sergio J. Ochatt, INRA, UMR1347 Agroécologie, France

#### Reviewed by:

Juan Jose Ferreira, Servicio Regional de Investigación y Desarrollo Agroalimentario (SERIDA), Spain Zhaoming Qi, Northeast Agricultural University, China

> \*Correspondence: Javaid Akhter Bhat javid.akhter69@gmail.com Tuanjie Zhao tjzhao@njau.edu.cn

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 08 May 2019 Accepted: 17 July 2019 Published: 04 September 2019

#### Citation:

Karikari B, Chen S, Xiao Y, Chang F, Zhou Y, Kong J, Bhat JA and Zhao T (2019) Utilization of Interspecific High-Density Genetic Map of RIL Population for the QTL Detection and Candidate Gene Mining for 100-Seed Weight in Soybean. Front. Plant Sci. 10:1001. doi: 10.3389/fpls.2019.01001 Seed-weight is one of the most important traits determining soybean yield. Hence, it is prerequisite to have detailed understanding of the genetic basis regulating seed-weight for the development of improved cultivars. In this regard, the present study used highdensity interspecific linkage map of NJIR4P recombinant inbred population evaluated in four different environments to detect stable Quantitative trait loci (QTLs) as well as mine candidate genes for 100-seed weight. In total, 19 QTLs distributed on 12 chromosomes were identified in all individual environments plus combined environment, out of which seven were novel and eight are stable identified in more than one environment. However, all the novel QTLs were minor (R <sup>2</sup> < 10%). The remaining 12 QTLs detected in this study were co-localized with the earlier reported QTLs with narrow genomic regions, and out of these only 2 QTLs were major (R <sup>2</sup> > 10%) viz., qSW-17-1 and qSW-17-4. Beneficial alleles of all identified QTLs were derived from cultivated soybean parent (Nannong493-1). Based on Protein ANalysis THrough Evolutionary Relationships, gene annotation information, and literature search, 29 genes within 5 stable QTLs were predicted to be possible candidate genes that might regulate seed-weight/size in soybean. However, it needs further validation to confirm their role in seed development. In conclusion, the present study provides better understanding of trait genetics and candidate gene information through the use high-density inter-specific bin map, and also revealed considerable scope for genetic improvement of 100-seed weight in soybean using marker-assisted breeding.

Keywords: soybean, seed-weight, QTL, candidate gene, marker-assisted breeding

# INTRODUCTION

Soybean (Glycine max L. Merr.) is one of the most economically important crop being rich source of both edible oil and protein as well as has significant role in health, biofuel, and soil fertility improvement (Kulkarni et al., 2016). In China, soybean production has continuously declined with considerable low yield increase in the past 50 years (Liu et al., 2018).

Moreover, China imports >80% of soybean for their total domestic use; hence, it is prerequisite to increase the domestic production of soybean to make country self-sufficient (Liu et al., 2018). Different yield-related traits are targeted by plant breeders to increase soybean production. In this context, seed-weight is one of the most important yield-related trait for increasing seed yield in soybean; however, it is a complex quantitative trait governed by polygenes and are highly influenced by environment, which makes it selection difficult for plant breeders (Yao et al., 2015). Furthermore, seed weight/size determines the specific soy-based food product that can be made from soybean (Cui et al., 2004; Gandhi, 2009). For instance, smallseeded cultivars are suitable for fermented soybean (natto) and sprout production, whereas large-seeded cultivars are used for boiled soybean (nimame), green soybean (edamame), soymilk, and soybean curd (tofu) (Liang et al., 2016; Teng et al., 2017; Wu et al., 2018). In addition, seed weight/size influences germination ability and seedling vigor, which in turn determines the competitive ability of the seedling for light, nutrient resources, and stress tolerance (Coomes and Grubb, 2003; Gomez, 2004; Haig, 2013).

Seed weight is one of the traits that was altered during domestication (Lee et al., 2011; Han et al., 2016). During domestication process from wild species to cultivated soybean, selecting desirable agronomic traits to keep achieving high yield allows many genes to be either directly selected or filtered out, resulting in a significant reduction of genetic diversity in soybean gene pool (Guo et al., 2010; Tang et al., 2010). Hyten et al. (2006) suggested that 50% of the genetic diversity and 81% of the rare alleles have been lost during domestication and that 60% of the genes show significant changes in allele frequency as a result of soybean domestication. It has been reported that wild soybean (Glycine soja) is an important source of genes for higher yield and related traits, quality, as well as biotic and abiotic stresses (Zhou Z. et al., 2015). Thus, it is necessary to broaden the gene pool in soybean breeding from diverse sources, especially from wild soybean (G. soja). The seed of cultivated soybean (G. max) is heavier and bigger compared to the wild accessions (Yu et al., 2017). Both wild and cultivated soybean belong to the same genus Glycine (Kim M.Y. et al., 2010), with the former having higher level of genetic diversity, as well as better adaptation to harsh environments (Stupar, 2010; Qiu et al., 2013; Zhang et al., 2017). Thus, G. soja holds great potential to improve its agriculturally important domesticated relative (G. max), beyond what is currently known (Kofsky et al., 2018). For example, comparative genomics, transcriptomics, and bioinformatics application have revealed the role of domestication in the seed weight of soybean (Lu et al., 2016; Zhao et al., 2016; Yu et al., 2017).

Quantitative trait loci (QTL) mapping using domesticated and wild progenitors have been reported to be the useful means for identifying genomic regions involved in morphological and physiological changes that distinguish crops from their wild relatives (Paterson, 2010). The wild soybean has been recently reported to be an important source of QTLs contributing to the increase in seed size in soybean. For example, Lu et al. (2017) identified a phosphatase 2C protein (PP2C-1) allele from wild soybean underlying a QTL that enhances the 100-seed weight in soybean. Although many genetic studies have been carried out in the past decades to identify QTLs for seed weight/size using different types of DNA markers through QTL mapping analyses. Currently, there are a total of 325 QTLs identified for seed-weight/size available on SoyBase<sup>1</sup> , and most of them are minor and not validated (Liu et al., 2018). In addition, knowledge for the molecular mechanism of soybean seed weight is very limited compared to other crops like rice (Wang et al., 2015; Liu et al., 2018). Till date, only two genes related to seed weight/size have been isolated from soybean viz., ln (Jeong et al., 2012) and PP2C-1 (Lu et al., 2017). Hence, it is prerequisite to identify stable QTLs for seed-weight as well as mine candidate genes underlying them to facilitate understanding of the molecular mechanisms regulating seed-weight in soybean (Kato et al., 2014). Furthermore, only few mapping populations derived from wild and domesticated soybean crosses have been used for QTLs detection of seed-weight in soybean<sup>1</sup> . Also, most of the previous studies have used low-throughput markers (such as SSR) for QTL identification of seed-weight in soybean (Panthee et al., 2005; Gai et al., 2007; Kato et al., 2014; Kulkarni et al., 2016; Wu et al., 2018). These marker systems have low resolution and larger confidence interval compared with high-density SNP markers (Hyten et al., 2010; Xu et al., 2013; Lu et al., 2017) that were revealed to be useful for high-throughput QTL mapping. Also, most of the published reports did not mine the candidate genes for seedweight (Zhang et al., 2004; Kato et al., 2014; Kulkarni et al., 2016; Wu et al., 2018).

Therefore, by keeping the above into view the present study used high-density inter-specific genetic map of the recombinant inbred line (RIL) population (NJIR4P) derived from a cross between Nannong493-1 (G. max) and PI 342618B (G. soja) that was evaluated in multiple environments to map stable QTLs as well as mine possible candidate genes underlying 100-seed weight in soybean. Using interspecific RIL population with wide range of variation in 100-seed weight has greatly assisted in the detection of more number of major and minor QTLs regulating 100-seed weight in soybean. The use of this RIL population could enhance our understanding of molecular mechanism, evolution, and genetic regulation of seed weight in soybean. The results of the present study will be helpful in markerassisted breeding (MAB) for developing soybean varieties with improved seed-weight.

#### MATERIALS AND METHODS

### Plant Materials and Experimental Conditions

An interspecific RIL population consisting of 161 lines were derived through single seed descent (SSD) method by crossing a soybean cultivar Nannong493-1 (G. max) with wild soybean line PI 342618B (G. soja), and this RIL population were named as NJIR4P. The Nannong493-1 parent has a higher 100-seed weight with an average value of 18.02 ± 2.60 g, whereas PI 342618B is an annual wild soybean with low 100-seed

<sup>1</sup>http://www.soybase.org/

weight (1.4 g) (Xie et al., 2014). The RILs (F6:9–F6:11) along with their parents were planted in four different environments viz., Fengyang Experimental Station, Chuzhou, Anhui Province (Latitude 32◦ 87<sup>0</sup> N; Longitude 117◦ 56<sup>0</sup> E), in 2012 (FY2012), and Jiangpu Experimental Station, Nanjing, Jiangsu Province (JP) (Latitude 33◦ 03<sup>0</sup> N; Longitude 118◦ 63<sup>0</sup> E) in 2012, 2013, and 2014 (JP2012, JP2013, and JP2014). Soybean lines were planted in a single line plot of 1 m in length and 0.5 m in width in a randomized complete block design (RCBD) with three replications. Standard cultural and agronomic practices were followed in each environment (Lihua, 1982; Liu et al., 2008).

#### Phenotypic Analysis of 100-Seed Weight

Each row of the RILs and their parents were harvested, threshed, and dried to a suitable moisture. Four-hundred healthy dried seeds from each row were selected randomly for measurement of 100-seed weight. The 100-seed weight, i.e., weight of 100 seeds at 13% moisture content was measured by electronic balance and were repeated four times. Seed-weight was calculated for all the three replication and mean value was used for analysis. Analysis of variance (ANOVA) in each environment and combined environments (CEs) were conducted using the general linear model (GLM) and mixed procedure, respectively, in SAS (SAS Institute, 2010. SAS/STAT software version 9.2; SAS Institute Inc., Cary, NC, United States). The broad-sense heritability (H<sup>2</sup> ) was calculated for both individual environments plus CE following the procedure of Hanson et al. (1956). Also genotypic coefficients of variation (GCV) was calculated by using the following formula proposed by Singh (1985): GCV = √ σ 2g µ , where p σ <sup>2</sup>g is the genotypic standard deviation in each environment while µ is the mean value of 100-seed weight.

#### QTL Mapping Analysis

In the present study, an inter-specific high-density bin map earlier developed by Wang et al. (2016) by using RAD-sequencing approach for this population was used for QTL mapping. This bin map consisted of 4,354 bin markers that were derived from 80,995 single-nucleotide polymorphisms (SNPs) distributed on all 20 soybean linkage groups/chromosomes, and has a total length of 2,136.717 cM. The average number of markers per linkage group and length of linkage group was 218 and 106.84 cM, respectively, with mean distance between bins as 0.49 cM (**Supplementary Table 1**). Among the NJRI4P-RIL, 46.07% were inherited their genetic background from Nannong493-1, 50.06% were from PI 342618B, and the remaining 3.87% were heterozygous genotypes. The segregation ratios of each bin marker were calculated, and only few significant segregation distortion regions were identified. In NJRI4P, out of 4,354 bin markers only 1 bin showed extreme segregation distortion at P < 0.0001 on chromosome 2, and 2 bins exhibited segregation distortion at P < 0.0005 on chromosomes 7 and 19, whereas the remaining bin markers did not show significant segregation distortion (Wang et al., 2016).

The QTL analysis was performed via WinQTLCart 2.5 software (Wang et al., 2007). For the WinQTLCart 2.5 software, the model of composite interval mapping (CIM) was used with a 10 cM window at a walking speed of 1 cM. The LOD threshold was calculated using 1,000 permutations for an experimentalwise error rate of P = 0.05 to determine whether the QTL was significantly associated with (Churchill and Doerge, 1994). The CIM model was also used to identify the main QTLs in the CE with the same parameters as used in the individual environment. Mapping for CE was done using the Best Linear Unbiased Prediction (BLUP) values for each independent environment and across all environments by using the lme4 package in R (Bates et al., 2014). QTLs detected in different environments at the same, adjacent, or overlapping marker intervals were considered the same QTL (Palomeque et al., 2009, 2010; Qi et al., 2017). QTL naming was done following the nomenclature of McCouch et al. (1997), thus starting with "q," followed by an abbreviation of the trait name (SW, seed weight) and the name of the chromosome, followed by the number of QTL detected on the same chromosome. The QTL genetic and physical positions based on the flanking markers with known positions were used to retrieve a number of earlier reported QTLs available on SoyBase<sup>2</sup> (Williams 82.a1.v.1.1). QTLs that did not overlap with reported QTLs in both genetic and physical positions were considered as new in this study. The QTLs identified in the individual environments were presented in Venn diagram using an online tool<sup>3</sup> (Oliveros, 2007).

#### Candidate Gene Prediction Analysis

In this study, QTL was considered as stable when detected in at least two environments. Model genes within the genomic physical position of the stable QTLs on the soybean genome (Williams 82.a1.v.1.1) available at SoyBase<sup>3</sup> were downloaded. Gene ontology (GO) enrichment analysis was conducted for all the genes within each QTL region using online GO tool<sup>4</sup> . Gene classification was then carried out using Web Gene Ontology (WeGO) Annotation Plotting tool, Version 2.0<sup>5</sup> (Ye et al., 2018). The predicted candidate genes were further subjected to Protein ANalysis THrough Evolutionary Relationships (PANTHER) Classification System to classify proteins (and their genes) in order to facilitate high-throughput analysis according to family and subfamily, molecular function, biological process, and pathway<sup>6</sup> . The selected candidate genes structure analysis was carried out using http://gsds.cbi.pku.edu.cn/ (Hu et al., 2014).

# RESULTS

#### Phenotypic Variation of 100-Seed Weight

Mean, range, standard deviation, skewness, kurtosis, H<sup>2</sup> , and GCV among the RILs and their parents across the four different environments (FY2012, JP2012, JP2013, and JP2014), and CE are presented in **Table 1**. The average 100-seed weight of the Nannong493-1, PI 483460B, and RILs were 16.49–19.09, 1.23–1.40, and 1.37–11.84 g, respectively, across

<sup>2</sup>www.soybase.org

<sup>3</sup>http://bioinfogp.cnb.csic.es/tools/venny\_old/venny.php

<sup>4</sup>http://geneontology.org/

<sup>5</sup>http://wego.genomics.org.cn/

<sup>6</sup>http://pantherdb.org/


TABLE 1 | Descriptive statistics, broad-sense heritability (H 2 ), and genotypic coefficient of variation (GCV) of 100-seed weight in NJIR4P RIL population and two parental lines viz., Nannong493-1 and PI 483460B.

<sup>a</sup>Standard deviation. <sup>b</sup>Combined environment (average from the four different environments).

all the studied environments (**Table 1**). However, there was no clear transgressive segregation among the RIL (**Figure 1**). Furthermore, ANOVA were performed to evaluate the effects of genotypes/lines (G), environment (E), and their interactions (GE) on 100-seed weight. The RILs showed highly significant differences (P < 0.01) for 100-seed weight in the individual environments. ANOVA for CE showed that G, E, and GE contributed significant variation to seed weight among the RILs of NJIR4P population (**Supplementary Table 2**). Hence, significant influence of E and GE on 100-seed weight of soybean suggests that seed-weight is a complex quantitative trait governed by polygenes. Moreover, high H<sup>2</sup> values in individual as well as CEs varying from 88.27 to 97.23% coupled with high GCV (>20%) suggest that considerable proportion of phenotypic variation of 100-seed weight is due to genotype.

#### QTL Mapping of 100-Seed Weight Using CIM

A total of 19 QTLs associated with seed-weight were identified in all the individual environments plus CE distributed on 12 of the 20 chromosomes of soybean, and explaining 4.22–13.20% of the phenotypic variation (R 2 ) (**Figure 2** and **Table 2**). Out of these 19 QTLs, 7 were identified for the first time viz., qSW-2-1, qSW-2-2, qSW-2-3, qSW-6-1, qSW-19-1, qSW-19-2, and qSW-19-3, and remaining 12 QTLs have been previously reported in reference to soybean genome GmComposite2003 (SoyBase) (**Table 2**). The highest number of four QTLs are present on Chr17 followed by three on each Chr2 and Chr19, and the rest 10 chromosomes contain one or two QTLs each. Of the 19 QTLs identified only two are major (R <sup>2</sup> > 10%) viz., qSW-17- 1 and qSW-17-4 both are located on Chr17, and the remaining 17 QTLs identified are minor (R <sup>2</sup> < 10%). Notably, the most prominent QTL with the highest LOD score (7.28) was identified in a 23.01 cM region on Chr17, named as qSW-17-1, explaining 13.20% of phenotypic variation. Five QTLs viz., qSW-2-1, qSW-2-2, qSW-4-2, qSW-14-1, and qSW-17-4 were identified in more than one individual environments (**Figure 3**), and three more QTLs viz., qSW-4-1, qSW-17-1, and qSW-17-3 were detected in one individual environment plus CE. Interestingly both major QTLs located on Chr17 (qSW-17-1 and qSW-17-4) were detected in more than one environments, suggesting the stability and consistency of these QTLs (**Table 2**). The remaining 11 QTLs were environment-specific QTLs identified in only one specific environment (**Table 2**). Out of these eight stable QTLs, two were novel QTLs identified for the first time (qSW-2-1 and qSW-2-2). All the QTLs identified for 100-seed weight in the RILs population displayed positive additive effects with positive alleles from higher seed-weight parent (Nannong493-1). Moreover, all the novel QTLs identified were minor (R <sup>2</sup> < 10%), thus, none of the novel QTLs detected in this study was major. However, most of the previously detected QTLs were identified in a narrowed physical genomic region (**Table 2**). The highest number of QTLs for 100-seed weight were identified on Chr17, Chr2, and Chr19 suggest the important role of these chromosomes in governing the inheritance of seed-weight in soybean.

### Gene Ontology and Candidate Gene Prediction Within Stable QTLs

Based on the number of individual environments QTL were detected, we selected five stable QTLs identified in more than one individual environments viz., qSW-2-1, qSW-2-2, qSW-4-2, qSW-14-1, and qSW-17-4 for GO and candidate gene prediction analysis. Within the physical genomic interval of qSW-2-1, qSW-2-2, qSW-4-2, qSW-14-1, and qSW-17-4, the 91, 100, 92, 137, and 70 model genes were present, respectively, and these genes as well as their gene annotation were downloaded from Soybase<sup>7</sup> . After GO enrichment analysis, we employed WeGO web-based tool to visualize the biological process, molecular function, and cellular component main categories (**Figure 4**). In all the five stable QTLs viz., qSW-2-1, qSW-2-2, qSW-4-2, qSW-14-1, and qSW-17-4, higher percentage of genes were associated with the terms cell part, cell, organelle, catalytic activity, binding, metabolic process, and cellular process (**Figure 4**). This suggests an important role of these terms in the seed development of soybean.

However, to identify the possible candidate genes underlying the above five stable QTLs responsible for 100-seed weight in soybean, we used PANTHER analysis, gene annotation information, and literature search. The PANTHER analysis is a comprehensive system that combines gene function, ontology, pathways, and statistical analysis tools, and enable biologists to analyze large-scale, genome-wide data from sequencing, proteomics, or gene expression experiments (Huaiyu et al., 2013). Based on the PANTHER analysis, gene annotation,

<sup>7</sup>www.soybase.org

as well as available literature, 29 genes out of total 490 model genes within the physical regions of the five stable QTLs were considered as possible candidate genes regulating seed-weight in soybean. Out of these 29 genes, 5 belong to ubiquitin-protein ligase (PC00234) class, 4 to carbohydrate transporter (PC00067), 3 are transporters (PC00227), 2 are involved in vesicle coat protein (PC00235), 1 in the SNARE protein (PC00034), and the remaining 18 belong to one or two other protein class (**Table 3**). Furthermore, Glyma02g13350, Glyma14g12220, and Glyma17g13000 genes had no protein class according to PANTHER analysis, and therefore were further analyzed using the gene expression data (RNA-seq) from phytozome database<sup>8</sup> , and their expression data revealed that these genes were highly expressed in the seed, and thus were also included as potential candidate genes. For instance, Glyma14g12220 has the domain of PP2C which is homolog to the PP2C that was demonstrated to enhance 100-seed weight by Lu et al. (2017).

### DISCUSSION

#### Phenotypic Analysis of Seed-Weight

Seed-weight is an important economical trait controlling the yield in soybean. Therefore, developing soybean cultivars with improved seed-weight was the prime objective of soybean breeders. However, to develop the soybean cultivars with improved seed-weight, it is necessary to understand the genetic mechanisms as well as identify genetic elements associated with 100-seed weight. Seed-weight is a polygenic quantitative trait

<sup>8</sup>https://phytozome.jgi.doe.gov/

governed by multiple genes, and is highly environmentally sensitive trait. Although over the past decades many QTLs related to soybean seed-weight/size have been reported, and there are ∼325 QTLs documented for seed weight/size in the USDA Soybean Genome Database (SoyBase<sup>9</sup> ). However, most of these QTLs were not stable as well as confirmed due to small sized mapping population and low-density genetic map, and hence have not been used for breeding improved seed-weight in soybean. Therefore, the aim of the present study was to utilize interspecific high-density linkage map of NJRI4P RIL population evaluated in four different environments to identify the stable QTLs as well as mine possible candidate genes for 100-seed weight in soybean. In the present study, ANOVA revealed that 100-seed weight was significantly affected by G, E, and G × E, similar as reported earlier by Fasoula et al. (2004). The RIL did not show clear transgressive segregation in any of environment, that might be due to unwanted linkages between beneficial and undesirable alleles contributed by exotic germplasm (Concibido et al., 2003; Wang et al., 2014). Furthermore, the cultivated and wild parents of RIL population showed clear and large difference in seed-weight/size confirming earlier reports that 100-seed is a domestication-related trait (Zhou L. et al., 2015; Zhou Z. et al., 2015). This wide difference between two parents of inter-specific RIL population for 100-seed weight has allowed detection of more number of QTLs including some novel QTLs. Maximum 100-seed weight of the RILs in each environment was more than three times higher than that of wild parent (PI 483460B), and also the RILs with minimum seed-weight were higher than

<sup>9</sup>http://www.soybase.org

PI 483460B indicating the usefulness of wild soybean in breeding program for specific seed size (Concibido et al., 2003; Hyten et al., 2006; Kim M.Y. et al., 2010; Lam et al., 2010; Yu et al., 2017; Kofsky et al., 2018). The higher H<sup>2</sup> value observed for seed-weight in both the individual and CEs suggests that large proportion of trait variation is under genetic control, and these findings are similar as reported earlier by Kulkarni et al. (2016).

#### Genetic Control of Seed-Weight

As discussed above, many QTLs have been reported for seedweight in soybean<sup>10</sup>. But majority of these previous studies used low-density genetic maps based on SSR or other lowthroughput markers (Specht et al., 2001; Hyten et al., 2004; Panthee et al., 2005; Liu et al., 2011; Yao et al., 2015), which has low resolution with large confidence interval of QTLs not suitable for candidate gene detection (Hyten et al., 2010; Xu et al., 2013). The quality of genetic maps has great influence on the accuracy of QTL detection (Gutierrez-Gonzalez et al., 2011). In this context, high-density genetic map could identify more recombination events in a population, and will increase accuracy of QTL mapping (Xie et al., 2010). In the present study, we used high-density inter-specific bin map of NJIR4P RIL population consisting of 4,354 bin markers distributed to all 20 chromosomes of soybean with an average number of markers and distance per chromosome are 218 and 106.84 cM, respectively. The average distance between two markers was 0.49 cM (Wang et al., 2016). In addition, high-density genetic map assists in identifying tightly linked markers associated with QTLs, and provided a good foundation for analyzing quantitative traits. Moreover, the use of interspecific population would also enhance identification of genomic region(s) which was/were altered during domestication (Liu et al., 2018).

The QTLs associated with seed-weight in soybean has been mapped on all soybean linkage groups/chromosomes. In the present study, we identified a total of 19 QTLs associated with 100-seed weight using inter-specific genetic map of NJIR4P population, and these QTLs contributed significantly to the

<sup>10</sup>www.soybase.org

seed-weight. By comparing our QTL results with SoyBase database<sup>11</sup>, 12 QTLs have been previously reported in the same physical genomic region, and only 7 were novel identified for the first time (**Table 2**). The seven novel QTLs detected indicating the distinct genetic architecture of NJIR4P population, and suggest the need to use more germplasm for revealing the complex genetic basis of 100-seed weight in soybean. The physical interval of qSW-2-1, qSW-2-2, qSW-2-3, qSW-6-1, qSW-19-1, qSW-19- 2, and qSW-19-3 did not overlap with any of the previously reported seed-weight QTLs, and hence were considered as novel QTLs. The qSW-1-1 was identified in the genetic interval (82.6– 84.1 cM) that overlap with the seed-weight QTLs viz., Seed weight 15-2 and Seed weight 18.1-2 identified in the same genetic and physical position as reported earlier (Hyten et al., 2004; Panthee et al., 2005). Similarly, two QTLs identified on Chr4 viz., qSW-4- 1 and qSW-4-2 overlapped with seed weight 47-1 corresponding to physical position of 96,434–51,252,852 bp (Li et al., 2010) and seed weight per plant 6-2 corresponding to physical position of 486,057–526,777 bp (Yao et al., 2015), respectively. The qSW-9-1 were detected in the same genomic physical interval as previously

<sup>11</sup>www.soybase.org

TABLE 2 | Main QTLs identified in an interspecific RIL population (NJIR4P) in four different environments (FY2012, JP2012, JP2013, and JP2014) and combined environment (CE).


<sup>a</sup>QTLs detected in different environments at the same, adjacent, or overlapping marker intervals were considered the same QTL. <sup>b</sup>Position of the QTL. <sup>c</sup>Chromosome on which QTL was detected. <sup>d</sup>The log of odds (LOD) value at the peak likelihood of the QTL. <sup>e</sup>Phenotypic variance (%) explained by the QTL. <sup>f</sup>Additive shows beneficial alleles from parent Nannong 493-1. <sup>g</sup>The physical position of QTL relation to soybean cultivar W82.a1.v.1.1. <sup>h</sup>1-LOD support confidence intervals (confidence interval length). <sup>i</sup>Environment where CE represents combined environments using the BLUP values and others refer materials and methods. <sup>j</sup>Overlapping references.

reported QTL, Seed weight 35-6 QTL (Han et al., 2012). Likewise, SW-11-1 was located in the genomic position of Seed weight 10-3 (Specht et al., 2001), Seed weight 32-1 (Li et al., 2008), and Seed weight 36-11 (Han et al., 2012). The qSW-14-1 could be the same QTL as Seed weight 36-14 (Han et al., 2012). Lu et al. (2017) identified QTL on Chr15 at the same physical interval (1,901,425–2,855,666 bp) as qSW-15-1 reported in the present study. The major and stable QTL viz., qSW-17-1 overlapped with earlier reported QTLs, Seed weight 21-1 (Gai et al., 2007), Seed weight 22-3 (Zhang et al., 2004), and Seed weight 47-2 (Li et al., 2010). Moreover, qSW-17-2 and qSW-17-3 overlapped with seed-weight QTLs previously reported by Li et al. (2010) and Wang et al. (2015), respectively. Another major and stable QTL identified on Chr17 viz., qSW-17-4 has been also reported by the number of earlier studies (Kim H.-K. et al., 2010; Kato et al., 2014; Wang et al., 2015; Zhou Z. et al., 2015; Liu et al., 2018). The seven novel QTLs identified for 100-seed weight together explained ∼46% of the phenotypic variation, which suggested potential importance of these loci for seed-weight. The QTLs identified in this study had narrow genetic and physical regions for the instance, qSW-17-4 which overlapped with Seed weight 47-2 (Li et al., 2010). In our study, qSW-17-4 was detected at genetic and physical positions of 37.7–42.3 cM and 9,420,885–10,095,969 bp, respectively, compared to Seed weight 47-2 (24.52–124.30 cM and 5,788,551–40,525,673 bp). In plant breeding, stability of QTL is essential for their use in MAB. Besides, two novel stable QTLs (qSW-2-1 and qSW-2-2) identified in the present study, the 12 QTLs for 100-seed weight have been previously co-localized in the same physical interval by earlier studies (see references in **Table 2**). Of the 12 QTLs previously reported, two are major QTLs with R 2 -value > 10% both located on Chr17 viz., qSW-17- 1 and qSW-17-4 (see references in **Table 2**). Hence, these QTLs might also be considered as stable QTLs, and major stable QTLs can be used for further fine mapping and map-based cloning to unravel the mechanisms of seed-weight in soybean, as well as might be good for MAB. All the beneficial/positive alleles in the NJIR4P RIL population were derived from the cultivated soybean (Nannong 493-1), indicating that seed-weight was altered during

domestication (Zhou L. et al., 2015; Zhou Z. et al., 2015; Lu et al., 2016). Similar to our findings, Lu et al. (2017) also reported that all the beneficial alleles for 100-seed weight were inherited from the cultivated soybean except one beneficial QTL allele viz., PP2C-1 that was derived from wild soybean parent. Although it has been revealed that wild soybean is a potential source for improving cultivated soybean in terms of yield-related traits, seed quality, and biotic and abiotic stress tolerance (Tuyen et al., 2010; Kim et al., 2011). In accordance with the earlier studies (Wang et al., 2016; Xin et al., 2016; Liu et al., 2018), our study also revealed that alleles derived from wild soybean contribute to a reduction in seed weight in all 19 seed-weight QTLs. It is not always the purpose of soybean breeders to increase seed weight/size, but also sometimes breeding program requires a suitable/optimized combination of yield-related parameters such as seed size, the number of seeds per pod, and the number of pods per plant. Hence, QTLs detected in our study would be valuable for controlling seed size via genomic breeding by design and positional cloning of the relevant genes. Furthermore, most of the QTLs detected in this study overlapped earlier reported QTLs indicating the accuracy of our mapping results. Moreover, those confirmed in this study with narrow regions

TABLE 3 | Thirty-one possible candidate genes predicated within five stable QTL regions identified in this study based on PANTHER analysis, gene annotation, and available literature.


Protein with <sup>∗</sup> indicates these proteins are selected from literature only. could be integrated into breeding program via marker-assisted selection (MAS).

# Candidate Gene Analysis for Seed-Weight

fpls-10-01001 August 31, 2019 Time: 18:21 # 10

It is of great interest for both theoretical study and practical breeding program to identify the actual candidate gene underlying the QTL region. Most of the earlier QTL mapping on seed-weight did not mine for candidate genes (Zhang et al., 2004; Kato et al., 2014; Wu et al., 2018), and till date only two seed weight/size-related genes have been isolated from soybean viz., ln gene has a large effect on the number of seeds per pod and seed size (Jeong et al., 2012), and recently, the PP2C-1 (protein phosphatase type-2C) allele from wild soybean accession ZYD7 was found to contribute to the increase in seed size (Lu et al., 2017). Hence, based on the available information in current literature, gene annotation as well as bioinformatics tools, the present study identified the possible candidate genes regulating the 100-seed weight in soybean that underlie the stable QTLs. A total of 490 model genes were mined from the physical regions of the five stable QTLs viz., qSW-2-1, qSW-2-2, qSW-4-2, qSW-14-1, and qSW-17-4, and out of these 29 were considered as possible candidate genes based on the PANTHER analysis, gene function, and available literature (Huaiyu et al., 2013). Based on the WeGo analysis most of the genes underlying above five stable QTLs belong to the terms cell component, catalytic activity, binding, transporting, metabolic and cellular process, and these elements were reported to be vital in seed development (Fan et al., 2006; Mao et al., 2010; Li and Li, 2014). For example, Glyma02g13210 gene underlying QTL qSW-2-2 belongs to oxygenase (PC00177) protein class, that has been demonstrated to regulate seed size in soybean (Zhao et al., 2016). Similarly, protein family E3 ubiquitin-protein ligase (PC00234) are involved in the ubiquitin-proteasome pathway, and this protein family include members from various crop species such as DA1, DAR1, DA2, and EOD1/BB (Arabidopsis), GW2 (rice), TaGW2 (Wheat), ZmGW2 (maize), and UBP15/SOD2 (Arabidopsis), and all these genes have been reported to have significant effect on seed development (Li and Li, 2014, 2016; Ge et al., 2016). Thus, Glyma02g11570, Glyma02g11850, Glyma02g11960, Glyma04g10610, and Glyma14g13011 belonging to E3 ubiquitin-protein ligase (PC00234) were considered as possible candidate genes in the present study. Furthermore, Xian-Jun et al. (2007) reported a gene underlying QTL for rice grain width and weight (GW2) that encodes a previously unknown RING-type E3 ubiquitin ligase has been demonstrated by Xian-Jun et al. (2007). They demonstrated that loss of GW2 function increased cell numbers, resulting larger spikelet hull, and accelerated the grain milk filling rate, resulting in enhanced grain width, grain weight, and yield. It has been revealed that regulation of seed development is controlled by source (leaf) and sink (seed) relationship in plants (Schnyder, 1993), which is influenced by assimilate translocation/transportation. Therefore, genes viz., Glyma02g13730, Glyma04g10590, Glyma04g11060, Glyma04g11120, Glyma04g11130, Glyma04g11140, and Glyma14g11780 belonging to carbohydrate transporter (PC00067 or PC00227) gene family were might be possible candidate genes for seed-weight. Legume seed development is closely related to metabolism and nutrient (sucrose) transport (Borisjuk et al., 2003). Candidate genes Glyma02g12351, Glyma04g10590, Glyma04g10600, and Glyma14g12405 belong to vesicle coat protein (PC00235). This protein family have been reported to be involved in protein–protein interaction and transport (Harley and Beevers, 1989; Anantharaman and Aravind, 2002). Two candidate genes Glyma02g13401 and Glyma02g13420 were members of K and MAD box protein family, and this protein family has been reported to regulate flower development in plants (Bowman et al., 1991; Ditta et al., 2004). The flower as an organ acts as either source or sink and determine the seed number, which indirectly affect the seed size and shape (Stanton, 1984; Jia et al., 2016). The Glyma04g11080 belongs to several protein classes such as amino acid transporter (PC00046); calmodulin (PC00061); mitochondrial carrier protein (PC00158); transfer/carrier protein (PC00219) which could possibly be involved in seed weight regulation. For example, in rice, Asano et al. (2002) has shown that a calmodulin-like domain protein kinase is required for storage product accumulation during seed development. Moreover, Glyma02g12030, Glyma04g12120 and Glyma14g12120 belong to one or more protein classes: such as acyltransferase (PC00038), glycosyltransferase (PC00111), and transfer/carrier protein (PC00219), and these protein classes were demonstrated to play role in seed development (Rehman et al., 2016). The Glyma17g12910 gene underlying a major stable QTL, qSW-17-4, belongs to ATP-binding cassette (ABC) transporter (PC00003) which could possibly be involved in seed development (David et al., 2010). In addition, Glyma17g13000 belongs to histone deacetylase 15 (PTHR45634:SF12) that might be involved in regulating seed weight (Shahbazian and Grunstein, 2007; Peserico and Simone, 2011). As Yang et al. (2016) demonstrated maize histone deacetylase HDA101 function and regulatory mechanism during seed development. Also, Glyma17g13050 and Glyma17g13210 which code for DNA-binding protein (PC00009) and leucine-rich repeat-containing protein, respectively, play significant role in seed development (Li and Li, 2016; Li et al., 2019). Among the predicted candidate genes, the minimum number of exons and introns was two with the maximum gene sequence of 13,670 bps for Glyma17g13050 (**Supplementary Figure 1**). However, few of the 29 possible candidate genes predicted in this study for 100-seed weight have been included in our on-going projects for their functional validation. Lastly, the major and stable QTLs identified in the present study will be the main focus of soybean breeders for fine mapping and MAB of soybean cultivars with improved 100-seed weight.

#### CONCLUSION

In conclusion, the present study used high-density bin map of an interspecific RIL population (NJIR4P) evaluated in multiple environments to detect QTLs as well as mine possible candidate genes controlling 100-seed weight. A total of 19 QTLs were found associated with 100-seed weight, and out of which 7 were novel

(reported for the first time). In addition, out of 19 QTLs, 8 were considered as stable QTLs identified in either more one individual environments or one individual environment plus CE, and two of them were major viz., qSW-17-1 and qSW-17-4 (R <sup>2</sup> > 10%). Moreover, most of the previously reported QTLs validated in the present study had narrow physical genomic interval. All the beneficial/positive alleles of 19 QTLs were derived from the cultivated soybean (Nannong493-1). Twentynine possible candidate genes were mined within the five stable QTLs and most of them are belonging to ubiquitin-protein ligase (PC00234) that have been earlier reported to play significant role in seed/organ size development and regulation. However, it needs further validation to determine their actual role in seed weight and development, although few of them have been included in our on-going projects for functional validation. Hence, after proper functional validation of these candidate genes, these candidate genes can be used for improving 100-seed weight of soybean through transgenic or MAB. Lastly, our study provides detailed information for accurate QTL localization and candidate gene discovery, and these findings will be of great use for MAS of soybean varieties with improved seed-weight.

# DATA AVAILABILITY

All datasets generated for this study are included in the manuscript and/or the **Supplementary Files**.

#### REFERENCES


# AUTHOR CONTRIBUTIONS

TZ conceived and designed the experiments. BK, SC, YX, FC, YZ, and JK performed the experiments. BK and JB analyzed the data. BK and JB drafted the manuscript. TZ and JB revised the manuscript.

# FUNDING

This work was supported by the National Key R&D Program of China (2016YFD0100201), the National Natural Science Foundation of China (Grant Nos. 31571691 and 31871646), the Fundamental Research Funds for the Central Universities (KYT201801), the MOE Program for Changjiang Scholars and Innovative Research Team in University (PCSIRT\_17R55), the MOE 111 Project (B08025), and the Jiangsu Collaborative Innovation Center for Modern Crop Production (JCIC-MCP) Program.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01001/ full#supplementary-material



Rehman, H. M., Nawaz, M. A., Le, B., Shah, Z. H., Krishnamurthy, P., Ahmad, M. Q., et al. (2016). Genome-wide analysis of Family-1 UDPglycosyltransferases in soybean confirms their abundance and varied expression during seed development. J. Plant Phys. 206, 87–97. doi: 10.1016/j.jplph.2016. 08.017

SAS Institute (2010). SAS/STAT software version 9.2. Cary, NC: SAS Institute Inc.


map based on population sequencing. Proc. Natl. Acad. Sci. U.S.A. 107, 10578– 10583. doi: 10.1073/pnas.1005931107


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Karikari, Chen, Xiao, Chang, Zhou, Kong, Bhat and Zhao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Haplotypes at the *Phg-2* Locus Are Determining Pathotype-Specificity of Angular Leaf Spot Resistance in Common Bean

#### *Michelle M. Nay1, Clare M. Mukankusi2, Bruno Studer 1\*† and Bodo Raatz3†*

*1 Molecular Plant Breeding, Institute of Agricultural Sciences, ETH Zurich, Zurich, Switzerland, 2 Bean Program, International Center for Tropical Agriculture (CIAT), Kampala, Uganda, 3 Bean Program, International Center for Tropical Agriculture (CIAT), Cali, Colombia*

#### *Edited by:*

*Karam B. Singh, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia*

#### *Reviewed by:*

*Dongying Gao, University of Georgia, United States Ken Chalmers, University of Adelaide, Australia*

> *\*Correspondence: Bruno Studer*

*Bruno.studer@usys.ethz.ch*

*†These authors share senior authorship*

#### *Specialty section:*

*This article was submitted to Legumes for Global Food Security, a section of the journal Frontiers in Plant Science*

*Received: 14 May 2019 Accepted: 15 August 2019 Published: 12 September 2019*

#### *Citation:*

*Nay MM, Mukankusi CM, Studer B and Raatz B (2019) Haplotypes at the Phg-2 Locus Are Determining Pathotype-Specificity of Angular Leaf Spot Resistance in Common Bean. Front. Plant Sci. 10:1126. doi: 10.3389/fpls.2019.01126*

Angular leaf spot (ALS) is one of the most devastating diseases of common bean (*Phaseolus vulgaris* L.) and causes serious yield losses worldwide. ALS resistance is reportedly pathotype-specific, but little is known about the efficacy of resistance loci against different pathotypes. Here, we report on ALS resistance evaluations of 316 bean lines under greenhouse and field conditions at multiple sites in Colombia and Uganda. Surprisingly, genome-wide association studies revealed only two of the five previously described resistance loci to be significantly associated with ALS resistance. *Phg-2* on chromosome eight was crucial for ALS resistance in all trials, while the resistance locus *Phg-4* on chromosome 4 was effective against one particular pathotype. Further dissection of *Phg-2* uncovered an unprecedented diversity of functional haplotypes for a resistance locus in common bean. DNA sequence-based clustering identified eleven haplotype groups at *Phg-2*. One haplotype group conferred broad-spectrum ALS resistance, six showed pathotype-specific effects, and the remaining seven did not exhibit clear resistance patterns. Our research highlights the importance of ALS pathotype-specificity for durable resistance management strategies in common bean. Molecular markers co-segregating with resistance loci and haplotypes will increase breeding efficiency for ALS resistance and allow to react faster to future changes in pathogen pressure and composition.

Keywords: food security, plant breeding, plant pathology, genome-wide association studies (GWAS), pathotypespecificity, common bean (*Phaseolus vulgaris* L.), angular leaf spot (ALS), *Pseudocercospora griseola*

# INTRODUCTION

Plant diseases can cause substantial loss of crop yields with detrimental effects on food security (Oerke, 2006; Savary et al., 2012). In Latin America and Africa, for example, common bean (*Phaseolus vulgaris* L.) is one of the most important crops and particularly valued for its protein and micronutrient content. However, common bean production is frequently reduced by pathogen attacks with angular leaf spot (ALS), caused by *Pseudocercospora griseola* (Sacc.) Crous and Braun (Crous et al., 2006), being one of the most devastating common bean diseases in the tropics and subtropics. ALS has been reported to cause yield losses of up to 80% (Schwartz et al., 1981; Correa-Victoria et al., 1989; Saettler, 1991; Wortmann et al., 1998; Stenglein et al., 2003; Sartorato, 2004). In the tropics and subtropics, common beans are mostly cultivated by smallholder farmers with limited possibilities to protect their crops from diseases or adverse climatic conditions and, therefore, depend on resistant common bean varieties to maintain stable yields (Schwartz and Pastor-Corrales, 1989).

Common bean germplasm can be divided into two gene pools, the Andean and the Mesoamerican gene pool (Gepts and Debouck, 1991; Mamidi et al., 2013). The latter, genetically more diverse Mesoamerican gene pool has been reported to contain more and stronger ALS resistance sources (Nay et al., 2019). Breeding for ALS resistance is challenged by the high genetic diversity of the pathogen and the recurrent appearance of new *P. griseola* pathotypes (Pastor-Corrales et al., 1998; Busogoro et al., 1999; Mahuku et al., 2002a). To categorize pathotypes, they are tested for their ability to infect six Andean and six Mesoamerican common bean lines with distinct resistance patterns (also referred to as differentials), in order to determine their race (Pastor-Corrales and Jara, 1995; Nay et al., 2019). The ALS pathogen co-evolved within the two common bean gene pools into Andean races, only causing disease on Andean beans, and Mesoamerican races, showing a higher specificity for Mesoamerican beans but also attacking beans of the Andean gene pool (Gepts and Debouck, 1991; Guzmán et al., 1995; Pastor-Corrales et al., 1998; Mamidi et al., 2013; Schmutz et al., 2014). Resistance in common bean has been reported to be pathotypespecific with large differences of the effectiveness in different locations and continents (Pastor-Corrales and Jara, 1995; Pastor-Corrales et al., 1998; Mahuku et al., 2002b; Silva et al., 2008).

Previous ALS resistance studies defined five repeatedly characterized resistance loci, in addition to several minor resistance sources (reviewed in Nay et al., 2019): *Phg-1* was found in the line AND 277, closely linked to the anthracnosis resistance locus *Co-14* at the lower end of chromosome (Chr) 1 (Carvalho et al., 1998; Gonçalves-Vidigal et al., 2011). *Phg-2* was found on Chr 8 in the Mesoamerican lines Mexico 54, with potential resistant alleles in Cornell 49–424, BAT 332, MAR 2, G10474, and G10909 (Sartorato et al., 1999; Ferreira et al., 2000; Nietsche et al., 2000; Caixeta et al., 2003; Mahuku et al., 2004; Mahuku et al., 2011). The *Phg-3* locus was found in Ouro Negro on the lower arm of Chr 4 and *Phg-4* in G5686 on the upper arm (Corrêa et al., 2001; Mahuku et al., 2009; Keller et al., 2015). *Phg-5* was found in the lines CAL 143 and G5686 on Chr 10 (Oblessuc et al., 2012; Keller et al., 2015). Besides these well-characterized major resistance loci, indications for quantitative resistance were reported (Teixeira et al., 2005; Oblessuc et al., 2012; Keller et al., 2015; Bassi et al., 2017).

All the above-mentioned studies were conducted in bi-parental mapping populations, limiting the allelic diversity in the population to the two parental alleles. The establishment of such mapping populations is laborious, and the resistance loci found in such experiments may only be effective in the original background due to epistatic effects (Holland, 2004; Kumar et al., 2014). In addition, bi-parental mapping studies were often tested for ALS resistance with a single pathotype or at a single field location, even though the pathotype-specific resistance reaction of *P. griseola* is well described (Pastor-Corrales et al., 1998; Corrêa et al., 2001; Mahuku et al., 2002b; Mahuku et al., 2009; Gonçalves-Vidigal et al., 2011; Mahuku et al., 2011; Oblessuc et al., 2012; Ddamulira et al., 2014; Bassi et al., 2017). Hence, little is known about the range of effectiveness and the interaction of different ALS resistance loci in common bean in different environments with possibly different pathotypes. Furthermore, all previous mapping studies were conducted with Latin American pathotypes, and it is unknown whether the same resistance loci are effective against pathotypes from Africa.

Genome-wide association studies (GWAS) in panels specifically assembled to contain breeding germplasm with phenotypic variability for the trait of interest can overcome the above-mentioned limitations of bi-parental mapping populations. This type of analysis became possible through technological advancements, particularly in next generation sequencing, which allows to genotype hundreds of individuals at a sufficiently high marker density to cover the linkage disequilibrium blocks and to find trait-specific single nucleotide polymorphisms (SNPs) for breeding. By testing a diversity panel with different pathotypes, GWAS enables the identification of pathotype-specific resistance loci as has been recently demonstrated for anthracnose (Zuiderveen et al., 2016).

The main objective of this study was to gain a broader understanding of ALS resistance sources, the resistance loci they contain, and their effectiveness against different pathotypes on two continents. Specifically, we aimed at i) assembling a panel consisting of the currently available ALS resistance sources, ii) evaluating its resistance against multiple ALS pathotypes under greenhouse and field conditions, and iii) identifying pathotype- and field location-specific resistance loci and haplotypes through genotyping by sequencing (GBS) and GWAS.

#### MATERIALS AND METHODS

#### Plant Material

An association mapping panel of 316 common bean lines, named extended BALSIT (extBALSIT), was used for ALS resistance evaluations and GWAS. ExtBALSIT included the Bean ALS International Trial (BALSIT) panel consisting of 55 lines, complemented with previously characterized resistance sources (Carvalho et al., 1998; Sartorato et al., 1999; Ferreira et al., 2000; Nietsche et al., 2000; Corrêa et al., 2001; Caixeta et al., 2003; Mahuku et al., 2003; Mahuku et al., 2004; Mahuku et al., 2009; Gonçalves-Vidigal et al., 2011; Mahuku et al., 2011; Oblessuc et al., 2012; Keller et al., 2015), CIAT breeding material with phenotypic variability for ALS response and susceptible checks. The panel included 124 large-seeded Andean beans, 129 small-seeded Mesoamerican, and 63 lines from inter-gene pool crosses. The 316 common bean lines of the extBALSIT panel were multiplied, out of which 264 lines received phytosanitary certificates and were shipped from Colombia to Uganda for ALS-resistance evaluation.

#### Evaluation of Angular Leaf Spot Disease Resistance

The extBALSIT panel was evaluated for ALS resistance in the greenhouse with single-spore *P. griseola* isolates and in the field with mixes of isolates. Highly pathogenic Mesoamerican and Andean races were chosen for the greenhouse experiments. Isolates belonging to races 63–63, 63–47, 30–0, and 13–63 were used in Colombia and race 61–63 in Uganda. In the field, inoculations were conducted with pathogen isolates previously collected at the respective field sites in Colombia and different districts in Uganda (**Supplementary Table 1**). Disease severity was evaluated with the CIAT standard scale ranging from 1 (no disease symptoms) to 9 (very severe disease symptoms and defoliation) (van Schoonhoven and Pastor-Corrales, 1987).

Greenhouse experiments were conducted at CIAT headquarters in Colombia (Cali) and at CIAT in Uganda (Kawanda). Three and five seeds of each common bean line were planted per pot under well-watered conditions in Colombia and Uganda, respectively. In Colombia, primary leaves were treated with Elosal (Bayer Crop Science, Monheim am Rhein, Germany) eight days after sowing, to prevent powdery mildew infections, and urea was added before inoculations. For each pathotype, two replicates in time were screened with each replicate containing one pot per line of the extBALSIT panel. Pathogen isolates were grown in V8 medium (Castellanos et al., 2016) for 8–20 days before inoculation, depending on growth rate of the isolate. Inoculum was prepared according to the CIAT manual (Castellanos et al., 2016) and spray-inoculated on trifoliate leaves of 17-day-old plants in Colombia, and 21-day-old plants in Uganda. After inoculation, plants were transferred to a humidity chamber for four days in Colombia, while in Uganda, they were covered with a plastic bag for three days to increase humidity. Ten days after inoculation, plant disease scores were evaluated four times within a week, usually on days 10, 12, 14, and 17. Because of the slow disease progression in Uganda, an additional evaluation was conducted 21 days after inoculation.

Field experiments were conducted during the rainy season in October 2016 and 2017 in Darien (N3°53'31'' W76°31'0,'' 1,491 m a.s.l.) and Quilichao (N3°04'22" W76°29'55," 991 m a.s.l.), Colombia, and in May 2018 in Kawanda (N0°24'11" E32°31'54," 1,178 m a.s.l.), Uganda. Common bean lines were evaluated as single rows in Colombia and in a randomized complete block design with two replicates in Uganda. The rows measured 2.5–3 m in Colombia and 5 m in Uganda, the distance between rows measured 0.6 m, and seeds were sown with a density of 10 seeds/m. Susceptible and resistant checks were added every eight rows, and a border of susceptible checks was planted to favor spread of the disease. Plants were inoculated three times in a weekly interval using a backpack sprayer, starting approximately 20 days after planting when the third trifoliate leaf of most plants was fully extended (stage V4, according to van Schoonhoven and Pastor-Corrales (1987)). ALS symptoms on leaves were also evaluated three times in a weekly interval and started at the appearance of the first disease symptoms approximately 40 days after inoculation. Pods were evaluated at the mid-pod fill stage, approximately 3 weeks after the last foliar evaluation (exact dates are given in **Supplementary Table 2**). Phenotypic data of the extBALSIT panel is available on dataverse.org (https://doi. org/10.7910/DVN/U2BAWN).

Inoculum was prepared according to Castellanos et al. (2016), as a mixture of five, six, and five single-spore pathogen isolates in Darien, Quilichao, and Kawanda, respectively (**Supplementary Table 1**). The isolates in Uganda did not sporulate well and a precise adjustment of the spore concentration was not possible. Therefore, fungal mycelium of 70 petri dishes was scraped off and diluted in water for the first inoculation and 35 petri dishes for the subsequent inoculations.

#### DNA Extraction and Genotyping

For genotyping, three emerging trifoliate leaves were sampled and used for DNA extraction following a urea–phenol– chloroform–isoamylalcohol protocol reported by Chen et al. (1992). DNA quality was checked by agarose gel electrophoresis and quantified by absorption of fluoresce using PicoGreen to stain double stranded DNA (Molecular Probes Inc., Eugene OR, USA). The common bean lines of the extBALSIT panel were subjected to GBS according to Elshire et al. (2011) with the following modifications: adaptor concentrations were 6 ng/μl, digestion per reaction was conducted with 0.5 μl restriction enzyme ApeKI (50 U/μl, New England Biolabs [NEB], Ipswich MA, USA), ligation with 0.5 μl ligase (20 U/μl, Promega, Madison WI, USA) and 3 μl buffer per sample, filled up with ddH2O to reach the target reaction volume. After adapter ligation, the 96 samples were pooled and cleaned with a PCR Clean-Up System (Promega), according to the manufacturer's protocol. For each pool, PCR was conducted in duplicate and merged afterwards. Each PCR reaction with a total volume of 50 μl contained 1x buffer (10 mM Tris-HCl pH 8.8, 50 mM KCl, 0.8% [v/v] Nonidet P40 [Fermentas, Waltham MA, USA]), 2 mM MgCl2, 0.1% bovine serum albumin, 1% polyvinylpyrrolidone, 0.016 μM of each primer, 0.4 mM dNTP, 0.3 μl TAQ polymerase (Sigma-Aldrich, St. Louis MN, USA), and 2 μl DNA template. Primers used for amplification were the following: forward PCR\_Primer1\_Short: AATGATACGGCGACCACCGAGATCTACACTCTTTCCC TACACGACGC and reverse PCR\_Primer2.1.i7: AAGCAG AAGACGGCATACGAGATGTCGATTGTGACTGGA GTTCAGATGTGTG. Each library containing 96 individually barcoded genotypes was sequenced by 150 bp single end sequencing on a single lane of the Illumina HiSeq Instrument (Illumina, San Diego CA, USA) at Hudson Alpha sequencing facility (Huntsville AL, USA). For SNP calling, the NGSEP pipeline (Perea et al., 2016) was used with the following quality criteria: a minimum quality score of Q40, scores in at least 220 of the 316 common bean lines, a minor allele frequency exceeding 5%, and a heterozygosity rate below 6%. Subsequently, heterozygous data points were removed. Genomic positions of SNPs and candidate genes were inferred according to the v2.1 of the *P. vulgaris* reference genome (available at https://phytozome. jgi.doe.gov, accessed 11. Nov. 2018). Genotypic information of the extBALSIT panel is available on dataverse.org (https://doi. org/10.7910/DVN/U2BAWN).

#### Genome-Wide Association Studies

Genotype to phenotype associations were identified with TASSEL 5 (Bradbury et al., 2007). For greenhouse and field trials, mean ALS scores from the last evaluation of the trial were used. A mixed linear model was implemented using principal component analysis (PCA) with the first two principal components to correct for population structure and the K matrix to correct for kinship (Bradbury et al., 2007). Within TASSEL, the kinship was calculated using the centered identity-by-state (IBS) method, P3D was implemented for variance component analysis, and no compression was used (Zhang et al., 2010; Endelman and Jannink, 2012). The significance threshold was adjusted with the Bonferroni correction. TASSEL output and phenotypic data were analyzed and plotted using RStudio (version 3.4.4) with the packages qqman, ggplot2, reshape2, and psych (R Core Team, 2008).

#### Haplotype Analysis at the *Phg-2* Locus

In order to group the haplotypes at the *Phg-2* locus on Chr 8, SNPs located in the interval of significant associations (i.e., from position 61,150,549–62,934,224 bp in the reference genome sequence) were clustered using a hierarchical clustering method implemented in R. The 276 common bean lines with less than 50% missing SNP data in the interval were retained for analysis. The genotype matrix was translated to numeric values, Euclidian distance between the common bean lines calculated and hierarchical clustering according to the Ward.D2 method was performed (Ward, 1963; Murtagh and Legendre, 2014). The resulting dendrogram was cut to group the haplotypes into eleven groups. The haplotype groups were named Andean or Mesoamerican, according to the gene pool of the lines from which the haplotypes originated. To evaluate the effect of the haplotypes, the disease scores of each haplotype for each experiment were plotted in R.

#### RESULTS

#### Angular Leaf Spot Resistance Is Highly Location- and Pathotype-Specific

Evaluation of the extBALSIT panel for ALS resistance revealed trial-specific frequency distributions of ALS scores (**Figure 1**, **Supplementary Figure 1**). Differences were observed between continents, locations, greenhouse and field experiments, and different pathotypes. For most trials, a continuous distribution of disease scores was found, only in the greenhouse experiment with pathotype COL 30–0, the histogram clearly differentiated resistant and susceptible lines, indicating major gene resistance. Twenty-seven lines were found resistant (ALS score ≤3 on a 1 to 9 scale) in all 6 trials conducted in Colombia, 43 were resistant in the 2 trials conducted in Uganda, and 2 (AAB 8–2, G6727) were resistant in all experiments. The differences between the continents were also notable: of the 46 most resistant lines in Colombia (average ALS leaf score over all experiments ≤3), only 15 had an average score of ≤3 against the Ugandan pathotypes tested.

Out of the 55 pairwise correlations between phenotypic data of the trials, 43 (78%) were significant (Pearson correlation, *P* < 0.05), ranging from 0.12 to 0.73 (**Supplementary Figure 2**). Highest correlations were observed between the replicates of the field experiment in Kawanda, Uganda, and the comparison of field data between years in Darien and Quilichao, Colombia (**Supplementary Table 3**).

FIGURE 1 | Frequency distributions of disease scores for angular leaf spot (ALS), evaluated in greenhouse (blue) and field trials (green) using the extBALSIT panel containing 316 common bean lines. ALS was scored on a scale from 1 to 9, where 1 is resistant and 9 is highly susceptible. The greenhouse trials were conducted with five different pathotypes, determined by their origin (COL and UG) and race (63–63, 63–47, 61–63, 13–63, and 30–0). Field trials in Colombia (Darien and Quilichao) and Uganda (Kawanda) were inoculated with mixtures of pathotypes previously collected at the corresponding sites. For Darien and Quilichao, the average ALS score from both evaluation years is shown.

#### Genome-Wide Association Studies Confirm ALS Resistance Loci on Chromosomes 4 and 8 of Common Bean

Genotyping by sequencing of the extBALSIT panel revealed 22,765 high-quality SNPs distributed over the eleven choromosomes of common bean (**Supplementary Figure 3**). The population structure of the extBALSIT panel was analyzed with PCA, on the basis of the SNP marker data (**Supplementary Figure 4**). The first PC explained 45% of the genotypic variance and clearly distinguished Andean and Mesoamerican lines, with lines that originated from inter-gene pool crosses clustering between them. The second PC explained 4% and distinguished lines that originated from a cross between G10474 and G5687 (referred to as RAI lines) from the remaining inter-gene pool crosses. The second PC further separated the Mesoamerican lines G10613, G10474, G10909, G18970, G855, Mexico 54, G1805, Flor de Mayo, MAR 2, and G5653. The first six of these accessions were collected in Guatemala or neighboring Oaxaca and likely belong to the highly ALS-resistant subpopulation previously characterized in Guatemala (Beebe et al., 2000; Mahuku et al., 2003; CIAT Genebank, 2018; Lobaton et al., 2018).

Genotype to phenotype associations were investigated by GWAS. In all but one trial, foliar ALS resistance was significantly associated with a region on Chr 8 (**Figure 2**). For the field trial in Uganda (Kawanda), a peak is clearly visible in the Manhattan plot, but it is not passing the stringent Bonferroni threshold. Manhattan plots indicate the same resistance locus on Chr 8 to be effective in Colombia as well as in Uganda. The interval where significant associations were found in this study on Chr 8 coincides with the genomic region where molecular markers linked to the *Phg-2* resistance locus in the common bean line Mexico 54 and G10474 were found (Sartorato et al., 1999; Sartorato et al., 2000; Gil et al., 2019), hence, it will be referred to as the *Phg-2* locus. GWAS analyses of ALS symptoms on pods at one of the field locations in Colombia (Darien) resulted in the same resistance locus on Chr 8. Pod evaluations at the other field locations in Colombia (Quilichao) and Uganda (Kawanda), where phenotypic variability was low, did not result in significant associations to markers in the GWAS analysis (**Supplementary Figure 5**). In addition to the predominant signal on Chr 8, another resistance locus on Chr 4 was effective against the pathotype COL 30–0. This resistance locus coincided with the mapping interval of the *Phg-4* locus (Keller et al., 2015).

Over all experiments, significantly associated SNPs were found in the interval spanning 61,150,549–62,934,224 bp (total length of 1,784 kbp) on Chr 8 and 46,703,147–46,934,061 bp (total length of 231 kbp) on Chr 4. In the interval on Chr 8, 265 annotated genes were identified, of which two (Phvul.008G284500, Phvul.008G285300) were NB-ARC domain-containing disease resistance genes (PF00931), another 2 (Phvul.008G267600, Phvul.008G267700) were of the TIR-NBS-LRR class (PF13676, PF01582), and 20 were containing leucine-rich repeats. On Chr 4, 28 annotated genes were found in the interval, but no putative resistance genes were among them. Significant SNPs on Chr 8 explained highest percentages of phenotypic variance, between 8.6–31.4%, in line with the very dominant role of this resistance locus seen in these experiments. Markers associated with the resistance locus on Chr 4 explained 9.3–11.4% of the variance.

FIGURE 2 | Manhattan plots of the genome-wide association studies (GWAS) for angular leaf spot (ALS) resistance in the extBALSIT panel. The greenhouse trials were conducted with five different pathotypes, determined by their origin (COL and UG) and race (63–63, 63–47, 61–63, 13–63, and 30–0). Field trials in Colombia (Darien and Quilichao) and Uganda (Kawanda) were inoculated with mixtures of pathotypes, previously collected at the corresponding sites. On the x-axis, the genomic position of the markers is given. On the y-axis, the negative logarithm to the base 10 of the *P*-value, representing the significance value, is given. In order to correct for multiple testing, the significance threshold was adjusted through the Bonferroni method, and the new significance threshold is depicted by the black horizontal line.

#### Haplotypes of the Resistance Locus on Chromosome 8 Explain ALS Pathotype-Specificity

Haplotypes at the *Phg-2* locus, identified through cluster analysis of the SNP data in the *Phg-2* region, were categorized into eleven groups (M1 to M5, M/A, A1 to A5, **Figure 3**) and associated with trial-specific ALS resistance scores (**Figure 4**, **Supplementary Figure 6**). The haplotype groups M1 to M5, originating from the Mesoamerican gene pool, were resistant against the pathotype COL 30–0, as indicated by its race code. Common bean lines from the Mesoamerican haplotype group M1 were resistant in nearly all experiments but showed intermediate resistance in the trial with the Ugandan pathotype UG 61–63. Lines from the haplotype groups M2 and M3 were resistant against COL 14–63, UG 61–63, and the pathotypes present in the field in Quilichao and Kawanda but were susceptible to pathotypes present in the field in Darien and the most aggressive race COL 63–63. Lines from the haplotype group M4 showed increased resistance against UG 61–63 and COL 13–63 but were less effective compared to M2 and M3. Lines from the haplotype group M5 were largely resistant against pathogen races in Darien and Kawanda, but no clear trend was observed in the other experiments.

Andean haplotype groups at the *Phg-2* locus were mostly associated with susceptibility to ALS. A1 and A2 only displayed effective resistance against COL 30–0, and A1 and A3 appeared

FIGURE 3 | Dendrogram of hierarchical clustering at the *Phg-2* locus. The common bean lines of the extBALSIT panel were clustered according to similarity of their SNP data in the 61.15–62.93 Mbp interval on Chr 8 and divided into eleven haplotype groups. Haplotype groups were named according to the gene pool of the lines (M, Mesoamerican; A, Andean; and M/A = mixed) and numbered. Below the haplotype names, the number of common bean lines in each haplotype group is given and well known ALS resistant common bean lines as well as the reference genome line (G19833) contained in the haplotype groups are indicated. On the y-axis, the Euclidian distance between clusters is shown.

FIGURE 4 | Haplotype groups at the *Phg-2* locus and their ALS response, as evaluated in greenhouse and field trials using the extBALSIT panel. For each trial, the ALS response, scored on a scale from 1 (resistant) to 9 (susceptible), is shown for each of the eleven haplotype groups.

resistant against UG 61–63. Lines from the haplotype groups A4, A5, and M/A were mostly susceptible in all experiments. The haplotypes at the *Phg-2* locus were able to explain a much larger fraction of the total phenotypic variability in ALS resistance (R2 = 0.40 – 0.85, **Supplementary Table 4**) compared to significant single SNP markers.

#### Haplotype-Specific SNPs to Advance Resistance Breeding by Marker-Assisted Selection

Seven haplotype groups (M1–3, M5, A1–A3) were identified as potentially interesting for breeding because of the resistance they displayed in multiple experiments. For example, the SNP marker specific for M1, the haplotype group associated with strongest resistance against most pathotypes, offers unique opportunities to trace this effective resistance allele in advanced breeding germplasm (**Figure 5A**). Similarly, the SNP markers tagging M2 (**Figure 5B**) and M3 (**Supplementary Table 5**) can be employed for breeding to provide resistance against UG 61–63 (and the region of its occurrence). The SNP markers specific for the Andean haplotype groups A1 and A2 can be used to improve ALS resistance in the Andean gene pool, although their effectiveness is limited to a few pathotypes only. Genomic positions of the SNPs specific for all but one of the seven haplotype groups as well as for the resistance locus on Chr 4 are provided in **Supplementary Table 5**.

# DISCUSSION

This study is the first to thoroughly evaluate the pathotypespecific response of ALS in common bean on the genetic level. Through GWAS in the largest yet assembled diversity panel segregating for ALS resistance, a pathotype-specific resistance locus, likely *Phg-4*, and a broad-spectrum resistance locus coinciding with *Phg-2* were effective against a variety of ALS pathotypes from Colombia and Uganda. For the latter locus, a high haplotype diversity was found, with at least seven different haplotype groups providing resistance in a pathotype-specific manner. Molecular markers specific for resistance loci and haplotype groups will facilitate breeding for pathotype-specific ALS resistance through marker-assisted introgression strategies.

# No Effect of *Phg-1*, *Phg-3*, and *Phg-5* Against the ALS Pathotypes Tested

In common bean, ALS resistance is reportedly controlled by five major resistance loci, named *Phg-1* to *Phg-5* (Souza et al., 2016). Our study revealed a preeminent role of *Phg-2*, representing the unmatched source of resistance in effectively all experiments, while *Phg-1*, *Phg-3*, and *Phg-5* did not appear to be relevant. This is unexpected as the resistance loci *Phg-1* and *Phg-5*, originating from the resistance sources AND 277, CAL 143, and G5686 that were extensively used as progenitors in the CIAT breeding program, were present in the extBALSIT panel at frequencies sufficiently high to be detected by GWAS. Our observation may be a consequence of the strong pathotypespecificity of *P. griseola* and the differences in pathotypes prevalent within regions, countries, and continents. Experiments that led to the discovery of *Phg-1* and *Phg-3* were conducted with ALS evaluation protocols comparable to ours using pathotypes of the races 63–23 and 63–39 from Brazil (Gonçalves-Vidigal et al., 2011; Gonçalves-Vidigal et al., 2013). *Phg-5* was discovered in CAL143 using a pathotype of race 0–39 and natural field evaluations in Brazil, and in G5686 using a pathotype of race 31–0 from Colombia (Oblessuc et al., 2012; Oblessuc et al., 2013; Keller et al., 2015). Brazilian pathotypes are known to be very aggressive on the current differentials (Balbi et al., 2009; Nietsche et al., 2001; Sartorato and Alzate-Marin, 2004; Silva et al., 2008), and it is possible that specific resistance genes are effective against these pathotypes. Future experiments should involve resistance evaluations of the extBALSIT panel with additional

FIGURE 5 | Candidate SNPs for marker-assisted selection of *Phg-2* haplotypes. Shown is the phenotypic distribution of ALS scores of the two alleles at the SNPs, which are specific for the functional haplotypes M1 (A) and M2 (B). The SNPs on chromosome 8 at position 61,901,182 bp and 62,188,623 bp of the Pv2.1 reference genome that co-segregated with the haplotype groups M1 and M2, respectively, were used. On the y-axis, ALS response scored on a scale from 1 to 9 is shown, whereas scores below 3 (dashed line) are considered resistant. On the x-axis, greenhouse and field trials are indicated, and for each trial, the ALS-resistance response of the two alleles of the SNP is plotted.

pathotypes, particularly from Brazil, where the resistance loci *Phg-1, Phg-3*, and *Phg-5* were observed to be effective.

In a similar study on resistance to anthracnose in common bean, an Andean bean diversity panel was tested with eight different pathotypes. In contrast to our study that only revealed a small subset of previously reported ALS resistance loci, GWAS for anthracnose resistance found the majority of the known resistance loci to effectively be involved (Zuiderveen et al., 2016). Our findings undermine the importance of the pathotypes on the efficacy of disease resistance in common bean and call for an increased understanding of the pathogen population structure and virulence to allow prediction of effectiveness of resistance loci. Once the population structure of the pathogen is better known, established GWAS panels can be used to study the pathotype-specificity within and between sub-populations.

#### *Phg-2* Is an Important ALS Resistance Locus With Functional Haplotypes From the Mesoamerican and the Andean Background

The *Phg-2* locus is one of the most important ALS resistance locus in common bean and originally described in the Mesoamerican cultivar Mexico 54 (Sartorato et al., 1999; Sartorato et al., 2000). In the meantime, several additional Mesoamerican common bean lines were found to contain ALS resistance, either at or in close proximity to *Phg-2* on Chr 8 (Nietsche et al., 2000; Mahuku et al., 2004; Namayanja et al., 2006; Mahuku et al., 2011). This led to the hypothesis that *Phg-2* originated from the Mesoamerican gene pool, and hence, several breeding efforts aimed at its introgression into the Andean gene pool. Our study revealed that ALS resistance at *Phg-2* can also be found in the Andean gene pool: through cluster analysis on the basis of the genotypic data in the *Phg-2* region, we were able to classify eleven haplotype groups, at least seven of which appeared to be functionally different, leading to distinct patterns of resistance against the tested ALS pathotypes. Not only were resistance- and susceptibility- associated haplotypes in both gene pools, but also within each gene pool different haplotype groups of this resistance locus provided resistance to some, but not all evaluated ALS pathotypes.

#### Genetic Determination of Pathotype-Specificity at *Phg-2* on Chromosome 8

The different haplotype groups at *Phg-2* largely explained pathotypespecificity for ALS response. For further understanding of the detailed interaction on the molecular level, the underlying genetic determinants need to be identified. To date, the causal genes of any ALS resistance loci, including *Phg-2*, are yet to be determined. Based on our data, it remains difficult to resolve whether the resistance at *Phg-2* is conferred by an allelic series at one resistance gene or by several resistance genes arrayed in clusters within the haplotype region defined by the significantly associated SNP markers.

Both, resistance gene clusters and allelic series are commonly occurring in plants (Keller et al., 2000). In common bean, several allelic series have been reported for anthracnose resistance, and there were five alleles described of the *Co-1* and *Co-3* loci, three alleles for the *Co-4* locus, and two alleles for the *Co-5* locus (BIC Genetics Committee, 2017)*.* In wheat, up to 17 alleles have been found for the powdery mildew resistance gene *Pm3* that showed different pathotype-specific reactions (Bhullar et al., 2010). The *Pm3* gene encoded for a classical nucleotide binding leucine-rich repeat (NR-LRR) receptor and alleles were highly similar, with usually only single amino acid changes differing between the alleles (Yahiaoui et al., 2009). This pattern is reflecting the evolutionary mechanisms that are promoting the genetic diversification of resistance loci (Ellis et al., 2000; Bergelson et al., 2001).

Although the presence of allelic series may be a plausible explanation for ALS pathotype-specificity, the large extension of the *Phg-2* region, spanning 1.78 Mbp including 265 annotated genes, as well as the pathotype-specific significance peaks at distinct positions within this interval, indicates the involvement of multiple genes. Indeed, the presence of several candidate NB-LRR resistance genes at *Phg-2* in the Andean reference genome and their distinct expression in leaf tissue, as the case with Phvul.008G284500 and Phvul.008G285300 (Phytozome, 2018), strengthens this hypothesis.

While these are probable candidate genes, it should be noted that the most effective haplotype groups at *Phg-2* originated from the Mesoamerican gene pool, while the reference genome used for SNP discovery and gene identification derived from the Andean gene pool (Schmutz et al., 2014). Resistance gene clusters are repetitive arrays of highly similar gene sequences that are often difficult to correctly assemble (Belser et al., 2018). Moreover, they usually differ in the number of repeats between common bean lines and gene pools, and hence, one reference genome might not be fully representative of the structural diversity at resistance loci. In the recent years, novel genome assembly strategies including long-read sequencing technologies have been developed to assemble such regions more accurately. With the increased availability of pan-genomes, it will be possible to take into account even the genetic rearrangements between common bean lines.

# Implications for ALS Resistance Breeding

ALS is one of the most devastating common bean diseases, particularly affecting smallholder farmers in low input agricultural systems. The results are production losses to the poorest, which most depend on the harvest from their fields for food security. Breeding for ALS resistance and other biotic and abiotic stress has been ongoing for a long time in common bean breeding programs of the tropics (Beebe, 2012), but in the future, breeding needs to respond quicker than in the past to assure food security and adequate nutrition. Globalization and the increased human mobility have led to a globalization of plant pathogens and will continue to facilitate the exchange of genetic pathogen diversity (Anderson et al., 2004; Jeger et al., 2011; Fisher et al., 2012; Santini et al., 2018). An additional process that is expected to heavily affect plant pathogen dynamics is climate change. The increased warming and occurrence of extreme weather events will have effects on prevalence and plant-pathogen interactions (Anderson et al., 2004; Elad and Pertot, 2014). In the case of the tropical pathogen ALS, global warming will likely expand its range, and global mobility will lead to a mixing of pathogen populations previously separated by distance. More effective breeding methods are therefore urgently required to develop the varieties that will feed the growing future populations in developing countries. The research presented here will increase breeding efficiency for ALS resistance by providing a screening panel that can be used to find effective resistance loci in different areas. The molecular markers linked to resistance loci and resistance haplotypes will allow development of resistant lines without direct phenotypic screening in the region.

Resistance gene pyramiding is usually the suggested strategy to ensure durable disease resistance for highly virulent pathogens such as *P. griseola* (Pastor-Corrales et al., 1998). The fact that ALS resistance in nearly all trials was conferred by the different haplotypes at *Phg-2* is rendering pyramiding difficult or impossible, depending on whether the causal genes are different genes within a resistance gene cluster or allelic series, respectively. Until the causal genes are known, the haplotype groups with very high effectiveness on both continents, M1 for Colombia and M2 and M3 for Uganda, provide the most sustainable strategy to control ALS by markerassisted selection in one of the globally most important food security crop. However, given the threat of resistance sources to become inefficient, it is crucial to seek new ALS resistant common bean lines and elucidate the genetics of their resistance (Nay et al., 2019).

#### DATA AVAILABILITY

The datasets analyzed and generated for this study can be found on dataverse at https://doi.org/10.7910/DVN/U2BAWN.

#### AUTHOR CONTRIBUTIONS

BR and BS conceived the study. MN conducted the experiments and performed data analysis and interpretation. BR, BS, and CM assisted in the experimental setup and data analysis. MN drafted

#### REFERENCES


the manuscript, which was improved by BS, BR, and CM. All authors read and approved the final manuscript.

#### FUNDING

This project was funded by ETH Global, the Sawiris Foundation for Social Development and supported by the Molecular Plant Breeding group of ETH Zurich. We would also like to acknowledge funding support from the Tropical Legumes III – Improving Livelihoods for Smallholder Farmers: Enhanced Grain Legume Productivity and Production in Sub-Saharan Africa and South Asia (OPP1114827) project funded by the Bill & Melinda Gates Foundation.

#### ACKNOWLEDGMENTS

We would like to acknowledge the help of H. F. Buendia, A. E. Portilla and V. M. Mayor for technical assistance during field trials and seed handling, D. Ariza and S. Yates for help with bioinformatics analyses and E. Macea and L. M. Diaz for lab assistance. Furthermore, we would like to acknowledge CIAT pathology unit for assistance with pathogen handling. In addition, we would like to thank the staff at CIAT Uganda, specifically F. Kato, S. Musoke, S. Sulaiman and C. Acam for help in conducting field and greenhouse trials.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01126/ full#supplementary-material


(Dordrecht, Netherlands: Kluwer Academic Publishers), 101–160. doi: 10.1007/978-94-011-3937-3\_5


around a major QTL controlling resistance to angular leaf spot in common bean. *Theor. Appl. Genet.* 126 (10), 1–15. doi: 10.1007/s00122-013-2146-1


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Nay, Mukankusi, Studer and Raatz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Overexpression of Medicago *MtCDFd1\_1* Causes Delayed Flowering in Medicago *via* Repression of *MtFTa1* but Not *MtCO-Like* Genes

*Lulu Zhang1, Andrew Jiang1, Geoffrey Thomson1, Megan Kerr-Phillips1, Chau Phan1, Thorben Krueger1, Mauren Jaudal1, Jiangqi Wen2, Kirankumar S. Mysore2 and Joanna Putterill1\**

*1 The Flowering Lab, School of Biological Sciences, University of Auckland, Auckland, New Zealand, 2 Noble Research Institute, Ardmore, OK, United States*

#### *Edited by:*

*Matthew Nicholas Nelson, Commonwealth Scientific and Industrial Research Organisation, Australia*

#### *Reviewed by:*

*Jim Weller, University of Tasmania, Australia Takato Imaizumi, University of Washington, United States*

#### *\*Correspondence:*

*Joanna Putterill j.putterill@auckland.ac.nz*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 08 April 2019 Accepted: 22 August 2019 Published: 19 September 2019*

#### *Citation:*

*Zhang L, Jiang A, Thomson G, Kerr-Phillips M, Phan C, Krueger T, Jaudal M, Wen J, Mysore KS and Putterill J (2019) Overexpression of Medicago MtCDFd1\_1 Causes Delayed Flowering in Medicago via Repression of MtFTa1 but Not MtCO-Like Genes. Front. Plant Sci. 10:1148. doi: 10.3389/fpls.2019.01148*

Optimizing flowering time is crucial for maximizing crop productivity, but gaps remain in the knowledge of the mechanisms underpinning temperate legume flowering. *Medicago*, like winter annual *Arabidopsis*, accelerates flowering after exposure to extended cold (vernalization, V) followed by long-day (LD) photoperiods. In *Arabidopsis*, photoperiodic flowering is triggered through CO, a photoperiodic switch that directly activates the *FT* gene encoding a mobile florigen and potent activator of flowering. In *Arabidopsis*, several CYCLING DOF FACTORs (CDFs), including AtCDF1, act redundantly to repress *CO* and thus *FT* expression, until their removal in LD by a blue-light-induced F-BOX1/GIGANTEA (FKF1/GI) complex. *Medicago* possesses a homolog of *FT*, *MtFTa1*, which acts as a strong activator of flowering. However, the regulation of *MtFTa1* does not appear to involve a *CO-like* gene. Nevertheless, work in pea suggests that CDFs may still regulate flowering time in temperate legumes. Here, we analyze the function of *Medicago MtCDF* genes with a focus on *MtCDFd1\_1* in flowering time and development. *MtCDFd1\_1* causes strong delays to flowering when overexpressed in *Arabidopsis* and shows a cyclical diurnal expression in *Medicago* with peak expression at dawn, consistent with *AtCDF* genes like *AtCDF1*. However, *MtCDFd1\_1* lacks predicted GI or FKF1 binding domains, indicating possible differences in its regulation from *AtCDF1*. In *Arabidopsis*, CDFs act in a redundant manner, and the same is likely true of temperate legumes as no flowering time phenotypes were observed when *MtCDFd1\_1* or other *MtCDF*s were knocked out in *Medicago Tnt1* lines. Nevertheless, overexpression of *MtCDFd1\_1* in *Medicago* plants resulted in late flowering relative to wild type in inductive vernalized longday (VLD) conditions, but not in vernalized short days (VSDs), rendering them day neutral. Expression of *MtCO-like* genes was not affected in the transgenic lines, but LD-induced genes *MtFTa1*, *MtFTb1*, *MtFTb2*, and *MtSOC1a* showed reduced expression. Plants carrying both the *Mtfta1* mutation and *35S:MtCDFd1\_1* flowered no later than the *Mtfta1* plants. This indicates that *35S:MtCDFd1\_1* likely influences flowering in VLD *via* 

**259**

repressive effects on *MtFTa1* expression*.* Overall, our study implicates *MtCDF* genes in photoperiodic regulation in *Medicago* by working redundantly to repress *FT-like* genes, particularly *MtFTa1*, but in a *CO*-independent manner, indicating differences from the *Arabidopsis* model.

Keywords: *CYCLING DOF FACTOR*, *MtCDFd1\_1*, *MtFTa1*, *MtFTb*, *CO*, *Medicago*, flowering time, primary axis elongation

### INTRODUCTION

Plants integrate several molecular pathways to control when they flower to maximize reproductive fitness and successful development of seeds and fruit (Fornara et al., 2010; Srikanth and Schmid, 2011; Andrés and Coupland, 2012). One of these pathways involves the responsiveness to changes in day length (photoperiod), which plays a vital role in the plant's ability to synchronize flowering time with favorable seasonal conditions (Putterill et al., 2004). For example, in temperate plants such as winter annual *Arabidopsis thaliana* (*Arabidopsis*) and the legume *Medicago truncatula* (*Medicago*), extended winter cold (vernalization, V) followed by exposure to long-day (LD) photoperiods—a feature of spring and early summer promotes flowering.

The well-characterized *Arabidopsis* LD pathway promotes flowering *via* the accumulation of CONSTANS (CO) protein in the leaves, which directly activates the expression of the potent floral activator *FLOWERING LOCUS T* (*FT*) in the late afternoon of LD, but not in short days (SDs). *FT* encodes a mobile florigen that moves to the shoot apical meristem and initiates the transition to flowering *via* activation of genes such as *SUPPRESSOR OF OVEREXPRESSION OF CONSTANS 1* (*SOC1*; Andrés and Coupland, 2012). Several factors converge to facilitate the accumulation of CO protein in LD including releasing the *CO* gene from transcriptional repression by CYCLING DOF FACTOR (CDF) transcription factors. This occurs *via* the light-induced formation of the FLAVIN-BINDING, KELCH REPEAT, F-BOX1/GIGANTEA (FKF1/ GI) complex which targets the CDFs for degradation *via* the proteasome, which in turn enables the transcription of *CO* (Imaizumi et al., 2005; Sawa et al., 2007; Fornara et al., 2009; Song et al., 2012; Goralogia et al., 2017). In addition, there is direct regulation of *FT* by AtCDF1 (Song et al., 2012).

The acceleration of flowering by *FT-like* genes is conserved in a diverse range of species (Wickland and Hanzawa, 2015; Putterill and Varkonyi-Gasic, 2016) including the *FTa1* gene in the temperate legumes *Pisum sativum* (pea) and *Medicago* (Hecht et al., 2011; Laurie et al., 2011). Temperate legumes are of particular interest as many serve as important agricultural crops with flowering time playing a significant role in annual production yields (Graham and Vance, 2003; Weller and Ortega, 2015).

However, increasing evidence suggests that temperate legume species operate with a *CO*-independent mechanism for the regulation of *FT-like* genes and thus flowering (Putterill et al., 2013; Weller and Ortega, 2015). Analysis of *Medicago CO-like* (*COL*) genes revealed that they were unable to complement the *Arabidopsis co* null mutant and did not promote flowering when overexpressed (Wong et al., 2014). *Medicago col* null mutant lines did not have a flowering phenotype under LD and therefore were unlikely to be involved in the *Medicago* photoperiodic response (Wong et al., 2014). An additional difference is that there are three LD-induced *FT* genes in *Medicago*, but none have the same diurnal pattern of expression as *Arabidopsis FT*, suggesting a different regulatory mechanism (Laurie et al., 2011). Thus, there is a substantial knowledge gap in our understanding of photoperiodic flowering in these species (Hecht et al., 2005, Hecht et al., 2011; Laurie et al., 2011; Putterill et al., 2013; Weller and Ortega, 2015; Ridge et al., 2016).

Despite the apparent lack of a functional *CO* in temperate legumes, legume *CDF*s appear to still participate in photoperiodic flowering. Specifically in garden pea, the dominant late-flowering *LATE2* mutant was recently mapped to a *CDF* homolog, *PsCDFc1*. Yeast two-hybrid assays indicate that the mutation disrupts the binding of PsFKF1 to PsCDFc1, indicating that increased PsCDFc1 protein stability may be the basis of the dominant phenotype (Ridge et al., 2016). Plants carrying the *late2*/*Pscdfc1* mutation have reduced expression of LD-induced *FT-like* genes, but not *PsCOL* genes. This indicates that CDFs participate in the photoperiodic regulation of flowering in pea but that the mechanism differs to that of *Arabidopsis* (Ridge et al., 2016).

*CDFs* were first characterized in *Arabidopsis* and are a subset of the plant-specific DNA-binding One Zinc Finger (DOF) gene family of transcription factors (Yanagisawa, 2002; Noguero et al., 2013). They are distinguished by their cyclical diurnal transcript levels, with the majority of genes showing peak transcript accumulation early in the day. In *Arabidopsis*, CDFs have an overlapping role in photoperiodic flowering control as single *AtCDF* mutants have either no or only weak flowering time phenotypes, but a quadruple *Atcdf1–3,5* mutant has day-neutral early flowering (Imaizumi et al., 2005; Fornara et al., 2009).

In *Medicago*, phylogenetic analysis has revealed a total of 42 *Medicago* DOF proteins clustered into four phylogenetic clades (Shu et al., 2015). One of these clades, MCOGD, contains all of the 13 MtCDF-like proteins, which in turn group into several subclades (Shu et al., 2015; Ridge et al., 2016). These are expressed predominantly in leaf blades, nodules, and buds (Shu et al., 2015), with expression in leaves consistent with a role in photoperiodic flowering (Turck et al., 2008).

Here, we analyze the function of *MtCDF* genes in the regulation of *Medicago* flowering, with a focus on *MtCDFd1\_1*. We analyzed the gene expression patterns of *MtCDF*s in VLD and VSD RNA-Seq morning time courses and surveyed plants carrying transposon insertions in *MtCDF* genes. While flowering time phenotypes were not observed in individual *Medicago* mutants, overexpressing the genes in *Arabidopsis* identified five genes, including *MtCDFd1\_1*, which caused strong delays to flowering. We then examined the effect of overexpressing *MtCDFd1\_1* in *Medicago* on plant development, flowering time, and the expression of known flowering time genes. Collectively, our results implicate *MtCDF* genes as regulators of photoperiodic flowering and plant architecture *via* the repression of *FT-like* genes, such as *MtFTa1*.

#### MATERIALS AND METHODS

### Bioinformatics

Legume and other plant CDF protein sequences were obtained from the literature (Shu et al., 2015; Ridge et al., 2016) and by BLASTP searches with AtCDF1 of the J. Craig Venter Institute (JCVI) *Medicago* genome (Mt4.0 http://www. jcvi.org/medicago/) and National Center for Biotechnology Information (NCBI) (http://www.ncbi.nlm.nih.gov/). The *Medicago MtCDF* gene identifiers and names are listed in **Table S1**. The phylogenetic tree of CDF-like proteins from *Medicago*, other legumes, tomato, potato, and *Arabidopsis* was constructed by aligning full-length amino acid sequences using MUSCLE (version 3.8.425; Edgar, 2004) as implemented in Geneious (version 11.1.5) and using the neighbor-joining algorithm implemented in PAUP\* (version 4.0; Swofford, 2003). An existing RNA-Seq dataset (Thomson et al., 2019) comprising three biological replicates was consulted to obtain the mean abundance of *MtCDF*-*like* gene transcripts in leaf tissue at 0, 2, and 4 h after dawn in transcripts per million (TPMs) in SD and LD photoperiods. *Medicago Tnt1* retroelement insertion lines were identified by screening the FST database (https:// medicago-mutant.noble.org/mutant/blast/blast.php) and are listed in **Table S1**.

#### Plant Materials and Growth Conditions

*Medicago truncatula* (*Medicago*) wild type Jester (Hill, 2000) and R1 08-1\_C3 (R108; Trinh et al., 1998) used in this study belong to *Medicago truncatula* Gaertn (barrel medic), ssp. *truncatula* and ssp. *tricycla*, respectively. All *Tnt1* insertion mutants in the R108 background listed in **Table S1** were obtained from the Noble Research Institute, LLC (Ardmore, OK, USA). *Arabidopsis thaliana* (*Arabidopsis*) wild type Columbia was used. The *Mtfta1* mutant utilized was NF1634 (Jaudal et al., 2019).

*Medicago* and *Arabidopsis* plants were grown in controlled environments under ~200 μM m−2 s−1 cool white fluorescent light at 22°C or 24°C and under ~140 μM m−2 s−1 at 22°C, respectively, in LDs (16 h light/8 h dark) or SDs (8 h light/16 h dark), with or without prior vernalization of germinated seeds at 4°C for 21 days, as previously described (Laurie et al., 2011; Yeoh et al., 2013; Jaudal et al., 2015). *Medicago* flowering time was measured in days to when the first floral bud was observed by eye and the number of nodes on the primary axis at flowering. *Arabidopsis* flowering time was measured in days to when the first floral buds were observed by eye and the total number of rosette and cauline leaves at flowering.

*CaMV 35S* overexpression constructs were made by inserting complementary DNA (cDNA) sequences into vector pB2GW7 (Karimi et al., 2007) using Gateway® Technology (Invitrogen®, CA, USA). Forward and reverse primers used for Gateway cloning are shown in **Table S2**. Transgenic *Arabidopsis* plants overexpressing *MtCDF* genes were generated using *Agrobacterium tumefaciens* GV3101 containing overexpression constructs *via* floral dipping and Basta selection of the T1 population as previously described (Martinez-Trujillo et al., 2004; Jaudal et al., 2015).

Transgenic R108 *Medicago* plants overexpressing *MtCDFd1\_1* were generated using *A. tumefaciens* EHA105 with the *35S:MtCDFd1\_1* construct *via* somatic embryogenesis and subsequent BASTA selection in soil as previously described (Cosson et al., 2006; Laurie et al., 2011).

*35S:MtCDFd1\_1* plants and *Mtfta1* heterozygous plants were crossed together (Chabaud et al., 2006) and then bred and genotyped to identify F2 *35S:MtCDFd1\_1*/*Mtfta1* homozygous mutant plants.

#### Quantitative Reverse Transcriptase PCR (qRT-PCR) for Gene Expression

RNA extraction and cDNA synthesis using an oligo dT primer was carried out as previously described (Laurie et al., 2011; Yeoh et al., 2013; Jaudal et al., 2015). qRT-PCR was performed using SYBR® green chemistry on Applied Biosystems® 7900HT Sequence Detection System (Applied Biosystems®, CA, USA) or QuantStudio™ 5 Real-Time PCR System (Applied Biosystems®, CA, USA). Each data point is derived from three biological replicates harvested in parallel. Each replicate consisted of a pool of leaf tissue from either two or three independent plants. Primer sequences used for qRT-PCR are listed in **Table S2**. Gene expression was calculated using the comparative Ct method (Livak and Schmittgen, 2001) with modifications (Bookout and Mangelsdorf, 2003). Samples were normalized to *PROTEIN PHOSPHATASE 2A* (*PP2A*; *Medtr6g084690*).

The statistical testing for the gene expression data presented in **Figures 4** and **6** was performed using a one-way analysis of variance (ANOVA) test between the means (α = 0.05). The Shapiro–Wilk normality assumption test was performed on all data presented. Multiple pairwise comparisons adjusted for false discovery rate (FDR) were utilized to highlight statistically significant differences in the data presented.

#### Yeast Two-Hybrid Assays

Full-length coding sequences of *MtCDFd1\_1*, *MtCDFc1*, and *AtCDF1* and the KELCH-repeat region of *AtFKF1* (amino acids 284 to 619; Imaizumi et al., 2005; Ridge et al., 2016) were used for the yeast two-hybrid assay. Gene fragments were cloned into Invitrogen destination vectors pDEST22 (AD, prey) and pDEST32 (BD, bait). The prey and bait constructs were transformed into the haploid yeast strains PJ69-4A and PJ69-4α (James et al., 1996), respectively, and selected on synthetic defined (SD) medium lacking tryptophan (Trp; prey) or leucine (Leu; bait). PJ69-4A and PJ69-4α strains were then mated, and diploid clones with both constructs were selected on medium lacking Trp and Leu (SD −Trp −Leu). Haploids containing empty pDEST22 and pDEST32 were also included to test autoactivation. Two independent diploid clones from each mating were diluted in 100 µl of water and plated on nonselective medium (SD −Trp −Leu) and selective medium [SD −Trp −Leu −histidine (His)] with different 3-amino-1,2,4 triazol (3-AT) concentrations (0, 1, 2, 10, 25, 50, and 100 mM). Colonies developed over 11 days at 28°C. Photos were taken on days 4, 7, 9, and 11. Similar results were obtained for each of the two independent clones. The positive control interactors were AtCDF1 and AtFKF1.

# RESULTS

#### Initial Characterization of 14 *MtCDF* Genes

To investigate the role of *MtCDF* genes in *Medicago* flowering time, we selected 14 *MtCDF* genes for initial analysis. These were 13 *MtCDF*s identified previously (Shu et al., 2015; Ridge et al., 2016) and a 14th related gene (*MtCDF1*) that we previously observed to have cyclical diurnal expression with an afternoon peak (Thomson et al., 2019). **Table S1** lists the *MtCDF* gene identifiers (JCVI Medicago genome Mt4.0) and corresponding gene names following the nomenclature in Ridge et al. (2016). The phylogenetic groupings of the predicted proteins along with AtCDFs are shown in **Figure 1A**, with a more comprehensive phylogenetic tree containing additional CDF proteins from legumes, tomato, and potato shown in **Figure S1**.

Protein sequence alignments of *Medicago* and *Arabidopsis* CDFs (**Figure S2**) highlighted the highly conserved DOF domain in all the MtCDF proteins and MtCDF1. However, five proteins (MtCDF1, MtCDFd1\_1, MtCDFd1\_2, MtCDFd1\_3, and MtCDFe) and two *Arabidopsis* CDFs (AtCOG1 and AtCDF4) lacked the two C-terminal regions that in *Arabidopsis* function as FKF1- and GI-binding domains. Two MtCDFs (MtCDF1 and MtCDFe) also lacked the predicted N-terminal TOPLESS (TPL)-binding domain. Recently, CDFs in *Arabidopsis* have been shown to form a complex with TPL (Goralogia et al., 2017); hence, the lack of TPL domains in these MtCDFs may indicate a functional divergence.

We analyzed expression of the 14 *MtCDF* genes (**Figure 1**) in an RNA-Seq dataset (Thomson et al., 2019) derived from leaves of plants grown in LD and SD after vernalization (V) and harvested at three time points: dawn and 2 and 4 h after dawn. We detected reads mapping to all 14 *MtCDF* genes, confirming that they are expressed in leaves as previously observed (Shu et al., 2015) and consistent with a potential role in photoperiodic flowering.

Transcript abundance varied >70-fold between the genes (**Figures 1B**–**O**). The four most abundant were *MtCDFa2*, *MtCDFc1*, *MtCDFb2*, and *MtCDFd1\_1*. Most genes (11/14; *MtCDF1*, *MtCDFa2*, *MtCDFb1*, *MtCDFb2*, *MtCDFc1*, *MtCDFc2\_1*, *MtCDFc2\_2*, *MtCDFc2\_4*, *MtCDFd1\_1*, *MtCDFd1\_2*, and *MtCDFd2*) were significantly differentially expressed between the two photoperiods. These included three genes, *MtCDFd1\_1*, *MtCDFb2*, and *MtCDFc1*, that were differentially expressed between the photoperiods at all three time points.

Further analysis of *MtCDFd1\_1* by qRT-PCR over a full day (**Figure S3**), indicated that the transcript of this gene has a diurnal cycle that is modulated by LD and SD photoperiods similar to the *Arabidopsis CDFs* (*AtCDF1-3,5*; Imaizumi et al., 2005; Fornara et al., 2009).

#### No Altered Flowering Time Phenotypes Were Observed in Medicago *MtCDF Tnt1* Insertion Lines

To investigate the function of the *MtCDF* genes, we screened the *Medicago Tnt1* flanking sequence database for candidate mutant *Medicago* plant lines with knockout *Tnt1* retroelement insertions in *MtCDF* genes. The results are summarized in **Table S1**. Lines homozygous for *Tnt1* insertions in 13 out of the 14 genes (the exception was *MtCDFd2*) were found.

In total, we identified 27 candidate plant lines, genotyped them for the presence of the *Tnt1* insertion, examined their gene expression, and scored their flowering time in VLD, LD, and VSD. Knockout, or knockdown, of gene expression was confirmed by qRT-PCR in 11/13 homozygous lines, except *MtCDFb1* and *MtCDFb2*, where the insertions were located in introns. However, no altered flowering time phenotypes were observed in any single mutant, which may be attributable to functional redundancy between some of the genes, as observed in *Arabidopsis* (Fornara et al., 2009).

#### Overexpression of *MtCDFd1\_1* and Four Other *MtCDF* Genes Causes Delayed Flowering in Arabidopsis

In previous work, overexpression of *AtCDF* genes, including *AtCDF1*, caused delayed *Arabidopsis* flowering (Imaizumi et al., 2005; Fornara et al., 2009). On the other hand, overexpression of wild-type pea *PsCDFc1* in *Arabidopsis* did not give late-flowering transgenic plants (Ridge et al., 2016). Only overexpression of the mutant version of *PsCDFc1* from the *late2* mutant resulted in late-flowering *Arabidopsis* plants (Ridge et al., 2016).

Here, having not observed mutant phenotypes in *Medicago MtCDF* knockout lines (**Table S1**), we turned to *Arabidopsis* to use as a rapid heterologous system for testing if any of the *MtCDFs* might regulate *Arabidopsis* flowering time. If such *MtCDF* genes were to be identified in this screen, then one would be selected for the overexpression functional analysis in *Medicago*.

We constitutively expressed 11 genes (*MtCDF1*, *MtCDFa2*, *MtCDFb1*, *MtCDFb2*, *MtCDFc1*, *MtCDFc2\_1*, *MtCDFc2\_4*, *MtCDFd1\_1*, *MtCDFd1\_2*, *MtCDFd1\_3*, and *MtCDFe*) from across different subclades in wild-type *Arabidopsis* and measured flowering time (**Figure 1A** and **Figure S1**). Expression constructs were made by fusing the *MtCDF*s

to the 35S promoter and then introduced into wild-type Columbia plants with the flowering time of T1 *Arabidopsis* transformants and photographs of selected T2 and T3 progeny presented in **Figure 2**.

Overexpression of five of the genes tested (*MtCDFa2*, *MtCDFc1*, *MtCDFd1\_1*, *MtCDFd1\_3*, and *MtCDFe*) resulted in strong delays to flowering in multiple independent T1 lines in LD, compared to Columbia (**Figures 2A**, **B**). Interestingly, these genes arise from different *MtCDF* subclades (**Figure 1A**

and **Figure S1**). Overexpression of two other genes (*MtCDFb1* and *MtCDFb2*) produced several transgenic plants that showed a slight delay in flowering time, while overexpression of four genes (*MtCDF1*, *MtCDFc2\_1*, *MtCDFc2\_4*, and *MtCDFd1\_2*) had little to no effect on Columbia flowering time (**Figures 2A**, **B**).

Apart from being late flowering, unusual aerial architectural phenotypes were seen compared to *Arabidopsis* Columbia plants (**Figure 2C**). Specifically, an abnormal late-flowering

derived from 11 *35S:MtCDF* expression vectors and Columbia wild-type *Arabidopsis* in LD conditions. The gray line represents the average leaves at flowering for Columbia; 11.2 ± 0.63 leaves (t.SE 0.05; *n* = 19). (B) Photographs of selected T2- and T3-generation *35S:MtCDF* plants at the time of flowering. (C) Several *35S:MtCDFd1\_1* and *35S:MtCDFc1* transgenic plants displayed aerial rosette phenotypes (white boxes) and poor fertility. Multiple *35S:MtCDFd1\_3* and *35S:MtCDFe* transgenic plants had an upright rosette leaf stature with rigid long-handle spoon-shaped leaves. Additionally, these plants were darker in color with purple abaxial surfaces but had light-colored spots on the older leaves (white boxes in the last panel) and had poor fertility. Age of the plants indicated in days. Yellow scale bars = 2 cm.

phenotype characterized by aerial rosettes and poor fertility was observed in several independent transgenic lines carrying either of two transgenes, *35S:MtCDFd1\_1* or *35S:MtCDFc1*. The aerial rosette phenotype is a feature also seen in some *Arabidopsis* plants where the floral transition is delayed including resulting from disruptions in the floral transition genes including *SOC1*, *AGAMOUS-like 42* (*AGL42*), *AGL71*, *AGL72* (Dorca-Fornell et al., 2011), *FT*, *TWIN SISTER OF FT* (*TSF*; Hiraoka et al., 2013), *FLOWERING LOCUS C* (*FLC*), *FRIGIDA* (*FRI*), and *AERIAL ROSETTE 1* (*ART1*; Poduska et al., 2003).

In addition, multiple independent lines carrying either of two transgenes, *35S:MtCDFe* or *35S:MtCDFd1\_3*, displayed an upright rosette leaf stature with rigid, long-handled spoonshaped leaves (**Figure 2C**). These plants also were smaller than wild type, infertile with a lack of primary inflorescence bolting, and darker in color. In addition, in some *35S:MtCDFd1\_3* lines, the older leaves of some plants developed spotty lesions (**Figure 2C**).

In summary, among the *MtCDF*s, *MtCDFa2*, *MtCDFc1*, *MtCDFd1\_1*, *MtCDFd1\_3*, and *MtCDFe* were able to cause strong delays to flowering in multiple transgenic lines when overexpressed in wild-type *Arabidopsis*. The remaining *MtCDF* genes we tested did not appear to have much effect on flowering time in *Arabidopsis* in our experiments, but this may be due to factors such as transgene expression level.

#### Constitutive Expression of *MtCDFd1\_1* in Medicago Causes Late Flowering in VLD

We selected *MtCDFd1\_1* for further functional analysis by overexpression in *Medicago*. This was because its transcript was relatively abundant in *Medicago* leaves and exhibited diurnal cycling in VLD and VSD similar to *AtCDF*s that regulate flowering time redundantly in *Arabidopsis*. Additionally, it caused a strong delay to flowering in multiple independent lines when overexpressed in *Arabidopsis*. However, it was interesting also because its predicted protein sequence differs from these AtCDF proteins and from PsCDFc1/LATE2, which has already been characterized in pea (Ridge et al., 2016), falling into a different subclade (d1, **Figure S1**). It lacks the predicted GI- and FKF1-binding domains (**Figure S2**) and appears not to interact with AtFKF1 in yeast two-hybrid assays (**Figure S4**).

We overexpressed *MtCDFd1\_1* in *Medicago* to assay the effect this would have on flowering time. After co-cultivation of *Medicago* wild-type R108 leaf disks with *Agrobacterium* carrying the *35S:MtCDFd1\_1* transgene, we selected six independent T0 transformants. T1 or T2 progeny was scored for flowering time in two photoperiodic conditions, with and without prior vernalization (VLD, LD, and VSD; **Figure 3A**).

As expected, VLD most strongly accelerated the flowering of R108 wild-type plants, out of the three conditions tested (**Figure 3A**). In contrast, most of the transgenic lines (four of six lines: 4.17, 13.24, 17.34, and 2.2) showed delayed flowering in VLD, in both days and nodes at flowering (**Figure 3A**). In LD, the same four lines showed later flowering than R108 in days to flowering. However, only line 2.2 flowered marginally later in nodes, indicating overall a much weaker flowering time phenotype in LD.

Line 4.17 was then chosen as the representative transgenic line to test in VSD conditions. It had previously shown no phenotypic differences in VLD conditions from three other independent transgenic lines (13.24, 17.34, and 2.2) that also strongly overexpressed *MtCDFd1\_1* (**Figure 4A**). Line 4.17 flowering time was not statistically significantly different to R108 in VSD, indicating that *35S:MtCDFd1\_1* did not confer late flowering relative to wild type in VSD conditions in this line. Additionally, we observed that line 4.17 flowered at a similar time in VSD and VLD. In summary, while *35S:MtCDFd1\_1*

caused late flowering in VLD compared to wild type, it had no significant effect in VSD in line 4.17, resulting in day-neutral flowering. Thus, flowering time analysis in VSD was not pursued further.

Wild-type R108 plants grown in VLD conditions also typically show elongation of the primary shoot axis at the time of flowering. Therefore, as might be expected from their lateflowering phenotype, the four late-flowering transgenic lines had a shorter primary axis in VLD compared to R108. This was observed at the flowering of R108 and the *35S:MtCDFd1\_1* plants (**Figures 3B**, **C**).

In addition, the leaves of later-flowering transgenic plants were sometimes paler in color than R108 and the transgenic plants that did not flower late (**Figure 3D**). In the later stage of plant growth, they had trifoliate leaves that curved down (epinastic) while R108 leaves curved upwards (**Figure 3E**). Some late-flowering transgenic plants displayed sterility. This was likely because the top of the pistil was curled down, causing the stigma to be away from anthers, leading to failure in pollination (**Figure 3F**).

In summary, four of the six independent lines carrying the *35S:MtCDFd1\_1* transgene showed delayed flowering and changes to architecture including shorter primary stems, leaf curling, and infertility in VLD conditions.

#### *MtCDFd1\_1* Overexpression Is Negatively Correlated With Transcript Levels of *MtFT-Like* Genes but Not *MtCOL* Genes

To investigate the basis of the late-flowering phenotypes observed in the *35S:MtCDFd1\_1* transgenic lines (**Figure 3A**), we analyzed gene expression by qRT-PCR (**Figure 4**). The genes assayed were *MtCDFd1\_1* and the three LD-induced *MtFT* genes, which are expressed at higher levels in VLD than in VSD: *MtFTa1*, *MtFTb1*, and *MtFTb2* (Laurie et al., 2011). *MtFTa1* has been shown to accelerate flowering when overexpressed in *Medicago*, while loss-of-function mutants show late flowering compared to wild type, particularly in VLD conditions (Laurie et al., 2011; Jaudal et al., 2019).

In VLD, *35S:MtCDFd1\_1* transcript levels in the four lateflowering lines (4.17, 13.24, 17.34, and 2.2) were significantly higher compared to those in R108 controls (**Figure 4A**). However, *MtCDFd1\_1* expression in the fifth line was only very weakly elevated, while the sixth line, 1.1, was not significantly different from R108. These latter two lines, 19.30 and 1.1, also flowered at a similar time to R108 (**Figure 3A**).

The increased expression of *MtCDFd1\_1* in VLD observed in the four transgenic lines 4.17, 13.24, 17.34, and 2.2 (**Figure 4A**) correlated with significantly lower abundance of *MtFTa1*, *MtFTb1*, and *MtFTb2* transcripts (**Figures 4B**–**D**) and late flowering (**Figure 3A**) in those lines compared to wild-type R108 plants.

In contrast, qRT-PCR analysis of five *MtCOL* genes (*MtCOLa*–*MtCOLd* and *MtCOLh*; **Figure 5**) indicates that there is no consistent change to the expression of these genes in the four *Mt MtCDFd1\_1* overexpression lines (4.17, 13.24, 17.34, and 2.2) compared to R108 and the two remaining

lines 4.17, 13.24, 17.34, 19.30, and 1.1, while R108-2 was planted alongside line 2.2. (D) Photographs of 63-day-old fully expanded trifoliate leaves from different T1 plants and R108 in VLD. Trifoliate leaves photographed, from the top and from the side (E), and flower (F) comparisons between R108 and *35S:MtCDFd1\_1* line 2.2. Photographs were taken when VLD R108 and *35S:MtCDFd1\_1* plants were 71 and 86 days old, respectively. The white arrow indicates the abnormal curleddown pistil in the *35S:MtCDFd1\_1* line compared to wild-type plants.

transgenic lines that do not overexpress *MtCDFd1\_1* (19.30 and 1.1).

In LD, a subset of lines was tested for gene expression. Like in VLD, overexpression of *MtCDFd1\_1* correlated with significantly reduced expression of *MtFTa1*, *MtFTb1*, and *MtFTb2* (**Figure 4**).

In VSD, no significant difference could be seen in the expression of *MtFTa1* in representative *MtCDFd1\_1*

overexpressing line 4.17 relative to R108 (**Figure 4B**). This is consistent with the absence of a flowering time phenotype in this transgenic line relative to R108 in VSD. *MtFTb1* and *MtFTb2* transcript levels were barely detectable in VSD in the transgenic line or R108 (**Figures 4C**, **D**) as expected (Laurie et al., 2011). Thus, gene expression analysis in VSD was not pursued further.

FIGURE 4 | *MtCDFd1\_1* overexpression in *Medicago* reduces *MtFTa1*, *MtFTb1*, and *MtFTb2* transcript levels. (A–D) Expression of *MtCDFd1\_1*, *MtFTa1*, *MtFTb1*, and *MtFTb2* in the *35S:MtCDFd1\_1 Medicago* R108 transgenic lines in vernalized long day (VLD), long day (LD), and vernalized short day (VSD). Data were derived from fully expanded trifoliate leaves harvested on days 14 and 15 (VLD), day 46 (LD), and day 43 (VSD) at ZT4. Gene expression levels are means ± SE of three biological replicates, normalized to *PP2A*. Data were presented relative to the highest value of a gene across the three growth conditions. In VLD, R108-1 was grown at the same time as lines 4.17, 13.24, 17.34, 19.30, and 1.1, while R108-2 was grown with line 2.2. In LD, R108-1 was grown at the same time as lines 17.34 and 1.1, while R108-2 with lines 4.17 and 2.2. All plants grown in VLD were T1 generation, while T2 populations were grown in LD and VSD. Asterisks indicate transgenic lines with significantly different expression from R108 [multiple pairwise comparisons adjusted for false discovery rate (FDR); α = 0.05].

#### Flowering Time and Gene Expression in *35S:MtCDFd1\_1/Mtfta1* Homozygous Lines

*MtFTa1* is a strong promoter of *Medicago* flowering, particularly in VLD conditions (Laurie et al., 2011). This suggests that the delayed flowering in the *35S:MtCDFd1\_1* plants in VLD might be due to the reduced average *MtFTa1* expression we observed. Therefore, to analyze the interaction between *35S:MtCDFd1\_1* and *MtFTa1*, two late-flowering *35S:MtCDFd1\_1* lines, 4.17 and 2.2, were crossed with the late-flowering *Mtfta1* mutant and the resulting F2 populations scored in VLD (**Figure 6A**).

*35S:MtCDFd1\_1*/*Mtfta1* homozygous F2 plants flowered ~1 month later than *35S:MtCDFd1\_1* lines homozygous for wild-type *MtFTa1*, but at a similar time to *Mtfta1* homozygous mutant plants. Thus, no additive effect was observed in *35S:MtCDFd1\_1* on the late flowering already conferred by the *Mtfta1* homozygous mutation in VLD.

As previously observed in the four late-flowering *35S:MtCDFd1\_1* transgenic plants (lines 4.17, 13.24, 17.34, and 2.2, **Figure 4**), the presence of the *35S:MtCDFd1\_1* transgene correlated with significantly lower transcript levels of *MtFTa1*, *MtFTb1*, and *MtFTb2* compared to R108 (**Figures 6B**–**E**).

We also analyzed the expression of *MtSOC1a* (**Figure 6F**), a *SOC1-like* gene which promotes flowering and primary stem growth and whose expression is partly dependent on *MtFTa1* (Fudge et al., 2018; Jaudal et al., 2018). Plants with the *35S:MtCDFd1\_1* transgene and wild type for *MtFTa1* showed a

statistically significant, moderate decrease (~2.7-fold) in average *MtSOC1a* transcript levels compared to wild-type R108 plants.

#### DISCUSSION

While the photoperiodic pathways in *Medicago* and pea promote flowering through LD-induced *FT* genes such as *FTa1*, in contrast to *Arabidopsis*, they appear to act in a *CO*-independent manner. To test whether *MtCDF* genes regulate *Medicago* photoperiodic flowering time, we analyzed the expression and function of members of the *MtCDF* clade with a focus on *MtCDFd1\_1*. Our work on the *MtCDF*s has revealed similarities and differences between *Medicago* and the well-characterized *Arabidopsis* system and indicates how *MtCDF*s may contribute to *Medicago* flowering time control.

*MtCDF* genes, *MtCDFd1\_1* (here) and *MtCDFc2-1* and *MtCDFb2* (Thomson et al., 2019), showed a diurnal cycle of expression, with peak transcript levels at or near dawn, which was similar to the best characterized *AtCDF*s that regulate flowering time (*AtCDF1*-*3*,*5*). We also observed that overexpression of *MtCDFd1\_1* in *Medicago* caused VLD plants to flower late, as if they had been grown in VSD, rendering the transgenic plants day neutral. These results are similar to those reported for the dominant pea mutation *late2*/*Pscdfc1* (Ridge et al., 2016) and for overexpression of *AtCDFs* in *Arabidopsis*. Thus, *MtCDF*s may normally function in wild-type plants predominantly to delay flowering in VSD.

*35S:MtCDFd1\_1* appears to regulate flowering in *Medicago via* repressing *MtFTa1*, a known strong promoter of flowering in VLD (Laurie et al., 2011), but not *via MtCOL* genes. The transcript levels of the LD-induced genes *MtFTa1*, *MtFTb1*, *MtFTb2*, and *MtSOC1a* were significantly reduced in the *35S:MtCDFd1\_1* transgenic plants, while five *MtCOL* genes

*35S:MtCDFd1\_1* transgene and wild type for *MtFTa1*) was obtained. It flowered at 26 days and eight nodes, similar to wild-type R108. (B–F) Relative gene expression in 35S:MtCDFd1\_1 F2 plants, with or without the Mtfta1 mutation of MtCDFd1\_1, MtFTa1, MtFTb1, MtFTb2 and MtSOC1a in vernalized long day (VLD). Data were derived from fully expanded trifoliate leaves harvested on day 23 at ZT4. Three biological samples each consisting of leaves from three plants were harvested per genotype. Gene expression levels were means of the three biological replicates ± SE, normalized to *PP2A*. Data were presented relative to the highest value of that specific gene. Asterisks indicate genotypes with significantly different expression from R108 [multiple pairwise comparisons adjusted for false discovery rate (FDR); α = 0.05].

were unaffected. Genetic analysis showed that *35S:MtCDFd1\_1*/ *Mtfta1* plants flowered no later than the later-flowering parent, *Mtfta1*. Thus, in VLD, *35S:MtCDFd1\_1* influenced flowering in the same pathway as *MtFTa1*, and the late flowering of *35S:MtCDFd1\_1* plants in VLD likely results from reduced *MtFTa1* gene expression. The short primary stem phenotype observed is also consistent with the repressive effect of *35S:MtCDFd1\_1* on expression of *MtFTa1* and *MtSOC1a*, previously indicated to be important for stem elongation in VLD and LD conditions (Laurie et al., 2011; Jaudal et al., 2018).

What might be the role of the other two *MtFT-like* genes, *MtFTb1* and *MtFTb2*, whose expression is also strongly reduced by *35S:MtCDFd1\_1*? The *35S:MtCDFd1\_1*/*Mtfta1* plants show no additional delay to flowering time, beyond that conferred by the *Mtfta1* mutation in VLD conditions, and as previously reported (Laurie et al., 2011), *MtFTb1* and *MtFTb2* expression is not affected by the single *Mtfta1* mutation. Overall, these results indicate that neither *MtFTb1* nor *MtFTb2* has non-redundant roles in *Medicago* flowering time in VLD. It is possible they may affect flowering *via* regulating *MtFTa1*, but testing this awaits the identification of single and double *MtFTb1*/*2* mutant plants.

While no *MtCDF Tnt1* insertion mutant plants had a flowering time phenotype, this is overall consistent with *Arabidopsis CDF* single mutants and is likely due to redundancy in function between the genes (Fornara et al., 2009). On the other hand, five genes (*MtCDFd1\_1*, *MtCDFa2*, *MtCDFc1*, *MtCDFd1\_3*, and *MtCDFe*), out of the 11 tested, stood out for their ability to cause late flowering when overexpressed in *Arabidopsis*. It is possible that sequence variation within key MtCDF functional domains, or their absence, could affect the other MtCDFs' ability to interact with potential binding partners or target genes and regulate flowering time. For example, differential susceptibility to the *Arabidopsis* FKF1/GI protein degradation system may affect MtCDFs' ability to repress flowering and could help explain some of the variation in flowering times observed between the different genes (Kloosterman et al., 2013; Ridge et al., 2016). On the other hand, it is possible that the inability of the other *MtCDF* genes tested to affect *Arabidopsis* flowering time was due to the differences in transgene expression levels.

In our case, *35S:MtCDFd1\_1* strongly represses flowering, and its predicted protein lacks the predicted GI- and FKF1-binding domains. This provides some indication that MtCDFd1\_1 protein may not be targeted for degradation by the endogenous FKF1/ GI system in *Arabidopsis* or *Medicago*, suggesting an alternative method of regulation of its activity in LD from the AtCDF system. On the other hand, MtCDFc1 and its predicted pea ortholog PsCDFc1 (Ridge et al., 2016) do interact with AtFKF1 in yeast two-hybrid assays but have different effects on flowering in *Arabidopsis*. In our experiments, *35S:MtCDFc1* strongly delayed *Arabidopsis* flowering, while *35S:PsCDFc1* was reported not to (Ridge et al., 2016). This indicates that other differences in sequence may be important, or perhaps differences in cultivation or levels of expression in the transgenic plants may be responsible.

Apart from a delayed transition to flowering, other phenotypes were seen in multiple *35S:MtCDF* transgenic *Arabidopsis* plants implicating *MtCDF*s in a variety of plant processes that extend beyond involvement in photoperiodic regulation (Corrales et al., 2014, Corrales et al., 2017). In plants such as *Arabidopsis* and tomato, *CDF* genes also modulate other processes such as abiotic stress tolerance (Corrales et al., 2014, Corrales et al., 2017). In addition, a different photoperiodic process, namely, SD-induced tuber development, is regulated by *StCDF1* in *Solanum tuberosum* L. (potato; Kloosterman et al., 2013). The abnormal phenotypes we observed included an upright rosette leaf stature with rigid longhandle spoon-shaped curved leaves, which may indicate effects on hormone homeostasis (e.g. Sun et al., 2010). Interestingly, in addition to late flowering in some independent *35S:MtCDFd1\_3* lines, the older leaves of some of the plants developed spotty lesions, perhaps indicative of effects on senescence or cell death and/or disease resistance processes (Lorrain et al., 2003).

Overall, our results expand the understanding of the features and functions of members of the MtCDF clade. *MtCDF* genes are implicated as regulators of the *Medicago* photoperiodic pathway, where they are likely to have overlapping functions in wild-type plants probably by repressing flowering in VSD conditions. In terms of mechanism, the absence of an effect of overexpression of *MtCDFd1\_1* in transgenic lines (4.17, 13.24, 17.34, and 2.2) on the expression of five *MtCOL* genes (**Figure 5**), but strong repression of LD-induced *MtFT* genes compared to R108 (**Figure 4**), adds further support to the idea that *MtCDF*s may function in a photoperiod pathway that is independent of *CO*. This is consistent with work in pea (Ridge et al., 2016). Future work to determine the function of *MtCDF*s and to overcome the challenges of functional redundancy will focus on generating plants carrying mutations in multiple *MtCDF* genes using the CRISPR–Cas9 system in *Medicago* (Meng et al., 2017; Curtin et al., 2018). In addition, since there is direct regulation of *FT* by AtCDF1 (Song et al., 2012), direct interactions of MtCDFs with the LD-induced *MtFT-like* genes could be tested to examine if this is conserved in legumes.

# DATA AVAILABILITY

All datasets generated for this study are included in the manuscript and the supplementary files.

# AUTHOR CONTRIBUTIONS

LZ, AJ, MK-P, GT, TK, CP, and MJ performed the experiments, LZ, AJ, GT, and JP prepared the figures. JW and KM provided the *Medicago Tnt1* insertion lines. JP conceived the project and wrote the article with contributions from AJ, LZ, MJ, and GT.

# FUNDING

The research was funded by the New Zealand Foundation for Research Science and Technology (www.msi.govt.nz/) contract number C10X0816 MeriNET and the New Zealand Marsden Fund (www.royalsociety.org.nz/programmes/funds/marsden/) contract 14-UOA-125. The development of the *Medicago Tnt1* insertion lines and reverse genetics screenings were supported by the National Science Foundation, USA (DBI 0703285 and IOS-1127155), and Noble Research Institute, LLC.

# ACKNOWLEDGMENTS

We thank Davide Zazarro and Nathan Deed for glasshouse support.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01148/ full#supplementary-material

# REFERENCES


Geneious 11.1.5 from Biomatters. Available at: https://www.geneious.com.


Hill, J. R. (2000). Jester—application no: 98/201. *Plant Varieties J.* 13, 40.


flowering time in *Medicago*," in *The model legume Medicago truncatula*. Ed. F. de Bruijn (Chichester, UK: Wiley-Blackwell).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Zhang, Jiang, Thomson, Kerr-Phillips, Phan, Krueger, Jaudal, Wen, Mysore and Putterill. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Exploring the Genetic Cipher of Chickpea (Cicer arietinum L.) Through Identification and Multi-environment Validation of Resistant Sources Against Fusarium Wilt (Fusarium oxysporum f. sp. ciceris)

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Teresa Millan, Universidad de Córdoba, Spain Agnieszka Barbara Najda, University of Life Sciences of Lublin, Poland Cengiz Toker, Akdeniz University, Turkey

#### \*Correspondence:

Mamta Sharma mamta.sharma@cgiar.org

#### Specialty section:

This article was submitted to Crop Biology and Sustainability, a section of the journal Frontiers in Sustainable Food Systems

> Received: 22 May 2019 Accepted: 06 September 2019 Published: 20 September 2019

#### Citation:

Sharma M, Ghosh R, Tarafdar A, Rathore A, Chobe DR, Kumar AV, Gaur PM, Samineni S, Gupta O, Singh NP, Saxena DR, Saifulla M, Pithia MS, Ghante PH, Mahalinga DM, Upadhyay JB and Harer PN (2019) Exploring the Genetic Cipher of Chickpea (Cicer arietinum L.) Through Identification and Multi-environment Validation of Resistant Sources Against Fusarium Wilt (Fusarium oxysporum f. sp. ciceris). Front. Sustain. Food Syst. 3:78. doi: 10.3389/fsufs.2019.00078 Mamta Sharma<sup>1</sup> \*, Raju Ghosh<sup>1</sup> , Avijit Tarafdar <sup>1</sup> , Abhishek Rathore<sup>1</sup> , Devashish R. Chobe<sup>1</sup> , Anil V. Kumar <sup>1</sup> , Pooran M. Gaur <sup>1</sup> , Srinivasan Samineni <sup>1</sup> , Om Gupta<sup>2</sup> , Narendra Pratap Singh<sup>3</sup> , D. R. Saxena<sup>4</sup> , M. Saifulla<sup>5</sup> , M. S. Pithia<sup>6</sup> , P. H. Ghante<sup>7</sup> , Deyanand M. Mahalinga<sup>8</sup> , J. B. Upadhyay <sup>9</sup> and P. N. Harer <sup>10</sup>

<sup>1</sup> Legumes Pathology, Integrated Crop Management, International Crops Research Institute for the Semi-arid Tropics, Patancheru, India, <sup>2</sup> Plant Pathology Division, Jawaharlal Nehru Krishi Vishwa Vidyalaya, Jabalpur, India, <sup>3</sup> Crop Protection Division, Indian Institute of Pulse Research, Kanpur, India, <sup>4</sup> Department of Plant Pathology, RAK College of Agriculture, Sehore, India, <sup>5</sup> Department of Plant Pathology, University of Agricultural Sciences, GKVK, Bangalore, India, <sup>6</sup> Pulse Research Station, Gujarat Agriculture University, Junagadh, India, <sup>7</sup> Agriculture Research Station, Badnapur, India, <sup>8</sup> Agriculture Research Station, Gulberga, India, <sup>9</sup> Department of Plant Pathology, Tirhut Collage of Agriculture, Samastipur, India, <sup>10</sup> Pulse Improvement Project, Mahatma Phule Krishi Vidyapeeth, Rahuri, India

Fusarium wilt (Fusarium oxysporum f. sp. ciceris) of chickpea is the major limitation to chickpea production worldwide. As the nature of the pathogen is soil borne, exploitation of host plant resistance is the most suitable and economical way to manage this disease. Present study was therefore conducted with an aim to find new, stable and durable sources of resistance of chickpea against Fusarium wilt through multi-environment and multi-year screening. During 2007/2008 crop season, 130 promising genotypes having <10% wilt incidence were selected from initial evaluation of 893 chickpea genotypes in wilt sick plot at ICRISAT, Patancheru. Of them 61 highly resistant lines were selected through further evaluation in 2008/2009 and 2009/2010 crop season. Finally, a set of 31 genotypes were selected to constitute a Chickpea Wilt Nursery (CWN) and tested at 10 locations in India for three cropping seasons (2010/2011, 2011/2012, and 2012/2013) coordinated through Indian Council of Agricultural Research (ICAR) and ICRISAT collaboration. The genotype and genotype × environment interaction (GGE) indicated significant variations (p ≤ 0.001) due to genotype × environment (G × E) interaction. Most of genotypes were resistant at two locations, ICRISAT (Patancheru) and Badnapur. On the contrary most of them were susceptible at Dholi and Kanpur indicating the variability in pathogen. GGE biplot analyses allowed the selection six genotypes ICCVs 98505, 07105, 07111, 07305, 08113, and 93706 with high resistance and stability across most of the locations and eight moderately resistant (<20% mean incidence) genotypes viz., ICCVs 08123, 08125, 96858, 07118, 08124, 04514, 08323, and 08117. As chickpea is grown in diverse agro-ecological zones and environments; these stable/durable sources can be used in future resistance breeding program to develop Fusarium wilt resistant cultivars.

Keywords: chickpea, GGE biplot, Fusarium wilt, multi-year, multi-environment

#### INTRODUCTION

Among the legumes, chickpea (Cicer arietinum L.) has occupied a foremost place due to its high nutritional value. But the average global productivity of chickpea is hampered due to various biotic stresses (Reddy et al., 1990; Tarafdar et al., 2017, 2018). Among them Fusarium wilt caused by Fusarium oxysporum f. sp. ciceris (FOC) is one of the widely distributed diseases of chickpea and cause yield loss up to 10–100% depending on varietal susceptibility and climatic conditions (Jimenez-Diaz et al., 1989; Patil et al., 2015). The disease is more predominant in the Indian subcontinent, Spain, Ethiopia, Mexico, Tunisia, Turkey, and the United States (Westerlund et al., 1974; Halila and Strange, 1996; Ghosh et al., 2013). Since the disease is soil borne, chemical control is not effective and practical to implement (Sharma et al., 2017). Exploitation of host plant resistance is therefore the most trustworthy way to overcome the situation Numerous sources of resistance to Fusarium wilt in chickpea has been identified previously (Pande et al., 2006; Mirzapour et al., 2014; Chobe et al., 2016) and several are being utilized in resistance breeding program at ICRISAT and National Agricultural Research Stations (NARS) that has contributed in substantial increase of chickpea productivity in semi-arid regions of Africa and Asia (Sharma et al., 2012; Fikre et al., 2018). However, resistance sturdiness in these sources is affected due to G × E interaction and high genetic variability in the pathogen.

The pathogen was reported with eight races from over the world (Haware and Nene, 1982; Sharma et al., 2012). Races 0, 1A, 1B/C, 5, and 6, has been reported from United States and Spain and races 1A, 2, 3, and 4 from India. Although, gene for gene relationship of few FOC Avr gene and chickpea R gene has been proved, but chickpea and FOC interaction at molecular level is yet to be known (Sharma et al., 2016b). Present distribution of FOC races is not very clear owing to large exchange of germplasm and climate variability (Sharma et al., 2014) and existence of multiple races in one region. Therefore, in order to develop effective stratagems, for wilt management through host plant resistance, it is essential to obtain information on resistance stability of genotypes at multi-environment.

Several methods have been used to analyze the G × E interaction (Moore et al., 2019) and a number of multivariate techniques such as GGE billet technique have been used by various researchers (Yan et al., 2000; Yan, 2001; Sharma et al., 2013, 2015). Biplot analysis of G × E data has been advanced such that many important questions, such as stability of genotypes, mean performance, discriminating ability, megaenvironment investigation, representativeness of environment, and who-resistant-where pattern can be graphically addressed for better understanding.

In the above context, the present work was undertaken to identify stable, durable, and broad based sources of resistance to wilt under ICAR-ICRISAT collaboration in chickpea through multi-environment and multi-year field testing across the wilt hot spot locations in India.

#### MATERIALS AND METHODS

#### Plant Materials

During 2007/2008 cropping season, a total of 893 chickpea genotypes including breeding lines and germplasm accessions were screened for Fusarium wilt resistant in a sick plot at ICRISAT, Patancheru under artificial epiphytotic conditions (**Figure 1**). Out of them, 130 promising lines (genotypes) having ≤10% wilt incidence was selected primarily for further evaluation. The selected genotypes were further evaluated in next two consecutive years, 2008/2009 and 2009/2010 through randomized complete block design (RCBD) in two replications. Each genotype was sown in a 4 m long row (2 rows/ replication) having 60 cm row to row distance. Susceptible check ICC 4951 (JG 62) and resistant check ICC 11322 (WR 315) were sown after every eight rows. Based on disease reaction in year of 2008/2009 and 2009/2010, a total of 61 highly wilt resistant genotypes were selected. Finally, a set of 31 genotypes (18 desi and 13 kabuli) including susceptible and resistant checks were selected based on consistent resistant reaction to constitute a Chickpea Wilt Nursery (CWN) for multi-environment and multi-year evaluation. The details of the 50% flowering, pod maturity duration and pedigree of the selected lines are described in **Table 1**.

FIGURE 1 | Screening of chickpea genotypes for Fusarium wilt in wilt sick plot.


\*Check genotypes; –, Data not available.

#### Multi-environment Screening

The CWN was evaluated for Fusarium wilt resistance at 10 different locations in India [Dholi, Banglore, ICRISAT (Patancheru), Rahuri, Sehore, Gulberga, Kanpur, Junagarh, Jabalpur, and Badnapur] for 3 years (2010/11, 2012/13, and 2013/14) in wilt sick plot. These locations encompass a wide diversity in agro-climatic zones with latitude from N 17.3297◦ at Gulberga to N 26.4499◦ at Kanpur, longitude from E 70.4579◦ at Junagadh to E 85.5895◦ at Dholi, and altitude from 52.2 m (Dholi) to 920 m (Banglore). The tested locations represent 27 environments and three soil types during three cropping seasons (**Table 2** and **Figure 2**). Seeds of the test genotypes were supplied to the respective collaborators for multi-location testing against wilt. At each location, the nursery was evaluated in a RCBD with two replications as described above.

#### Data Collection and Statistical Analysis

Wilt incidence data was recorded replication across all locations during the years of evaluation. Per cent disease incidence in each test genotype was calculated using the following formula

$$\% \text{ disease incidence} = \frac{\text{No. of infected plants}}{\text{Total no. of plants}} \times 100 \qquad (1)$$

Based on mortality of plants to Fusarium wilt, the test genotypes were divided into four categories, resistant (<10.0% plant mortality), moderately resistant (10.1–20.0% plant mortality), susceptible (20.1–40.0% plant mortality), and highly susceptible (>40.0% plant mortality).

To make residual normal the percentage data was arcsine transformed prior to analysis (Gomez and Gomez, 1984) and transformed data was used to test the significance of genotype (G), environment (E), and genotype × environment (G × E) interaction using MIXED procedure of SAS 9.4 (SAS Institute Inc, 2017) considering genotype, environment, and replication as random effects. Individual environment variances were modeled into combined analysis. BLUPs (Best Linear Unbiased


TABLE 2 | Details of test environments.

† Environment is denoted as first three letters of each location followed by year of screening.

\*SZ, South Zone; NEPZ, North eastern Plane zone; CZ, Central Zone.

Predictors) were estimated for G, E, and G × E interaction from combined analysis. Multiple comparisons were performed among test locations.

To identify relationship between environments, Spearman's rank correlation was performed using SAS PROC CORR procedure (SAS Institute Inc, 2017). The performance of all possible genotypes in two environments were compared and determined if the differences in performances is significantly "<0" in one environment and significantly ">0" in other environment (Yang et al., 2009; Ponnuswamy et al., 2018).

The GGE biplot, site regression model (Yan and Kang, 2002) was used to visualize the G × E interaction patterns and to distinguish (1) genotype performance and stable genotype across all environments, (2) environment effects, discriminating genotypes and (3) identify patterns where by specific genotypes can be recommended to a specific environment(s). Further, to understand the disease incidence and its distribution pattern among the test genotypes across the locations, boxplot analysis of environment × incidence, and genotype × incidence was carried out (Wiik and Rosenqvist, 2010).

## RESULTS

# Identification of Resistant Genotypes Through Preliminary Screening

Preliminary screening of 893 genotypes at ICRISAT, Patancheru showed a broad range of response to wilt (resistant to susceptible). This allowed removal of susceptible materials and selection of most promising 61 genotypes having ≤10% wilt incidence. Subsequent screening of 61 genotypes enabled in constitution of a CWN consisting of 28 promising genotypes with <10% wilt incidence for multi-environment and multi-year screening in 10 diverse locations (**Tables 1**, **2**).

# Response of Genotypes to Wilt in Multi-environment

The wilt incidence in most of the chickpea genotypes varied across the locations and years. Among the genotypes, the variability in wilt incidence is evident from the frequency distribution of genotypes across locations over the years (**Figure 3**). For instance, most of the genotypes were resistant at

ICRISAT, Patancheru (24 genotypes) followed by Badnapur (21 genotypes). On the contrary, none of the genotypes were found resistant in Kanpur and Sehore (**Figure 3**). Analysis of variance for wilt incidence showed that all the three sources of variation were highly significant (P < 0.0001); with a higher proportion of variation due to G × E (40.35 %) followed by G (35.6%) and E (24.05%) (**Table 3**).

An adequate disease pressure was found in wilt sick plots at all the locations as evident from wilt incidence of susceptible check JG 62 (85.35–100%) (**Table 4**). Among the locations, the highest mean wilt incidence irrespective of genotypes (excluding check) over 3 years was observed in Kanpur (46.42%) followed by Dholi (45.75%), whereas the lowest mean wilt incidence was observed in Badnapur (8.45%) followed by ICRISAT (Patancheru) (9.52%) (**Table 4**). Significant differences in wilt resistant/ moderately resistant genotypes were observed in different locations (**Tables S1, S2**).

Performance of genotypes differed over the years and locations as evident from the distribution pattern of disease incidence among the test genotypes across the locations

TABLE 3 | Combined analysis of variance for wilt incidence of 31 genotypes across the locations during 2010/2011, 2011/2012, and 2012/2013 cropping seasons.


(**Figure 4**). For example, three genotypes including susceptible check JG 62 and two genotypes ICCV 93706 and ICCV 96854 was stable throughout the environments, whereas the performance of four genotypes ICCV 07304, ICCV 08116, and ICCV 08311 and late wilt susceptible check ICC 5003 varied in different environments. Also, magnitude of wilt incidence varied depending on environmental variable at particular location. The analysis indicated that the location-wise performance of the genotypes was highly stable in Badnapur, followed by ICRISAT over the years, whereas the performance of genotypes varied highly in Jabalpur followed by Dholi and Banglore in same cropping seasons (**Figure 5**). Mean wilt incidence was highest at DHO\_2012-13 (57.6%) followed by DHO\_2010- 11 (54.76%) and KAN\_2011-12, whereas lowest average wilt incidence (4.86%) was observed in ICR\_2012-13. Maximum range of average wilt incidence (2.44–100%) was recorded in JAB\_2012-13.

During Fusarium wilt screening, the overall assessment through mean wilt incidence % indistinctly classified the tested genotypes in different resistant and susceptible phenotypic groups on the basis of their disease reaction, even though they are derived from the same parents. For example, the kabuli chickpea ICCVs 07304, 07305, and 07306 were derived from the multiple crosses of (ICCV 98502 × ICCV 98004) × ICCV 92311 (**Table 1**) showed considerable varying disease reaction from moderately resistant to highly susceptible with their mean wilt incidence range from 13.10 to 42.03%. Similarly, the desi chickpea ICCVs 93706 and 98505 were derived from cross of same parent lines ICCC 42 × ICC 1069, but the little variation was observed in their mean wilt incidence, 14.88 and 8.47%, respectively.

#### Correlation Between Environments

A significant positive correlation for the levels of wilt incidence was found in some of the test environments using Spearman's rank correlation analysis (P < 0.0001). For instance, the significant positive correlation (r = 0.60) was observed between environments of BAN\_2010-11 (3) and SEH\_2010-11 (22) with respect to wilt incidence, while in same cropping season, the negative correlation (r = −0.14) was found between environments of BAD\_2010-11 (1) and GUL\_2010-11 (9) (**Table 5**). Crossover interactions indicated diverse environments among the test locations. For instance, high entry rank value (24.09%) between the environments of JAB\_2010-11 (15) and BAD\_2010-11 (1) indicated the diverse environment between the locations. Conversely low entry rank value (3.84%) indicated less diversity between BAD\_2010-11 (1) and GUL\_2010-11 (9) (**Table 5**).

#### Stability of Genotypes and Environments

GGE biplot analysis explained 66.11% of total variation, where PC1 (wilt incidence) and PC2 (resistance stability) accounted for 51.64 and 14.47% variation, respectively (**Figure 6**). It was found that the performance of test genotypes as indicated by proximity of vectors over the years within a specific location was nearly similar. For instance, the proximity of vectors for



\*Excluding three check lines ICCV 11322, ICC 4951, and ICC 5003.

the environments of JAB\_2010-11, JAB\_2011-12, and JAB\_2012- 13 indicated that the performance of the tested genotypes to wilt in Jabalpur location was stable. Further, the longer vectors for the environments KAN\_2010-11, KAN\_2011-12, JAB\_2011- 12, and JAB\_2012-13 indicated that those environments were most discriminating for genetic differentiation of genotypes. Conversely, the environments BAD\_2010-11, BAN\_2011-12, BAN\_2012-13, DHO\_2011-12, and DHO\_2012-13 had smaller vectors indicating that tested genotypes were least discriminatory to those environments.

An angle (nearly 90◦ ) among the environments e.g., ICR\_2010-11, ICR\_2011-12, SEH\_2010-11, and SEH\_2012-13 with KAN\_2010-11 and KAN\_2011-12 indicated the moderate correlations within them. However, the higher PC1 scores and lower PC2 scores of the environments JAB\_2010-11, JAB\_2011- 12, JAB\_2012-13, KAN\_2010-11, and KAN\_2011-12 indicated better discriminating ability of these environments (**Figure 6**).

A six sided polygon on the biplot indicated that the genotypes positioned at the vertices of the polygon added most to the interaction (highest or lowest wilt incidence). The genotypes placed at the right side of the Y-axis indicated susceptibility to wilt irrespective of environments, whereas those genotypes placed at the left side are resistant to wilt across the environments. Out of 31, genotypes ICCV 07304 (9), ICCV 08319 (26), ICC 4951 (2), and ICC 5003 (3) were persistently more susceptible to wilt by being farthest from the point of source of biplot on the right side.

However, genotype ICCV 98505 (31) located farthest from the point of origin on the left side endorsed its resistance to wilt across the locations with high stability. Further, four genotypes namely genotype ICCV 07105 (5), ICCV 07111 (7), ICCV 07305 (10), and ICCV 93706 (29) placed on the right side of the originpoint revealed moderate level of stability to tested environment with low level of wilt incidence (**Figure 6**). Detail of the genotypes

FIGURE 5 | Distribution of residuals for genotypes across ten locations. Box edges represent the upper and lower quintile with median value shown in the middle of the box. Whiskers represented by "o" symbol. Individuals falling outside the range of whiskers shown as numbers.


TABLE 5 | Spearman's rank correlations (r) and cross over interactions (%) showing stability and comparison of the genotypes across the location.

found suitable and adaptable for the location specific breeding program is provided in **Table 6**.

# DISCUSSION

Deploying wilt resistant chickpea cultivars is one of the sustainable strategies adopted by the breeders as a part of integrated disease management. The success of any breeding programme is generally dependent on the stable performance of any traits within the genotypes. Therefore, present study using G × E interaction on Fusarium wilt incidence, may play a crucial role to enhance chickpea productivity as this has enabled us to identify donors with stable and broad-based sources of resistance and existing variability in the pathogen population.

A multi-environment evaluation revealed significant differences (<0.0001) in G, E and G × E interaction for wilt. Differential reaction of the chickpea to Fusarium wilt in multienvironment can be attributed to variations in virulence in the pathogen population (Sharma et al., 2014). Presence of different virulence genes within the pathogen and their varied responses in different geographical locations may be responsible to varied wilt incidence across the environments, although the differential response of the genotypes in different environmental conditions cannot be let off (Kulkarni and Chopra, 1982).

High level of wilt incidence in susceptible genotypes at all locations indicated adequate disease pressure in sick plots. Average wilt incidence was found higher during the years of evaluation at Kanpur and Dholi than other test locations irrespective of genotypes. In contrast, average wilt incidence was lowest in ICRISAT (Patancheru) and Badnapur. Location-wise variation in wilt incidences might be attributed to differences in virulence of the pathogens or random distribution of the resistance gene(s) within the chickpea genotypes, or due to influence of both factors (Sharma et al., 2014). Presence of four pathogenic races (races 1A, 2, 3, 4) from India (Haware and Nene, 1982) has been reported where race 2 from Kanpur was found to be more virulent than race 1 from ICRISAT, Patancheru.

In our study, it was shown that the genotypes derived from same parents differed for their disease reaction e.g., kabuli chickpea ICCVs 07304, 07305, and 07306 indicating the segregation of the resistant genes. However, little variation in mean wilt incidence for the desi chickpea ICCVs 93706 and 98505 derived from same parent lines might be due to a tight link within the multiple genes contributing to the resistance response. Previous studies on genetic analysis reported that the resistance to FOC race 1 is governed either by one or two genes (Brinda and Ravikumar, 2005) or three genes (Singh et al., 1987) whereas the resistance of FOC race 2 is conferred by single recessive gene (Sharma and Muehlbauer, 2007) and resistance to FOC is also monogenic (Sharma et al., 2004). Therefore, it is clear that segregation of genes have a major role in controlling the disease. However, when such resistant genes are absent in the genotypes or the genotypes are challenged by a complex mixture of the

TABLE 6 | Detail of the genotypes suitable and adaptable for specific location.


different races of FOC in sick plot, the varied disease response of the genotype is expected.

Multi-environment evaluation of genotypes assisted in selection of stable and resistant genotypes [ICCV 98505 (31), ICCV 07105 (5), ICCV 07111 (7), ICCV 07305 (10), ICCV 08113 (14), and ICCV 93706 (29)]. These genotypes could be used as resistant donor for wilt in chickpea breeding programs at different locations. Identification of high stable genotypes with low disease incidence is the prime source of resistant breeding programs. The GGE biplot analysis has been widely used in resistant breeding program for selection of genotypes having high stability with low disease incidence such as spot blotch in wheat (Sharma and Duveiller, 2007), rust in soybean (Twizeyimana et al., 2008), Ascochyta blight in faba bean (Rubiales et al., 2012), sterility mosaic disease in pigeonpea (Sharma et al., 2015), Fusarium wilt in chickpea and pigeonpea (Sharma et al., 2012, 2016a), rust of field pea (Das et al., 2019). Further, the variance for G × E interaction was more in this study than the genotypic variance indicating that the disease incidence was affected by both genotypes and environments. Significant positive as well as negative correlation was found between test locations, however it was not in accordance with agro-ecological zones again indicating the interactive effects of genotypes and environment.

Multi-environment and multi-year screening against diseases possibly can be used as a model for future selection of genotypes and identified genotypes could be the prime source in resistance breeding programs for specific adaptation to a particular agro-climatic zone. Varying response of the genotypes throughout the environment reflected the influence of environment toward instability of wilt incidence. In this study we discriminated broad based and stable resistant chickpea genotypes for future resistance breeding programme. Badnapur, ICRISAT, and Rahuri found as ideal test locations for culling out superior and stable wilt resistant chickpea genotypes. For optimum resource utilization, elimination of unnecessary testing locations should be implemented.

#### DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the manuscript/**Supplementary Files**.

#### AUTHOR CONTRIBUTIONS

MSh conceived and planned the work with significant inputs from RG. MSh and RG coordinated the experiment. AT and DC compiled the data. AR and AK analyzed the data for statically. OG, NS, DS, MSa, MP, PHG, DM, JU, and PH contributed in the multi-location trials including data collection at their respective locations. PMG and SS provided the pedigree details and seeds for the trial. RG, AT, and DC drafted the manuscript. MSh finally

#### REFERENCES


edited the manuscript. All authors read the manuscript and agree with its content.

#### FUNDING

The funding support from CGIAR Research Program on Grain Legumes and Dryland Cereals (CRP–GLDC), ICRISAT, and SPLICE-Climate Change Program under Department of Science and Technology, Govt of India [DST/CCP/CoE/142/2018 (G)] is gratefully acknowledged.

#### ACKNOWLEDGMENTS

We are thankful to the All India Coordinated Research Project (AICRP) on Chickpea for undertaking this activity under ICAR-ICRISAT collaboration. We also acknowledge the partnership with NARS pathologists of respective locations for evaluating the disease nursery over the years.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fsufs. 2019.00078/full#supplementary-material


in chickpea (Cicer arietinum L.)," in Proceedings of ISMPP International Conference on "Plant Health for Human Welfare" (Jaipur: Indian Society of Mycology and Plant Pathology), 119.


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Sharma, Ghosh, Tarafdar, Rathore, Chobe, Kumar, Gaur, Samineni, Gupta, Singh, Saxena, Saifulla, Pithia, Ghante, Mahalinga, Upadhyay and Harer. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Expression Patterns of Key Hormones Related to Pea (*Pisum sativum* L.) Embryo Physiological Maturity Shift in Response to Accelerated Growth Conditions

#### *Federico M. Ribalta1\*†, Maria Pazos-Navarro1†, Kylie Edwards1, John J. Ross2, Janine S. Croser1 and Sergio J. Ochatt3*

#### *Edited by:*

*Stephan Pollmann, National Institute of Agricultural and Food Research and Technology, Spain*

#### *Reviewed by:*

*Wilco Ligterink, Wageningen University & Research, Netherlands Helene S. Robert, Brno University of Technology, Czechia*

*\*Correspondence:*

*Federico M. Ribalta federico.ribalta@uwa.edu.au*

*†These authors share first authorship*

#### *Specialty section:*

*This article was submitted to Plant Physiology, a section of the journal Frontiers in Plant Science*

*Received: 25 March 2019 Accepted: 23 August 2019 Published: 27 September 2019*

#### *Citation:*

*Ribalta FM, Pazos-Navarro M, Edwards K, Ross JJ, Croser JS and Ochatt SJ (2019) Expression Patterns of Key Hormones Related to Pea (Pisum sativum L.) Embryo Physiological Maturity Shift in Response to Accelerated Growth Conditions. Front. Plant Sci. 10:1154. doi: 10.3389/fpls.2019.01154*

*1 Centre for Plant Genetics and Breeding, School of Agriculture and Environment, The University of Western Australia, Crawley, WA, Australia, 2 School of Biological Sciences, University of Tasmania, Hobart, TAS, Australia, 3Agroécologie, AgroSup Dijon, INRA, Univ. Bourgogne Franche-Comté, Dijon, France* 

Protocols have been proposed for rapid generation turnover of temperate legumes under conditions optimized for day-length, temperature, and light spectra. These conditions act to compress time to flowering and seed development across genotypes. In pea, we have previously demonstrated that embryos do not efficiently germinate without exogenous hormones until physiological maturity is reached at 18 days after pollination (DAP). Sugar metabolism and moisture content have been implicated in the modulation of embryo maturity. However, the role of hormones in regulating seed development is poorly described in legumes. To address this gap, we characterized hormonal profiles (IAA, chlorinated auxin [4-Cl-IAA], GA20, GA1, and abscisic acid [ABA]) of developing seeds (10–22 DAP) from diverse pea genotypes grown under intensive conditions optimized for rapid generation turnover and compared them to profiles of equivalent samples from glasshouse conditions. Growing plants under intensive conditions altered the seed hormone content by advancing the auxin, gibberellins (GAs) and ABA profiles by 4 to 8 days, compared with the glasshouse control. Additionally, we observed a synchronization of the auxin profiles across genotypes. Under intensive conditions, auxin peaks were observed at 10 to 12 DAP and GA20 peaks at 10 to 16 DAP, indicative of the end of embryo morphogenesis and initiation of seed desiccation. GA1 was detected only in seeds harvested in the glasshouse. These results were associated with an acceleration of embryo physiological maturity by up to 4 days in the intensive environment. We propose auxin and GA profiles as reliable indicators of seed maturation. The biological relevance of these hormonal fluctuations to the attainment of physiological maturity, in particular the role of ABA and GA, was investigated through the study of precocious *in vitro* germination of seeds 12 to 22 DAP, with and without exogenous hormones. The extent of sensitivity of developing seeds to exogenous ABA was strongly genotype-dependent. Concentrations between 5 and 10 µM inhibited germination of seeds 18 DAP. Germination of seeds 12 DAP was enhanced 2.5- to 3-fold with the addition of 125 µM GA3. This study provides further insights into the hormonal regulation of seed development and *in vitro* precocious germination in legumes and contributes to the design of efficient and reproducible biotechnological tools for rapid genetic gain.

Keywords: abscisic acid, auxins, embryo physiological maturity, generation turnover, gibberellins, hormone regulation, legumes, precocious seed germination

#### INTRODUCTION

Recent advances in LED light technology have enabled the development of protocols for rapid generation turnover of temperate legumes under conditions optimized for day-length, temperature, and light spectra (recently reviewed by Croser et al., 2018). These conditions act to compress time to flowering and seed development across diverse genotypes, but their effect on the hormone profile of developing embryos remains unknown. In pea (*Pisum sativum* L.), we have demonstrated that embryos do not efficiently germinate until maturity is reached at *c.* 18 days after pollination (DAP; Ribalta et al., 2017). However, application of exogenous hormones under *in vitro* culture conditions can lead to germination of immature embryos 10 to 12 DAP (Gallardo et al., 2006; Ochatt, 2011). Sugar metabolism and moisture content have been implicated in the modulation of embryo physiological maturity (Obendorf and Wettlaufer, 1984; Le Deunff and Rachidian, 1988; Weber et al., 2005). At 18 DAP, pea germinates when seed moisture content is below 60% and sucrose level is less than 100 mg g−1 dry weight (DW) (Ribalta et al., 2017). While sucrose and moisture are good indicators of readiness to germinate, questions remain about the hormonal regulation of the embryo maturation process in pea, particularly the role of abscisic acid (ABA) and gibberellins (GAs). We reason that exposure to intensive conditions optimized for rapid generation turnover will alter the hormone content and relationships within the developing seed, compressing the time to physiological maturity of the embryo.

Abscisic acid and GAs are well-known key regulators of seed maturation, dormancy, and germination (Finch-Savage and Leubner-Metzger, 2006). Abscisic acid mediates plant response to environmental conditions (Weber et al., 2005; Nakashima and Yamaguchi-Shinozaki, 2013) and is involved in the inhibition of precocious germination, in reserve mobilization (Bewley, 1997; Raz et al., 2001), and the regulation of mRNA transcription for storage proteins (Mc Carty, 1995; Bewley, 1997; Verdier et al., 2008; Ochatt, 2011). Abscisic acid biosynthesis takes place in both maternal and embryo tissues during seed maturation (Weber et al., 2005). Maternal ABA, synthesized in the seed coat of *Arabidopsis* and *Nicotiana* and translocated to the embryo, promotes seed growth and prevents abortion (Frey et al., 2004). In *Medicago truncatula* Gaertn., it has been suggested that ABA regulates germination through the control of radicle emergence by inhibiting cell-wall loosening and expansion (Gimeno-Gilles et al., 2009). In addition, ABA has been implicated in the regulation of starch biosynthesis and degradation pathways of developing seeds (Seiler et al., 2011). Gibberellins are known antagonists of ABA function in seed development and act primarily to promote germination-associated processes and seedling growth (Swain et al., 1995; Lee et al., 2002). Bioactive GAs (GA1, GA4, and GA7) are involved in determining the rate of seed coat growth and sink strength during the early stages of seed development (Nadeau et al., 2011). From 8 to 12 DAP, a transition in the seed GA biosynthesis and catabolism pathways occurs to produce sufficient bioactive GA for continued seed tissue growth and development, with a shift to the production of GA20 (precursor of GA1) and minimal bioactive GA in the embryo as the seed enters into its maturation phase (Ozga et al., 2009). In *Arabidopsis*, optimal germination requires the induction of GA biosynthesis to counteract the negative regulation imposed by DELLA proteins (Locascio et al., 2013; Resentini et al., 2015). Auxins play a key role during the early stages of seed development in processes such as cell division and elongation, nutrient accumulation, and water uptake (Pless et al., 1984; Vernoux et al., 2011; Ochatt, 2011; Atif et al., 2013). The "chlorinated auxin (4-Cl-IAA), a hormone restricted to the *Fabaceae* but not present in *Cicer* species (Lam et al., 2015), is thought to have a growth regulatory role in pea through the induction of GA biosynthesis and inhibition of ethylene action (Johnstone et al., 2005; Ozga et al., 2009). Hormone levels have been shown to substantially fluctuate according to the stage of seed development (Weber et al., 2005; Slater et al., 2013; Ochatt, 2015) and environmental conditions (Seiler et al., 2011; Yuan et al., 2011; Shu et al., 2016), although the influence of these changes on germination competence in legumes remains unclear.

In recent years, *in vitro* techniques have facilitated the study of the fundamental physiological mechanisms underlying seed development and germination (Le et al., 2010; Finkelstein, 2013; Ochatt, 2015; Gatti et al., 2016). Examples include studies of the kinetics of seed protein accumulation (Gallardo et al., 2006; Verdier et al., 2008), acquisition of stress tolerance (Elmaghrabi et al., 2018), and morphogenesis (Ochatt, 2011; Ochatt, 2013; Atif et al., 2013; Ribalta et al., 2017), as well as flowering and fruiting induced *in vitro* (Ochatt and Sangwan, 2008; Ochatt, 2011; Ribalta et al., 2014; Mobini et al., 2015). The use of plant growth regulators *in vitro* has also been explored as a means to elucidate hormonal regulation during embryo development in a number of species (Myers et al., 1990; Jimenez, 2005; Zhao et al., 2011; Abe et al., 2014), including legumes (Ozcan et al., 1993; Lakshmanan and Taji, 2000; Blöchl et al., 2005; Ochatt, 2011; Atif et al., 2013; Pazos-Navarro et al., 2017; Ochatt et al., 2018). Slater et al. (2013) studied the seed hormone profiles of developing *in vivo* seeds of four legume species in an effort to determine the optimal time for embryo rescue, although these predictions were not validated. Despite these efforts, little is known about the interactions between auxins, ABA, and GAs on the control of seed precocious *in vitro* germination in legumes.

Ribalta et al. Precocious Seed Germination Hormonal Regulation

In this research, we report hormonal profiles (IAA, 4-Cl-IAA, GA20, GA1, and ABA) of developing seeds at 10 to 22 DAP from phenologically diverse pea genotypes grown under intensive conditions optimized for rapid generation turnover and compare these profiles to those of equivalent samples from glasshouse conditions. To elucidate the biological relevance of these hormonal fluctuations to attainment of physiological maturity, in particular the GA-ABA interaction, we precociously germinated developing seeds *in vitro* with and without the use of plant growth regulators. The results from this research will provide further insights regarding hormonal regulation of seed development and *in vitro* precocious germination and thus contribute to the design of efficient and reproducible methodologies for accelerated breeding in legumes.

#### MATERIALS AND METHODS

This research was undertaken in the controlled plant growth facilities at the University of Western Australia, Perth (latitude: 31°58′49″ S; longitude: 115°49′7″ E). Pea (*P. sativum* L.) cultivars representing early (PBA Twilight), mid (PBA Pearl), and late (Kaspa) field flowering phenology were selected for this research. Plants were grown in two environments: Environment 1 (E1) optimized for rapid growth and development as per Croser et al. (2016): far-red enriched LED lighting–AP67, B series Valoya lights (Helsinki, Finland), and Environment 2 (E2) glasshouse under natural light conditions (February/March period) (**Table 1**).

Seeds were sown in 0.4 L pots filled with steam pasteurized potting mix (UWA Plant Bio Mix–Richgro Garden Products Australia Pty Ltd). Plants were watered daily and fertilized weekly with a water-soluble N:P:K fertilizer (19:8.3:15.8) with micronutrients (Poly-feed, Greenhouse Grade; Haifa Chemical Ltd.) at a rate of 0.3 g per pot. Flowers were individually tagged at anthesis (when petals extended beyond the sepals).

#### Effect of Growing Conditions on Seed Development and Its Effect on Precocious *in Vitro* Germination Ability

To study the effect of growing conditions on the rate of seed development, the fresh weight (mg seed−1 ) of seeds between 12 and 30 DAP produced in environments E1 and E2 was calculated. For this study, the mid flowering cultivar PBA Pearl was selected as a representative type, with a minimum of five seeds measured per developmental stage. Additionally, developing seeds around embryo physiology stage (between 14 and 22 DAP) produced



*\*AP67, B series Valoya.*

in both environments were cultured *in vitro* to determine their ability for robust precocious germination as per Ribalta et al. (2017). Pods were surface-sterilized in 70% ethanol for 1 min, followed by 5 min in sodium hypochlorite (21 g/L), and three rinses in sterile deionized water. Pods were opened under sterile conditions, and 10 immature seeds, with and without integuments removed, were cultured per Petri dishes containing 20 mL MS medium (Murashige and Skoog, 1962) modified by the addition of 20% sucrose, 0.6% agar (Sigma, Type M), and pH 5.7. Germination percentage was recorded after 4 days of *in vitro* culture. Embryos were considered germinated when both radicle and shoot emergence was observed.

#### Study of Hormone Profiles of Developing Seeds of Phenologically Diverse Pea Genotypes Produced in Different Environments

The aim of this experiment was to study the effect of plant growth conditions on the hormone profile of developing seeds from the end of morphogenesis to the beginning of embryo physiological maturity. Seeds of PBA Twilight, PBA Pearl, and Kaspa were harvested every 2 days in environment E1 from 10 to 22 DAP and in environment E2 from 14 to 22 DAP. For the hormone analysis, samples were formed from a pool of at least five seeds from different pods at each developmental stage. Seed integuments were removed, and samples stored at −80°C, before being freeze-dried at 20 µbar and 22°C using a VirTis® , Bench TopTM K series freeze dryer (Gardiner, NY, USA). The hormone extraction procedure was completed as per Lam et al. (2015). Quantification was performed by mass spectrometry with labeled internal standards. For auxin, details are provided by Lam et al. (2015) and Mc Adam et al. (2017) and for ABA by Mc Adam and Brodribb (2012). Gibberellins were analyzed without derivatization. For GA1, the transitions monitored for quantification were m/z 347 to 273 for endogenous GA1 and m/z 349 to 275 for the di-deuterated internal standard. For GA20, the transitions monitored were m/z 331 to 287 for endogenous and m/z 333 to 289 for the di-deuterated standard. The labeled GA internal standards were kindly provided by Prof. Lewis Mander of the Australian National University, Canberra. Hormone content levels were calculated based on DW (ng g−1 ).

#### Role of Hormones on Precocious *in Vitro* Germination

To study the role of endogenous ABA as a preventer of precocious *in vitro* germination, seeds of the three phenologically diverse genotypes grown in E1 were collected at 18 DAP (embryo physiological maturity stage). Seeds were cultured as described above but with the addition of different ABA concentrations (0, 1, 2.5, 5, and 10 µM; A4906; Sigma-Aldrich, Australia). Seed coats were removed in all samples before culture. To determine the origin of endogenous ABA, seeds of the intermediate field flowering genotype PBA Pearl were also cultured with intact, nicked, and removed integuments on modified MS medium.

To study the promoting effect of GAs on precocious *in vitro* germination, seeds of the three genotypes grown in E1 were cultured at 12 and 14 DAP as previously described on modified MS medium with the addition of different concentrations of GA3 (0, 100, 125, and 150 µM; G7645; Sigma-Aldrich).

In all experiments, germination percentage was recorded after 4 days of *in vitro* culture for the ABA treatments and after 7 days for the GA3 treatments. Embryos were considered germinated when both radicle and shoot emergence was observed.

#### Statistical Analysis

The effect of the environment on fresh weight of developing seeds was analyzed by Student *t* test (*P* ≤ 0.05). For the hormone profile analysis, data represent hormone content from a pool of at least five seeds from different plants, providing an average result of five individual plants. Data were analyzed by analysis of variance (*P* ≤ 0.05) to determine differences in hormone content between cultivars, seed developmental stages (DAP), and environments (n = 3). Two tests were run focusing on the period between the end of morphogenesis and initiation of seed dehydration in E1 (10–22 DAP) and on the period comprising the attainment of embryo physiological maturity in both environments (16–22 DAP). The environmental effect on seed hormone levels at the physiological maturity stage was analyzed by Student *t* test (*P* ≤ 0.05) by pooling hormone concentration data across genotypes, where no genotypic effect was observed (n = 3).

All *in vitro* precocious germination experiments were repeated at least three times with a minimum of 30 seeds per genotype and treatment. The experimental design was completely randomized, and the statistical analysis performed using χ2 test for homogeneity of the binomial distribution. A proportion test analysis was performed when significant differences between treatments were observed. Statistical tests were considered significant when *P* ≤ 0.05. All statistical analyses were performed using Rstudio software.

#### RESULTS

#### Effect of Growing Conditions on Seed Growth and Development

The kinetics of development of PBA Pearl seeds in the environment optimized for rapid growth and development (E1) and the glasshouse environment (E2) are shown in **Figure 1A**. The largest difference in seed fresh weight occurred at 28 DAP (*P* ≤ 0.001), most likely attributable to seeds in E1 entering the desiccation phase at an earlier time point, as documented

FIGURE 1 | (A) Effect of growing conditions on fresh weight (mg seed−1) of developing seeds of PBA Pearl grown in environments E1 (controlled environment room) versus E2 (glasshouse). Data represent mean ± SE, n = 5. Analysis was performed by Student *t* test (*P* ≤ 0.05). (B) Precocious *in vitro* germination of PBA Pearl seeds 14 to 22 days after pollination (DAP) produced in environments E1 and E2. Seed coat was removed before culture. Results represent the percentage of germination 4 days after *in vitro* culture. Statistical analysis was performed using χ2 test for homogeneity of the binomial distribution (n = 30; *P* ≤ 0.05). Asterisks indicate significant differences between treatments.

previously (Ribalta et al., 2017). This variation in development was also evidenced by the differential ability for robust *in vitro* germination of seeds harvested at equivalent time points from the two environments and without the use of plant growth regulators. In E1, *in vitro* germination levels greater than 91% were achieved by culturing immature seeds from 16 DAP, while in E2 similar levels of response were achieved only 4 days later (from 20 DAP, **Figure 1B**).

#### Effect of Growing Conditions on the Hormone Profiles of Developing Seeds Around Embryo Physiological Maturity of Diverse Pea Genotypes

Experiments were undertaken to study the effect of plant growth conditions on the hormone profiles of developing seeds of phenologically diverse genotypes, from an approximate period between the end of embryo morphogenesis and the attainment of embryo physiological maturity stage, i.e. the period when the seed acquires the capacity for *in vitro* precocious germination. In our previous research, we demonstrated that embryo physiological maturity is achieved in pea under intensive conditions at 16 to 18 DAP (Ribalta et al., 2017). Therefore, for the hormone analysis in E1, we selected developing seeds between 10 and 22 DAP. To enable the comparison of the seed hormone profiles between environments, in E2 we selected immature seeds at equivalent developmental stages. The results from the experiment presented in the section above indicate a delay in seed development of approximately 2 to 4 days in the glasshouse environment (E2) compared to the optimized environment (E1). Based on these results, we estimate embryo physiological maturity is achieved in E2 at around 20 DAP, leading us to select immature seeds at 14 to 22 DAP for the hormone analysis.

*Auxins*. A strong environmental effect on endogenous 4-Cl-IAA and IAA content was observed when comparing the seed profiles during the period comprising the achievement of embryo physiological maturity under intensive (E1) and glasshouse (E2) conditions (16–22 DAP). This is clearly demonstrated by the statistical analysis shown in **Table S1D** (*P* < 0.001). Similar 4-Cl-IAA profile patterns were observed in E2 across genotypes with the highest concentrations, between 15,000 and 25,000 ng g DW−1 , at 16 to 18 DAP. Across genotypes, the peak in 4-Cl-IAA content occurred much earlier in E1, typically at 10 to 12 DAP, so that by 16 to 18 DAP, seeds in E2 contained significantly higher hormone levels than E1 seeds (**Figures S1A**, **S2A**, and **S3A**). For example, at 16 DAP, in E2 the mean content of 4-Cl-IAA across genotypes was 17,190 ± 3,545 ng g DW−1 (n = 3), and in E1, 584 ± 218 ng g DW−1 (n = 3). This difference is significant at the *P* < 0.03 level (**Table S2**). In E1, the highest concentrations of IAA were observed at 10 DAP in the three genotypes and 4 to 8 days later in E2 (**Figures S1B**, **S2B**, and **S3B**; **Tables 2** and **S1**). Again, consistently higher concentrations of IAA were detected in seeds from E2 compared to those from E1 at 16 to 18 DAP (**Tables 2** and **S1D**). For example, at 16 DAP, the mean content of IAA across genotypes in E2 was 1,020 ± 229 ng g DW−1 (n = 3), while in E1 it was 34 ± 14 ng g DW−1 (n = 3), a difference significant at the *P* < 0.03 level (**Table S2**).

*Gibberellins.* Clear differences were observed in the GA20 profiles between environments. At the time points comprising the attainment of embryo physiological maturity in both environments (16–22 DAP), seeds in E2 contained significantly higher levels of GA20 than those in E1 (**Figures S1C**, **S2C**, and **S3C**; **Table S1D**). For example, at 16 DAP, seeds in E2 contained 7,469 ± 1,254 ng g DW−1 (n = 3) of GA20, while in E1 the level was 1,927 ± 223 ng g DW−1 (n = 3), a difference significant at the *P* < 0.03 level (**Table S2**). Furthermore, the well-defined peak in GA20 level observed in E2 was less apparent in E1. Also, when comparing GA20 concentrations between environments at the point of complete attainment of embryo physiological maturity in E1 (18 DAP), levels up to 20-fold higher were detected in E2 compared with E1. GA1 was only detected in E2 during the period studied (10–22 DAP; **Table 2**; **Figure S4**).

*Abscisic acid.* The statistical analysis in **Table S1D** showed a clear effect of controlled environment growth conditions on ABA concentrations between 16 and 22 DAP, with consistently lower levels detected in seeds in E2 compared to those grown in E1 (*P* < 0.01; **Figures S1D**, **S2D**, and **S3D**; **Table 2**). For example, across the three genotypes studied, at 16 DAP, the mean level of ABA in E2 was 1,100 ± 204 ng g DW−1 (n = 3), while in E1 it was 2,805 ± 482 ng g DW−1 (n = 3), a difference significant at the *P* < 0.03 level (**Table S2**). At the time point when full embryo physiological maturity is attained in E1 (18 DAP), ABA levels were again up to threefold higher under intensive conditions compared to the glasshouse.

#### Role of Hormones on Precocious *in Vitro* Germination

To determine the origin of endogenous ABA, seeds of the cultivar PBA Pearl were cultured at the embryo physiological maturity stage (18 DAP) with intact, nicked, and removed integuments. The removal of the seed coat resulted in faster germination compared to the culture of intact or nicked seeds. After 4 days of culture, 100% germination was recorded in seeds with the seed coat removed, 70% with nicked seeds, and 9.1% with intact seeds (**Table S5**). All cultured seeds, independently of the treatment, germinated within 10 days of *in vitro* culture.

Precocious *in vitro* germination of immature seeds at 12 DAP was enhanced in all three genotypes with the addition of GA3 to the culture medium. In general, growing seeds 12 DAP in media with GA3 concentrations up to 100 to 125 µM resulted in 2.5 to 3.5-fold increase in germination percentage compared to the control (*P* < 0.05; **Figure 2A**, **Table S3**). The addition of GA3 to the culture media had no effect on precocious germination of seeds 14 DAP in PBA Twilight and Kaspa. On the other hand, for PBA Pearl, exogenous GA3 at concentrations between 100 and 150 µM greatly enhanced the germination rate of 14-DAP seeds compared to the control (*P* < 0.001; **Figure 2B**, **Table S3**). Precocious *in vitro* germination rate was not significantly enhanced by increasing the concentration of GA3 to 125 and 150 µM at either 12 or 14 DAP.

Physiologically mature seeds (18 DAP) of the three genotypes tested showed different sensitivity to the addition TABLE 2 | Effect of growing conditions on hormone content (ng g DW−1) of developing seeds produced in environments E1 [10–22 days after pollination (DAP)] and E2 (14–22 DAP) for diverse pea genotypes.


*nd, not detected; —, not measured. Data represent mean hormone content from a pool of at least five seeds from different plants ± SE. Analysis of variance tests presented in*  Table S1 *show differences in seed hormone content between cultivars, developmental stages, and environments during the period between the end of morphogenesis and initiation of seed dehydration (10–22 DAP), and the period comprising the attainment of embryo physiological maturity in both environments (16–22 DAP; P ≤ 0.05; n = 3).*

of ABA to the culture medium (*P* < 0.001; **Figure 3**, **Table S4**). The addition of 1 µM of ABA reduced the germination rate in PBA Twilight to levels below 10%. Significantly higher levels of exogenous ABA were required to achieve similar levels of germination blockage in cultivars Kaspa (5 µM) and PBA Pearl (10 µM).

# DISCUSSION

Hormones are known to regulate seed development, and their effect has been extensively studied in the model species *Arabidopsis* (Finkelstein, 2013; Binenbaum et al., 2018) and to some extent in *M. truncatula* (Ochatt, 2015). Despite this, little is known about the hormonal regulation of *in vitro* precocious seed germination in legumes (Weber et al., 2005; Ochatt, 2015; Croser et al., 2018). Here, we characterized and compared the hormone profiles of developing seeds harvested from three phenologically diverse pea genotypes from the end of morphogenesis to the attainment of embryo physiological maturity (10–22 DAP) and grown under two different controlled environments. The first environment (E1) was designed to promote rapid generation turnover for single seed descent (artificial light with a 20 h photoperiod). The second environment (E2) was a glasshouse used for normal plant growth and seed production activities (natural light and a photoperiod of 13–14 h). Growing plants under E1 conditions altered the seed hormone content by advancing the auxin, GA, and ABA profiles by 4 to 8 days compared to those of seeds grown under E2 conditions. We observed a synchronization of the IAA and 4-Cl-IAA profiles in E1 across the three genotypes. This was associated with an acceleration of the time to embryo physiological maturity by up to 4 days. In addition, we confirmed the antagonistic effect between exogenous ABA and GA on *in vitro* precocious seed germination.

The manipulation of key *in vivo* growth conditions, including photoperiod, light, and temperature, has enabled the substantial shortening of time to maturity in a broad range of species (reviewed by Croser et al., 2018). Our previous research in pea demonstrated sugar and moisture content of the developing seed varies in response to environmental conditions, and the resulting composition is linked to the achievement of embryo

FIGURE 2 | Effect of the addition of exogenous GA3 to the culture media on the percentage of *in vitro* germination of immature pea seeds (A) 12 days after pollination (DAP) and (B) 14 DAP from phenologically diverse genotypes. Statistical analysis was performed using χ2 test for homogeneity of the binomial distribution and proportional test (*P* ≤ 0.05; n = 30). Different letters indicate a difference at *P* < 0.05. Statistical data are presented in Supplementary Table S3.

physiological maturity (Ribalta et al., 2017). By demonstrating that embryos did not efficiently germinate *in vitro* without exogenous hormones until physiological maturity was reached, we were able to propose sugar and moisture contents as a reliable indicator of readiness for precocious *in vitro* germination. In the present study, when comparing the kinetics of seed development between E1 and E2, the largest differences in seed fresh weight were detected at 28 DAP. This is in line with our previous report where we showed that seeds in the optimized environment reach the dehydration phase at an earlier time point (Ribalta et al., 2017). Nonsynchronous seed development across the two environments was further evidenced by the success rate of *in vitro* germination of seeds harvested at equivalent time points from the two environments and without the addition of exogenous plant growth regulators. For E1, *in vitro* germination levels greater than 91% were achieved by culturing immature seeds from 16 DAP, while for E2 similar levels of response were achieved from 20 DAP. The *in vitro* germination results informed the selection of seed developmental windows for hormone profiling. The focus of the hormone profile analyses was the study of the developmental period between the end of embryo morphogenesis and attainment of embryo physiological maturity, corresponding to the timeframe in which the seed acquires the capacity for *in vitro* precocious germination (Croser et al., 2016; Ribalta et al., 2017; Croser et al., 2018). Thus, under rapid generation turnover conditions, we undertook profile analysis on seeds between 10 and 22 DAP and under glasshouse conditions on seeds harvested 14 to 22 DAP.

Auxins are known to play a major role during the early stages of seed development. Evidence suggests an auxin-mediated promotion of GA synthesis is triggered by fertilization, driving early fruit growth (Dorcey et al., 2009). Recent evidence shows that in pea seeds auxins are also important during later stages of seed development for the determination of embryo structure and size, including starch accumulation (Locascio et al., 2014; Mc Adam et al., 2017). In the present research, clear differences across environments were detected at the time points comprising the period of attainment of embryo physiological maturity in both environments (16–22 DAP; *P* ≤ 0.001). Growing plants under conditions optimized for rapid generation turnover (E1) resulted in the acceleration of the 4-Cl-IAA profile by 6 to 8 days and of the IAA profile by 4 to 8 days compared to the glasshouse environment (E2). Depending on the genotype, we observed the highest 4-Cl-IAA and IAA levels at 10 to 12 DAP in E1 and at 16 to 20 DAP in E2, with a substantial lowering in concentration after that time point. Auxins have also been implicated in the onset and length of endoreduplication (Ochatt, 2015). Endoreduplication is a progressive phenomenon in storage accumulating organs during the transition between cell division and seed maturation phases (Kowles et al., 1990). This is concomitant with an increase in DNA synthesis and the accumulation of storage proteins, and there is a considerable agronomic interest in understanding the control of this phenomenon (Ochatt, 2011). In our research, the peak auxin concentrations observed in seeds produced in E1 (10–12 DAP) coincide with the peak endoreduplication observed in *M. truncatula* seeds (Ochatt, 2011; Atif et al., 2013). Our findings indicate that under E1 conditions, the end of morphogenesis and the concurrent initiation of embryo maturation and onset of endoreduplication occur between 10 and 12 DAP when the auxin peak is observed. Tivendale et al. (2012) provide further support for this association, reporting the relationship between decreasing concentrations of 4-Cl-IAA and IAA and the completion of seed development. Likewise, in E2, high auxin concentrations at later stages of seed development and for an extended period indicate that the cell division phase and endoreduplication are prolonged. Should this be the case, it is expected that the delay in seed development observed in E2 will translate into the production of seeds with higher number of cotyledonary cells (Ochatt, 2011; Ochatt, 2015), of a larger surface area (Atif et al., 2013), and probably coupled with a higher storage protein content (Gallardo et al., 2006; Verdier et al., 2013). Further studies in this area would be required to confirm these ideas.

Gibberellins have been recognized as regulators in numerous aspects of plant physiology, including embryo and seed development, induction of seed germination, root development, leaf expansion, stem elongation, and flowering (Salazar-Cerezo et al., 2018). Plants metabolize GAs through the early 13-hydroxylation pathway: GA12 → GA53 → GA44 → GA19 → GA20 → GA1 (Binenbaum et al., 2018), although there is evidence that in young pea seeds GA1 is produced from GA4 (MacKenzie-Hose et al., 1998). During the early stages of seed development, several peaks in the production of the bioactive gibberellin GA1

are observed that drive rapid seed coat and embryo growth. In pea, as the seeds enter into its maturation phase, a shift to the production of GA20 occurs with very low levels of bioactive GA detected in the embryo (Ozga et al., 2009). By the time seeds are dry, virtually all their GA20 has been converted to GA29 and then to GA29-catabolite (Ross et al., 1993). However, the biological significance of the later peaks of inactive GAs on the completion of seed development and subsequent germination is still not clear (Davidson et al., 2005; Ayele et al., 2006). In our experiment, seed GA levels were significantly affected by growth conditions (*P* ≤ 0.001). Using immature seeds around embryo physiological maturity, GA1 was only detected in seeds grown in E2, while its precursor, GA20, was detected in seeds from both environments. GA20 was observed at significantly lower levels (up to 20 times lower) at 18 DAP in Kaspa, and up to 8 days earlier in the profiles of seeds harvested in E1 compared to those from E2. The highest GA20 concentrations were detected between 10 and 16 DAP in E1 and 16 and 18 DAP in E2, with a sharp drop from this point of development. The reduced level of GA20 in E1, compared with E2, is one of the more dramatic effects on hormone content in this study. Mature dry pea seeds are known to contain very little GA20 (Ross et al., 1993), and the low level in E1 is consistent with the evidence from auxin levels indicating that these seeds are physiologically mature at an earlier stage than in E2. In our study, *in vitro* germination of seeds 12 and 14 DAP was 2.5- to 3-fold more successful with the addition of 125 µM GA3 to the culture medium. At 14 DAP, a clear improvement of *in vitro* precocious germination was observed only in PBA Pearl with +100 µM GA3. The enhancing effect of exogenous GA on precocious seed germination in pea is consistent with experiments showing promotion of α-amylase synthesis in germinating wheat seeds treated with GA3 (Hader et al., 2003; Kondhare et al., 2012).

Abscisic acid accumulates during seed maturation and in some species controls seed development and germination (Nakashima and Yamaguchi-Shinozaki, 2013). In *Arabidopsis*, reduced levels of ABA affect the induction of maturation genes leading to defective synthesis of storage proteins and anthocyanins, failed chlorophyll degradation, and causing precocious germination and intolerance to desiccation (Finkelstein, 2013). In developing seeds, ABA can be synthesized locally, originating from the embryo proper during seed maturation and showing peaks at the onset and at end of the maturation phase (Frey et al., 2004; Weber et al., 2005), or imported from the mother plant, through the seed coat (Quatrano et al., 1997). In the present study, as expected, ABA levels strongly fluctuated in response to the environmental conditions (*P* ≤ 0.001). In general, higher concentrations of ABA were detected in E1 compared to E2. Abscisic acid concentrations tended to decrease after reaching the highest levels at 10 to 14 DAP in E1 and 16 to 18 DAP in E2. To further understand the relevance of the ABA fluctuations observed in the hormone profiles to precocious seed germination, we applied exogenous ABA to seeds harvested at embryo physiological maturity (18 DAP) and cultured *in vitro*. Allowing the embryo to reach physiological maturity enables vigorous *in vitro* germination and faster seedling development with no requirement for plant growth regulators in the culture medium (Ribalta et al., 2017). In this experiment, ABA inhibited *in vitro* germination of physiologically mature embryos, with genotypic variations. A 5 µM ABA concentration was sufficient to completely block germination in PBA Twilight and Kaspa, while for PBA Pearl concentrations higher than 10 µM were required to achieve similar results. Additionally, to determine the origin of endogenous ABA, seeds of PBA Pearl were cultured at 18 DAP with intact, nicked, and removed integuments. The removal of the seed coat resulted in a 100% germination compared to 70% germination with nicked seeds and 9.1% germination with intact seeds. This suggests germination is slowed by mechanical impedance of the integuments rather than a hormonal barrier caused by the endogenous levels of ABA in the seed coat. This contrasts with results in *Arabidopsis* that indicate ABA produced in the seed coat affects precocious seed germination (Wang et al., 1998; Raz et al., 2001; Piskurewicz and Lopez-Molina, 2009). Also, the fact that all seeds from this experiment (intact seeds, nicked seeds, and seeds with coat removed) germinated within 10 days of *in vitro* culture highlights that seeds at 18 DAP are mature enough to complete germination.

A dynamic balance between ABA and GAs controls the progression of seed maturation to germination; therefore, there is ecological and commercial value in understanding this physiological regulation (Weber et al., 2005; Rodriguez-Gacio et al., 2009; Liu et al., 2010; Ochatt, 2015). Abscisic acid levels increase during the late phase of seed maturation and are maintained until germination, while GA concentrations remain relatively low during this period until seed imbibition (Ogawa et al., 2003; Locascio et al., 2014). The *FUS3* gene plays an essential role in the coordination of ABA: GA levels during the late stages of seed development and germination. The *FUS3* gene regulates ABA and GA synthesis, and these two hormones in turn determine the stability of the FUS3 protein (Gazzarrini et al., 2004). Gibberellin is negatively regulated by *FUS3*, while ABA is a positive regulator of many *FUS3*-regulated embryonic functions including storage reserve accumulation, desiccation tolerance, and dormancy (Keith et al., 1994; Bäumlein et al., 1994; Leung and Giraudat, 1998; Gazzarrini et al., 2004). Hence, the ABA : GAs ratio is crucial for the completion of seed maturation and the initiation of germination (Liu et al., 2010; Locascio et al., 2014; Ochatt, 2015). As previously indicated, in this study we confirmed the antagonistic effect between ABA and GA on pea seed germination through the *in vitro* culture of seeds at the embryo physiological maturity (18 DAP) with the addition of exogenous hormones to the media. The ratio of ABA to GA was proposed as an indicator of embryo maturation for *in vitro* culture studies in legumes (Slater et al., 2013). Seed endogenous ABA concentrations are known to increase as the seed matures; however, being a stress-response hormone, ABA levels also fluctuate during the day in response to environmental signals. Therefore, the ratio of ABA to GA is not a reliable indicator of embryo physiological maturity when growing plants under conditions for rapid generation turnover. On the other hand, the earlier peak in auxin and GA production in E1 compared with E2 is likely to contribute to the earlier attainment of physiologically maturity and earlier competence to germinate in E1, since auxindeficient seeds do not develop normally, and their germination rate is low (Mc Adam et al., 2017). This suggests auxin and GA profiles can act as reliable indicators of the end of morphogenesis and the initiation of seed maturation.

Developmental and environmental signals (such as water potential, temperature, and light quality) influence endogenous hormone levels in developing seeds and the complex signaling connections between hormones and sugars, which ultimately control seed size, starch and protein accumulation, dormancy, and germination (Piskurewicz et al., 2009; Rodriguez-Gacio et al., 2009; Locascio et al., 2014). In the present study, we provide new information regarding the influence of growing conditions on the progress of seed development and maturation and on endogenous hormone accumulation across diverse genotypes of the model legume species pea. These results will provide further insights into the hormonal regulation of legume seed development and *in vitro* precocious germination and contribute to the design of efficient and reproducible biotechnological tools contributing to genetic gain.

#### DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this manuscript will be made available by the authors, without undue reservation, to any qualified researcher.

# AUTHOR CONTRIBUTIONS

FR, MP-N, JC, SO, and JR conducted experimental design, data analysis/interpretation, and manuscript writing. KE and FR conducted *in vitro* experiments and data collection. MP-N and JR conducted hormone analysis.

# FUNDING

The UWA authors acknowledge funding provided by the Grains Research and Development Corporation [UWA00175].

# ACKNOWLEDGMENTS

We thank Mr. R. Creasy, Mr. B. Piasini, and Mr. L. Hodgson for glasshouse expertise. We also thank Dr. David Nichols and Assoc. Prof. Noel Davies (Central Science Laboratory, University of Tasmania) for mass spectrometry, Toby Ling for sample preparation, Prof. Lewis Mander for providing labeled GAs, and Prof. Jerry Cohen for labeled 4-Cl-IAA.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01154/ full#supplementary-material

### REFERENCES


enzymes and structural proteins in *Medicago truncatula* embryo axis. *Mol. Plant* 2, 108–119. doi: 10.1093/mp/ssn092


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Ribalta, Pazos-Navarro, Edwards, Ross, Croser and Ochatt. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Sustainability Dimensions of a North American Lentil System in a Changing World

#### Teresa Warne<sup>1</sup> , Selena Ahmed<sup>1</sup> \*, Carmen Byker Shanks <sup>1</sup> and Perry Miller <sup>2</sup>

<sup>1</sup> Montana State University Food and Health Lab, Department of Health and Human Development, Montana State University, Bozeman, MT, United States, <sup>2</sup> Cropping Systems Laboratory, Department of Land Resources and Environmental Science, Montana State University, Bozeman, MT, United States

Food production and consumption are among the largest drivers of global change. The adoption of lentil in production systems and in plant-based diets is a food system solution that can support the environmental, socio-economic, and human health dimensions of sustainability. The purpose of this study is to evaluate producer and consumer perceptions of the sustainability profile of the lentil system in Montana (USA), and the surrounding region that includes Idaho (USA), North Dakota (USA), Washington (USA), and Canada, in the context of global change. Surveys were conducted with lentil producers (n = 63; conventional n = 42, organic n = 15, and mixed management n = 6) and consumers (n = 138) in the rural state of Montana (USA). The most prevalent agronomic reason for including lentil in production systems reported by producers is to diversify crop rotation (92%). The most prevalent economic reasons for including lentil in rotation reported by producers is to capitalize on dryland production (95%) and to serve as a cash crop (87%). With respect to lentil consumption, the most prevalent health-related perceptions were that eating lentils helps to improve nutrition (88%), feel satiated or full (85%), and support a plant-based diet (81%). Consumers and non-consumers of lentils alike reported they would increase lentil consumption based on environmental (78%), economic (75%), and health and nutrition (72%) information contrasting lentils and animal-based protein sources. Overall, findings highlight how the lentil system supports multiple dimensions of sustainability based on the perspectives of study informants. Additionally, findings elucidate barriers and opportunities for promoting lentil in agricultural systems and diets. Impacts of market, policy, and climate change on lentil production, and lack of consumer knowledge on benefits of lentils to help meet food security through a sustainable diet, challenge sustainability dimensions of lentil in the food system.

Keywords: lentil, sustainability, food security, management practices, consumption

# INTRODUCTION

One of the greatest societal challenges of our times is to feed a growing population a healthy and nutritious diet in an environmentally, economically, and socially sustainable way (Tilman and Clark, 2014; Willett et al., 2019). Food production and consumption are among the largest drivers of environmental degradation (Meybeck and Gitz, 2017) and global change (Willett et al., 2019)

#### Edited by:

Alfonso Clemente, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Juana Frias, Instituto de Ciencia y Tecnología de Alimentos y Nutrición (ICTAN), Spain Tahira Fatima, Purdue University, United States

> \*Correspondence: Selena Ahmed selena.ahmed@montana.edu

#### Specialty section:

This article was submitted to Crop Biology and Sustainability, a section of the journal Frontiers in Sustainable Food Systems

> Received: 01 June 2019 Accepted: 20 September 2019 Published: 11 October 2019

#### Citation:

Warne T, Ahmed S, Byker Shanks C and Miller P (2019) Sustainability Dimensions of a North American Lentil System in a Changing World. Front. Sustain. Food Syst. 3:88. doi: 10.3389/fsufs.2019.00088 with dietary patterns impacting numerous facets of society (Mason and Lang, 2017; Meybeck and Gitz, 2017). The food system is further challenged by population growth, food insecurity, and food justice (Popkin et al., 2012; Tilman and Clark, 2014). The expected rise in global population from ∼7.5 billion people to 9.7 billion people by 2050 will place increased pressure on ecosystems and society to ensure food security for all (Zhang et al., 2007; United Nations Department of Economic and Social Affairs (UN DESA), 2019). These food system challenges are exacerbated by climate change with notable implications for sustainability (Mason and Lang, 2017; Willett et al., 2019).

In recognition of the aforementioned challenges, a sustainable food systems approach is increasingly recognized and promoted to support environmental and human wellbeing (Johnston et al., 2014; Herforth et al., 2017; Mason and Lang, 2017; Ahmed and Byker Shanks, 2019; Willett et al., 2019). A sustainable food systems approach seeks to enhance the environmental, socio-economic, and health aspects of sustainability from food production to consumption to waste, including processing, distributing, preparing, marketing, and accessing food involved (Herforth et al., 2017; Mason and Lang, 2017; Ahmed and Byker Shanks, 2019). For example, on the production side of food systems, producers can adopt agricultural practices including diversified crop rotations, cover crops, no-till, crop diversification, nutrient management, integrated pest management, and rotational grazing (Horrigan et al., 2002) to support ecosystem services including carbon sequestration, nutrient cycling, soil retention, increased water holding capacity, and soil fertility (Power, 2010). More specifically, including lentil in production diversifies crop rotation, provides nitrogen fixation, helps break pest and disease cycles, and is a dryland crop suitable to arid regions (Peoples et al., 2015). On the consumption side of food systems, consumers can change their dietary choices including adoption of plant-based diets rich in pulse crops (Gonzalez Fischer and Garnett, 2016; Herforth et al., 2017) and reduce food waste (Ahmed et al., 2018). More specifically, lentils are a pulse, and relatively affordable high-quality source of plant-based protein (∼24– 26%), carbohydrate (∼60–64%), and dietary fiber (∼11–31%) (Ganesan and Xu, 2017).

There is a gap in research regarding barriers and opportunities that producers and consumers face with respect to lentil production and consumption using a sustainable food system approach, specifically in the context of North America. The following study addresses this research gap through the examination of the research question: What are producer and consumer perceptions of the sustainability profile (environmental, socio-economic, and health dimensions) of the lentil system in Montana and surrounding region in the context of global change (climate change, land-use change, and market demand), and what are associated barriers and opportunities? Findings may inform future research on lentil production and consumption, may inform policy in favor of supporting lentil production through producer incentives, and highlights education and outreach efforts on promoting lentils in plant-based diets to support sustainable food systems.

# BACKGROUND

The food system experiences environmental, socio-economic, and health challenges which have an effect at a global though local scale. Food systems and global agricultural production are responsible for 19–29% of total greenhouse gas emissions (Vermeulen et al., 2012), account for 38% of land use (Foley et al., 2005), and 70% of freshwater use (Steffen et al., 2015). Livestock alone accounts for 14.5% of total greenhouse gas emissions (Gerber et al., 2013). Current dietary patterns across the globe, including the trend of increased consumption of animal-sourced foods in excess of dietary recommendations, burden ecosystems through pressures related to land use, resources used for feed production, and nutrient overload (Linseisen et al., 2002, 2009; Bouwman et al., 2013). In addition, roughly one-third of total food produced is lost or wasted along the food supply chain (Gustavsson et al., 2011; Fox and Fimeche, 2013; High Level Panel of Experts on Food Security and Nutrition (HLPE), 2014). Monetary estimates of global annual food loss are as high as USD 936 billion (Food Agriculture Organization of the United Nations, 2014). Finally, poor diets are a leading risk factor of the global burden of disease (Stanaway et al., 2018). More than 820 million people in the global food system are undernourished (Food Agriculture Organization of the United Nations, 2018) and more than 2 billion people are micronutrient deficient, despite global production of sufficient calories and nutrients to feed the world (Ritchie et al., 2018). At the same time, overweight and obesity afflict every country (Development Initiatives, 2018) and are associated with the rise in diet-related non-communicable diseases including coronary heart disease and cancer, risk of stroke, and type II diabetes (Aune et al., 2009; Popkin, 2009; Hu, 2011; Huang et al., 2012; Pan et al., 2012; Chen et al., 2013).

Transition from a modern Western diet to a plantbased diet has been found to have numerous benefits for environmental and human wellbeing including reductions in land use, greenhouse gas emissions, water use, and mortality risk and rates (Aleksandrowicz et al., 2016; Peters et al., 2016). Adoption of plant-based diets rich in pulse crops such as lentils, is a food system solution that is being promoted to support the environmental, socio-economic, and human health dimensions of sustainability including enhancing biodiversity, farmer livelihoods, food security, and nutrition while contributing to climate change mitigation and adaptation (Kissinger and Lexeme Consulting, 2016).

On the production side of food systems, growing lentil serves as a livelihood strategy for many populations in arid regions, such as the northern Great Plains, while providing a drought tolerant pulse crop that can be grown under relatively water-limited and rain-fed environments (Miller et al., 2002; Thornton and Cramer, 2012). Lentil and other pulse crops can reduce inorganic nitrogen fertilizer requirements, both during crop growth and for subsequent crops, in a crop rotation through their ability to fix nitrogen from the atmosphere by legume-rhizobia symbiosis in the soil (Lemke et al., 2007; Canfield et al., 2010; Burgess et al., 2012; Peoples et al., 2015). In addition, lentil may improve the productivity of the subsequent crop through increased availability of nitrogen (Burgess et al., 2012; Peoples et al., 2015). Lentil has a wide range of other production benefits including recycling of water and nutrients and helping with weed and pest control (Krupinsky et al., 2002; Lupwayi and Kennedy, 2007). The inclusion of lentil in production systems may increase soilbuilding capacity and stimulate nitrogen fixation, which could serve to create conducive conditions to reduced-tillage practices (Lafond et al., 1993; van Kessel and Hartley, 2000; Tanaka et al., 2010). These agricultural benefits may translate to environmental and economic savings with respect to use of nitrogen fertilizer (Burgess et al., 2012; MacWilliam et al., 2014) and pesticides (Krupinsky et al., 2002; Lupwayi and Kennedy, 2007). The integration of lentil in agricultural systems without tillage may also result in reduced labor and time as well as reduced use of machinery and fossil fuels (van Kessel and Hartley, 2000). Previous research has additionally highlighted some challenges due to the inclusion of lentil in rotation, that include harvesting challenges, increased soil erosion, and evaporative water loss due to sparse ground cover and short stubble height (Cutforth et al., 2002; Miller et al., 2002).

On the consumption side of food systems, lentils support food security as a dietary staple in many low to middle income countries such as India. The addition of lentils in diets are recognized for numerous health benefits. The nutritional profile of lentils include iron (∼6.5–7.7 mg), magnesium (∼47– 69 mg), potassium (∼677–943 g), zinc (∼3.3–5.9 mg), and folate (∼479–555 µg) per 100 g raw lentils that may help support micronutrient deficiencies and healthy pregnancy (Mitchell et al., 2009; Sen Gupta et al., 2013; Ganesan and Xu, 2017; Singh, 2018; United States Department of Agriculture (USDA) Agricultural Research Service, 2018). In an in vitro experiment, extract of lentil showed a potential source of antioxidant phenolics that could be used in health promoting applications such as dietary supplements (Zou et al., 2011). Despite the potential health benefits of lentils, consumption of lentils and other pulse is relatively low in developed countries such as the United States of America (USA), where 7.9% of the population eat pulse on any given day (Mitchell et al., 2009). Due to socio-economic aspects, consumption of pulse is higher among lower income households and the Hispanic population (Lucier et al., 2000). In parallel with low pulse consumption in the USA, current dietary advice recommends ∼100–300 g pulse per week per each of the food groups "protein" and "vegetable," for a 2,000-kcal diet (United States Department of Health and Human Services and US Department of Agriculture, 2016).

As a relatively affordable and nutritious source of protein that can contribute to food security, lentil production has increased in the past few decades. In 2017, global lentil production and area harvested was ∼7.6 million tonnes and 6.6 million ha, respectively, compared to ∼2.8 million tonnes and 3.5 million ha in 1998 (Food and Agriculture Organization of the United Nations, 2017). With respect to lentil production, the top five lentil producing countries in 2017 included Canada (49%), India (16%), Turkey (6%), United States (4%), and Kazakhstan (4%) (Food and Agriculture Organization of the United Nations, 2017). The gap between lentil production in Canada and India has closed over the last 20 years (1998–2017), and since 2015 lentil production in Canada has surpassed production in India (Food and Agriculture Organization of the United Nations, 2017). Lentil production in Canada increased from around 480,000 tonnes (1998) to about 3.7 million tonnes (2017). While lentil production in the United States is much smaller compared to Canada, lentil production has steadily increased over the last 20 years from about 88,000 tonnes (1998) to about 340,000 tonnes (2017) (Food and Agriculture Organization of the United Nations, 2017).

Despite the increase in lentil production in the United States, as well as the recognized benefits of lentil for sustainability on both the production and consumption sides of the food system, there have been relatively few studies examining the contribution of lentil to sustainability and associated barriers and opportunities for lentil production and consumption. The aim of this study is to examine producer and consumer perceptions of the environmental, socio-economic, and health dimensions of sustainability of lentil production and consumption in Montana and greater region including Idaho (USA), North Dakota (USA), Washington (USA), and Canada. Due to status as the number one producer of lentil in the USA, Montana was selected as a study site (Montana Department of Agriculture, 2015). Lentil production in Montana has increased in the last 5 years from about 88,000 tonnes in 2013 to about 198,000 tonnes in 2017 [United States Department of Agriculture (USDA) National Agricultural Statistics Service (NASS), 2018; **Figure 1**].

Previous studies on lentil production in Montana and North America more broadly highlight the agronomic, environmental, and economic benefits of lentil. A study on the addition of lentil into spring wheat rotations to replace summer fallow in the semiarid Canadian prairies found that lentil improved overall productivity and sustainability while economic benefits were only realized when the price of lentil was above a certain threshold (Zentner et al., 2001). The inclusion of lentil and other pulse crops in an oil seed rotation system in western Canada was found to reduce environmental impacts of production and improve farm-level return (MacWilliam et al., 2014). A meta-analysis by Miller et al. (2002) on pulse and lentil management highlighted the climate resiliency of lentil based on the finding that weather parameters could not be related to lentil yield, thus allowing for the broad adaptation of lentil in the semi-arid region of the northern Great Plains (NGP). Similarly, a review by Cutforth et al. (2007) identified pulses and lentil as "plastic" and adaptable to various weather conditions in the semi-arid NGP region. Lentil was further found to have lower energy intensity and reduce the energy intensity of the subsequent crop in a Montana-based study (Burgess et al., 2012). Carlisle (2014) found resilience in diversified organic agricultural systems including lentil in the NGP was largely due to producer flexibility and willingness to adapt. Furthermore, with respect to nitrogen cycling, lentil, and other pulse included in rotation in sites in the NGP resulted in either no change or a small reduction in greenhouse gas emissions to the atmosphere (Lemke et al., 2007).

# MATERIALS AND METHODS

#### Study Site

Surveys were carried out in the rural state of Montana (USA) and the greater lentil producing region of the northern Great

FIGURE 1 | Study region and change in montana lentil production. (A) Study region that includes Montana (USA), Idaho (USA), North Dakota (USA), Washington (USA), and Canada; and change in lentil production over time including (B) 2013 Montana lentil production (total tonnes per year) by agriculture district, (C) 2014 Montana lentil production (total tonnes per year) by agriculture district, (D) 2015 Montana lentil production (total tonnes per year) by agriculture district, (E) 2016 Montana lentil production (total tonnes per year) by agriculture district, (F) 2017 Montana lentil production (total tonnes per year) by agriculture district. Data source: Shapefiles from Montana State Library Geographic Information Clearinghouse and United States Geological Survey; lentil production from United States Department of Agriculture National Agricultural Statistics Service, accessed March, 2019.

Plains and the Pacific Northwest of North America. Montana is an expansive agricultural state with just under 27,000 farms operated on about 2.35 M ha. Ranch and rangeland accounts for about 81% of farm operated land area, with the remaining 19% of land dedicated to crop production (United States Department of Agriculture (USDA) National Agricultural Statistics Service (NASS), 2018). The region is suitable for numerous commodities including livestock and milk production, wheat and other cereal grains, oilseeds, pulse, hay and forage, surgarbeets, potatoes, and vegetables and fruits adapted to the semi-arid climate. Projections of climate change in Montana highlighted in the 2017 Montana Climate Assessment (Whitlock et al., 2017) include temperature increase between 2.5 and 3.3◦C, increased overall precipitation with a decrease during summer months, longer growing season with an increase in frost-free days, and decreased mountain snowpack (Whitlock et al., 2017), all which point to hotter, drier, and longer summers.

#### Producer Structured Questionnaire

Lentil production sites and/or key informants were identified through local field experts working on dryland and sustainable agriculture in the region. Site visits to lentil production systems and stakeholders in Montana (Gallatin, Hill, and Missoula counties) were completed and key informants were interviewed (n = 3). A structured survey questionnaire was designed from feedback by key informants, and coupled with applicable material from review of the literature, to elucidate producer perceptions of the sustainability profile of the lentil system and associated barriers and opportunities (Thornton and Cramer, 2012; Villamil et al., 2012; Rejesus et al., 2013; Kissinger and Lexeme Consulting, 2016). The survey instrument was reviewed for face validity by content experts with expertise in agriculture, climate change, rural communities, sustainability, food systems, and sustainable diets. Revisions were made to the survey based on feedback from experts, and the survey was pilot-tested for validity with key informants at the Montana Pulse Day Conference (November 2018) to lentil producers (n = 12) from Montana (USA), and Idaho (USA), North Dakota (USA), Washington (USA), and Canada (herein "greater region").

The final survey instrument (**Supplementary Material**) included 33 multiple choice and open-ended questions divided into four sections: (1) Background of production system, (2) Management of lentil production, (3) Social, economic, and health dimensions of the lentil production system, and (4) Global change, challenges, and opportunities. Specifically, section One of the survey included general questions to understand background of the lentil production system including overall farm size, location, and management methods. Section Two of the survey included questions to understand current management practices, management outcomes and challenges, and onfarm environmental observations related to lentil production. Section Three of the survey included questions to understand perceptions of social, economic, and health dimensions of lentil production including questions specific to the North American consumer. Section Four of the survey included questions to understand challenges and future concerns and opportunities regarding lentil production in the context of global change including bioenergy production and feasibility.

Approval for human subjects to participate in this study was received from the Institutional Review Board (IRB) at Montana State University prior to survey implementation. Informed consent was obtained from all survey participants prior to completing the survey. The survey took ∼15–20 min to complete. The final survey instrument was input and formatted in the SurveyMonkey (SurveyMonkey Inc., San Mateo, California, USA, www.surveymonkey.com) platform, and administered both in-person (n = 28) and in an online format (n = 51). The in-person survey was distributed at the Montana Grain Growers Association Conference (n = 11) and the Montana Organic Association Conference (n = 17) in Great Falls, MT (November and December 2018, respectively). While the attendants of these two venues may have overlapped, each survey participant was unique. The online survey was distributed through USA Dry Pea and Lentil Council, Northern Pulse Growers Association, and University of Idaho Extension newsletters and/or distribution lists (open online December 2018–January 2019). The distribution of the surveys through these multiple types of agricultural organizations was carried out in order to elicit responses from a range of both organic and conventional lentil producers in Montana (USA), and the greater lentil producing regions in Idaho (USA), North Dakota (USA), Washington (USA), and Canada. The researcher did not make successful connections with Canadian-based pulse and/or lentil organizations, thus eliminating the opportunity to utilize an online platform to distribute the lentil producer survey more broadly to informants in Canada. In addition, the researcher did not travel outside of Montana (USA) thereby eliminating the opportunity to distribute the survey in-person in Canada, and additionally Idaho (USA), North Dakota (USA), and Washington (USA). Therefore, with the majority of producer informants located in Montana (USA), reference to Idaho (USA), North Dakota (USA), Washington (USA), and Canada as the "greater lentil producing region" outside of Montana (USA) is not meant to minimize the perceptions and observations of producers from these areas, but rather meant to account for small sample size from these participating areas.

#### Consumer Structured Questionnaire

Two structured questionnaires were designed, piloted, and implemented with consumers in Montana including a questionnaire for consumers who eat lentils several times a year (**Supplementary Material**: Survey for Consumers of Lentils) and a questionnaire for consumers who generally do not eat lentils (**Supplementary Material**: Survey for Consumers Who Do Not Eat Lentils). The surveys were designed based on review of the applicable literature (Bickel et al., 2000; Thornton and Cramer, 2012; Gundersen et al., 2017; Palmer et al., 2018). The consumer surveys were reviewed for face validity by content experts with expertise in sustainability, food systems, sustainable diets, nutrition, and health. Revisions were made to both versions of the consumer survey based on feedback from experts.

The final survey instrument for consumers who eat lentils included 23 multiple-choice, Likert-scale, and openended questions divided into the following five sections: (1) Individual/household consumption patterns, (2) Consumer knowledge, (3) Food security status, (4) Market policy, and (5) Comparison of lentils and animal-based protein sources. The survey for consumers who do not eat lentils consisted of 10 multiple-choice, Likert-scale, and open-ended questions divided into the following three sections: (1) Consumer knowledge, (2) Food security status, and (3) Comparison of lentils and animal-based protein sources. The background section of both survey instruments included questions to elicit demographic information including age range and food security status as well as questions to elucidate consumer understanding and/or perceptions of sustainability aspects of lentil consumption and production. Each survey instrument included a separate lentil brochure (**Supplementary Material**: Lentil Survey Brochure) for informants to utilize when answering the final section of the survey. The brochure included information regarding the environmental, economic, and nutritional aspects of lentils and animal-based protein sources. Informants had the option to choose between the two types of lentil consumer surveys on the basis of self-identified level of lentil consumption.

As for the producer survey, approval for human subjects to participate in this study was received from the Institutional Review Board (IRB) at Montana State and informed consent was obtained from all survey participants prior to completing the survey. The survey was administered at four locations (January–March 2019) in Gallatin and Park County (Montana, USA) that serve different types of consumers: (1) Bozeman Winters Farmers' Market (serves consumers that can generalized as supporters of local foods), (2) Heebs Fresh Market (local grocery that caters to a wide variety of consumer demographics), (3) Livingston Food and Resource Center (a food pantry and community kitchen that serves an economically vulnerable population), and (4) Montana State University Family Science Night (serves Bozeman-area families). The distribution of the surveys through these multiple locations was carried out in order to elicit responses from a range of consumers in Montana (USA) including both consumers and non-consumers of lentils.

#### Data Analysis

#### Producer Structured Questionnaire

A total of 79 producers completed, or partially completed, the survey from lentil producing areas in the USA and Canada. Participants with over 30% missing/incomplete responses were removed from the sample resulting in a final sample size of 63 informants. As not all informants responded to every question, sample size may vary among responses. Quantitative data was analyzed using the JMP (JMP <sup>R</sup> SAS Institute Inc., Cary, IL, USA) statistical software program. Analysis of Variance (ANOVA) and contingency analysis were carried out to compare differences among survey responses on the basis of the following three management practices: conventional management (n = 42), organic management (n = 15), and mixed management (n = 6; both conventional and organic management). The Pearson p-value is reported for significant differences in survey response among conventional, organic, and mixed management systems. Responses to open-ended questions were coded by two researchers following methods outlined in Saldana (2015). Coding involved development of a code book based on prevalent themes that emerged from responses. The coded responses were quantified and reported.

#### Consumer Structured Questionnaire

A total of 138 informants completed the survey including those who consume lentils (n = 70) and those who do not generally consume lentils (n = 68). As not all survey informants responded to every question, the sample size may vary among responses. As with the producer survey, consumer survey responses were analyzed using JMP statistical software. An ANOVA and contingency analysis were completed for survey responses between consumers who eat lentils and consumers who generally do not eat lentils. Further, analysis between lentil consumers was completed among groups with low, medium, and high frequency of lentil consumption. The Pearson p-value is reported for significant differences in survey response among consumers and non-consumers, as well as among low, medium, and high lentil consumption groups.

#### RESULTS

#### Producer Structured Questionnaire Background of Lentil Production Systems

The majority of producers' farms (n = 62) were located in the Montana (USA) and accounted for 61% of informants, followed by 18% of producers located in Idaho (USA), 11% in Washington (USA), 3% in Saskatchewan (Canada), 2% in North Dakota (USA), 2% in Manitoba (Canada), and 3% with farms located in two or more states. Lentil production systems (n = 63) ranged among conventional management (n = 42; 67%), organic management (n = 15; 24%), and mixed management systems (n = 6; 10%). Producers reported total farm area (n = 59) ranged from ∼150 ha to about 10,000 ha with a mean farm size of 2,195 ha and standard deviation of 1,937 ha.

Producers' experience growing lentil (n = 61) ranged from 1 year to over 15 years. Producers reported they grew lentil for 1–5 years (36%), more than 15 years (33%), 6–10 years (20%), and 11– 15 years (12%). Average land area dedicated to lentil production reported by producers (n = 62) ranged from <40 ha to >400 ha. Range of land area under lentil production reported by producers include 40–200 ha (29%), 200–400 ha (29%), >400 ha (23%), and 40 ha or less (19%). Producers reported they grow a variety of lentil including black, brown, French green, large/medium/small green, and red.

#### Environmental and Management Dimensions of Lentil Production Systems

The most prevalent management practices (**Figure 2**) reported by producers (n = 63) include dryland farming (83%), crop rotations (83%), and land rolling (76%). The least prevalent management practices reported by producers include cover cropping (16%), use of inorganic fertilizer (10%), and irrigation (3%). The practices that were significantly different among conventional, organic, and mixed management producers include use of chemical desiccant (p < 0.0001), no-tillage (p = 0.007), fungicide treated seed (p < 0.0007), tillage (p = 0.0176), swathing (p < 0.0001), organic certified (p < 0.0001), and cover cropping (p < 0.0001). Specifically, a greater number of conventional producers reported use of chemical desiccant (91%), no-tillage (86%), use of fungicide treated seed (93%), and tillage (46%) in contrast to organic and mixed management systems (**Figures 2E–H**). Alternately, a greater number of organic producers reported use of swathing (57%), organic certification (78%), and use of cover cropping (80%) in contrast to conventional and mixed management systems (**Figures 2J,L,M**).

The most prevalent perceptions of the agronomic effects (**Figure 3**) from including lentil in production reported by producers include that the addition of lentil helps transfer nitrogen to subsequent crops (68%), rhizobium inoculants are sufficient to ensure maximum nodulation in their lentil (68%), the addition of lentil helps increase nutrient availability for subsequent crops (65%), and helps increase overall food crop productivity (63%). The least prevalent perceptions of the agronomic effects reported by producers include the addition of lentil in production has decreased moisture availability for subsequent crops (17%), resulted from using no till management (16%), and producers have experienced inefficient nodulation in their lentil crop (16%). The differences in producers' perceptions of the agronomic affects from including lentil in their production system were not statistically significant among conventional, organic, and mixed management producers.

Agronomic rationale (**Figure 4A**) for including lentil in production systems was reported by producers. The most

producers that reported using management practices including (A) dryland farming, (B) crop rotations, (C) land rolling, (D) rhizobium inoculants, (E) chemical desiccant, (F) no-tillage, (G) fungicide treated seed, (H) tillage, (I) conservation tillage, (J) swathing, (K) precision agriculture, (L) organic certification, (M) cover cropping, (N) inorganic fertilizer, and (O) irrigation (n = 63).

prevalent agronomic rationale reported by producers was to diversify crop rotation (92%), while the least prevalent rationale was to offset irrigation (0%). Agronomic rationale for including lentil in production systems that were significantly different among conventional, organic, and mixed management producers include lentil as green manure (p = 0.0274) and brown manure (p = 0.0407). Specifically, a greater number of organic producers reported use of lentil for green manure and brown manure compared to conventional and mixed management producers (**Figure 4A**).

#### Economic Dimension of Lentil Production Systems

The most prevalent range of on-farm income from lentil production and sales received over the past 10 years (2008– 2017) reported by producers (n = 61) was between 6 and 15% (34%). Additionally, range of on-farm income received from lentil production reported by producers included <5% (28%), between 16 and 25% (25%), and >25% of on-farm income (13%). Significant differences in range of income were found among conventional, organic, and mixed management producers with conventional producers earning greater percentages of income from lentil production (p = 0.0369).

Producers reported (n = 61) their perceptions of market and policy factors that impacted lentil production during 2013– 2017 (**Figure 5A**). The majority of producers reported tariffs and/or subsidies (72%) and market variability of lentil (67%) impacted their lentil production. The least prevalent perceived impacts of market and policy factors on lentil production during 2013–2017 reported by producers include cost of labor (16%) and fuel costs (13%). Producers' perceptions of effects of market

and policy factors on lentil production that were significantly different among conventional, organic, and mixed management producers include tariffs and/or subsidies (p < 0.0001) and market variability (p = 0.0308). Specifically, a greater number of conventional producers reported tariffs and/or subsidies (82%) and market variability (76%) impacts lentil production in contrast to organic and mixed management producers.

Producers reported (n = 61) their perceptions of market access for lentil during 2013–2017 (**Figure 5B**). The majority of producers reported they had adequate access to a consistent market (79%), distribution channels (62%), and profitable market for lentil (59%). The differences in producers' perceptions regarding market access during 2013–2017 were not statistically significant among conventional, organic, and mixed management producers.

Producers reported their rationale and reasons for growing lentil related to economics (**Figure 4B**). The most prevalent economic rationale for growing lentil reported by producers include to capitalize on dryland production (95%) and to serve as a cash crop (87%). The economic rationale among conventional, organic, and mixed management producers that were significantly different include to grow lentil as a cash crop (p = 0.0178) and to offset herbicide cost or use (p = 0.0384). Specifically, a greater number of conventional producers reported they grow lentil as a cash crop (68%) and to offset herbicide costs (86%) in contrast to organic and mixed management producers.

#### Health Dimension of Lentil Production System

Producers reported their rationale for growing lentil related to health (**Figure 4C**). The majority of producers reported they grow lentil to support plant-based diets (52%). The least prevalent reason for growing lentil reported by producers was to support local food security (16%). The rationale for growing lentil related to health that were significantly different among conventional, organic, and mixed management producers include to support local food security (p = 0.001). Specifically, a greater number of organic producers reported they grow lentil to support local food security (70%).

With respect to the North American consumer (**Figure 6**), the most prevalent perception of consumer knowledge reported by producers (n = 61) include that consumers are generally knowledgeable regarding the nutrient benefits of lentils (34%). The least prevalent perception reported by producers include consumers are generally knowledgeable regarding how to incorporate lentils into their diets in a nutritionally balanced way (15%) and consumers are generally knowledgeable regarding how to cook with lentils (15%). Producers' perception of North American consumer knowledge that was significantly different include a greater number of conventional producers that reported consumers are generally knowledgeable regarding how to cook with lentils (p = 0.0323).

#### Global Change: Challenges and Opportunities

Producers reported environmental observations and weather affects that impact lentil production, and on-farm opportunities that include the potential for other crops. Environmental observations impacting lentil production reported by producers include drought stress (73%), extreme weather events (57%), pests and disease (46%), and increased temperatures (43%) (**Figure 7A**). Environmental observations impacting lentil production that were significantly different among conventional, organic, and mixed management producers include pests and disease (p = 0.002). Specifically, a greater number of

conventional producers reported they observed pests and disease impact lentil production. Producers reported their perception of weather variation and extremes on their agricultural business on at least one or more occasion (**Figure 7B**). Half or more of producers reported El Niño or La Niña had an effect on their agricultural business (65%) and recent changes in climate due to normal weather cycles had an effect on their agricultural business (50%). A minority of producers reported they had not experienced of the effects of weather variation and weather extremes on their agricultural business (8%). Producers' views regarding the effect of climate change were significantly different among conventional, organic, and mixed management producers (p = 0.0012). Specifically, a greater number of organic producers reported climate change had an effect on their

agricultural business (50%) in contrast to convention and mixed management producers.

Producers reported their perceptions of extreme weather patterns and/or climate change on future lentil crop yield and change in areal crop rotation over the next 20 years. The most prevalent perceptions reported by producers include they expect average lentil yield will stay the same (45%). The least prevalent perception reported by producers include they expect a decrease (16%) in lentil crop yield. Perceptions regarding whether or not other area producers would make a significant change in crop rotation due to extreme weather patterns and/or climate change in the next 20 years reported by producers include they are not sure (38%), there will be no change (32%), and yes, there will be a change (30%). The differences between producers'

Total percentage (center) and the proportion of conventional, organic, and mixed management producers that reported market access for lentil that includes adequate access to a consistent market, distribution channels, and profitable market (n = 61).

perception regarding weather impacts on future lentil crop yield and areal changes in crop rotation over the next 20 years were not statistically significant among conventional, organic, and mixed management producers.

With respect to the rising cost of energy, producers reported they would consider making relatively few on-farm changes in the next season or near future specifically related to alternative energy sources such as biofuels and/or landuse change (**Supplementary Figure 1**). While the minority of producers reported they would consider any of the select changes, the most prevalent response reported by producers include they would change their management practices (28%). The least prevalent changes reported by producers include they would try to develop a local market for biofuels (2%) and use alternative fuels available on the market (0%). Producers' consideration of on-farm changes that were significantly different among conventional, organic, and mixed management producers include exploring alternative energy sources such as wind or solar

FIGURE 7 | Environmental observations and perceptions of weather. (A) Total percentage (center) and the proportion of conventional, organic, and mixed management producers that reported environmental observations (n = 63) that have impacted lentil production including drought, extreme weather, pests and disease (p = 0.002), and increased temperature. (B) Total percentage (center) and the proportion of conventional, organic, and mixed management producers that reported effects of weather (n = 60) that have had an impact on their agricultural business including the cyclical weather patterns El Niño and/or La Niña, normal weather cycles and variation, climate change (p = 0.0012), and no effect of weather variation or extremes on their agricultural business.

(p = 0.0002). Specifically, a greater number of organic producers reported they would consider exploring alternative energy such as wind and solar (73%) compared to conventional and mixed management producers (**Supplementary Figure 1**).

With respect to the rising cost of energy, producers reported the feasibility of alternative crops and products they perceived as having potential for success to help meet local, regional, and/or national future energy needs (**Supplementary Figure 2**). Generally, the most prevalent crops perceived as feasible to help meet energy needs reported by producers include perennial grasses (33%) and cellulosic biomass (30%). The least prevalent alternatives perceived as feasible to help meet energy needs reported by producers include plant wastes (23%) and algal biofuels (17%). The crops and products perceived as feasible to help meet energy needs reported by producers that were significantly different among conventional, organic, and mixed

FIGURE 9 | Purchasing attributes and lentil characteristics. (A) Total percentage (center), and proportion of low, medium, and high lentil consumption groups that reported lentil characteristics that influence purchasing decisions based on lentil types including dried lentils, pre-made lentil meals, and canned lentils. (B) Total percentage (center), and proportion of low, medium, and high lentil consumption groups that reported lentil characteristics that influence purchasing decisions based on lentil varieties including red/orange, green, brown, black, and French green. (C) Total percentage (center), and proportion of low, medium, and high lentil consumption groups that reported characteristics that constitute a high-quality lentil including brightness of color, size, and percentage of splits. (D) Total percentage (center), and proportion of low, medium, and high lentil consumption groups that reported lentil attributes that influence purchasing decisions including locally grown or grown in Montana (USA), certified organic, color, cooking qualities, and grown in the United States or Canada (n = 70).

management producers include corn for ethanol (p = 0.0149) and biodiesel from small grains (p = 0.0115). Specifically, in contrast to organic and mixed management producers, a greater number of conventional producers reported feasibility of corn for ethanol and biodiesel from small grains.

With respect to the rising cost of energy and the feasibility of alternative crops and products to help meet future energy needs, producers reported a variety of factors that would influence their decision to grow an energy crop in the next season or the near future (**Supplementary Figure 3**). The majority of producers reported the factors that would influence their decision to grow an energy crop include improving soil quality and/or building organic matter (72%) and market potential for the crop (70%). The least prevalent factors reported by producers include reducing carbon dioxide emissions (18%) and to create jobs in the community (18%). Factors that would influence producers' decisions to grow an energy crop that were significantly different among conventional, organic, and mixed management producers include concern about using resources for food vs. fuel (p = 0.034) and reducing carbon dioxide emissions (p = 0.0327). Specifically, a greater number of organic producers reported concern about using resources for food vs. fuel (47%) and reducing carbon dioxide emissions (55%).

Producers identified current management challenges for lentil production and concerns for future lentil production. The three most prevalent challenges reported by producers include challenges with (1) weeds and other pests, (2) lentil harvest, and (3) weather. The three most prevalent concerns identified regarding future lentil production reported by producers include (1) market demand and price of lentil, (2) weeds and pests, and (3) weather.

Producers identified the main agronomic reasons they value including lentil in their production system and opportunities for future lentil production. The three most prevalent agronomic reasons reported by producers include they value (1) the rotational benefits from lentil, (2) price of lentil, and (3) nitrogen fixation. The three most prevalent opportunities for future lentil production reported by producers include increase in (1) consumer knowledge and domestic demand, (2) market and price, and (3) research related to new plant varieties.

#### Consumer Structured Questionnaire Demographics

Informants (n = 138) participated in the lentil consumer survey at locations that included the Bozeman Winter Farmers' Market (41%), Heebs Fresh Market (26%), Livingston Food Resource Center (25%), and Montana State University Family Science Night (8%). Informants reported their age range was between 18– 37 years (45%), 38–54 years (25%), 55–73 years (26%), and 74–92 years of age (4%).

Informants self-selected one of the two surveys based off their own/household frequency of lentil consumption that included (1) Survey for Consumers of Lentils (**Supplementary Material**) and (2) Survey for Consumers Who Do Not Eat Lentils (**Supplementary Material**). Informants that self-reported they/their household do/does not generally eat lentils accounted for 49% of informants (herein "non-consumers"). Informants that self-reported they/their household eat/eats lentils (lentils consumed at least a few times to numerous times per year) accounted for 51% of informants (herein "consumers"). Of the self-reported lentil consumers, 20% reported they eat lentils several times a year (herein "low" frequency group), 56% reported they eat lentils around once per week (herein "medium" frequency group), and 24% reported they eat lentils as a regular part of their diet (herein "high" frequency group).

Two questions were selected from U.S. Household Food Security Survey Module: Six-Item Short Form of the Food Security Survey Module (US Department of Agriculture, Economic Service Research, 2012) based on their sensitivity, specificity, and accuracy to detect indication of food insecurity (Gundersen et al., 2017). Informants were asked to report their level of agreement (often, sometimes, or never true) with the following statements, (1) "We worried whether (my/our) food would run out before (I/we) got money to buy more" and (2) "The food that (I/we) bought just didn't last and (I/we) didn't have money to get more" (n = 137). Overall, 23% of informants reported affirmative responses that indicate food insecurity, with an approximately even split between lentil consumers (12%) and non-consumers (11%).

#### Individual Consumption Patterns

The following section on individual consumption patterns includes responses from informants that consume lentils (consumers; n = 70). Change in household lentil consumption over the past 5 years reported by consumers include an increase (59%), no change (36%), and a decrease in lentil consumption (4%). The most prevalent rationale for eating lentils (**Figure 8**) reported by consumers include they eat lentils for their nutritional properties (93%), affordability (77%), and to support a plant-based diet (63%). The differences in consumers' change in household lentil consumption and rationale for eating lentils were not statistically significant among low, medium, and high lentil consumption groups.

Consumers reported they are interested in several attributes when purchasing lentils (**Figure 9**). The most prevalent type and variety of lentils purchased that was reported by consumers include dried lentils (99%) and red/orange lentils (64%). The most prevalent perception of what constitutes high-quality lentils reported by consumers was brightness of color (49%). Purchasing decisions related to social values and quality attributes of lentils reported by the majority of consumers include preference for locally grown (grown in Montana) (66%) and certified organic lentils (56%). Significant differences in lentil attributes were not found among low, medium, and high lentil consumption groups.

#### Consumer Knowledge

The following section on consumer knowledge includes responses from both consumer and non-consumer groups (n = 138). The most prevalent sources of lentil information reported by informants include family and friends (41%) followed by the internet (25%) (**Figure 10**). The least prevalent sources of lentil information reported by informants include supermarket (6%) and a doctor and/or dietician (4%). There were similarities and differences in sources of information on lentils between consumers and non-consumers. The sources of lentil information reported by informants that were significantly different between consumers and non-consumers of lentils include family and/or friends (p = 0.0143), the internet (p = 0.0022), health magazines (p = 0.0141), farmers market (p = 0.0244), and farmers (p < 0.0001). For all differences noted, consumers reported a greater prevalence of lentil information from the select sources compared to non-consumers.

Informants reported their agreement with statements regarding knowledge and perceptions of lentils (**Figure 11**). The majority of all informants agreed they find the taste of lentils desirable (73%) and they feel knowledgeable regarding the nutrient benefits of lentils (51%). Significant differences between consumers and non-consumers include that they find the taste of lentils desirable (p = 0.0028), they feel knowledgeable regarding the nutrients benefits of lentils (p < 0.0001), how to cook with lentils (p < 0.0001), how to incorporate lentils into their diet in a nutritionally balanced way (p < 0.0001), and how to use the different types of lentils into a variety of dishes (p < 0.0001). For all differences noted, lentil consumers reported a greater prevalence of agreement regarding knowledge and perceptions of lentils than non-consumers.

Informants reported their level of agreement with statements regarding health and nutritional aspects from including lentils in diet with a 5-point Likert scale from "strongly agree" to "strongly disagree" (**Figure 12**). The majority of informants reported they either agreed or strongly agreed lentils can help to improve nutrition (88%), feel satiated or full (85%), support a plantbased diet (81%), promote a healthy digestive tract (77%), benefit weight loss efforts (59%), maintain healthy blood sugar (59%), and lower bad cholesterol (51%). Differences between consumers' and non-consumers' reported level of agreement on the health and nutritional aspects of including lentils in diet include lentils help to improve nutrition (p < 0.0001), feel satiated or full (p < 0.0001), support plant-based diets (p < 0.0001), promote a healthy digestive tract (p < 0.0001), maintain healthy blood sugar (p < 0.0001), benefit weight loss efforts (p = 0.0035), lower bad cholesterol (p = 0.0006), produces gas (p = 0.0395), and benefit the diet of those with diabetes (p = 0.0007). For all differences noted, lentil consumers reported a greater prevalence of agreement regarding knowledge and perceptions of lentils than non-consumers.

#### Market and Access

The following section on market and access includes responses from informants that consume lentils (consumers; n = 70). The most prevalent locations consumers reported they purchase lentils include supermarket (83%) and farmers' market and/or cooperative (74%). The least prevalent location consumers reported they purchase lentils include big box stores (11%). The majority of consumers reported they agree lentils are generally available at the market of their choice (99%) and affordable or sold at a reasonable price (97%) (**Figure 13**). Consumers reported they agree they have adequate access to lentils of all types (black, brown, green, French green, and red/orange) in their community (82%). Additionally, consumers reported they agree the lentils

FIGURE 11 | Consumer knowledge and perceptions of lentils. Percentage of all informants (center) and proportion of lentil consumers and non-consumers that reported they (A) find the taste of lentils desirable (p = 0.0028), (B) feel knowledgeable regarding the nutritional benefits of lentils (p < 0.0001), (C) feel knowledgeable regarding how to cook with lentils (p < 0.0001), (D) feel knowledgeable regarding how to include lentils in their diet in a nutritionally balanced way (p < 0.0001), (E) agree that lentils are classified as both a protein and a vegetable in the 2015 Dietary Guidelines for Americans, and (F) they feel knowledgeable regarding how to use different lentils in a variety of dishes (p < 0.0001) (n = 138).

FIGURE 12 | Consumer knowledge and perceptions of health. Total informant level of agreement with health and nutritional statements regarding the effects of including lentils in their diet to help (A) improve nutrition (p < 0.0001), (B) feel satiated or full (p < 0.0001), (C) support plant-based diets (p < 0.0001), (D) promote a healthy digestive track (p < 0.0001), (E) maintain healthy blood sugar (p < 0.0001), (F) benefit weight loss efforts (p = 0.0035), (G) lower bad cholesterol (p = 0.0006), (H) produce gas (p = 0.0395), (I) benefit diet of those with diabetes (p = 0.0007), and (J) reduce cancer risk (n = 138). Significant differences are between lentils consumers and non-consumers (n = 136).

they purchase generally meet their quality standards on the basis of taste, aroma, texture, palatability (93%). The differences between consumers' rationale regarding market and access of lentils and frequency of consumer lentil consumption among low, medium, and high groups were not statistically significant.

#### Lentil Brochure and Willingness to Change Consumption Patterns

The following section on consumer knowledge and willingness to change amount and/or frequency of lentil consumption includes responses from both consumer and non-consumer groups (n = 138). Informants were presented with a lentil brochure (**Supplementary Material**: Lentil Survey Brochure) with information on the environmental, economic, and nutritional aspects of protein production that compared lentils and animal-based protein sources. Informants reported their willingness to change their lentil consumption frequency based off information from the lentil brochure (**Figure 14**). Regardless of lentil consumption, consumers and non-consumers reported they would increase the amount or frequency in which they consume lentils based on the environmental (78%), economic (75%), and nutrition (72%) information contrasting lentils and animal-based protein sources. The differences between consumers' and non-consumers' willingness to change their frequency of lentil consumption that were significantly different included the nutritional information (p = 0.0103). A greater prevalence of non-consumers reported they were not sure, and a greater prevalence of consumers reported they would not change their frequency of consumption, based off the nutritional information.

#### DISCUSSION

#### Key Findings

This study elucidates perceptions of lentil producers and consumers and highlights the contribution of lentil production and consumption to the sustainability profile of lentil in Montana and the surrounding lentil-producing regions. On the production side of the lentil system, producers from all management types reported environmental, socio-economic, and health aspects related to lentil production that include they grow lentil to diversify crop rotations (92%), capitalize on dryland production (95%), and as a cash crop (87%), and half of producers reported they grow lentil to support a plant-based diet (52%). On the consumption side of the lentil system, lentil consumers generally were more knowledgeable about lentils, and eat lentils due to their nutritional properties (93%), affordability (77%), and to support plant-based diets (63%). Lentil consumers and non-consumers alike reported they would increase their lentil consumption based on environmental (78%), economic (75%), and nutritional (72%) information contrasting lentils and animal-based proteins.

#### Producers

Similarities among conventional, organic, and mixed management producers point to the overall sustainability of lentil production in the food system. For example, the majority of producers reported certain perceptions and practices regarding lentil production that contribute to the environmental, socio-economic, and health dimensions of sustainability. Environmental aspects include lentil helps diversify crop rotations (92%), nitrogen transfer to subsequent crop (68%), increase nutrient availability for subsequent crop (65%), and increase yield of subsequent food crops (63%). Socio-economic aspects include producer perceptions and practices that lentil production results in savings in input costs such as fertilizer (40%) and herbicide (33%), income as a cash crop (87%), adequate access to a consistent (79%) and profitable market (59%), and distribution channels (62%). In addition, lentil production contributes to the health aspects of sustainability through support of plant-based diets (52%) while providing consumers access to an affordable plant-based protein source. Alternately, very few producers reported use of inorganic fertilizer (10%) and irrigation (3%) that point to the low-input nature of lentil production in the study region.

Differences among conventional, organic, and mixed management producers in this study highlight areas where one management type may be more beneficial or resilient in certain aspects of sustainability than another. Specifically, of those that reported each respective management practice or perception, conventional producers more prevalently reported use of no-till (86%) and received greater on-farm income from lentil production (78%). In addition, of those that reported each respective perception of market effect and environmental observation on lentil, conventional producers more prevalently reported impacts of tariffs and/or subsidies (82%) and market variability (76%), and effects of drought (72%), extreme weather (69%), pests and disease (86%), and increased temperatures (70%). This points to both positive outcomes in soil carbon sequestration and on-farm income through lentil production, and barriers to lentil production through policy and market effects, and effects of weather and pests and disease experienced by conventional producers. Alternately, of those that reported each respective management practice, organic producers more prevalently reported swathing (57%), in contrast to use of chemical desiccant reported by conventional producers (91%). In addition, of those that reported each respective perception of market effect and environmental observation on lentil, organic producers less prevalently reported impacts of tariffs and/or subsidies (5%) and market variability (15%), and effects of drought (20%), extreme weather (19%), pests and disease (3%), and increased temperatures (22%). This leads to a potential "resilience effect" experienced by organic producers shown by less prevalently reported impacts of policy and market effects, and effects weather and pest and disease on their lentil crop (Carlisle, 2014).

At a local level, relatively few producers reported they grow lentil to support local food security, however, food security is supported at a regional and/or global level through lentil export. Producers perceive North American consumers are not generally knowledgeable regarding health and nutritional aspects of lentils shown by 15–34% of producers that reported they feel consumers are knowledgeable regarding specific aspects of lentil. This highlights an opportunity for producers to learn consumer perceptions and purchasing habits as well as barriers to local consumption, such as lack of consumer knowledge of lentil.

#### Consumers

Similarities in perceptions and knowledge of lentils between lentil consumers and non-consumers were relatively few. Among all informants, the least reported source of lentil information was from a doctor/dietician (4%), and relatively few informants reported they receive lentil information from dietary guidelines (15%). This highlights an opportunity for education efforts to include individuals in health professions to promote lentils as a part of a healthy eating pattern, as described in the dietary guidelines, in addition to promoting education efforts about the dietary guidelines in school health classes and other appropriate settings. Another similarity between consumers and non-consumers include their willingness to increase lentil consumption based on environmental (78%), economic (75%), and health and nutrition (72%) information of lentils. This points to the educational opportunities to increase regional consumption through promoting sustainability dimensions that are supported through lentil production and consumption. Similarities among consumers that eat lentils were prevalent. For example, lentil consumers among low, medium, and high consumption groups reported they eat lentils for their nutritional properties (93%), affordability (77%), and to support a plantbased diet (63%). While less than the majority, but of similar note, consumers reported they eat lentils to support the environment (49%) and local farmers (43%). In addition, consumers reported they purchase locally grown (66%) and organic (56%) lentils. This points to the awareness of sustainability among lentil consumers in the food system and leads to the impression that awareness of sustainability principles may promote lentil consumption.

Differences in knowledge and perceptions between consumers and non-consumers of lentils were more prevalent. Of those that reported knowledge aspects regarding lentils, relatively few non-consumers reported they feel knowledgeable regarding the nutritional benefits of lentils (21%), how to cook with lentils (21%), and how to include lentils in their diet in a nutritionally balanced way (12%). Non-consumers also more prevalently reported uncertainty in agreement with health aspects regarding lentils such as lentils help improve nutrition, maintain healthy blood sugar, promote a healthy digestive tract, benefit diet of those with diabetes, lower bad cholesterol, and reduce cancer risk. This points to a gap in education and knowledge on the benefits of lentils that are available to consumers, and highlights opportunity to increase access and consumption through outreach efforts directed at consumers, and especially populations vulnerable to food insecurity such as those who participate in federally funded food programs and local food banks and food distribution centers.

#### Limitations

With respect to the lentil producer survey, limitations include sample size and distribution, response bias, and spatial scale. The results in this study apply to producers in the United States, and Canada, to a minimal extent. Specifically, the majority of producers that participated in the producer survey have farm locations in Montana (61%), followed by Idaho (18%), Washington (11%), Canada (5%), North Dakota (2%), and <4% of producers had locations in more than one state. Regional differences could not be elucidated due to small sample sizes across areas outside of Montana (USA). The researcher did not make successful connections with Canadian-based pulse and/or lentil organizations, thus eliminating the opportunity to utilize an online platform to distribute the lentil producer survey more broadly to producers in Canada. Additionally, the researcher did not travel outside of Montana (USA) thereby eliminating the opportunity to distribute the survey in-person in Canada, as well as Idaho (USA), North Dakota (USA), and Washington (USA). Another limitation with respect to sample size and distribution include that organic producers were oversampled by a magnitude several times greater than conventional producers, especially considering organic farmland in the United States is <1% of total farmland in USA (United States Department of Agriculture (USDA) Economic Research Service, 2011). Another limitation in the producer survey is response bias among producers that completed the survey. Additionally, the spatial scale is reported at the state/country level, where-as county-level regions may have further emphasized differences, such as climatic responses of lentil.

Limitations of the consumer survey include scale of survey distribution and sample bias, as well as response bias. The survey results may only apply at a localized level within Gallatin and Park County, Montana. However, the survey was distributed to diverse populations at a local grocery, farmers' market, food resource center, and a university-related family event with a wide range of informants with demographic differences. Another limitation of the consumer survey is the potential bias introduced by informants that completed the survey. If another region in Montana, or region in the United States or Canada was sampled, results may be similar or could substantially vary. For example, if the survey was distributed in other college towns and smaller urban centers located in rural states and provinces, results may be similar. In contrast, if the survey was completed on an American Indian reservation, or predominately rural area, results would likely differ from survey responses from a college town (Gallatin County) and a National Park gateway (Park County).

# Integration to Current Understanding

Lentil production is perceived as a successful crop to include in rotation for the environmental, economic, and health benefits. The most valued reasons for including lentil in rotations were for (1) the rotational benefits from lentil, (2) price of lentil, and (3) nitrogen fixation. In order to become more widely adopted in production systems on the basis of rotational benefits, which may inadvertently promote sustainability within the food system, market demand and price need to be conducive for economic sustainability. In this study, producers reported adequate and consistent access to market and lentil distribution channels, though conventional producers reported effects from policy and market variability impact production. As such, producers identified the three most prevalent concerns regarding future lentil production that include (1) market demand and price of lentil, (2) weeds and pests, and (3) weather. Organic producers less prevalently reported impacts from drought, extreme weather, increased temperature, and pests and disease, however, impacts of weather variability on lentil production was experienced by most producers. A very low prevalence of producers reported they did not experience the effects of extreme weather on their agricultural business. This points to a need for climate adaptation strategies and policy measures put in place to support lentil producers.

The sustainability of lentil and other pulse in food systems have been highlighted by the UN declaration of the 2016 International Year of the Pulse, and include health benefits as well as supporting food security (Kissinger and Lexeme Consulting, 2016). Other studies have highlighted lentil and pulse for their respective health benefits, though few studies have been completed on human subjects. Messina (1999) highlighted the nutrient composition of legumes to support a healthful diet, including lentils, and the limited availability of epidemiological studies on the health effects of legumes on humans. Ganesan and Xu (2017) completed a review of health effects of polyphenols in lentils, and high dietary fiber and prebiotic content in relation to their role in prevention of non-communicable diseases such as diabetes, obesity, cancers, and cardiovascular diseases. Their findings highlight lentils contain slowly digestible starch that may maintain microbiota within the gut that could help to prevent diseases of the colon. Findings also highlighted lentils contain polyphenols rich in antioxidant potential that may protect against diabetes, obesity, cancers, and cardiovascular diseases.

Relatively few countries include pulse or promote sustainable foods in their dietary guidelines. In addition to the United States, countries that include pulse within national dietary guidelines are Australia, Brazil, Bulgaria, Canada, Greece, India, Ireland, Nordic countries (Denmark, Finland, Sweden, Iceland, Norway), Spain, South Africa, and the United Kingdom (Marinangeli et al., 2017). Marinangeli et al. (2017) completed a review to examine national dietary guidelines that include lentil and other pulse in order to unify a target adult serving size, and found 100 g of cooked lentils were a "reasonable" serving to contribute dietary nutrients, with claims such as high in fiber, iron, phosphorus, zinc, folate, and thiamin in the USA. The recommended adult serving of 100 g (per day) falls within suggested recommendations of eating for human health and planetary boundaries, which recommends 0–100 g of dried beans, lentils, and peas per day for a standardized 2,500 kcal diet (Willett et al., 2019). Willett et al. (2019) included considerations of planetary health when creating possible ranges of pulse serving size, similar to sustainability dimensions within select national dietary guidelines. The USA does not currently include sustainability within the dietary guidelines, though Mike Hamm, Timothy Griffin, and the Dietary Guideline Advisory Committee have placed considerable efforts to promote sustainability and sustainable foods within the dietary guidelines (Dietary Guidelines Advisory Committee, 2015). This points to an opportunity to promote lentils as a sustainable food source within future iterations of the Dietary Guidelines for Americans.

Lentil consumption in the USA is not part of the cultural history and is only a recently and sparsely adopted food source. In this study, consumption similarly remains low among informants. Portion size of lentils was not elucidated from the consumer survey, and the highest frequency of lentil consumption was "several times per week" reported by 12% of informants, while 28% of informants reported they eat lentils once a week. This leads to the conclusion that lentil consumption may remain below current dietary guidance of 100–300 g pulse per week per protein and vegetable food group, within this sample (United States Department of Health and Human Services and US Department of Agriculture, 2016). However, almost 60% of lentil consumers reported they increased their lentil consumption in the last 5 years, highlighting a trend in overall increased consumption. Producers can learn from consumers with respect to their purchasing habits and rationale for eating lentils. Likewise, consumers can learn from the production of lentil in relation to sustainability and make informed purchasing decisions.

#### Future Directions

Future directions with respect to lentil production include understanding and contrasting the different perceptions among producers in other lentil producing regions. Future research could integrate producer perceptions among conventional, organic, and mixed management systems in Idaho (USA), North Dakota (USA), Washington (USA), and Canada to understand the system more broadly and elucidate geographic and cultural differences. For example, results from the Palouse region in Idaho and Washington could highlight similarities and differences in management practices and perceptions from producers who have been producing lentil more historically. Representation from producers in Canada could highlight both similarities and differences and potential barriers and/or opportunities in future lentil production in similar landscapes across national borders.

With respect to consumers, results presented here indicate there are differences in knowledge of lentils between lentil consumers and non-consumers, and highlights opportunities for future research on social aspects surrounding lentil consumption, and educational outreach efforts to increase lentil consumption. Expanding on this research to understand perceptions among consumers in rural and urban areas would be important to highlight and contrast barriers and opportunities for lentil consumption among other demographics. It would be important to understand the perspectives among vulnerable populations in contrast to higher-income consumer groups, and among consumers with various levels of education. If barriers for lentil consumption are highlighted more broadly, targeted efforts can be placed to promote lentil consumption.

Due to the sustainability of lentil as a food system solution to promote environmental and human well-being, policy measures should be implemented that support lentil producers and consumers. For example, federal funds made available through the Farm Bill should be disbursed to incentivize best management practices producers already use, such as low-input, dryland, and diversified farms that include lentil. In addition, federal funds for research and development in value-added applications of lentil could enable an additional market demand for producers. Beyond canned soups and pre-packaged dahl, more recent lentil applications in value-added products include lentil flours which can be used in gluten-free baked goods such as cookies, crackers, chips, and breads and pasta (USA Dry Pea Lentil Council, 2016). In addition to incentivizing practices and supporting value-added applications of lentil, federal programs should support national food security and local farmers through the purchase of lentil to disburse in food programs such as Child and Adult Care Food Program, Federal Distribution Program on Indian Reservations, and the National School Lunch Program. Further, these federal food programs follow the advice from the Dietary Guidelines for Americans with respect to meals and menu-planning. Adding a sustainability dimension to future iterations of the Dietary Guidelines for Americans that promote increased consumption of plant-based protein and sustainable foods such as lentils, could create a platform for other areas of change. Bridging use of locally and/or nationally grown lentils within these food programs would create an additional market for producers while simultaneously supporting food security.

# CONCLUSION

Lentil is food system solution that requires few inputs, contributes to the livelihood of regional producers, and provides a relatively low-cost high-quality plant-based protein source that supports multiple dimensions of sustainability through both production and consumption. As found by this study, management practices, market, and supporting plant-based diets are key components in the sustainability profile of lentil on the side of production. On the side of consumption, consumer willingness to increase lentil intake based on environmental, socio-economic, and nutrition information could be a key component to increase market demand.

Due to the recent and less developed culture of eating lentils in the USA, policy actions should support and incentivize lentil production and support increased consumption through national dietary guidance and through federal food programs that serve vulnerable populations. Utilizing lentils from local and/or national sources simultaneously supports multiple dimensions of sustainability while promoting food security.

#### DATA AVAILABILITY STATEMENT

The datasets generated for this study are available on request to the corresponding author.

#### ETHICS STATEMENT

The studies involving human participants were reviewed and approved by Institutional Review Board (IRB) at Montana State University. The patients/participants provided their written informed consent to participate in this study.

#### AUTHOR CONTRIBUTIONS

TW, SA, CB, and PM contributed to the conception, design of the study, and wrote sections of the manuscript. TW performed the statistical analysis, created figures, and wrote the first draft of the manuscript. All authors contributed to manuscript revision and read and approved the submitted version.

#### REFERENCES


#### FUNDING

WAFERx was supported by the National Science Foundation under the EPSCoR Track II Cooperative Agreement No. OIA-1632810. Any opinions, findings, conclusions, or recommendations are those of the author(s) and do not necessarily reflect the views of the National Science Foundation.

#### ACKNOWLEDGMENTS

We are thankful to Montana Grain Growers Association, Northern Pulse Growers Association, Montana Organic Association, USA Dry Pea and Lentil Council, Timeless Foods, Joseph Kibowat, Doug Crabtree and Anna Jones-Crabtree, and Mark Van Dyke.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fsufs. 2019.00088/full#supplementary-material


2019).


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Warne, Ahmed, Byker Shanks and Miller. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Evaluation and Identification of Promising Introgression Lines Derived From Wild *Cajanus* Species for Broadening the Genetic Base of Cultivated Pigeonpea [*Cajanus cajan* (L.) Millsp.]

#### *Shivali Sharma1\*, Pronob J. Paul1, C.V. Sameer Kumar2, P. Jaganmohan Rao3, L. Prashanti4, S. Muniswamy5 and Mamta Sharma6*

#### *Edited by:*

*Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain*

#### *Reviewed by:*

*Brigitte Uwimana, International Institute of Tropical Agriculture (IITA), Uganda Subhojit Datta, Central Research Institute for Jute and Allied Fibres, India Federico Martin Ribalta, University of Western Australia, Australia*

> *\*Correspondence: Shivali Sharma shivali.sharma@cgiar.org*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 30 April 2019 Accepted: 11 September 2019 Published: 22 October 2019*

#### *Citation:*

*Sharma S, Paul PJ, Kumar CVS, Rao PJ, Prashanti L, Muniswamy S and Sharma M (2019) Evaluation and Identification of Promising Introgression Lines Derived From Wild Cajanus Species for Broadening the Genetic Base of Cultivated Pigeonpea [Cajanus cajan (L.) Millsp.]. Front. Plant Sci. 10:1269. doi: 10.3389/fpls.2019.01269*

*1 Theme Pre-breeding, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India, 2 Regional Agricultural Research Station, Professor Jayashankar Telangana State Agricultural University, Palem, India, 3 Regional Agricultural Research Station, Professor Jayashankar Telangana State Agricultural University, Warangal, India, 4 Regional Agricultural Research Station, Acharya N. G. Ranga Agricultural University, Tirupati, India, 5 Regional Agricultural Research Station, University of Agricultural Sciences, Kalaburagi, India, 6 Legume Pathology, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India*

Pigeonpea [*Cajanus cajan* (L.) Millsp.], a multipurpose and nutritious grain legume crop, is cultivated for its protein-rich seeds mainly in South Asia and Eastern and Southern Africa. In spite of large breeding efforts for pigeonpea improvement in India and elsewhere, genetic enhancement is inadequate largely due to its narrow genetic base and crop susceptibility to stresses. Wild *Cajanus* species are novel source of genetic variations for the genetic upgradation of pigeonpea cultivars. In the present study, 75 introgression lines (ILs), derived from crosses involving cultivated pigeonpea variety ICPL 87119 and wild *Cajanus cajanifolius* and *Cajanus acutifolius* from the secondary gene pool, were evaluated for yield and yield-attributing traits in diverse environments across locations and years. Restricted maximum likelihood (REML) analysis revealed large genetic variations for days to 50% flower, days to maturity, plant height, primary branches per plant, pods per plant, pod weight per plant, 100-seed weight, and grain yield per plant. Superior ILs with midearly to medium maturity duration identified in this study are useful genetic resources for use in pigeonpea breeding. Additive main effects and multiplicative interaction (AMMI) analysis unfolded large influence of environment and genotype × environment interaction for variations in yield. A few lines such as ICPL 15023 and ICPL 15072 with yield stability were identified, while a number of lines were completely resistant (0%) to sterility mosaic diseases and/or *Fusarium* wilt. These lines are novel genetic resources for broadening the genetic base of pigeonpea and bring yield stability and stress tolerance. High-yielding lines ICPL 15010, ICPL 15062, and ICPL 15072 have been included in the initial varietal trials (IVTs) of the All India Coordinated Research Project (AICRP) on pigeonpea for wider evaluation across different agro-ecological zones in India for possible release as variety(ies).

Keywords: pre-breeding, introgression lines, *Cajanus cajanifolius*, *Cajanus acutifolius*, AMMI, wild *Cajanus*  species, pigeonpea

# INTRODUCTION

Pigeonpea [*Cajanus cajan* (L.) Millsp.], originating in India, is the sixth most important grain legume crop of the tropics and subtropics and grown for multiple uses. It is an oftencross-pollinated diploid (2*n* = 2*x* = 22) crop. Globally, 6.81m t of pigeonpea grains was produced from 7.02m ha with an average productivity of 0.97 t ha−1 (FAOSTAT, 2017). Although its presence has been noted in many countries, India and Myanmar in South Asia and Kenya, Tanzania, Malawi, Uganda, and Mozambique in Eastern and Southern Africa are the major pigeonpea-producing countries (FAOSTAT, 2017). India contributed about 72% of global pigeonpea production. Disproportionate yield gaps were noted between potential (2.5–3.0 t ha−1) and average (~0.9 t ha−1) yields in India (Bhatia et al., 2006). The average yield in India remained around 0.9 t ha−1 for the past six decades (FAOSTAT, 2017). This yield gap is mainly due to the exposure of the crop to biotic stresses such as *Fusarium* wilt (FW; caused by *Fusarium udum* Butler), sterility mosaic diseases (SMD; caused by pigeonpea sterility mosaic virus transmitted by eriophyid mite, *Aceria cajani* Channabasavanna), phytophthora blight (*Phytophthora drechsleri* Tucker f. sp. *cajani*), pod borer (*Helicoverpa* sp.), and pod fly (*Melanagromyza obtusa*) and abiotic stresses such as waterlogging, salinity, and frost/cold as well as due to its cultivation in marginal environments with limited inputs (Sharma and Upadhyaya, 2016).

Like other legumes, domestication bottlenecks also contributed to the narrow genetic base in Pigeonpea (Kassa et al., 2012). Breeders often use their own working collection consisting of elite breeding and some germplasm lines as parents in crossing. This results in recirculating the same germplasm, leading to the narrow genetic base of the released cultivars. In pigeonpea, T-1 and T-90 were the most frequently used germplasm as parents in breeding programs in India (Kumar et al., 2004). The polymorphic survey of a set of *Cajanus* accessions has also indicated the lack of genetic diversity within the cultivated gene pool (Kumar et al., 2018). Furthermore, the natural defense mechanism in improved cultivars has been lost during intense selection for high yield, which may result in the genetic vulnerability of crop cultivars to a number of biotic and abiotic stresses (Tanksley and McCouch, 1997). Overall, the narrow genetic base of pigeonpea cultivars and lack of high levels of resistance/tolerance to important biotic and abiotic stresses in cultivated gene pool and/or breeder working collection hinders its genetic improvement and results in low genetic gains.

Wild *Cajanus* species are the reservoir of many useful genes and hold great potential for pigeonpea improvement. The ICRISAT genebank has the global responsibility of collecting, conserving, and distributing pigeonpea germplasm comprising landraces, modern cultivars, genetic stocks, mutants, and wild *Cajanus* species. It holds over 13,200 accessions of cultivated pigeonpea and 555 accessions belonging to 66 species of six genera in genus *Cajanus* from 74 countries (Upadhyaya et al., 2013). This germplasm collection based on the crossability relationship between cultivated and wild pigeonpea has been grouped into three gene pools (Sharma, 2017) (**Table 1**).

TABLE 1 | Pigeonpea gene pool classification.


Multiple sources of resistance/tolerance to stress have been reported among wild *Cajanus* species—SMD (Kulkarni et al., 2003; Rao et al., 2003); phytophthora blight (Rao et al., 2003); alternaria blight (*Alternaria tenuissima*; Sharma et al., 1987); pod borer (*Helicoverpa armigera*) (Rao et al., 2003; Sujana et al., 2008; Sharma et al., 2009); pod fly (Saxena et al., 1990; Rao et al., 2003); root-knot nematodes (*Meloidogyne* spp.; Sharma et al., 1993a; Sharma et al., 1993b; Sharma et al., 1994; Rao et al., 2003); salinity (Subbarao, 1988; Subbarao et al., 1991; Rao et al., 2003; Srivastava et al., 2006); and drought (Rao et al., 2003). Pigeonpea, by nature, is a photosensitive crop. A few wild *Cajanus* accessions, however, were reported as insensitive to photoperiod (Rao et al., 2003).

Cultivated pigeonpea is believed to originate in India (Vavilov, 1951; van der Maesen, 1980). In this study, two wild *Cajanus* species from a secondary gene pool, *Cajanus acutifolius* and *Cajanus cajanifolius*, belonging to different geographic origins were crossed with a popular pigeonpea variety, ICPL 87119 (also known as 'Asha'), to generate interspecific populations following advanced backcross approach. *C. acutifolius* accession ICPW 12 (syn. ICP 15613) is a native of Australia and reported to be resistant to *H. armigera* (Sujana et al., 2008), whereas *C. cajanifolius* accession ICPW 29 (syn. ICP 15630) is of Indian origin and the progenitor of cultivated pigeonpea (van der Maesen, 1980). The main aim of this investigation was to (a) create new genetic variability with minimum linkage drag by utilizing two wild *Cajanus* species of different geographic origins as donors and popular pigeonpea cultivars as recipients following advanced backcross approach and (b) identify promising introgression lines (ILs) having good agronomic performance and disease resistance for ready use in pigeonpea breeding programs. These promising ILs will enrich variability in the primary gene pool, and their utilization in breeding programs will assist in developing new climate-resilient cultivars with a broad genetic base, which in turn will enhance the genetic gains in pigeonpea.

### MATERIALS AND METHODS

#### Development of Pre-Breeding Populations

Using two wild *Cajanus* accessions, ICPW 12 (*C. acutifolius*) and ICPW 29 (*C. cajanifolius*), natives of Australia and India, respectively, and popular pigeonpea cultivar ICPL 87119, two prebreeding populations were developed at ICRISAT, Patancheru, India. ICPL 87119 (Asha) is a medium-duration leading variety widely cultivated in India (Jain et al., 1995) while ICPW 12 and ICPW 29 were reported to have high levels of resistance against pod borer (Sujana et al., 2008).

ICPL 87119 was used as the female parent, whereas wild species accessions were used as the male parent to generate F1 hybrids. In each cross, true F1s were selected based on leaves, flowers, and pod morphology and subsequently backcrossed with ICPL 87119 to produce BC1F1 seeds. Similarly, true BC1F1 plants in both crosses were identified based on morphological traits, and the confirmed BC1F1 plants were used for the second backcross with ICPL 87119 to produce BC2F1 seeds. True BC2F1 plants were selfed twice to produce BC2F3 populations that were subsequently advanced to produce stable ILs, 149 in ICPL 87119 × ICPW 12 (designated as Pop I) and 183 in ICPL 87119 × ICPW 29 (designated as Pop II). Considerable variability for plant type and morpho-agronomic traits was observed between and within lines in both populations. In the first round of selection, stable lines with no segregation but having a good agronomic performance and differing in maturity such as mid-early (140– 180 days) to medium (161–180 days to maturity) maturity, high seed yield, and 100-seed weight were selected. Overall, 30 stable ILs (12 ILs from Pop I and 18 ILs from Pop II) were selected to assess their agronomic performance across four locations during the 2016 rainy season in India (**Table 2**).

The second round of selection was made to exploit withinline variability in the remaining lines in both populations. The selection was made in two stages. At the first stage, almost stable lines showing some segregation and overall good agronomic performance were selected. At the second stage, single plants were selected based on the visual observations and overall plant aspect score from each of the selected lines in both populations. Overall, 16 single plants from 16 selected lines in Pop I and 29 single plants from 29 selected lines in Pop II were selected for evaluation over years at ICRISAT, Patancheru (**Table 2**).

#### Evaluation of Promising ILs for Yield-Related Traits

For precise phenotyping with minimum microenvironment errors across locations, two multilocation evaluation trials (designated as "MET") were constituted using 30 stable ILs. For this, 30 ILs were randomly divided into two sets: set I with 15 ILs (five from Pop I and 10 from Pop II) was evaluated in MET 01, and set II with the remaining 15 ILs (seven ILs from Pop I and eight ILs from Pop II) was evaluated in MET 02. Both MET 01 and MET 02 trials were conducted under rainfed conditions across four locations, Patancheru, Kalaburagi, Tirupati, and Warangal, during the 2016 rainy season. These locations were selected based on the high importance of pigeonpea crop in these areas, especially under rainfed conditions (**Table S1**). Both MET 01 and MET 02 were conducted in "Vertisols" at Patancheru, Kalaburagi, and Warangal and in "Alfisols" at Tirupati. Two popular pigeonpea varieties [ICPL 87119, (Jain et al., 1995) and ICP 8863, also known as 'Maruti', (ICRISAT, 1993)] were used as checks in each trial.

Using 16 single plant selections (SPSs) from Pop I and 29 SPSs from Pop II, two trials, designated as "Trial 03" and "Trial 04," respectively, were conducted at ICRISAT, Patancheru, for the evaluation of yield-related traits during the 2016 and 2017 rainy seasons. In both Trial 03 and Trial 04, three checks, ICPL 87119, ICP 8863, and ICP 85010, were included in the evaluation studies.

Each trial across all locations/seasons was conducted in a randomized block design with three replications. Plot size was


a four-row plot of 4-m length with 1.2-m row-to-row spacing in the MET 01 and MET 02 and a four-row plot of 4-m length with 75-cm spacing in Trial 03 and Trial 04. Manual weeding and spraying of insecticide were done to control weeds and insectpest damage. All other recommended agronomic practices were followed for raising a healthy crop.

Data were recorded on days to 50% flower, days to maturity, plant height (cm), 100-seed weight (g), and grain yield per plant (g) in MET 01 and MET 02 at each location. In Trial 03 and Trial 04, data were recorded on days to first flower, days to 50% flower, plant height (cm), primary branches per plant, pods per plant, pod weight per plant (g), 100-seed weight (g), and grain yield per plant (g). Data on days to first flower, days to 50% flower, and days to maturity were recorded on a plot basis, whereas plant height, primary branches, pods per plant, pod weight per plant, 100-seed weight, and grain yield per plant were recorded on five randomly selected representative plants per plot following pigeonpea descriptors (IBPGR and ICRISAT, 1993).

#### Screening for FW and SMD Resistance

A total of 45 ILs (16 ILs from Trial 03 and 29 ILs from Trial 04) were screened for FW and SMD in the sick plot under artificial epiphytotic conditions during the 2017 rainy season at ICRISAT, Patancheru, and 32 promising resistant ILs (12 ILs from Trial 03 and 20 ILs from Trial 04) were further evaluated during the 2018 rainy season for confirming resistance. For FW screening, chopped wilted pigeonpea was incorporated in the sick plot to maintain a threshold level of the *F. udum*, the wilt pathogen. ICP 2376, a highly wilt-susceptible cultivar, was planted after every five rows to serve as an indicator/infector row. For SMD screening, SMD-infested leaves (Patancheru isolate) were inoculated in every plant of the ILs at a two-leaf stage following the leaf staple technique (Nene et al., 1981). To provide a good source of virus inoculum, a highly susceptible cultivar, ICP 8863, was planted one month in advance of the regular planting after every five rows of test entries to serve as an indicator/infector row. Special care was taken during planting of test ILs and susceptible cultivar in the wind direction to facilitate the virus transmission through mites. The percent disease incidence was calculated using the formula: Percent disease incidence = (no. of plants infected in a row/total no. of plants in a row) × 100. Based on the disease incidence, ILs were categorized as resistant (0–10% diseases incidence), moderately resistant (10.1–20%), moderately susceptible (20.1– 40%), and susceptible (>40%) (Pande et al., 2012).

#### Statistical Analysis

Replicate-wise data on five agronomic traits in MET 01 and MET 02 and eight agronomic traits in Trial 03 and Trial 04 were analyzed using restricted maximum likelihood (REML) methods for each location considering genotypes as a random effect and replications as a fixed effect in the mixed-model procedure (Patterson and Thompson, 1971). Variance components due to genotypes ( ) σ g 2 and their standard errors were determined. Environment-wise best linear unbiased predictors (BLUPs) were calculated for each genotype in each trial. The significance of variance components was tested using respective standard errors. Heritability (*H*<sup>2</sup> , broad sense) at an individual environment was estimated from the following formula:

$$H^2 = \frac{\sigma\_\#^2}{\sigma\_\#^2 + \sigma\_\text{c}^2/r}$$

where σ g 2 is the variance component due to genotypes, σe 2 is the variance component due to error, and *r* is the number of replications.

A phenotypic distance matrix was created by calculating the differences between each pair of entries for each trait. The diversity index was calculated by averaging the differences in the phenotypic values for each trait divided by the respective range (Johns et al., 1997). The mean diversity, minimum diversity, and maximum diversity were calculated, and the accessions showing the minimum diversity and maximum diversity were identified in each trial.

To study the adaptability and yield stability of the ILs across different locations, additive main effects and multiplicative interaction (AMMI) analysis was performed (Gauch, 1992). The basic model for AMMI is based on the additive variance from the multiplicative variance and the principal component analysis (PCA) as detailed here:

$$Y\_{\circ} = \mu + g\_{\circ} + e\_{\circ} + \sum\_{n=1}^{N} \mathfrak{r}\_n \, Y\_{\circ\_n} \, \mathcal{S}\_{\circ^n} + \mathfrak{e}\_{\circ^n}$$

where *Yij* is the yield of the *i* th genotype (*i* = 1, …, *L*) in the *j* th environment (*j* = 1, …, *J*); µ is the grand mean; *gi* and *ej* are the genotype and environment deviations from the grand mean, respectively; τ*n* is the eigenvalue of the PCA axis *n*; γ*in* and δ*jn* are the genotype and environment principal component (PC) scores for axis *n*; *N* is the number of PCs retained in the model; and ε*ij* is the error term.

AMMI stability value (ASV) was calculated for each IL according to the relative contribution of the PC axis scores (IPCA1 and IPCA2) to the interaction sum of squares (SS).

The ASV was estimated as described by Purchase et al. (2000):

$$ASV = \sqrt{\left[\frac{IPCA1\_{\text{Sum of squares}} \left(IPCA1\_{\text{ score}}\right)}{IPCA2\_{\text{Sum of squares}}}\right]^2 + \left(IPCA2\_{\text{score}}\right)^2}$$

where *IPCA*1Sum of squares/*IPCA*2Sum of squares is the weight derived from dividing the IPCA1 SS [from the AMMI analysis of variance (ANOVA) table] by the IPCA2 SS. The larger the IPCA score is, either negative or positive, the more adapted a genotype is to a certain environment. Conversely, smaller ASV scores indicate a more stable genotype across environments.

Genotype selection index (GSI) was estimated (Farshadfar, 2008) using the sum of the ranking based on yield and ranking based on the ASV as

$$\text{GSI} = \text{RASV} + \text{RY}$$

where *RASV* is the rank of the genotypes based on the ASV and *RY* is the rank of the genotypes based on yield across environments.

All analyses were performed in Genstat 19 (VSN International, Hemel Hempstead, UK, web page: genstat.co.uk).

#### RESULTS

#### Variance Components, Trait Variability, and Heritability

REML analysis showed that variances due to genotypes (σ2 g) were significant for days to 50% flower, days to maturity, plant height, 100-seed weight, and grain yield per plant across four locations in both MET 01 and MET 02, indicating the presence of significant variability among genotypes (**Table 3**). In Trial 03 (**Table 4**) and Trial 04 (**Table 5**) also, significant variability was observed among genotypes for days to first flower, days to 50% flower, plant height, primary branches per plant, pods per plant, 100-seed weight, pod weight per plant, and grain yield per plant in 2016 and 2017 at ICRISAT, Patancheru.

Large variation in range and means were noted in individuals as well across locations (**Table 3**). The Newman–Keuls test of significance for mean values showed significant differences in the performance of genotypes across four locations for most of the traits in both MET 01 and MET 02. ILs in MET 01 flowered and matured significantly earlier at Patancheru, were taller at Kalaburagi and Warangal, but produced maximum grain yield at Kalaburagi (**Table 3**). In MET 02 also, ILs flowered and matured early at Patancheru, were significantly taller at Warangal, and produced higher grain yield at Kalaburagi (**Table 3**).

Significant differences in mean performance were also noted for most traits in Trial 03 and 04 at Patancheru. In both trials, the ILs flowered early in 2016, were taller in 2017, had more primary branches and pods per plant in 2016, but had higher grain yield in 2017 (**Tables 4** and **5**).

High heritability (*H*<sup>2</sup> ) (>70%) was recorded for most of the traits in MET 01 and MET 02 (**Table 3**) as well as in Trial 03 (**Table 4**) and Trial 04 (**Table 5**).

#### Phenotypic Diversity and Identification of Promising High-Yielding ILs

The mean phenotypic diversity index across four locations varied from 0.125 (Patancheru) to 0.149 (Tirupati) in MET 01 and from 0.138 (Kalaburagi) to 0.185 (Warangal) in MET 02. In Trial 03, the mean phenotypic diversity index was 0.1059 in 2016 and 0.1143 in 2017, and in Trial 04, it was 0.1164 in 2016 and 0.0.0637 in 2017. The maximum diversity was observed between ICPL 15065 and ICP 8863 at Patancheru and Kalaburagi, between ICPL 15060 and ICPL 15007 at Warangal, and between ICPL 87119 and ICPL 15006 at Tirupati in MET 01 (**Table S2a**). Similarly, in MET 02, the maximum diversity was observed between ICP 8863 and ICPL 15040 at Patancheru and ICP 8863 and ICPL 15079 at Kalaburagi (**Table S2a**). Lines showing maximum diversity were also identified in Trial 03 and Trial 04 **(Table S2b)**.

ICPL 15065 was the most diverse accession across the three locations, Patancheru, Kalaburagi, and Tirupati, whereas ICPL


TABLE 3 | Variance components due to genotypes (

 ) σ2

, mean, range, and heritability (

*H*2) of agronomic traits across locations of MET 01 and MET 02 during the 2016 rainy season.

g *\*Significant at P ≤ 0.05. †Mean values were tested using Newman–Keuls test, and means with different alphabets are significantly different at P ≤ 0.05.*

TABLE 4 | Variance components due to genotypes ( ) σg <sup>2</sup> , mean, range, and heritability (*H*2) of agronomic traits in Trial 03 at ICRISAT Patancheru during the 2016 and 2017 rainy seasons.


*#DF, days to first flower; DF50, days to 50% flower; PH, plant height; NPB, number of primary branches; PPP, pods per plant; PWPP, pod weight per plant; HSW, 100-seed weight; GYPP, grain yield per plant.*

*\*Significant at P ≤ 0.05. †Mean values were tested using Newman–Keuls test, and means with different alphabets are significantly different at P ≤ 0.05.*

TABLE 5 | Variance components due to genotypes ( ) σg <sup>2</sup> , mean, range, and heritability (*H*2) of agronomic traits in Trial 04 at ICRISAT Patancheru during the 2016 and 2017 rainy seasons.


*#DF, days to first flower; DF50, days to 50% flower; PH, plant height, NPB, number of primary branches; PPP, pods per plant; PWPP, pod weight per plant; HSW, 100-seed weight; GYPP, grain yield per plant.*

*\* Significant at P ≤ 0.05. †Mean values were tested using Neman–Keuls test, and means with different alphabets are significantly different at P ≤ 0.05.*

15010 was similar to ICPL 87119 across most locations in MET 01 (**Tables S2a and S2c**). In MET 02, ICPL 15014 was the most diverse accession across two locations (Patancheru and Warangal). Similarly, ICPIL 17155 and ICPIL 17156 were the most diverse accessions in Trial 03, and ICPIL 17167 was the most diverse in Trial 04 in 2016 (**Table S2d)**. Three lines with maximum diversity–similarity with ICPL 87119 were also identified (**Table S3**).

ILs, in general, flowered early or at par with the popular highyielding pigeonpea variety ICPL 87119 (Asha) across all four trials. Promising high-yielding ILs were identified (**Table 6**). Most of the ILs in MET 01 across four locations yielded at par with ICPL 87119. However, nine lines at Kalaburagi (20% to 62% yield superiority over ICPL 87119) and two each at Tirupati (65% and 69% yield superiority) and Warangal (25% and 37% yield superiority) were significantly higher yielding than ICPL 87119. Of these, six ILs at Kalaburagi, one IL at Tirupati, and one at Warangal also matured significantly earlier than ICPL 87119 and had a 100-seed weight ranging from 9.5 to 10.5 g (**Table S4a**). ICPL 15085 yielded a significantly higher grain yield at Kalaburagi (over 20% yield superiority), Tirupati (over 65% yield superiority), and Warangal (over 25% yield superiority) and was similar to ICPL 87119 at Patancheru. This IL had a 100-seed weight ranging from 9.0 to 10.7 g across four locations (**Table S4a**). ICPL 15019 was found to be significantly higher yielding at Warangal (~37% yield superiority) and Kalaburagi (over 35% yield superiority). Similarly, ICPL 15062 exhibited significantly higher grain yield than ICPL 87119 at Kalaburagi (~30% yield superiority) and Tirupati (over 45% yield superiority). ICPL 15065 combined high grain yield and the highest 100-seed weight (12.5 to 13.5 g) across four locations (**Table S4a**).

In MET 02, 14 ILs at Kalaburagi (~22–71% yield superiority), eight each at Patancheru (~19–45% yield superiority) and Tirupati (~41–75% yield superiority), and three at Warangal (~21–32% yield superiority) significantly out-yielded ICPL 87119 (**Table 6**). On an average, ICPL 15072 across four locations and ICPL 15077, ICPL 15014, ICPL 15021, and ICPL 15030 across three locations (Patancheru, Kalaburagi, and Tirupati) out-yielded ICPL 87119 by ~50% and 45% (**Table S4b**).

In Trial 03, the grain yield of most of the ILs was similar to that of ICPL 87119 (**Tables 7** and **S5a**). In Trial 04, nine ILs in the 2016 rainy season and only one IL, ICPL 17149, were significantly better than ICPL 87119 for grain yield per plant (**Table 7** and



*Bold emphasis indicates significantly better lines at the 0.5% level of significance.*

TABLE 7 | Promising high-yielding lines identified in Trial 03 and Trial 04 during the 2016 and 2017 rainy seasons at ICRISAT, Patancheru.


*Bold emphasis indicates significantly better lines at the 0.5% level of significance.*

**S5b**). Overall, six ILs (ICPIL # 17165, 17167, 17168, 17169, 17178, and 17188) produced more pods and higher pod weight than ICPL 87119. Based on consistent performance in 2016 and 2017, ICPIL 17165 and ICPIL 17167 were found promising for higher grain yield, pod numbers, and pod weight and days to 50% flower at par with ICPL 87119 (**Table S5b**).

#### FW and SMD Resistance

Fourteen ILs in Trial 03 were resistant to FW, of which 12 were SMD resistant, while in Trial 04, 24 ILs were resistant to FW, of which 20 ILs were resistant to SMD. ILs combining resistance to FW and SD were further screened for resistance to these two diseases in the next season. The second-year evaluation confirmed SMD resistance in all ILs (12 ILs from Trial 03 and 20 ILs from Trial 04), whereas FW resistance was confirmed in 10 ILs (ICPIL # 17148, 17149, 17150, 17151, 17153, 17154, 17157, 17158, 17161, and 17162) from Trial 03 and 19 ILs (ICPIL # 17164, 17165, 17167, 17168, 17169, 17170, 17172, 17173, 17174, 17177, 17178, 17182, 17183, 17184, 17185, 17186, 17187, 17188, and 17191) from Trial 04.

#### AMMI Analysis

The genotype, location, and genotype × location interactions (GEIs) were assessed by AMMI model in MET 01 (**Table S6a)** and MET 02 (**Table S7a**). Variance analysis of the AMMI model for grain yield showed significant effects for genotype, location, and GEI in MET 01 and MET 02. Locations contributed the largest phenotypic variation, followed by GEI and genotype in both MET 01 and MET 02 (**Tables S6a** and **S7a**). The GEI was highly significant (*P* ≤ 0.01), accounting for over 29% and 32% of the total variation in MET 01 and MET 02, implying the differential response of the genotypes to locations. The presence of GEI was also clearly demonstrated by the AMMI model when the interaction was partitioned into the first two interaction PC axes (IPCA) (**Tables S6a** and **S7a**). IPCA1 and IPCA2 scores were highly significant, explaining 48.2% and 34.8% of the variability, respectively, in MET 01 and 55.6% and 30.6% of the variability, respectively, in MET 02 (**Tables S6a** and **S7a**).

In the AMMI biplot [second interaction PC axis (IPCA2) against the first interaction PC axis (IPCA1)], genotypes closer to the origin of the axis have a smaller contribution to the interaction than those that are farthest. In the AMMI biplot for grain yield (**Figure 1**), ICPL 15023, ICPL 15010, and ICPL 15057 in MET 01 showed greater stability. Of these three ILs, the grain yield per plant of ICPL 15057 and ICPL 15010 was lower than the overall population mean and checks (ICP 8863 and ICPL 87119), whereas the grain yield of ICPL 15023 was better than the checks and population mean. From the AMMI biplot as well as AMMI selection per environment, it is evident that ICPL 15062, ICPL 15085, ICPL 15019, and ICPL 15075 were the best-suited ILs in Tirupati, Patancheru, Warangal, and Kalaburagi locations, respectively (**Figure 1** and **Table S6b**). Further, ICPL 15071 was better adapted at Kalaburagi and Warangal, whereas ICPL 15085 at Patancheru and Tirupati locations.

Similarly, in MET 02, the AMMI biplot (IPCA2 vs IPCA1) for grain yield per plant showed that ICPL 15072 was the most stable genotype across locations (**Figure 2**). ICPL 15077 and ICPL 15014 were found to be the best-suited ILs at Patancheru and Tirupati, respectively. ICPL 15077 was placed closer to both Kalaburagi and Patancheru environmental vectors and hence was suitable for these locations. Based on the AMMI selections per environment (**Table S7b**), this genotype was ranked number 1 at Patancheru and number 2 at Kalaburagi.

Apart from the AMMI biplot, AMMI stability analysis (ASV) gives the strength to quantify and classify genotypes that have stable performances across different environmental conditions (Oliveira et al., 2014). A low ASV of any genotypes indicates its stability across environments, while those with high ASV values are less stable. ICPL 15023, ICPL 15010, ICPL 15057, and ICPL 15065 were found to be the most stable ILs in MET 01 with ASV values of 0.4 to 1.5, whereas ICPL 15071 and ICPL 15075 were the most unstable ILs with ASV values of 5.0 and 4.3, respectively (**Table S6c**). In MET 02, ICPL 15072, ICPL 15021, ICPL 15077, and ICPL 15067 were the most stable lines based on ASV value (1.1–1.6) (**Table S7c**).

Stability with high yield potential should be considered for the selection, and hence, GSI may be useful in selecting the best genotypes. Based on low GSI value, ICPL 15023 and ICPL 15085 in MET 01 and ICPL 15072 and ICPL 15077 in MET 02 were found to be the most stable with high yield potential (**Tables S6c** and **S7c**).

# DISCUSSION

Global warming is adversely impacting agricultural production globally. Developing climate-resilient crops and their cultivation will contribute to food and nutritional security to the growing

world population. The narrow genetic base may result in the vulnerability of food crops and render them susceptible to stresses. Developing new climate-resilient cultivars necessitates the exploitation of new and diverse sources of variations in breeding programs. Crop wild relatives are the reservoir of many useful genes, and their use in breeding programs will lead to enhanced levels of plasticity in new cultivars and thereby a higher capability to withstand environmental stresses (Khoury et al., 2015; Sharma, 2017).

Though the potential of wild species in improving crop cultivars is well known, breeders are mostly indisposed to use these important and unexploited genetic resources in many breeding programs. Cross-incompatibility, poor adaptability, and linkage drag among others are the major constraints for low use of wild relatives in crop breeding. Moreover, difficulty in hybridization even with cross-compatible wild species and more time, efforts, and resources required to minimize linkage drag for the development of interspecific populations make the introgression breeding using wild relatives lengthier and cumbersome (Sharma et al., 2013). Pre-breeding provides a unique platform for creating new genetic variability following interspecific hybridization and developing ILs with preferred traits for genetic enhancement. Thus, ILs with higher frequency of useful traits introgressed from wild relatives provide new sources of variability into the diverse genetic background for use in breeding to develop climate-resilient crops (Sharma, 2017).

Use of wild species in breeding programs is often associated with introgressing many undesirable traits such as long maturity duration, pod shattering, and small pods, which are commonly known as linkage drag. Hence, for population development, an advanced backcross approach followed by selfing was used to recover the genetic background of the cultivated type and to identify promising recombinants with minimum linkage drag. To broaden the genetic base of pigeonpea, *C. acutifolius* (ICPW 12) and *C. cajanifolius* (ICPW 29) were crossed with recurrent parent ICPL 87119 (Asha), and the F1s were backcrossed twice and selfed for three to four generations to derive 75 ILs that were evaluated for stress tolerance and productivity to identify promising ILs with required characteristics that breeders may use to accelerate cultivar development in pigeonpea. The results showed that the advance backcross approach was successful in creating useful genetic variability with minimum linkage drag using wild species.

#### ILs with Great Diversity in Phenology and Agronomic Traits

Large variation in maturity duration (141–176 days) as noted in the present study makes these ILs an ideal genetic resource for use in pigeonpea breeding worldwide. Pigeonpea cultivars based on maturity are categorized into super-early (<100 days), extra-early (100–120 days), early (120–140 days), mid-early (140–160 days), medium (160–180 days), and long-duration (>180 days) groups (Srivastava et al., 2012). Each maturity group is suited to a specific agro-ecosystem, which is defined by altitude, temperatures, latitude, and day length. India is a major pigeonpea-growing country, and a medium-duration variety, Asha (ICPL 87119), dominates the production for the past two decades (Kumar et al., 2014).

In the national system of India, more than 10-year-old varieties are not promoted in the seed chain and are termed as "extant" varieties. As Asha was released in 1995 for cultivation, there is no possibility to promote this variety in the seed chain. Hence, there is a dire need to introduce new high-yielding varieties with FW and SMD resistance as a replacement to Asha. The high-yielding ILs such as ICPL 15085, ICPL 15072, ICPL 15062, ICPL 15067, ICPIL 17164, ICPIL 17165, and ICPIL 17169 identified in the present study, having on and average 21–50% yield superiority over Asha and with average maturity ranging from 161 to 170 days across locations and/or over years, provide a great opportunity for breeders using this useful genetic resource to develop new cultivars that may replace Asha.

Further, due to short cropping seasons, pigeonpea improvement programs are focusing on developing short-duration varieties, particularly in the mid-early maturity duration group. ICPL 15010, ICPL 15019, ICPL 15023, ICPL 15021, ICPL 15077, and ICPIL 17160 across locations and/or years were more high yielding and matured earlier (<160 days) than Asha and hold great potential in developing high-yielding varieties in the mid-early maturity duration group.

Further, based on the mean phenotypic diversity index, the most diverse pairs of ILs have been identified. It will be interesting and fruitful to involve the most diverse ILs in hybridization programs to see the extent of segregations for different traits. Besides this, a few promising ILs such as ICPL 15065, ICPL 15014, and ICPIL 17167 were found to be more diverse than the recurrent parent ICPL 87119. Thus, these lines may be used to broaden the genetic base of cultivated pigeonpea.

#### Yield Stability

The AMMI, based on the two-way ANOVA and the PCA, is a unified approach to analyze multilocation trial data (Crossa et al., 1990). Being a powerful tool for visualizing as well as partitioning the GEI, AMMI determines the stable genotypes and the behavior of test environments (Silveira et al., 2013). A large SS for the environment in AMMI analysis showed that the environments in which these lines were evaluated were highly diverse. ICPL 15023 in MET 01 and ICPL 15072 in MET 02, being close to the AMMI biplot origin, are the most stable ILs across environments. These lines may be further evaluated on large-scale trials prior to recommending for cultivation. These two lines also scored high based on grain yield, ASV, and GSI. Hence, these lines should be given utmost importance for use in breeding programs or release them directly as a variety. The AMMI biplot further revealed

that ICPL 15058, ICPL 15075, and ICPL 15071 are adapted to specific environments and therefore may be used in breeding for developing region-specific cultivars or may be deployed as a cultivar for production in specific environments.

#### Biotic Stress Resistance

SMD and FW cause substantial losses to pigeonpea production and have been identified as the "must-have" traits for pigeonpea in India. In this study, the majority of the lines in Trial 03 and Trial 04 were resistant to either FW, SMD, or both. Twentynine ILs from both the trials showed high levels of resistance (<10% incidence) for both SMD and FW. *C. acutifolius* and *C. cajanifolius* were reported resistant to SMD (Khoury et al., 2015; Patil and Kumar, 2015). Ten C. *acutifolius*-derived ILs and 11 *C. cajanifolius*-derived ILs showed complete resistance to SMD (0% incidence), implying that SMD resistance has been successfully introgressed into these lines.

Three distinct isolates have been characterized for SMD, namely, Bangalore, Patancheru, and Coimbatore isolates; the Patancheru and Coimbatore isolates are mild strains, while the Bangalore isolate is the most virulent one (Kulkarni et al., 2003). A breakdown of SMD resistance has been reported based on multilocation field trials (Nene et al., 1989). The SMD resistance sources identified in the present study should be screened further across locations to identify isolate-specific sources of SMD resistance.

#### ILs: Potential To Be Released as Cultivars in India

The superiority of a few ILs over local and/or national checks provided an opportunity to the breeders to include a few promising ILs in the initial varietal trials (IVTs) of the All India Coordinated Research Project (AICRP) on pigeonpea for a wider evaluation across different agro-ecological zones in India. ICPL 15010, ICPL 15072, and ICPL 15062 based on their high yield compared to local checks and market preference for seed color and size have been nominated for IVT of the AICRP on pigeonpea under mid-early and medium maturity duration categories. ICPL 15010 has been nominated under the mid-early maturity duration group, whereas ICPL 15072 and ICPL 15062 are under the medium-maturity duration group. Besides India, Myanmar is the second-highest pigeonpea-producing country and is dominated by a longduration variety, Monywa Shwedingar. These promising lines have also been shared with researchers in Myanmar for use in pigeonpea breeding programs. Utilization of these promising ILs derived from wild *Cajanus* species in pigeonpea breeding programs will assist in developing new climate-resilient cultivars with a broad genetic base.

# DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the articles/ **Supplementary Files**.

# AUTHOR CONTRIBUTIONS

SS conceived the idea; SS and CK coordinated the project; SS was involved in developing the pre-breeding populations; SS and CK selected material for this study and constituted the trials; CK, PR, LP, and SM evaluated the material across locations; MS screened the material for FW and SMD; PP assisted in statistical data analysis; SS and PP prepared the manuscript; CK, PR, LP, SM, and MS provided their inputs. All the authors reviewed and approved the final manuscript.

### FUNDING

Funding support provided by the Global Crop Diversity Trust (GCDT), (Grant Number GS15020), and CGIAR Research Program on Grain Legumes and Dryland Cereals (GLDC).

# REFERENCES


## ACKNOWLEDGMENTS

This work is part of the initiative "Adapting Agriculture to Climate Change: Collecting, Protecting and Preparing Crop Wild Relatives," which is supported by the Government of Norway. The project is managed by the Global Crop Diversity Trust. For further information, visit the project website (http://www. cwrdiversity.org/). The support provided by the CGIAR Research Program on Grain Legumes and Dryland Cereals (GLDC) is duly acknowledged.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01269/ full#supplementary-material


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Sharma, Paul, Kumar, Rao, Prashanti, Muniswamy and Sharma. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Seed Coat Pattern QTL and Development in Cowpea (*Vigna unguiculata* [L.] Walp.)

*Ira A. Herniter1\*, Ryan Lo1, María Muñoz-Amatriaín1†, Sassoum Lo1, Yi-Ning Guo1, Bao-Lam Huynh2, Mitchell Lucas1, Zhenyu Jia1, Philip A. Roberts2, Stefano Lonardi3 and Timothy J. Close1*

*1 Department of Botany and Plant Sciences, University of California, Riverside, CA, United States, 2 Department of Nematology, University of California, Riverside, CA, United States, 3 Department of Computer Sciences and Engineering, University of California, Riverside, CA, United States*

#### *Edited by:*

*Matthew Nicholas Nelson, Commonwealth Scientific and Industrial Research Organisation, Australia*

#### *Reviewed by:*

*Kirstin E. Bett, University of Saskatchewan, Canada Leonardo Velasco, Institute for Sustainable Agriculture, Spain*

#### *\*Correspondence:*

*Ira A. Herniter ihern014@ucr.edu*

#### *†Current address:*

 *Department of Soil and Crop Sciences, Colorado State University, Fort Collins, CO, United States*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 25 March 2019 Accepted: 27 September 2019 Published: 25 October 2019*

#### *Citation:*

*Herniter IA, Lo R, Muñoz-Amatriaín M, Lo S, Guo Y-N, Huynh B-L, Lucas M, Jia Z, Roberts PA, Lonardi S and Close TJ (2019) Seed Coat Pattern QTL and Development in Cowpea (Vigna unguiculata [L.] Walp.). Front. Plant Sci. 10:1346. doi: 10.3389/fpls.2019.01346*

The appearance of the seed is an important aspect of consumer preference for cowpea (*Vigna unguiculata* [L.] Walp.). Seed coat pattern in cowpea has been a subject of study for over a century. This study makes use of newly available resources, including mapping populations, a reference genome and additional genome assemblies, and a high-density single nucleotide polymorphism genotyping platform, to map various seed coat pattern traits to three loci, concurrent with the *Color Factor* (*C*), *Watson* (*W*), and *Holstein* (*H*) factors identified previously. Several gene models encoding proteins involved in regulating the later stages of the flavonoid biosynthesis pathway have been identified as candidate genes, including a basic helix–loop–helix gene (*Vigun07g110700*) for the *C* locus, a WD-repeat gene (*Vigun09g139900*) for the *W* locus and an E3 ubiquitin ligase gene (*Vigun10g163900*) for the *H* locus. A model of seed coat development, consisting of six distinct stages, is described to explain some of the observed pattern phenotypes.

#### Keywords: *Vigna unguiculata*, cowpea, seed coat, pigment, pattern, quantitative trait loci

# INTRODUCTION

Cowpea (*Vigna unguiculata* [L.] Walp.) is a diploid (2n = 22) warm season legume which is primarily grown and serves as a major source of protein and calories in sub-Saharan Africa. Further production occurs in the Mediterranean Basin, Southeast Asia, Latin America, and the United States. Just over 7.4 million metric tonnes of dry cowpeas were reported worldwide in 2017 (FAOSTAT, 2019), though these numbers do not include Brazil, Ghana, and some other relatively large producers. Most of the production in sub-Saharan Africa is by smallholder farmers in marginal conditions, often as an intercrop with maize, sorghum, or millet (Ehlers and Hall, 1997). Due to its high adaptability to both heat and drought and its association with nitrogen fixing bacteria, cowpea is a versatile crop (Ehlers and Hall, 1997; Boukar et al., 2018).

The most common form of consumption is as dry grain. The seeds are used whole or ground into flour (Singh, 2014; Tijjani et al., 2015). Seed coat pattern is an important consumer-related

**Abbreviations:** 2049, IT84S-2049; 503, IT93K-503-1; 556, IT97K-556-6; B21, Bambey 21; BB, Big Buff (IT82E-18); bHLH, basic helix-loop-helix; *C*, *Color Factor*; CB27, California Blackeye 27; CB46, California Blackeye 46; CB50, California Blackeye 50; E3UL, E3 Ubiquitin ligase; GWA, Genome-Wide Association; *H*, *Holstein*; MAGIC, Multiparent Advanced Generation InterCross; QTL, quantitative trait locus; RIL, recombinant inbred line; SNP, single nucleotide polymorphism; UCR, University of California, Riverside; *W*, *Watson*; WD40, WD-repeat.

trait in cowpea. Consumers make decisions about the quality and presumed taste of a product based on appearance (Kostyla et al., 1978; Jaeger et al., 2018). Cowpea displays a variety of patterns, including varied eye shapes and sizes, Holstein, Watson, and Full Coat pigmentation, among others (**Figure 1**). Each cowpea production region has preferred varieties, valuing certain color and pattern traits above others for determining quality and use. In West Africa consumers pay a premium for seeds exhibiting certain characteristics specific to the locality, such as lack of color for use as flour or solid brown for use as whole beans (Langyintuo et al., 2004; Mishili et al., 2009; Herniter et al., 2019a). In the United States consumers prefer varieties with tight black eyes, commonly referred to as "black-eyed peas" (Fery, 1985).

Seed coat traits in cowpea have been studied since the early 20th century, when Spillman (1911) and Harland (1919, 1920), reviewed by Fery (1980), explored the inheritance of factors controlling seed coat color and pattern. In a series of F2 populations Spillman (1911) and Harland (1919, 1920) identified genetic factors responsible for color expression, including "*Color Factor*" (*C*), "*Watson*" (*W*), "*Holstein-1*" (*H-1*), and "*Holstein-2*" (*H-2*). A three-locus system controlling seed coat pattern was established by Spillman and Sando (1930) and was confirmed by Saunders (1960) and Drabo et al. (1988), though "*O*" was used in place of "*C*."

A genotyping array for 51,128 single nucleotide polymorphisms (SNP) was recently developed for cowpea (Muñoz-Amatriaín et al., 2017) which offers opportunities to improve the precision of genetic mapping. Numerous biparental populations have been used to map major quantitative trait loci (QTL) for various traits, including root-knot nematode resistance (Santos et al., 2018), domestication-related traits (Lo et al., 2018), and black seed coat color (Herniter et al., 2018) and to develop consensus genetic maps of cowpea (Muchero et al., 2009; Lucas et al., 2011; Muñoz-Amatriaín et al., 2017). In addition, new populations have been developed for higher-resolution mapping including an eight-parent Multi-parent Advanced Generation Inter-Cross (MAGIC) population containing 305 lines (Huynh et al., 2018).

FIGURE 1 | Seed coat pattern traits. Images of lines from various populations demonstrating the phenotypes which were scored as part of this study.

A reference genome sequence of cowpea (Lonardi et al., 2019; phytozome.net) and genome assemblies of six additional diverse accessions (Muñoz-Amatriaín et al., 2019) have been produced recently. Here, we make use of these resources to map a variety of seed coat pattern traits, determine candidate genes, and develop a model for genetic control of seed coat pattern. Additionally, we posit a developmental pattern for the cowpea seed coat to explain some of the observed variation.

# MATERIALS AND METHODS

#### Plant Materials

Ten populations were used for mapping: an eight-parent MAGIC population containing 305 lines (Huynh et al., 2018), four biparental recombinant inbred line (RIL) populations, and five F2 populations. Descriptions of each pattern discussed below can be found in *Seed Coat Phenotyping Section* and examples can be seen in **Figure 1**.

One biparental population consisted of 87 RILs developed at the University of California, Riverside (UCR), derived from a cross between California Blackeye 27 (CB27), which has a black Eye 2 pattern, and IT82E-18, also known as "Big Buff " (BB), which has a brown Full Coat pattern (Muchero et al., 2009). The second biparental RIL population consisted of 80 RILs developed at UCR derived from a cross between CB27 and IT97K-556-6 (556), which has a brown Full Coat pattern (Huynh et al., 2015). The third biparental RIL population consisted of 101 RILs developed at UCR, derived from a cross between California Blackeye 46 (CB46), which has a black Eye 2 pattern, and IT93K-503-1 (503), which has a brown Eye 1 pattern (Pottorff et al., 2014). The fourth biparental RIL population consisted of 76 RILs developed at UCR and at the International Institute for Tropical Agriculture in Nigeria, derived from a cross between 524B, which has a black Eye 2 pattern, and IT84S-2049 (2049), which has a brown Eye 1 pattern (Menéndez et al., 1997). The F2 populations were developed at UCR as part of this work. Two F2 populations, consisting of 176 and 132 individuals, were developed from independent crosses between CB27 and Bambey 21 (B21), which has the No Color phenotype. One F2 population, consisting of 143 individuals, was developed from a cross between B21 and California Blackeye 50 (CB50), which has a black Eye 2 pattern. Two F2 populations, consisting of 175 and 119 individuals, were developed from independent crosses between Tvu-15426, which has a purple Full Coat pattern, and MAGIC014, a line developed as part of the MAGIC population but not included in the final population, which has a black Watson pattern.

To temporally describe seed coat development four accessions were examined: CB27, MAGIC059, Sanzi, and Sasaque. CB27 is described above. MAGIC059 has the Starry Night pattern in black and purple and is one of the lines included in the MAGIC population. Sanzi has a Speckled pattern in black and purple. Sasaque has the Full Coat pattern in red and purple.

#### SNP Genotyping and Data Curation

DNA was extracted from young leaf tissue using the Qiagen DNeasy Plant Mini Kit (Qiagen, Germany). A total of 51,128

SNPs were assayed in each sample using the Illumina Cowpea iSelect Consortium Array (Illumina Inc., California, USA; Muñoz-Amatriaín et al., 2017). Genotyping was performed at the University of Southern California Molecular Genomics Core facility (Los Angeles, California, USA). The same custom cluster file as in Muñoz-Amatriaín et al. (2017) was used for SNP calling. In the F2 populations the extracted DNA was bulked by phenotype, with DNA from 20 individuals combined in each genotyped sample.

For the MAGIC population, SNP data and a genetic map were available from Huynh et al. (2018). The map included 32,130 SNPs in 1,568 genetic bins (Huynh et al., 2018). For the biparental RIL populations, SNP data and genetic maps for the CB27 by BB and the CB46 by 503 populations were available from Muñoz-Amatriaín et al. (2017), and SNP data and a genetic map were available for the 524B by 2,049 population from Santos et al. (2018). The CB27 by 556 genetic map was created using MSTMap (Wu et al., 2008). The CB27 by BB genetic map included 16,566 polymorphic SNPs in 977 genetic bins (Muñoz-Amatriaín et al., 2017); the CB27 by 556 genetic map contained 16,284 SNPs in 2,604 bins; the CB46 by 503 genetic map contained 16,578 SNPs in 683 bins (Muñoz-Amatriaín et al., 2017); the 524B by 2,049 genetic map contained 14,202 SNPs in 933 bins (Santos et al., 2018). For each F2 population, SNPs were filtered to remove nonpolymorphic loci between the respective parents. The number of markers used for each population is as follows: the two CB27 by B21 populations, 8,550 SNPs (**Supplementary Table 1**); the B21 by CB50 population, 8,628 SNPs (**Supplementary Table 2**); the two Tvu-15426 by MAGIC014 populations, 20,010 SNPs (**Supplementary Table 3**).

#### Seed Coat Phenotyping

Phenotype data for seed coat traits were collected by visual examination of the seeds. The scored phenotypic classes consisted of No Color, Eye 1, Eye 2, Holstein, Watson, and Full Coat (**Figure 1**). No Color indicates no pigmentation present on the seed coat. Eye 1 consists of a loose eye in the shape of a teardrop with spots of color outside the eye on the wider side. Eye 2 consists of a tight eye in the shape of two wings with no pigment observed outside the edge of the eye. Holstein consists of an eye with a defined edge and additional spots of pigmentation spread over the seed coat up to almost completely covering the coat. Watson consists of an eye with an indefinite edge. Full Coat consists of pigment completely covering the seed coat. Two of the lines used for observing seed coat development had other seed coat patterns than those mapped. MAGIC014 had the Starry Night pattern, which consists of incomplete pigmentation covering the entire seed. Sanzi had the Speckled pattern, which consists of small dots of pigment covering the seed coat. Seeds with a paler brown color are often difficult to distinguish between the Eye 1 and Watson patterns. The MAGIC population was scored for Eye 1, Eye 2, Holstein, Watson, and Full Coat patterns (**Supplementary Table 4**). The CB27 by BB (**Supplementary Table 5**) and CB27 by 556 (**Supplementary Table 6**) biparental RIL populations were scored for Eye 2, Holstein, Watson, and Full Coat patterns. The CB46 by 503 (**Supplementary Table 7**) and 524B by 2,049 (**Supplementary Table 8**) biparental RIL populations were scored for Eye 1, Eye 2, Holstein, Watson, and Full Coat patterns. The CB27 by B21 and B21 by CB50 F2 populations were scored for the No Color and Eye 2 patterns. The Tvu-15426 by MAGIC014 F2 populations were scored for the Watson and Full coat patterns.

For mapping purposes, each observed pattern was scored individually and mapped independently with scores assigned as "1" indicating presence of the trait and a "0" indicating absence. For example, a line expressing the Eye 1 pattern would be scored as "1" for the Eye 1 trait and "0" for all other traits. Pattern phenotypes are mutually exclusive. As the Eye 1 pattern appears to be epistatic towards the *H* and *W* loci, any lines with the Eye 1 phenotype were scored as missing data for other seed coat phenotypes to avoid biasing the mapping. This was the case in all populations other than the MAGIC population, as the mpMap script could not operate with such an extent of missing data. In the MAGIC population, for traits other than Eye 1 (Eye 2, Holstein, Watson, and Full Coat), individuals with the Eye 1 phenotype were scored as "0" instead of as missing data since marking too many lines as missing data caused r/mpMap to fail.

#### Segregation Ratios

Expected segregation ratios reported in **Table 2** were determined based on the type of population, parental and F1 phenotypes. For example, the F2 populations were expected to segregate in a 3:1 ratio for traits controlled by single genes with complete dominant/recessive relationships, while the biparental RIL populations were expected to segregate in a 1:1 ratio. Expected segregation ratios were tested by chi-square analysis.

For the MAGIC population, based on how the population was constructed (Huynh et al., 2018) it was assumed that each fully homozygous parent had a roughly 1/8 probability to pass its genotype at a particular locus to a given RIL. For example, at the *C* locus, three parents (IT84S-2049, IT89KD-288, and IT93K-503-1) express the Eye 1 phenotype and are proposed to have a *C1C1* genotype, while the other five parents are proposed to have a *C2C2* genotype. Based on this, a given line in the population is expected to have a 3/8 probability of having a *C1C1* genotype and a 5/8 probability of have a *C2C2* genotype. At the *W* and *H* loci, one parent (CB27) is proposed to have the *H0H0* and *W0W0* genotypes, while the other seven parents are proposed to have the *W1W1* and *H1H1* genotypes. Based on this, a line should have a 1/8 probability of having the *W0W0* and a 1/8 probability of having the *H0H0* genotype. By multiplying the probabilities at each locus, the probability of a given genotype can be determined using the following equation:

$$P\_{\mathcal{C}} \ast P\_{\mathcal{W}} \ast P\_{\mathcal{H}} = P\_{\text{set}}$$

where PC is the probability of a given allele at the *C* locus, PW is the probability of a given allele at the *W* locus, PH is the probability of a given allele at the *H* locus, and Pnet is the probability of a given genotype. For example, the probability of a *C2C2H1H1W0W0* genotype, which would have a Holstein phenotype would be 35/512 ([5/8]\*[7/8]\*[1/8]). The above method results in a predicted 192:5:35:35:245 phenotypic ratio for the Eye 1 (*C1C1*), Eye 2 (*C2C2H0H0W0W0*), Holstein (*C2C2H1H1W0W0*), Watson (*C2C2H0H0W1W1*), and Full Coat (*C2C2H1H1W1W1*) patterns, respectively.

#### Trait Mapping

Trait mapping was achieved with different methods for each type of population. In the MAGIC population, the R package "mpMap" (Huang and George, 2011) was used as described by Huynh et al. (2018). The significance cutoff values were determined through 1,000 permutations, resulting in a threshold of *p* = 8.10E−05 [−log10(*p*) = 4.09]. Due to the high number of markers in the genotype data, imputed markers spaced at 1 cM intervals were used.

In the biparental RIL populations, the R packages "qtl" (Broman et al., 2003) and "snow" (Tierney et al., 2015) were used as in Herniter et al. (2018). Briefly, probability values were assigned to each SNP using a Haley-Knott regression, tested for significance with 1,000 permutations, and marker effects were determined using a hidden Markov model.

For the F2 populations, the genotype calls of each bulked DNA pool in the population were filtered to leave only the markers known to be polymorphic between the parents, and these were then sorted based on physical positions in the pseudochromosomes available from Phytozome (Lonardi et al. 2019; phytozome.net). Each population's genotype was then examined visually in Microsoft Excel for areas where the recessive bulk was homozygous, and the dominant bulk was heterozygous. Duplicated populations were examined in conjunction.

#### Determining Haplotype Blocks

Once significant regions were established through mapping analysis, the overlapping area shared between the four biparental RIL populations was examined to determine the minimal area where all four biparental populations had overlapping haplotype blocks. SNPs located in the hotspots of pseudochromosomes Vu07, Vu09, and Vu10 were examined visually in Microsoft Excel for regions of identity within phenotypic groups. SNPs located in the hotspots which had been removed during trait mapping due to high levels of missing data were added back as presence/absence variations and segregated similar to nucleotide polymorphisms.

#### Determining Candidate Genes

Genes were examined within each minimal haplotype block. Gene expression data (Yao et al., 2016), from the cowpea reference genome (IT97K-499-35), which has a black Eye 1 (*C1C1*) pattern available from the Legume Information System (legumeinfo.org) were examined for expression in developing seed tissue. Genes encoding proteins known to be involved in regulation of the flavonoid biosynthesis pathway were prioritized.

#### Determining Allelic Series

Dominance relationships were determined by examining the phenotypes of several F1 progeny in addition to segregation ratios in the F2 populations. Crosses were made between CB27 and three lines from the CB27 by BB population (BB-090, BB-113, and BB-074). Seeds from these F1 plants were visually examined for seed coat patterns. CB27/BB-090 seeds had a Watson pattern (*C2C2H0H0W1W1*), CB27/BB-113 seeds had a Holstein pattern (*C2C2H1H1W0W0*), and CB27/BB-074 seeds had a Full Coat pattern (*C2C2H1H1W1W1*). An additional cross was available from the early development of the MAGIC population, where the phenotype of the seed coat on seeds from a maternal *C1C2* heterozygote was Full Coat. IT84S-2246 (Full Coat, *C2C2H1H1W1W1*) was crossed with IT93K-503-1 (Eye 1, *C1C1H1H1W1W1*) to yield this *C2C1H1H1W1W1* maternal parent.

#### Comparing Sequence Variation

The genome sequences of the candidate genes from each of five genome sequences (the reference genome sequence and four additional genome assemblies) and about 3 kb of upstream sequence were compared using A plasmid Editor (ApE; jorgensen.biology.utah.edu/wayned/ape/). Transcription factor binding sites were predicted in the upstream regulatory region of each gene using the binding site prediction function available from the Plant Transcription Factor Database (Jin et al., 2017; planttfdb.cbi.pku.edu.cn/). The species input was *Vigna radiata* (mung bean), as a map of cowpea was unavailable. The cowpea reference sequence is of IT97K-499-35. Among the additional sequenced genomes, CB5-2 has the Eye 2 pattern (*C2C2*), Suvita-2 has the Full Coat pattern (*C2C2H1H1W1W1*), Sanzi has a Speckled pattern, and UCR779 has the Full Coat pattern (*C2C2H1H1W1W1*). *See Seed Coat Phenotyping Section* for pattern descriptions and **Figure 1** for examples.

A larger set of SNPs (about 1 million), discovered from wholegenome shotgun sequencing of 37 diverse accessions (Muñoz-Amatriaín et al., 2017; Lonardi et al. 2019), was available from Phytozome (phytozome.net). Among the 37 accessions, 28 had phenotype data available. These lines were examined for variations in the SNP selection panel that were in the genecoding and regulatory regions of the candidate genes.

# Correlation Test

The 28 lines from the SNP selection panel with phenotype and genotype data available were tested for correlation in R, using the native "cor.test" function. For input, the phenotype was recorded as "+1" for accessions with the Eye 1 (*C1C1*) phenotype and "−1" for those without. The genotype was recorded as "+1" for accessions matching the reference genotype, "−1" for the alternate homozygote, and "0" for the heterozygote (**Supplementary Table 9**).

### Seed Color Development

The four accessions for which pattern development was recorded (CB27, MAGIC059, Sanzi, and Sasaque) were grown in a greenhouse at the University of California, Riverside (Riverside, California; 33.97° N 117.32° W) at a constant temperature of about 32°C from March through May 2018. Three plants were used for each accession. Upon flowering, each flower was tagged with the date it opened. The flowers were permitted to self-fertilize. For each day after the flower opened, beginning on the second day, on each of the three test plants a pod was collected until no more green pods were observed.

Seeds from each collected pod were photographed using a Canon EOS Rebel T6i at a 90º angle under consistent lighting conditions. The length of the most advanced seed within the pod was measured using ImageJ (imagej.nih.gov). A developmental scale from 0 to 5 was designed based on the visual observations of the spread of pigmentation (see *Results* section). Each photograph was scored using this scale.

# RESULTS

#### Phenotypic Data and Segregation Ratios

Phenotypic data and proposed genotypes for each parent in the observed populations can be found in **Table 1**. A summary of the phenotypic data, along with predicted segregation ratios, chisquare values, and probability can be found in **Table 2**.

#### Identification of Loci Controlling Seed Coat Pattern

A total of 35 SNP loci were identified using different methods for each population type (see *Materials and Methods* section for details) and were concentrated on three chromosomes: Vu07 (*C* locus), Vu09 (*H* locus), and Vu10 (*W* locus). Mapping results can be found in **Supplementary Table 10**. The overlapping mapping results allowed a narrowing of the area examined for candidate genes.

#### Determination of Minimal Haplotype Blocks

Following trait mapping, all called SNPs on chromosomes Vu07, Vu09, and Vu10 were examined for minimal haplotype blocks in the overlapping significant regions in the four biparental RIL populations. On Vu07 (*C* locus) the minimal haplotype block was between 2\_12939 and 2\_09638 (228,331 bp) and contained ten genes. On Vu09 the minimal haplotype block was between 2\_33224 and 2\_12692 (166,724 bp) and contained seventeen genes. On Vu10 the minimal haplotype block was between 2\_12467 and 2\_15325 (120,513 bp) and contained eleven genes. The list of candidate genes can be found in **Supplementary Table 11** and on Phytozome (Lonardi et al., 2019; phytozome.net). The minimal haplotype block .0regions can be found in **Supplementary Table 12**.

# I890-dentification of Candidate Genes

A predominant candidate gene was identified at each locus based on high relative expression in the developing seeds (**Supplementary Figure 1**) and a review of the literature on the regulation of the flavonoid biosynthesis pathway (see *Discussion* section for details). This led to the determination of a single major candidate gene on each of Vu07, Vu09, and Vu10. Each of the candidate genes belongs to a class which is known to be involved in transcriptional control of the later stages of

#### TABLE 1 | Parental phenotypes and expected genotypes of the examined populations.


TABLE 2 | Phenotypes, segregation ratios, and probability values for the tested populations.


flavonoid biosynthesis. No Color, Eye 1, and Full Coat mapped to an overlapping area on Vu07, where the gene *Vigun07g110700*, encoding a basic helix–loop–helix protein, was noted as a strong candidate gene. Eye 2, Holstein, Watson, and Full Coat mapped to a similar area on Vu09, where the gene *Vigun09g139900*, encoding a WD-repeat gene, was noted as a strong candidate gene. Eye 1, Eye 2, Holstein, Watson, and Full Coat mapped to an overlapping area on Vu10, where the gene *Vigun10g163900*, encoding an E3 ubiquitin ligase protein with a zinc finger, was noted as a strong candidate gene.

#### Determination of Allelic Series

Segregation ratios indicated the dominance of *H1* over *H0* (*Holstein* locus, **Figure 2E**, **G**ii), *W1* over *W0* (*Watson* locus, **Figure 2G**i), *C2* over *C0* (*Color Factor* locus, **Figure 2F**), and *C2* over *C1* (*Color Factor* locus, **Figure 2G**iv). The dominance relationship between the *C1* and C0 alleles could not be determined from these data.

#### Sequence Comparisons of Candidate Genes

Multiple sequence alignments for each of the three candidate genes and regulatory regions (~3 kb upstream of the transcription start site) revealed SNPs and small insertions or deletions (**Supplementary Datasets 1**, **2**, and **3**). None of the variants in the transcript sequence were predicted to cause changes in the amino acid sequence.

The regulatory region of *Vigun07g110700* (*C* locus candidate gene) showed a C/T SNP variation between the reference genome and the four other genome sequences on Vu07 at 20,544,306 bp. The reference genome has a T at this position while the other four sequences have a C. Transcription

FIGURE 2 | Interaction of seed coat pattern loci. (A) Table displaying the pattern loci identified in mapping, their locations, the trait encoded, alleles identified, and phenotypes. (B) Table displaying the allelic series and relative dominance of alleles. (C) Segregation patterns for the CB27 by BB and CB27 by 556 F8 populations. (D) Segregation patterns for the CB46 by 503 and 524B by 2049 F8 populations. (E) Segregation pattern for the Tvu-15426 by MAGIC014 F2 populations. (F) Segregation pattern for the CB27 by B21 and B21 by CB50 F2 populations. (G) Phenotype of seeds from the F1 plants resulting from a series of crosses (i) Cross between CB27 and line from the CB27 by BB population with a Watson pattern, resulting in Watson pattern. (ii) Cross between CB27 and a line from the CB27 by BB population a Holstein pattern, resulting in Holstein pattern. (iii) Cross between CB27 and a line from the CB27 by BB population with a Full Coat pattern, resulting in a Full Coat pattern. (iv) Cross between IT84S-2246 and IT93K-503-1 from the early development of the MAGIC population, resulting in a Full Coat pattern in the seed coats on seeds of the F1 maternal parent.

factor binding site prediction from the Plant Transcription Factor Database (planttfdb.cbi.pku.edu.cn/) indicated that this variation constitutes either a WRKY binding site in the C allele or an ERF binding site in the T allele. Of the 28

accessions in the SNP selection panel, eleven expressed the Eye 1 (*C1*) pattern and 17 did not. Twenty accessions had a CC genotype, six had a TT genotype, and two had a TC genotype. The correlation test gave an estimated correlation value of 0.75, with a *p*-value of 3.51E-06, indicating significant correlation between the genotype and phenotype values such that this SNP is a reliable marker for distinguishing between the *C1* (Eye 1) and the *C2* (Eye 2) alleles. Two of the 28 lines had the No Color (*C0*) phenotype, but had the CC genotype, indicating that this SNP is not a good marker for the *C0* allele (for a possible explanation see *Discussion* section). The regulatory region of *Vigun09g139900* (*W* locus candidate gene) showed a C/T variation between the reference genome and CB5-2 against the other three genome sequences on Vu09 at 30,207,722 bp. This SNP was not included in the list from the SNP selection panel and so could not be examined like the previous SNP. Transcription factor binding site prediction did not indicate that the site was a target for any transcription factor in either form. The upstream regulatory region of *Vigun10g163900* (*H* locus candidate gene) did not have any distinguishing variation.

#### Stages of Color Development

A model of seed coat development has been formulated consisting of six stages based on the spread of pigmentation. In Stage 0, there is no color on the seed coat. In Stage 1, color appears at the base of the hilum. In Stage 2, color appears around the hilum. In Stage 3, color begins to spread along the outside edges of the seed. In Stage 4, color begins to fill in on the edges of the testa. In Stage 5, the color has completely developed to the mature level. After Stage 5 the pod and seeds begin to desiccate. Of the observed varieties, only Sasaque and Sanzi completed all six stages. MAGIC059 reached Stage 4, while CB27 only reached Stage 2. No seeds in Stage 0 were observed for Sasaque. Images of each tested variety at

various stages can be seen in **Figure 3**. Color development was associated with seed size; the pigmentation spread as the seeds grew larger.

# DISCUSSION

#### Segregation Ratios and Epistatic Interaction of Seed Coat Pattern Loci

Segregation ratios and dominance data (**Table 2** and **Figure 2**) in the tested populations were consistent with a three gene system with simple dominance and epistatic interactions that matches the *C* (*Color Factor*), *W* (*Watson*), and one of the *H* (*Holstein*) factors identified by Spillman (1911) and Harland (1919, 1920). In brief, the *C* locus encodes a "constriction" factor while the *W* and *H* loci encode distinct "expansion" factors. The *C* locus is the primary locus controlling seed coat pattern. Pigmentation may be not visible (No Color, *C0*), constrained to an eye (Eye 1, *C1*), or distributed throughout the seed coat (Eye 2, Holstein, Watson, or Full Coat, *C2*). The extent of distribution is modified by the *H* and *W* loci, whose contribution is visible only with an unconstrained allele (*C2*) at the *C* locus. In the presence of *Holstein* (*H1*) and absence of *Watson* (*W0*), a Holstein pattern is expressed. Conversely, in the presence of *Watson* (*W1*) and absence of *Holstein* (*H0*), a Watson pattern is expressed. In combination, the *Watson* (*W1*) and *Holstein* (*H1*) factors result in the Full Coat phenotype.

Based on the above proposed allelic series, an individual with the *C0C0* genotype will express the No Color pattern, regardless of the genotypes at the *W* and *H* loci, and an individual with the *C1C1* genotype will express the Eye 1 pattern, regardless of the genotypes at the *W* and *H* loci. However, when not constricted by a *C0* or *C1*

allele (having the *C2* allele) the "expansion" factors can be observed. An individual with the *C2––W0W0H1––* genotype expresses the Holstein pattern, while and individual with the *C2––W1––H0H0* genotype expresses the Watson pattern. An individual with the *C2––W1––H1––* genotype, with both "expansion" factors, expresses the Full Coat pattern. An individual with the *C2––W0W0H0H0* genotype expresses the Eye 2 pattern. In this latter case the eye pattern is observed despite the unconstricted *C2* allele due to the absence of the "expansion" factors. Based on this model, the CB27 by BB and CB27 by 556 populations segregate at the *W* and *H* loci (**Figure 2C**), while the MAGIC, CB46 by 503, and 524B by 2,049 populations segregate at all three loci (**Figure 2D**). Similarly, the Tvu-15426 by MAGIC014 populations segregate at the *W* locus (**Figure 2E**) and the CB27 by B21 and B21 by CB50 populations segregate at the *C* locus (**Figure 2F**).

An additional pattern phenotype of Blue-grey Ring was noted in some of the tested populations. Blue-grey Ring consists of a pale ring of bluish-grey surrounding the eye (**Figure 1**). It appears only with the Eye 1 (*C1*) phenotype but is not always present when the phenotype is Eye 1 (*C1*). The Blue-grey Ring phenotype may represent another (fourth) allele at the *C* locus, or it may result from a combination of the *C1* (Eye 1) allele and other pigmentation genes. However, from other unpublished work on seed coat color there does not appear to be a strict correlation between seed coat color and presence of the Blue-grey Ring. Further research is required to clarify the basis of the Blue-grey Ring phenotype.

#### Pattern Traits Qtl Overlap

Several regions of the genome are hotspots for seed coat pattern traits (**Supplementary Table 11**). These correspond to locations of genetic factors identified by Spillman (1911) and Harland (1919, 1920), who identified four factors controlling seed coat patterning: *Color Factor* (*C*), *Watson* (*W*), *Holstein-1* (*H-1*), and *Holstein-2* (*H-2*). The present data suggest the presence of only one *Holstein* locus or that the two loci are very closely linked in the tested populations. To avoid possible confusion, the *Holstein* locus discussed here is simply termed "*H*."

The major QTL and regions of interest for No Color and Eye 1 are clustered in an overlapping region on Vu07, suggesting that the "constriction" factor at locus *C* is at that position with allelism at the locus. Mapping results from the Tvu-15426 by MAGIC014 F2 populations indicate that the *H* locus is on Vu10. Additional evidence for the *H* locus being located on Vu10 comes from Wu et al. (2019), who identified the *Anasazi* locus (equivalent to the cowpea *H* locus) on chromosome 10 of common bean, which is homologous to Vu10 (Lonardi et al., 2019). While none of the biparental F2 populations segregated solely for the *W* locus, the identification of the *C* locus on Vu07 and the *H* locus on Vu10 must, by process of elimination, identify the location of the *W* "expansion" locus on Vu09.

#### Seed Coat Pattern Is Due to Failure to Complete the Normal Color Developmental Program

It was noted that the varieties with the Full Coat pattern at maturity followed the developmental pattern described in *Stages of*  *Color Development Section* and shown in **Figure 3** to completion. In contrast, varieties which do not display the Full Coat pattern appear to have color development arrested at certain points. This is most obvious in CB27 (Eye 2, *C2*), where color development proceeds only to Stage 2. It is likely that other varieties which have distinct eye sizes proceed to varied stages of development. For example, varieties with the No Color (*C0*) phenotype would not proceed past Stage 0. However, the three gene model presented here does not explain every seed coat pattern. An example is the pattern observed in mature Sanzi seed, which exhibits a Speckled black and purple seed coat (see *Seed Coat Phenotyping Section* for a description and **Figure 1**). According to this analysis, Sanzi completes all six stages of seed coat development, indicating that the Speckled pattern is controlled separately. A biparental RIL population, consisting of lines derived from a cross between Sanzi and Vita 7, which has a brown Full Coat pattern (*C2C2W1W1H1H1*), was used for mapping the black seed coat color; there was a perfect correlation between black seed coat color and the Speckled pattern (Herniter et al., 2018). This indicates that genetic control of the Speckled pattern is colocalized with black seed coat color and may be an allele at the *Bl* locus, which is located on Vu05.

Further research is needed to determine if all cowpea accessions follow the pattern observed in the four tested lines shown in **Figure 3**. It may be that each of the observed stages of seed coat pigmentation development is controlled by a different gene, and that failures of normal gene function cause the observed variation in patterning. Evidence for this model is furnished by the noted developmental pattern of the seed coats where development appears to be arrested at Stage 2 in CB27, which expresses the Eye 2 (*C2*) pattern, and at Stage 4 in MAGIC059, which expresses the Starry Night pattern (see *Seed Coat Phenotyping Section* for a description and **Figure 1**). The mechanism by which this occurs is not elucidated here and requires further research. Transcriptome data could be gathered for the seed coat at each developmental stage. The currently available transcriptome data (Yao et al., 2016; legumeinfo.org) used whole seeds at specific days post flowering and do not distinguish between transcripts in the seed coat and those in the embryo or cotyledons, and further do not separate transcripts by developmental stage.

# Candidate Gene Function

The later steps in flavonoid biosynthesis are controlled by a transcription factor complex composed of an R2-R3 MYB protein, a basic helix-loop-helix protein (bHLH), and a WD-repeat protein (WD40; Xu et al., 2015). E3 Ubiquitin ligases (E3UL) are believed to negatively regulate this complex (Shin et al., 2015). The color and location (leaf, pod, seed coat) of the pigmentation are determined by expression patterns (Wu et al., 2003; Iorizzo et al., 2018). Candidate genes on Vu07 (*C* locus) and Vu09 (*W* locus) encode a bHLH and WD40 protein, respectively. A candidate gene on Vu10 (*H* locus) encodes an E3UL protein. This information lends itself to a model in which *Vigun07g110700* (bHLH) serves as a "master switch" controlling the extent of pigmentation constriction while *Vigun09g139900* (WD40) and *Vigun10g163900* (E3UL)

act as "modulating switches" controlling the type of expanded pattern, altering the effect of the pathway to result in the observed Holstein and Watson patterns (**Figure 4**). The R2-R3 MYB directs the DNA binding of the complex, with expression of different genes in different tissues resulting in the observed color and location of the pigments. For example, MYB genes identified by Herniter et al. (2018) are required for black seed coat and purple pod tip color. Further, *Vigun07g110700*  (bHLH) was identified as a candidate gene controlling flower color in cowpea by Lo et al. (2018), indicating a possible dual function of the gene. Indeed, Harland (1919, 1920) noted that a lack of pigment in the flower was often associated with a lack of pigment in the seed coat. Finally, homologs of *Vigun07g110700* have been identified in other legumes as Mendel's *A* gene controlling flower color in *Pisum sativum* (Hellens et al., 2010) and as the *P* gene in *Phaseolus vulgaris* (McClean et al., 2018).

Two R2R3 *MYB* genes (*Vigun10g165300* and *Vigun10g165400*) are located only 110 kb downstream of *Vigun10g163900* (*H* locus candidate gene). However, these fall outside of the haplotype blocks identified in the CB27 by BB and CB27 by 556 populations, indicating that they are not the source of the observed phenotypic variation. However, there may be interaction between one or both of these MYBs and the E3UL responsible for the Holstein pattern; this hypothesis could be investigated through additional research.

The observed C/T SNP variation in the regulatory sequence of *Vigun07g110700* (bHLH) at 20,544,306 bp constitutes a difference between a WRKY binding site in the *C2* (Eye 2) allele versus an ERF binding site in the *C1* (Eye 1) allele. WRKY proteins are positive regulators of seed coat pigment biosynthesis in Arabidopsis (Lloyd et al., 2017) while ERF proteins negatively regulate the same pathway (Matsui et al., 2008). This SNP could be used as a genetic marker to distinguish between the *C1* and *C2* alleles. The lack of correlation between an observed marker and the *C0* (No Color) allele may be caused by other variants, such as a small deletion interrupting gene function, which has been shown in *P. vulgaris* (McClean et al., 2018). Such a variation would not be detected by the genotyping platform used for this study. Similarly, the observed C/T SNP variation in the regulatory region of *Vigun09g139900* at 30,207,722 bp could be used as a marker to distinguish between the *W0* (not Watson) and *W1* (Watson) alleles, despite not necessarily being the cause of the observed phenotypic variation. No single variation was identified for *Vigun10g163900* alleles. However, haplotype blocks determined from the biparental RIL populations can be used for future breeding efforts. Two SNPs which fall within the genome sequence of *Vigun10g163900* segregate with the phenotype in the biparental RIL populations. At 2\_24359, the lines with the *H0* (not Holstein) allele have an A genotype and the lines with the *H1* (Holstein) allele have a G genotype. At 2\_24360, the lines with the *H0* (not Holstein) allele have an A and the lines with the *H1* (Holstein) allele have a C. Future research is needed to develop more perfect markers for the three loci.

#### DATA AVAILABILITY STATEMENT

All datasets [SNPs] for this study are included in the manuscript/ **Supplementary Files**.

#### AUTHOR CONTRIBUTIONS

IH performed all trait mapping, statistical analysis, and interpretation. RL performed analysis of the seed coat development. MM-A assisted in trait mapping and provided SNP data. SaL assisted in trait mapping. Y-NG extracted DNA for genotyping. B-LH provided the MAGIC population and its genotypic information. ML performed crosses used for allelic series analysis. ZJ assisted in statistical analysis. PR and TC provided guidance and access to population and genetic resources. StL assisted with the SNP selection panel data. TC assisted IH with the writing.

# FUNDING

This study was supported by the Feed the Future Innovation Lab for Climate Resilient Cowpea (USAID Cooperative Agreement AID-OAA-A-13-00070), the National Science Foundation BREAD project "Advancing the Cowpea Genome for Food Security" (NSF IOS-1543963), and Hatch Project CA-R-BPS-5306-H.

#### ACKNOWLEDGMENTS

This manuscript has been released as a Pre-Print in bioRxiv (Herniter et al., 2019b). The authors thank Amy Litt for helpful discussion and guidance on pattern development; Eric Castillo and Sabrina Phengsy for assistance with seed photography; Steve

#### REFERENCES


Wanamaker for assistance in the analysis of the various genome sequences.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01346/ full#supplementary-material

SUPPLEMENTARY FIGURE 1 | Relative expression levels of the candidate genes. TPM, Transcripts per million; dap, days after pollination. Data retrieved from legumeinfo.org.

carrot (*Daucus carota* L.) root and petiole. *Front. Plant Sci.* 9, 1927. doi: 10.3389/ fpls.2018.01927


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Herniter, Lo, Muñoz-Amatriaín, Lo, Guo, Huynh, Lucas, Jia, Roberts, Lonardi and Close. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Biotic and Abiotic Constraints in Mungbean Production—Progress in Genetic Improvement

*Ramakrishnan M. Nair1\*, Abhay K. Pandey1, Abdul R. War1, Bindumadhava Hanumantharao1, Tun Shwe2, AKMM Alam3, Aditya Pratap4, Shahid R. Malik5, Rael Karimi6, Emmanuel K. Mbeyagala7, Colin A. Douglas8, Jagadish Rane9 and Roland Schafleitner10*

*1 World Vegetable Center, South Asia, Hyderabad, India, 2 Myanmar Department of Agricultural Research, Nay Pyi Taw, Myanmar, 3 Pulses Research Centre, Bangladesh Agricultural Research Institute (BARI), Gazipur, Bangladesh, 4 Crop Improvement Division, ICAR-Indian Institute of Pulses Research (IIPR), Kanpur, India, 5 Pakistan Agricultural Research Council, Islamabad, Pakistan, 6 Kenya Agricultural and Livestock Research Organization (KALRO), Katumani, Kenya, 7 National Agricultural Research Organization-National Semi-Arid Resources Research Institute (NARO-NaSARRI), Soroti, Uganda, 8 Agri-Science Queensland, Department of Agriculture and Fisheries, Hermitage Research Facility, Warwick, QLD, Australia, 9 National Institute of Abiotic Stress Management, Baramati, India, 10 World Vegetable Center, Tainan, Taiwan*

#### *Edited by:*

*Penelope Mary Smith, La Trobe University, Australia*

#### *Reviewed by:*

*Prakit Somta, Kasetsart University, Thailand Shimna Sudheesh, Department of Economic Development Jobs Transport and Resources, Australia*

#### *\*Correspondence:*

*Ramakrishnan M. Nair ramakrishnan.nair@worldveg.org*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 27 March 2019 Accepted: 25 September 2019 Published: 25 October 2019*

#### *Citation:*

*Nair RM, Pandey AK, War AR, Hanumantharao B, Shwe T, Alam A, Pratap A, Malik SR, Karimi R, Mbeyagala EK, Douglas CA, Rane J and Schafleitner R (2019) Biotic and Abiotic Constraints in Mungbean Production—Progress in Genetic Improvement. Front. Plant Sci. 10:1340. doi: 10.3389/fpls.2019.01340*

Mungbean [*Vigna radiata* (L.) R. Wilczek var. *radiata*] is an important food and cash legume crop in Asia. Development of short duration varieties has paved the way for the expansion of mungbean into other regions such as Sub-Saharan Africa and South America. Mungbean productivity is constrained by biotic and abiotic factors. Bruchids, whitefly, thrips, stem fly, aphids, and pod borers are the major insect-pests. The major diseases of mungbean are yellow mosaic, anthracnose, powdery mildew, Cercospora leaf spot, halo blight, bacterial leaf spot, and tan spot. Key abiotic stresses affecting mungbean production are drought, waterlogging, salinity, and heat stress. Mungbean breeding has been critical in developing varieties with resistance to biotic and abiotic factors, but there are many constraints still to address that include the precise and accurate identification of resistance source(s) for some of the traits and the traits conferred by multi genes. Latest technologies in phenotyping, genomics, proteomics, and metabolomics could be of great help to understand insect/ pathogen-plant, plant-environment interactions and the key components responsible for resistance to biotic and abiotic stresses. This review discusses current biotic and abiotic constraints in mungbean production and the challenges in genetic improvement.

#### Keywords: mungbean, breeding, stresses, insect-pests, diseases, marker-assisted selection

# INTRODUCTION

Mungbean [*Vigna radiata* (L.) R. Wilczek var. *radiata*] is a short-duration grain legume cultivated over 7 million hectares, predominantly across Asia and rapidly spreading to other parts of the world. Mungbean seeds are rich in proteins (~24% easily digestible protein), fiber, antioxidants, and phytonutrients (Itoh et al., 2006). Mungbean is consumed as whole seed or split cooking, flour, or as sprouts, thus, forms an important source of dietary protein. Mungbean sprouts contain high amounts of thiamine, niacin, and ascorbic acid. Yield potential of mungbean is in the range of 2.5–3.0 t/ha, however, the average productivity of mungbean is staggering low at 0.5 t/ha. The low productivity is due to abiotic and biotic constraints, poor crop management practices and non-availability of quality seeds of improved varieties to farmers (Chauhan et al., 2010; Pratap et al., 2019a). The major biotic factors include diseases such as yellow mosaic, anthracnose, powdery mildew, Cercospora leaf spot (CLS), dry root rot, halo blight, and tan spot, and insect-pests especially bruchids, whitefly, thrips, aphids, and pod borers (Lal, 1987; Singh et al., 2000; War et al., 2017; Pandey et al., 2018). Abiotic stresses affecting mungbean production include waterlogging, salinity, heat, and drought stress (HanumanthaRao et al., 2016; Singh and Singh, 2011). Genetic diversity in cultivated mungbean is limited due to breeding efforts that were restricted to relatively few parental lines and hence the need to broaden the narrow genetic base of cultivated mungbeans. Development of short-duration varieties has paved the way for expansion of mungbean into different cropping systems (rice–rice, rice–wheat and rice-maize intercropping) and for cultivation in other regions of the world including Sub-Saharan Africa and South America (Shanmugasundaram, 2007; Moghadam et al., 2011). In order to improve productivity and stabilize crop production, there is a need to develop varieties resistant to biotic and abiotic stress factors. Breeding information on the biotic and abiotic stresses in mungbean and on the influence of environmental stresses at different plant development stages is essential to identify the sources for tolerance traits expressed at the right stage. With advanced technologies *viz*., phenotyping, genomics, proteomics and metabolomics, the genetic basis of plant interactions with pest, pathogen, and environment can be dissected to design effective crop improvement strategies. In this context, we discuss the biotic and abiotic constraints in mungbean, and the breeding efforts to improve this short duration crop.

# BIOTIC STRESS IN MUNGBEAN

#### Major Diseases and Economic Impacts

Viral, bacterial, and fungal diseases are of economic importance in South Asia, South East Asia, and Sub-Saharan Africa (Taylor et al., 1996; Singh et al., 2000; Raguchander et al., 2005; Mbeyagala et al., 2017; Pandey et al., 2018). Mungbean yellow mosaic disease (MYMD) is an important viral disease of mungbean (Singh et al., 2000; Noble et al., 2019). MYMD is caused by several begomoviruses, which are transmitted by whitefly *Bemisia tabaci* (Gennadius) (Hemiptera: Aleyrodidae) (Nair et al., 2017). The major fungal diseases are Cercospora leaf spot (CLS) [*Cercospora canescens* Ellis & G. Martin], powdery mildew (*Podosphaera fusca*  (Fr.) U. Braun & Shishkoff, *Erysiphe polygoni* (Vaňha) Weltzien) and anthracnose (*Colletotrichum acutatum* (J.H. Simmonds), *C. truncatum* (Schwein.) Andrus & Moore*, C. gloeosporioides*  (Penz.) Penz. & Sacc)*.* Dry root rot [*Macrophomina phaseolina*  (Tassi) Goid] is an emerging disease of mungbean. The less important ones are web blight (*Rhizoctonia solani* Kuhn), Fusarium wilt (*Fusarium solani* (Mart.) Sacc) and Alternaria leaf spot (*Alternaria alternata* (Fr.) Keissl) (Ryley and Tatnell, 2011; Pandey et al., 2018). Halo blight (*Pseudomonas syringae* pv. *phaseolicola*), bacterial leaf spot (*Xanthomonas campestris*  pv. *phaseoli*), and tan spot (*Curtobacterium flaccumfaciens* pv. *flaccumfaciens*) are the important bacterial diseases. The economic losses due to MYMD account for up to 85% yield reduction in India (Karthikeyan et al., 2014). Dry root rot caused 10–44% yield losses in mungbean production in India and Pakistan (Kaushik and Chand, 1987; Bashir and Malik, 1988). Reports of yield losses of 33–44% due to Rhizoctonia root rot (Singh et al., 2013a) and 30–70% due to anthracnose (Kulkarni, 2009; Shukla et al., 2014) from India were estimated. Yield losses due to CLS were 97% in Pakistan and different states of India (Iqbal et al., 1995; Chand et al., 2012; Bhat et al., 2014), and 40% due to powdery mildew (Khajudparn et al., 2007). Among the minor fungal diseases, 20% yield loss was reported due to Fusarium wilt (Anderson, 1985) and 10% due to Alternaria leaf spot (Maheshwari and Krishna, 2013). A survey of mungbean fields throughout China between 2009–2014 reported average yield reductions of 30–50% and total crop failure in severely infected fields due to halo blight (Sun et al., 2017). Halo blight is an emerging disease in China (Sun et al., 2017) and Australia (Noble et al., 2019). In Iran, 70% incidence (Osdaghi, 2014) and in India 30% incidence (Kumar and Doshi, 2016) of bacterial leaf spot (*X. phaseoli*) has been reported. Studies were carried out to investigate the efficacy of bactericides, fungicides, bio-fungicides and botanicals in seed treatment and foliar spray and impact of cultural practices to reduce mungbean diseases (Pandey et al., 2018). Deployment of varieties with genetic resistance is the most effective and durable method for integrated disease management.

#### BREEDING FOR RESISTANCE TO VIRAL DISEASES

Research into resistance to MYMD has been underway since 1980, with mutant genotypes developed from local germplasm by mutation breeding (gamma irradiation) at the National Institute for Agriculture and Biology, Pakistan, which later led to the development of the popular NM series varieties including NM 92 and NM 94 (Ali et al., 1997). Researchers reported that in mungbean, the genetic resistance against MYMD is governed by a single recessive gene (Reddy, 2009a), a dominant gene (Sandhu et al., 1985), two recessive genes and complementary recessive genes (Pal et al., 1991; Ammavasai et al., 2004). The mungbean variety NM 92 showed a resistant reaction against MYMD due to a single recessive gene (Khattak et al., 2000). Dhole and Reddy (2012) reported that two recessive genes governed the segregation ratio in the F2 population in six crosses between resistant and susceptible genotypes. However, F2 and F3 populations developed through an inter-specific [TNAU RED × VRM (Gg) 1] and intraspecific [KMG 189 × VBN (Gg)] crosses showed role of a single recessive gene in MYMD resistance (Sudha et al., 2013). Saleem et al. (1998) in their study with F2 populations derived from crosses between two local lines (NM-92 and NM-93-resistant to MYMD) and four exotic lines (VC-1973A, VC-2254A, VC-2771A and VC-3726A-susceptible to MYMD), found that susceptibility and resistance were controlled by a single genetic factor and that susceptibility was dominant over resistance. Similar results were recorded by Jain et al. (2013) in F2 and F3 populations of crosses between five susceptible (LGG 478, KM6 202, PUSA 9871, K 851, and KM6 204) and 4 resistant (KM6201, Sonamung, Samrat, and KM6 220) lines, and it was reported that the inheritance was governed by single dominant gene. However, two recessive genes were found to be responsible for MYMD resistance in the populations developed from crosses between two resistant (Satya and ML 818) and two susceptible (Kopergoan and SML 32) cultivars (Singh et al., 2013b). However, in the study of Mahalingam et al. (2018) two dominant genes governed MYMD resistance in the crosses between resistant (SML 1815, MH 421) and susceptible [VBN (Gg) 3, VBN (Gg) 2, LGG 460, RMG 10-28, and TM 96-2] genotypes. The major genes controlling MYMD resistance in the two crosses (KPSI × BM 6 and BM1 × BM 6) using six (P1, P2, F1, F2, BC1, and BC2) generations were estimated within 1.63–1.75 loci (Alam et al., 2014)

It is important to identify the strain/species of the virus causing the disease to make comparison between the different studies done. In repeated samplings over consecutive years in India, Nair et al. (2017) reported genetic similarity of MYMV strains from mungbean to a strain from Urdbean [*Vigna mungo* (L.) Hepper] (MYMV-Urdbean) dominant in North India, strains most similar to MYMV-*Vigna* predominant in South India, and *Mungbean yellow mosaic India virus* (MYMIV) strains predominant in Eastern India. The resistance sources of mungbean genotypes to MYMD (**Table 1**) can be used as potential donors and to develop mapping populations for the development of potential markers for MYMD. For the development of resistant lines, researchers have deployed plantbreeding methods with traditional methods of disease screening. In this regard, marker-assisted selection (MAS) is the most promising technique for disease resistant cultivar development. The study of genotypic diversity and the discovery of linked markers for *R* gene and quantitative trait loci (QTL) maps construction through molecular markers has improved the adeptness in the breeding programs conferring resistance for MYMD (Sudha et al., 2013). Basak et al. (2004) developed a yellow mosaic virus resistance linked marker named 'VMYR1' in mungbean. Among the parents, one pair, resistance gene analog (RGA) 1F-CG/RGA 1R (445bp DNA) of gene was found to be polymorphic out of 24 pairs of RGA primers screened. In F2 and F3 families, the polymorphisms were found to be linked with YMV-reaction. Binyamin et al. (2015) used sequence characterized amplified region-based markers linked with the MYMD-resistance gene for the screening of mungbean genotypes against the disease. In the resistant and tolerant genotypes, marker amplified desired bands were reported, while no amplification was observed in susceptible genotypes. Maiti et al. (2011) identified two MYMD-resistance marker loci, *CYR1* and *YR4* completely linked with MYMD-resistant germplasms and co-segregating with MYMDresistant F2 and F3 progenies. Holeyachi and Savithramma (2013) identified random amplified polymorphic DNA (RAPD) markers linked with MYMD recombinant breeding lines. They reported that out of 20 random decamers, only 10 primers showed polymorphism between parents China mung (S) and BL 849 (R) and among them, only one primer (UBC 499) amplified a single 700 bp band in the resistant parent (BL 849) that was absent in susceptible genotype (China mung). Kalaria et al. (2014) studied the polymorphism by using 200 RAPD and 17 inter simple sequence repeat (ISSR) markers.


*\*(T, Tolerant; I, Immune; HR, Highly resistant; R, Resistant; MR, Moderately resistant).*

Among RAPD markers, OPJ-18, OPG-5, and OPM-20 and in ISSR DE-16 were found to be potential ones, as they produced 28, 35, 28, and 61 amplicons, respectively. The resistant genotypes NAUMR1, NAUMR2, NAUMR3, and Meha were clearly separated from the susceptible cultivar, GM4. In another study, 5 QTLs based on simple sequence repeats (SSR) markers were investigated against MYMD, of them, three were from India (*qYMIV1, qYMIV2*, and *qYMIV3*) and 2 were from Pakistan (*qYMIV4* and *qYMIV5*) (Kitsanachandee et al., 2013). The QTL, *qYMIV1* explained 9.33% variation in disease response. Similarly, *qYMIV2* explained 10.61%, *qYMIV3* explained 12.55%, *qYMIV4* explained 21.55% and *qYMIV5* explained 6.24% variations in the disease response. Two major QTLs controlling genes on linkage group 2 (*qMYMIV2*) and 7 (*qMYMIV7*) resistant to MYMD were reported. These QTLs were conferring resistance in both F2 and BC1F1 populations with a coefficient of determination (R2 ) of 31.42–37.60 and 29.07–47.36%, respectively (Alam et al., 2014). Markers linked to QTLs in this study will be useful in marker-assisted breeding for the development of MYMD resistant mungbean varieties. During the growing season plant breeders can conduct repeated genotyping in the absence of disease incidence by applying linked marker-assisted genotyping. This technique will save labor and time during the introgression of MYMD-resistance through molecular breeding, as phenotyping against begomoviruses is complex, labor and time consuming. New donors of MYMD resistance have also been identified from interspecific sources (Chen et al., 2012; Nair et al., 2017).

# BREEDING FOR RESISTANCE TO FUNGAL DISEASES

Researchers screened mungbean genotypes against fungal diseases from different countries in controlled and field conditions in order to identify sources of resistance. Resistant genotypes reported by investigators against various fungal diseases are presented in **Table 2**. It may be noted that screening of mungbean genotypes against powdery mildew and Cercospora leaf spot diseases has been much explored. However, little work has been done on the identification of sources of resistance against anthracnose and dry root rot and needs to be addressed as future priorities. Screening of mungbean genotypes against fungal diseases provided in **Table 2** were carried out under natural conditions, except for dry root rot, Khan and Shuaib (2007) screened in laboratory conditions.

Efficient breeding for fungal stresses requires readily available resistant germplasm and markers linked with QTL regions or major genes that can be employed in marker-assisted selection (MAS). In mungbean, for Cercospora leaf spot and powdery mildew molecular markers have been identified for application in breeding programs. However, QTLs or molecular markers for dry root rot and anthracnose have not been investigated. Both qualitative and quantitative modes of inheritance have been reported for resistance to powdery mildew Kasettranan et al. (2009). Single dominant gene control of resistance to powdery mildew was reported (AVRDC, 1979; Khajudparn et al., 2007; Reddy, 2009b), while Reddy et al. (1994) reported that two major dominant genes control the resistance. Chaitieng et al. (2002) and Humphry et al. (2003) found that one QTL conferred the resistance to powdery mildew, while Young et al. (1993) reported three QTLs linked with powdery mildew resistance. Young et al. (1993) made the conclusion from studying the mapping population developed from mungbean line VC3890 as a resistance parent. The population developed from a cross between KPS 2 (moderately resistant) and VC 6468-11-1A (resistant) mungbean genotypes was investigated by Sorajjapinun et al. (2005) and they reported additive gene action control of resistance. Kasettranan et al. (2010) identified SSR markers based QTLs such as *qPMR-1* and *qPMR-2* associated with resistance to powdery mildew. One major QTL on the linkage group 9 and two minor QTLs on linkage group 4 were identified in mungbean line V4718 (Chankaew et al., 2013). The mapping population against powdery mildew developed from mungbean line RUM5 resulted in two major QTLs on LG6 and LG9 and one minor QTL on LG4 (Chankaew et al., 2013). Fine mapping with populations developed from crosses between highly susceptible and highly resistant parents would be reliable for the identification of reliable markers.

Lee (1980) reported that a single dominant gene governs the resistance to CLS. Reports on quantitative genetic control of resistance to CLS (Chankaew et al., 2011) and a single recessive gene control (Mishra et al., 1988) have been reported. One major QTL (*qCLS*) for CLS located on linkage group 3, which explained 66-81% phenotypic variation was reported (Chankaew et al., 2011) using F2 (CLS susceptible cultivar Kamphaeng Saen1, KPS1 × CLSresistance mungbean line, V4718) and BC1F1 [(KPS1 × V4718) × KPS1] populations.

# BREEDING FOR RESISTANCE TO BACTERIAL DISEASES

Bacterial pathogens are seed-borne and can persist in crop residue. Varietal resistance is recognized as the cornerstone of integrated disease management (Noble et al., 2019). Little work has been done on the screening of mungbean genotypes against bacterial diseases and identifying genetic markers associated with bacterial diseases in mungbean. From India, Patel and Jindal (1972) evaluated 2160 genotypes of mungbean for resistance to bacterial leaf spot (*X. phaseoli*) and reported that Jalgaon 781, P 646, P 475, and PLM 501 mungbean genotypes were resistant. From Pakistan, 8 out of 100 mungbean genotypes, were reported as resistant against bacterial leaf spot disease under field conditions (Iqbal et al., 1991; Iqbal et al., 2003). Munawar et al. (2011) screened 51 genotypes against bacterial leaf spot disease in Pakistan, and found NCM11-8, NCM 15-11, AZRI-1, and 14063 mungbean genotypes as resistant in natural incidence of the disease. In their field evaluation, few genotypes such as NCM 258-10, NCM-21, NCM 11-6, AZRI-06, and NCM 11-3 showed moderate resistance reaction.

The inheritance of bacterial leaf blight is governed by a single dominant gene (Thakur et al., 1977). Patel and Jindal (1972) reported that in mungbean genotypes Jalgaon 781, P 646, P 475, and PLM 501, the inheritance of resistance to bacterial leaf blight (BLB) was monogenic dominant. While QTLs were identified for bacterial leaf blight disease in other crops like chickpea (Dinesh et al., 2016), no records are available on QTLs of mungbean against bacterial disease. Screening for halo blight and tan spot has been carried out by the Australian breeding program in both controlled



*\*HR, Highly resistant; R, Resistant; MR, Moderately resistant; adopted from Pandey et al. (2018).*

(glasshouse) and field conditions to identify useful donors as well as resistant progenies (Noble et al., 2019). Identification of genetic markers/QTLs associated with halo blight, tan spot, and bacterial leaf spot disease resistance in mungbean will accelerate the development of resistant commercial cultivars. These markers can be established through genome-wide association studies using large, diverse mungbean mapping populations' representative of worldwide germplasm (Schafleitner et al., 2015; Noble et al., 2019).

#### MAJOR INSECT-PESTS AND ECONOMIC IMPACTS

Insect-pests attack mungbean at all crop stages from sowing to storage and take a heavy toll on crop yield. Some insect-pests directly damage the crop, while others act as vectors of diseases. The economically important insect-pests in mungbean include stem fly, thrips, aphids, whitefly, pod borer complex, pod bugs, and bruchids (Swaminathan et al., 2012). Stem fly (bean fly), *Ophiomyia phaseoli* (Tryon), is one of the major pests of mungbean. Other species of stem fly that infest mungbean include *Melanagromyza sojae* (Zehntner) and *Ophiomyia centrosematis* (de Meijere) (Talekar, 1990). This pest infests the crop within a week after germination and under epidemic conditions, it can cause total crop loss (Chiang and Talekar, 1980). Whitefly, *B. tabaci* is a serious pest in mungbean and damages the crop either directly by feeding on phloem sap and excreting honeydew on the plant that forms black sooty mould or indirectly by transmitting MYMD. Whitefly's latent period is less than four hours and a single viruliferous adult can transmit the MYMV within 24 h of acquisition and inoculation. The male and female whiteflies can retain the infectivity of the virus for 10 and 3 days, respectively. Further, *B. tabaci* complex consists of 34 cryptic species (Boykin and De Barro, 2014). Whitefly causes yield losses between 17 and 71% in mungbean (Marimuthu et al., 1981; Chhabra and Kooner, 1998; Mansoor-Ul-Hassan et al., 1998). Thrips infest mungbean both in the seedling and in flowering stages. The seedling thrips are *Thrips palmi* Karny and *Thrips tabaci* Lindeman and the flowering thrips are *Caliothrips indicus* Bagnall or *Megalurothrips* spp. During the seedling stage, thrips infest the seedling's growing point when it emerges from the ground, and under severe infestation, the seedlings fail to grow. Flowering thrips cause heavy damage and attack during flowering and pod formation. They feed on the pedicles and stigma of flowers. Under severe infestation, flowers drop and no pod formation takes place. Spotted pod borer, *Maruca vitrata* (Fab.) is a major insect-pest of mungbean in the tropics and subtropics. With an extensive host range and distribution, it is widely distributed in Asia, Africa, the Americas and Australia (Zahid et al., 2008). The pest causes a yield loss of 2–84% in mungbean amounting the US \$30 million (Zahid et al., 2008). The larvae damage all the stages of the crop including flowers, stems, peduncles, and pods; however, heavy damage occurs at the flowering stage where the larvae form webs combining flowers and leaves (Sharma et al., 1999). Cowpea aphid, *Aphis craccivora* Koch., sucks plant sap that causes loss of plant vigour and may lead to yellowing, stunting or distortion of plant parts. Further, aphids secrete honeydew (unused sap) that leads to the development of sooty mould on plant parts. Cowpea aphid also acts as a vector of bean common mosaic virus. Bruchids are the most important stored pests of legume seeds worldwide. They infest seeds both in field and in the storage, however, major damage is caused in storage. Bruchid damage can cause up to 100% losses within 3–6 months, if not controlled (Tomooka et al., 1992; Somta et al., 2007). Twenty species of bruchids have been reported infesting different pulse crops (Southgate, 1979). Of these, the Azuki bean weevil (*Callosobruchus chinensis* L.) and cowpea weevil (*Callosobruchus maculatus* Fab.) are the most serious pests of mungbean. The cryptic behaviour of bruchids where the grubs feed inside the legume seeds makes it easy to spread them through international trade.

# BREEDING FOR INSECT RESISTANCE

Identification of sources of resistance is important for the introgression of resistance into cultivars through breeding. The primary gene pool forms the first choice for the breeder for source of resistance. The secondary and tertiary gene pools provide further choices of variation to be incorporated into the crop. Although a number of screening methods have been developed, lack of uniform insect infestation across seasons and locations in some key pests, whose rearing and multiplication is difficult on artificial diets, is highly challenging for screening plants against insect-pests. For pod borers, screening in field, and greenhouse conditions is generally done by releasing ten first-instar larvae on the plant placed in net wire framed cage (40 cm in diameter, 45 cm long) under no-choice and free choice conditions (Sharma et al., 2005). Under laboratory conditions, the easiest and the most

reliable technique used for screening plants for pod borer and foliage feeding insects is detached leaf bioassay techniques (Sharma et al., 2005). This technique is very useful to screen the germplasm where antibiosis and non-preference are important components of plant resistance. Under field conditions, screening is also done by augmenting insect populations, planting date adjustment, tagging the inflorescences and plant grouping according to maturity and height (Sharma et al., 2005). For screening against *Maruca,* plant phenology is an important criterion to be taken into consideration (Dabrowski et al., 1983; Sharma et al., 1999). Plants are screened for resistance on the basis of the number of shoots prior to flowering and the number of eggs per plant during the early stages of the crop (Oghiakhe et al., 1992). Whitefly, thrips, and cowpea aphid resistance screening in mungbean is done on the basis of the number of insects and scoring the plants for insect damage on a visual rating scale (Taggar and Gill, 2012). Screening for bruchid resistance is done by using small plastic cups with 10–50 seeds in a no-choice or free-choice conditions and releasing up to five pairs of newly emerging adults (Somta et al., 2007, Somta et al., 2008).

To breed for resistance to insect-pests, understanding plantinsect interactions is very important. Some of the important parameters for successful breeding for insect resistance is to understand the biology of the insect pest, infesting stage and the biochemical and molecular aspect of insect-plant interactions. The role of various agro-ecological and environmental conditions along with uniform insect infestation is very important as the evaluation techniques, insect population and plant ecology depend on these factors. Further, it is important to have an optimum population build-up of the insect-pests during the most vulnerable stage of the crop. Uniform infestation at appropriate stages of plant development plays an important role in identifying insect-resistant genotypes and to reduce or eliminate the escapes (Maxwell and Jennings, 1980). Basic strategies in breeding for insect resistance are to identify the resistance coding genes from wild/cultivated species and introgress them into improved lines through recombination, hybridization, and selection. Though conventional plant breeding has some limitations it has contributed to significant improvement in yield and disease and insect resistance in mungbean (Fernandez and Shanmugasundaram, 1988). Induced mutation by using physical and chemical mutagens have been implicated in the development of insect and disease resistant varieties along with the other target traits in mungbean (Lamseejan et al., 1987; Wongpiyasatid et al., 2000; Watanasit et al., 2001). Some of the techniques in conventional breeding to develop insect resistant cultivars include mass selection, pure line selection and recurrent selection (Dhillon and Wehner, 1991; Burton and Widstorm, 2001). Techniques such as backcross breeding, pedigree breeding and bulk selection are being used for developing insect resistance in mungbean along with improved agronomic traits.

# SOURCES OF RESISTANCE AGAINST INSECT-PESTS

Host plant resistance plays an important role in crop protection against insect pests. The identification of new insect resistance sources provides breeders with avenues to breed for resistance to insect pests. The variability primary gene-pool available with the breeders could serve an important source for various traits including insect resistance. Generally, many valuable genes that confer resistance to insect pests can be found in the wild species and/or non-domesticated crop relatives (Sharma et al., 2005). Extensive screening studies have been carried out under controlled and natural conditions to identify insect resistance sources in mungbean (**Table 3**). For stem fly, very few studies have been carried out for the identification of resistant sources in mungbean. World Vegetable Center and The International Center for Tropical Agriculture (CIAT) identified some stem fly resistant genotypes, which have been used as potential sources in breeding for resistance against stem fly (Talekar, 1990; Abate et al., 1995). CIAT identified G 05253, G 05776, G 02005, and G 02472 as highly resistant to stem fly. Co 3 has been reported as resistant to *Ophiomyia centrosematis* (De Meijere) (Devasthali and Joshi, 1994). Some of the whitefly resistant sources have been identified globally and used to breed for resistance to this pest. Abdullah-Al-Rahad et al. (2018) reported Bari Mung -6 as resistant to whitefly and cowpea aphid under natural infestation. Sources of resistance to both seedling and flower thrips have been identified in mungbean under natural and artificial infestation in mungbean (**Table 3**). Breeding for resistance to spotted pod borer has lead to the identification of some of the sources of resistance in mungbean (Chhabra et al., 1988; Sahoo et al., 1989; Gangwar and Ahmed, 1991; Sahoo and Hota, 1991; Bhople et al., 2017). In mungbean, not much work has been done to identify the sources of resistance against cowpea aphid. Just a couple of resistant sources are available (Bhople et al., 2017; Abdullah-Al-Rahad et al., 2018).

Despite screening a large number of lines against bruchids, only a few resistant sources have been identified till date. These include V2709, V2802, V1128, and V2817 (Somta et al., 2008). The first bruchid resistant source was TC1966, a wild mungbean (*V. radiata* var. *sublobata* (Roxb.) Verdc.), collected in Madagascar and was used as a source of resistance (Tomooka et al., 1992; Watanasit and Pichitporn, 1996). TC1966 showed complete resistance to *C. maculatus* and *C. chinensis* and the resistant reaction was observed to be controlled by a single dominant gene, *Br* (Fujii and Miyazaki, 1987; Kitamura et al., 1988; Fujii et al., 1989). However, they found linkage drag that resulted in pod shattering in the cultivars developed using TC 1966 (Watanasit and Pichitporn, 1996). Two mungbean lines, V2709 and V2802 were identified by the World Vegetable Center with complete resistance to bruchids and have been extensively used in breeding programs to develop bruchid resistant mungbean (Talekar and Lin, 1981; AVRDC, 1991; Talekar and Lin, 1992). V2709 has been used as a source of resistance to develop three bruchid-resistant lines (Zhonglv 3, Zhonglv 4, and Zhonglv 6) in China (Yao et al., 2015) and, one bruchid-resistant variety (Jangan) in Korea (Hong et al., 2015). Somta et al. (2008) identified two mungbean cultivated lines, V1128 and V2817 as resistant to *C. maculatus*. At the World Vegetable Center, bruchid resistance from two black gram accessions, VM2011 and VM2164 was introgressed into mungbean successfully (AVRDC, 1987). Out of 101 breeding lines screened against bruchids, five lines (VC1535-11-1-B-1- 3-B, VC2764-B-7-2-B, VC2764-B-7-1-B, VC1209-3-B-1-2-B, and VC1482-C-12-2-B) were reported as tolerant to bruchids (AVRDC, 1988). Recently, World Vegetable Center has developed promising lines that are resistant to bruchids, thrips and cowpea aphid (ACIAR, 2018; ACIAR, 2019).

Among insect-pests, bruchid resistance in mungbean has been extensively studied using the molecular techniques. However, QTL mapping for resistance to field insect-pests that are common in legumes has been studied common bean and cowpea. In common bean, *Empoasca* spp. (Murray et al., 2004), *T. palmi* (Frei et al., 2005), *Apion godmani* Wagner (Blair et al., 2006) and bruchids (Blair et al., 2010), while in cowpea, *Megalurothrips sjostedti* (Trybon) (Omo-Ikerodah et al., 2008) and *A. craccivora* (Huynh et al., 2015) have been studied in detail. The stem fly resistance in mungbean has been found to be governed by additive, dominance and epistasis mechanisms (Distabanjong and Srinives, 1985). The wild species of mungbean TC 1966, which is resistant to *C. maculatus, C. chinensis, C. analis* and *C. phaseoli* has been widely used by breeders to develop bruchid resistant lines by crossing with agronomically superior cultivars (Fujii et al., 1989; Talekar and Lin, 1992; Tomooka et al., 1992; Somta et al., 2007). Molecular techniques have been utilized to identify bruchid resistant mungbean, locate genes that code for bruchid resistance, clone them genes and develop molecular markers for mapping bruchid resistance (Tomooka et al., 1992; Tomooka et al., 2000; Somta et al., 2008; Schafleitner et al., 2016). The selection efficiency and reduction in tests for screening of breeding material against insect pests including bruchids has been increased by the molecular markers developed (Schafleitner et al., 2016).

Various molecular markers such as restriction fragment length polymorphism (RFLP), RAPD, single nucleotide polymorphism (SNP) and SSR have been used to map bruchid resistance in mungbean (Young et al., 1992; Villareal et al., 1998; Chen et al., 2007; Chotechung et al., 2011), most of them are qualitative and the results are based on phenotypic data. In TC1966, bruchid resistance has been mapped using RFLP (Young et al., 1992). They mapped 14 linkage groups containing 153 RFLP markers of 1,295 centiMorgans (cM) with an average distance of 9.3 cM between the markers. The analysis of 58 F2 progenies from a cross between TC1966 and a susceptible mungbean cultivar showed that an individual F2 population possess a bruchid resistance gene within a tightly linked double crossover and was used for the development of bruchid resistant mungbean. A population derived from a cross between the cultivar Berken and ACC41 (a wild mungbean genotype, *V. radiata* subsp. *sublobata*) using RFLP probes were used to develop a linkage map (Humphry et al., 2002). The mungbean bacterial artificial chromosome libraries have been developed by *STSbr1* and *STSbr2* [polymerase chain reaction-based markers] (Miyagi et al., 2004). The authors reported close linkage in a recombinant inbred line (RIL) population between ACC41 and 'Berken'. Further, Sarkar et al. (2011) showed that *STSbr1* amplified a 225bp fragment in *V. sublobata* accession (sub2) and 12 other cultivars that were resistant to bruchids. Though RAPD markers are fast and simple, the distance between them is high from the bruchids resistant gene. RAPD markers for bruchid resistance have also been used with a mapping population from RIL and near-isogenic line (NIL; B4P 5-3-10, B4P3-3-23, DHK 2-18, and B4Gr3-1 with bruchid resistant genes from Pagasa 5, Pagasa 3, TABLE 3 | Resistant sources of mungbean against insect pests.


*\*R, Resistant; MR, Moderately resistant.*

VC 1973A and Taiwan Green, respectively by using TC 1966 as a resistance source (Villareal et al., 1998). NILs were differentiated by using 31 RAPD markers from which 25 showed co-segregation in the RIL population. A RIL population obtained from crossing 'Berken' (bruchid-susceptible line) with ACC41 (bruchid-resistant line) was used to map the Br1 locus (Wang et al., 2016). Ten RAPD markers were identified by Chen et al. (2007) for bruchid resistance in 200 RILs from a cross between TC1966 and NM 92. These included UBC66, UBC168, UBC223, UBC313, UBC353, OPM04, OPU11, OPV02, OPW02, and OPW13. Out of these, four markers (OPW02, UBC223, OPU11, and OPV02) were closely linked. For bruchid resistance in mungbean, a few SSR markers have been reported. These include SSRbr1, DMB-SSR158, and GBssr-MB87 (Miyagi et al., 2004; Chotechung et al., 2011; Chen et al., 2013; Hong et al., 2015). In V2802 and TC 1966, chromosome 5 possess the DMB-SSR 158 marker associated with *Vradi05g03940- VrPGIP1* and *Vradi05g03950-VrPGIP2* genes, which code for polygalacturonase inhibitor involved in bruchid resistance (Chen et al., 2013; Chotechung et al., 2016). The major QTL in TC1966 and DMB-SSr 158 marker are <0.1cM away from the bruchid resistant gene (Chen et al., 2013). Also, QTL *qBr* has been reported between markers VrBr-SSR013 and DMB-SSR158 at the same position.

The sequence-changed protein genes (SCPs) and differentially expressed genes (DEGs) retain the transcript diversity and specificity of the *Br* genes (Liu et al., 2016) and the variations in DEGs promoter and of SCPs can be potential markers in breeding for resistance against bruchids. Two QTLs, *MB87* and *SOPU11* have been reported to be associated with bruchid resistant genes in the study from a population developed from crossing Sunhwa (susceptible) and Jangan (resistant variety developed from back crossing with V2709) (Hong et al., 2015). Mei et al. (2009) reported a QTL in wild mungbean ACC41 that accounts for about 98.5% of bruchid resistance.

Recently, SNP markers have gained high momentum for use in breeding for pest and disease resistant plants. Their abundant, ubiquitous nature in the genome and readily availability for genotyping makes them very useful (Brumfield et al., 2003). Further, being co-dominant, single-locus, and biallelic markers, the SNPs are unique for use in breeding programs. Owing to the small genome size of mungbean (515 Mb/1C), the full genome sequencing or a reduced representation library sequencing are possible that would lead to the generation of many SNP markers (Moe et al., 2011). Further, SNPs have been extensively studied in breeding for resistance in mungbean against stink bug, *Riptortus clavatus* and adzuki bean weevil, *C. chinensis* (Moe et al., 2011; Schafleitner et al., 2016). Schafleitner et al. (2016) identified dCAPS2, dCAPS3, CAPS1, and CAPS12 SNP markers for bruchid resistance in mungbean. Despite being physically mapped to different chromosomes, these markers showed genetic linkage by co-segregation at the proportions of 96.5% in the F3 families of the crosses TC 1966 X NM 92 and V2802 X NM 94. They reported that in both crosses, the QTL for the bruchid resistance was mapped to chromosome 5 and the markers showed the prediction of 100%. Kaewwongwal et al. (2017) reported that *VrPGIP1* and *VrPGIP2*, which are tightly linked genes confer bruchid resistance in V2709. They identified two alleles for VrPGIP1 and VrPGIP2 in V2709 as *VrPGIP1-1* and *VrPGIP2-2*, respectively.

The next generation sequencing (NGS) technologies are being utilized to develop SNPs used for genotyping several traits and increase the amounts of transcripts much higher than the cloning and Sanger sequencing approaches in plants and animals. The genetic complexities of various traits including resistance to biotic and abiotic stresses are being studied using genotyping by sequencing (GBS) methods. Some of the areas in which GBS has been utilized include purity testing, genetic mapping, MAS, marker-trait associations, and genomic selection (Schafleitner et al., 2016). Schafleitner et al. (2016) used GBS technology on populations derived from TC1966 (wild mungbean accessionbruchid resistant) and V2802 (a cultivated mungbean accession) with bruchid susceptible lines, NM 92 and NM 94. A total of 32,856 SNPs were obtained, out of which 9,282 SNPs were scored in RIL populations. Finally, 7,460 SNP sequences were aligned to 11 chromosomes and 1,822 were aligned to scaffold sequences. It has been reported that SuperSAGE in combination with the NGS has been applied to study the biotic and abiotic stress resistance/tolerance in some legumes (Rodrigues et al., 2012; Almeida et al., 2014), however, such combinations have not been studied in detail for insect resistance. RNAseq technique is very important to study the pest and disease resistance in plants in a given situation. In RNAseq, sequencing of all the transcripts that are expressed in response to pest pressure is developed and is highly powerful as the transcriptomes are synthesised *de novo* and can also be used to compare the expression of genes in different insect pressures. Additionally, RNAseq can be used to study the simultaneous expression of genes both in plant and in the pest in a given situation (Liu et al., 2012). Genome-wide transcriptome profiling techniques provide the expression of a huge number of genes in response to insect damage, however, it is challenging to identify which of them are involved in resistant plant phenotypes. The studies on the co-localization of these genes with QTLs and functional genomics has been quite helpful, however, it will be critical to study the generation and application of high-throughput reverse genetic platforms. Though functional genomics is applied to understand the genetic basis of resistance and is implicated in breeding for resistance against insect-pests, further in-depth investigations are needed to stabilize the insect resistance in mungbean. Furthermore, identification of molecular markers linked to genes/QTLs controlling insect-pest resistance has been studied in many legumes, only in a few cases, these markers have been used in MAS breeding, the main constraint being the large distance between the markers and the gene/QTL controlling resistance (Shi et al., 2009; Schafleitner et al., 2016).

#### ABIOTIC STRESSES IN MUNGBEAN

Abiotic stresses negatively influence plant growth and productivity and are the primary cause of extensive agricultural losses worldwide (Arun and Venkateswarlu, 2011; Ye et al., 2017). Reduction in crop yield due to environment variations has increased steadily over the decades (Boyer et al., 2013). Abiotic stresses include extreme events and factors related to atmosphere (heat, cold, and frost); water (drought and flooding); radiation (UV and ionizing radiation); soil (salinity, mineral or nutrient deficiency, heavy metal pollutants, pesticide residue, etc.) and mechanical factors (wind, Nair et al. Mungbean Improvement

soil compaction) (HanumanthaRao et al., 2016). Crops utilize resources (light, water, carbon and mineral nutrients) from their immediate environment for their growth. The microenvironment and the management practice of cultivation influence crop growth and development directly (**Figure 1**). Climate change further adds to the complexity of plant-environment interactions (Goyary, 2009). The eco-physiological models that integrate the understanding of crop physiology and crop responses to environmental cues from detailed phenotyping are therefore used to understand the impact of environmental factors on crop growth and development, predict yield/plant response and also assist in developing management strategies (**Figure 2**) (APSIM: Chauhan et al., 2010; MungGro: Biswas et al., 2018). The plant response to abiotic stress at the cellular level is often interconnected (Beck et al., 2007) leading to molecular, biochemical, physiological and morphological changes that affect plant growth, development and productivity (Ahmad and Prasad, 2012). Several crop production models project a reduction in the crop yields of major agricultural crops mostly due to climate change (Rosenzweig et al., 2014), which tend to make crop growth environment unfavorable due to abiotic stresses. Such efforts in crops like mungbean is rare and requires a special attention. In the current era, environmental stresses are a menace to global agriculture and there is a need to emphasize trait based breeding to ensure yield stability across the locations as well as crop seasons. Efforts are underway to develop new tools for understanding possible mechanisms related to stress tolerance and identification of stress tolerance traits for promoting sustainable agriculture (Cramer et al., 2011; Fiorani and Schurr, 2013). Basic tolerance mechanisms involve the activation of different stressregulated genes through integrated cellular as well as molecular responses (Latif et al., 2016). Plants respond to their immediate surroundings in diverse ways, which assist the cells to adapt and achieve cellular homeostasis manifested in phenotypes of plants under particular environment (James et al., 2011). While breeding lines are regularly phenotyped for easily visible traits including growth and yield components, many traits that contribute to stress tolerance are ignored. This can be largely due

to feasibility of measuring these traits precisely and rapidly. Hence, recent phenotyping tools deploy image capture and automation in advanced plant phenotyping platforms. These recent efforts are expected to boost efforts to translate basic physiology of crop plants into products with practical values to support breeding program in harsh environments (viz., stresses like salinity, soil moisture, extreme temperatures etc) explained in the following section.

#### SALINITY

In agriculture, soil salinity has been a threat in some parts of the world for over 3000 years (Flowers, 2006) and it has been aggravated by irrigation water sourced through surface irrigation in arid and semi-arid environments (HanumanthaRao et al., 2016). Salt stress mainly in most of the crops reduces seed germination, fresh and dry biomass, shoot and root length, and yield attributes of mungbean (Promila and Kumar, 2000; Rabie, 2005; Ahmed, 2009). It affects root growth and elongation, thereby, hampering nutrient uptake and distribution. Root growth was significantly reduced with higher Sodium Chloride (NaCl) (NaCl) concentrations. Nevertheless, BARI Mung4 showed better performances at higher NaCl concentration considering a yield-contributing character. Nodules/plant decreased with the increase of salinity although the nodule size increased (Naher and Alam, 2010). Being polygenic in nature, salinity tolerance is genotype-dependent and growth stage-specific phenomenon, therefore, tolerance at an initial (seedling) stage may not be corroborated with tolerance at later growth (maturity) stages (Sehrawat et al., 2013). It also involves multidimensional responses at several organ levels in plants (e.g., tissue, molecular, physiological and plant canopy levels) (HanumanthaRao et al., 2016). Because of this complexity and lack of appropriate techniques for introgression, little progress has been achieved in developing salt-tolerant mungbean varieties over years (Ambede et al., 2012; HanumanthaRao et al., 2016). Appreciable improvement in salt tolerance of important crops (barley, rice, pearl millet, maize, sorghum, alfalfa, and many grass species) have been attained in the past, but not in legumes in general and mungbean

in particular (Ambede et al., 2012). Rapid screening methods are required to identify putative donor parents in a breeding program (Saha et al., 2010). In a comprehensive study, Manasa et al. (2017) screened 40 mungbean lines sourced from World Vegetable Center for salinity tolerance using Salinity Induction Response (SIR) technique at the seedling as well as at whole plant levels by canopy phenotyping assay under 150 and 300 mM NaCl stress scenario. The results showed a marked reduction in growth and yield performances of both tolerant and susceptible lines, but a few lines displayed a relatively better biomass and pod yield on par with non-stressed control plants. The intrinsic ability of salt portioning to vacuole (more influx of Na+ ions) by tolerant lines during high salt concentration in the cytocol could be one of the reasons for their tolerance. Based on the extent of salt tolerance both at seedling and whole plant stages, a few salt tolerant (EC 693357, 58, 66, 71, and ML1299) lines were identified (Manasa et al., 2017) for further validation under field conditions.

# SOIL MOISTURE STRESS

The response of legumes to the onset of drought vary and the final harvestable yield will significantly be reduced (Nadeem et al., 2019). Global climate change attributes erratic prediction in drought episodes and its control of crop yields. Being grown on marginal lands, mungbean is largely considered as a drought tolerant (grow with a limited soil moisture). However, like any other plants, it responds to a decrease in available soil moisture by reducing its growth and hence productivity. It is evident from the experiment that 30% decrease in water supply relative to water optimum for crop growth results in nearly 20% decrease in seed weight per plant if the soil moisture stress imposed around a vegetative stage. The plants subjected to stress during flowering showed 50 to 60% decrease in seed yield (Fathy et al., 2018). Soil moisture stress did not affect the number of pods per plant as severely as it did for seed weight or biomass per plant in this experiment, clearly indicating that seed formation or filling is the most sensitive to soil moisture stress. It is also suggested that the dry matter partitioning is one of the potential screening trait for drought tolerance in mungbean (Hossain et al., 2010; Nadeem et al., 2019). When the drought stress was severe enough to reduce plant biomass per m2 from 359 to 138 g, the resultant reduction in pod number was nearly 50% and the same for seed yield was nearly 60% relative to well-watered plants (Kumar and Sharma, 2009).

The decrease in total plant dry weight and harvest index were the main reasons for reduced seed yield due to drought stress in mungbean (Sadasivan et al., 1988; Thomas et al., 2004). Significant reduction in pod initiation and pod growth rates were the major responses to soil moisture stress during flowering and pod-filling stages (Begg, 1980). Water stress during flowering results in reduced yield mainly due to flower abscission (Moradi et al., 2009). The relative water content in leaves and partitioning of biomass have been sighted as the traits contributing to tolerance to drought in summer mungbean (Kumar and Sharma, 2009). Yield loss of 31-57% at flowering and 26% at post flowering/podding stages in mungbean due to drought stress was reported by Nadeem et al. (2019). The drought-induced imbalance in electrons produced and consumed during the photosynthetic process gives rise to harmful superoxide molecules, which have been cited as a major reason for damages at the cellular level. Hence, key factors that can alleviate oxidative stress are the focus of research for alleviating drought stress. Recent studies infer that alleviation of drought-caused oxidative stress depends largely on the status of Ascorbic acid and Glutathione pools in reduced and oxidative stages (Anjum et al., 2015). There is a need to explore genetic variation for these traits and possibility of introgressing the relevant genes for improving drought tolerance in mungbean. Decreased leaf water potential was associated with reduced activity of nitrogenase, glutamine synthetase, asparagine synthetase, aspartate aminotransferase, xanthine dehydrogenase and uricase that are associated with nitrogen fixation (Kaur et al., 1985). New insights into these metabolites and enzymes can be obtained to understand their roles through recently evolved metabolomics.

Water stress-induced inhibition of hypocotyl elongation is more conspicuous in separated cotyledons than the intact ones. It is necessary to check if the larger cotyledons can be the solution for better plant establishment under soil moisture stress. When two mungbean genotypes exhibiting more than two-fold variation in leaf water loss were explored for the genetic variation in their physiological and molecular responses to drought, efficient stomatal regulation was observed in water saving low leaf water loss (LWL) genotype (Raina et al., 2016). The stomatal closure under drought was accompanied with a concomitant downregulation of farnesyl transferase gene in this genotype. However, other genotypes had a cooler canopy temperature facilitated by a branched root system that allowed better extraction of soil moisture (Raina et al., 2016). These mechanisms and traits of mungbean are suitable for harsh environments but needs a prioritization based on the type of drought and agro-ecological features. The other important key physiological traits viz., water use efficiency, root growth/biomass, carbon isotope discrimination (∆13C) and leaf temperature (Canopy temperature difference), may be beneficial for screening mungbean for drought tolerance.

# HIGH TEMPERATURE OR HEAT STRESS AND INCREASING ATMOSPHERIC CARBON DIOXIDE (CO2)

Of the various environmental stresses that a plant can experience, temperature has the widest and far-reaching effects on legumes. Temperature extremes, both high (heat stress) and low (cold stress), are injurious to plants at all stages of development, resulting in severe loss of productivity. Legumes, such as chickpea, lentil, mungbean, soybean, and peas, show varying degrees of sensitivity to high and low-temperature stresses, which reduces their potential performance at different developmental stages such as germination, seedling emergence, vegetative phase, flowering, and pod/seed filling phase (HanumanthaRao et al., 2016; Sharma et al., 2016). The optimum temperature for growth and development of mungbean is 28–30°C and the range under which plant continues to develop seed is 33–35°C. Each degree rise in temperatures above optimum reduces the seed yield by 35–40% relative to the plants grown under optimum temperature (Sharma et al., 2016).

Temperatures >45°C that often coincides at flowering stage can lead to flower abortion and yield losses. Sharma et al. (2016) evaluated the effect of high temperature on different

mungbean lines for vegetative and reproductive performances using Temperature Induction Response (TIR) and physiological screening, techniques at seedling and whole plant levels. The promising tolerant lines were shortlisted for further investigation at the whole plant level. These lines were grown in containers under full irrigation in outdoors; screened for growth and yield traits at two sowings: normal sowing (NS), where day/night temperatures during reproductive stage were <40/28°C, and late sowing (LS), where temperatures were higher (> 40/28°C). The leaves of LS plants showed symptoms of leaf rolling and chlorosis and accelerated phenology lead to sizable marked reduction in leaf area, biomass, flowers and pods. Interestingly, shortening of flowering and podding duration was also observed.

To address ever-fluctuating temperature extremes that various legumes get exposed to, efforts are being made to develop heattolerant varieties through conventional breeding methods (exposing breeding lines to open air growing seasons having high temperature episodes either throughout the growth stages or specific to flowering or reproductive phase) in order to select promising tolerant lines. Subsequently subject these shortlisted entries to varied growing environments that coincide with drier/heat periods for confirmatory validation to identify true-genotypes to engage them in heat stress breeding programs. With the advancement of `omics' era, phenomics platform (phenotyping) can conveniently be applied to screen field shortlisted or promising sub-set of candidates with more precisely conditioned high-temperature regimes (at customized growth periods) to identify true types along with expressed plant architectures. Tolerance to suboptimal temperatures has not been studied extensively in crops like mungbean. However, for the improvement in grain yield of this crop in hilly areas or in higher latitudes it is necessary to introgress traits associated with cold or low-temperature tolerance.

Increasing atmospheric CO2 concentration along with temperature also pose a constraint to plant growth and development, which would be more pronounced in C3 plant species (like mungbean) than C4. Some of the physiological functions (activation of carboxylating enzymes, photosynthetic rates, cell expansion, carbohydrate synthesis etc) will be enhanced which have an impact on leaf area and biomass associated improvements. An improved biomass by virtue of increased leaf expansion may not always result in higher yield levels. However, in mungbean, higher pod and seed yields were documented when a few high temperature tolerant genotypes exposed to elevated CO2 of 550 ppm compared to ambient CO2 of 400 ppm (Bindumadhava et al., 2018). However, molecular mechanism governing aggravated metabolic functions at different growth stages is still unclear and possibility of employing CO2 fertigation as a breedable trait needs more research attention in days to come from the context of changing global climate.

# WATERLOGGING

Anthropogenic studies reveal that the frequency and severity of flooding events increase with climate change (Arnell and Liu, 2001). Waterlogging adversely affects germination, seedling emergence and growth, crop establishment and root and shoot growth (Bailey-Serres and Voesenek, 2008; Toker and Mutlu, 2011). Heavy rains during pod ripening stage results in premature sprouting, leading to inferior seeds. Mungbean is predominantly cultivated in rice-fallow systems and is sensitive to waterlogging (Singh and Singh, 2011). Excess rainfall in such cultivation systems can result in waterlogging wherein roots are completely immersed in water and shoots (sometimes) are partially or fully submerged. Ahmed et al. (2013) highlighted the biochemical mechanisms *viz.*, increased availability of soluble sugar, enhanced enzymatic activity of glycolytic pathway antioxidant defense mechanism, and altered aerenchyma formation help plants withstand waterlogging. In addition to the deficiency of oxygen, waterlogging can alter the mineral nutrient composition accessible for plants and needs to be considered during genetic crop improvement (Setter et al., 2009). Spring grown crops are more prone to water stress as the rainfall is scanty and farmers mostly prefer to grow this crop on residual moisture. Therefore, cultivating short duration cultivars may help in escaping terminal moisture stress (Pratap et al., 2013).

# BREEDING FOR ABIOTIC TRAITS

At the plant level, there were several satisfying attempts in mungbean to screen and identify tolerant types for high temperature (heat stress), salinity, waterlogging, and water stress from physiological, biochemical, and molecular perspectives (Kaur et al., 2015; HanumanthaRao et al., 2016; Bhandari et al., 2017; Manasa et al., 2017; Sehgal et al., 2018). The breeding lines selected and identified for these aforementioned stresses would form a panel of donor resources for future trait-navigated crop improvement (**Table 4**).

The initial phase of breeding in mungbean resulted in selecting a few locally adapted germplasm, mainly for biotic stresses resistance and high yield. While selecting for abiotic stress resistance was not practiced directly, selection for yield, plant type, and adaptation related traits indirectly lead to selection for abiotic stress resistance as well. The selection has been a useful strategy to identify superior cultivars with significant drought tolerance. Warm season food legumes generally encounter two types of drought stresses: (i) terminal drought, which is more prominent in summer/spring crops, usually coincides with late reproductive stage and increases towards generative stage, and (ii) intermittent drought, which may occur anytime during vegetative growth and results due to a break in rainfall or insufficient rains at the vegetative stage. The ranking of warm season food legumes in increasing order of drought resistance was soybean, followed by blackgram, mungbean, groundnut, bambara nut, lablab bean and cowpea (Singh et al., 1999). Fernandez and Kuo (1993) used a stress tolerance index (STI) to select genotypes with high yield and tolerance to temperature and water stresses in mungbean. Singh (1997) described the plant type of mungbean suitable for Kharif (rainy) as well as dry (spring/summer) seasons. Pratap et al. (2013) also suggested the development of short duration cultivars for Spring/Summer cultivation so that these escape terminal heat and drought stress. Cultivars with 60–65 days' crop cycle, determinate growth habit, high harvest index, reduced photoperiod sensitivity, fast initial growth, longer pods with more than 10 seeds/pod and large seeds are more suitable to the summer season. Keeping this backdrop, a number of early maturing mungbean lines have been selected and released as commercial cultivars.

### RNAi TECHNOLOGY: BIOTIC AND ABIOTIC STRESS RESISTANCE

Though conventional breeding strategies have helped breeders to produce disease and insect resistant, and high yielding varieties,


TABLE 4 | Tolerant/resistant sources of mungbean against abiotic stresses.

the challenges in the conventional breeding make it time-consuming and often leads to the transfer of undesired traits along with desired traits. Further, the functional analysis of candidate genes that code for physiological and biochemical pathways in plants responsible for resistance against diseases and insect-pests have been studied in detail in legumes. However, these studied are limited in mungbean. To further advance the functional genomic analysis of plants, gene silencing technologies using RNA interference (RNAi) or virusinduced gene silencing have been developed to study the expression or inhibition of the candidate genes (Wesley et al., 2001). RNAi technology offers a new and innovative potential tool for plant breeding for resistance/tolerance to biotic and abiotic stresses through the introduction of small non-coding RNA sequences that are able to regulate gene expression in a sequence-specific manner (**Figure 3**; Dubrovina and Kiselev, 2019). The suppression of expression of a specific gene provides an opportunity to remove or accumulate a specific trait in plants that would lead to biochemical or phenotypic changes, which in turn, provide resistance/tolerance to plants against biotic and abiotic stresses. Furthermore, RNAimediated gene silencing techniques can be used by plant breeders to suppress genes in full or partially using specific promoters and construct design (Senthil-Kumar and Mysore, 2010). In RNAi technology, the candidate gene activity is disrupted and or silenced in a sequence-specific manner by introducing constructs that generate double-stranded RNAs (Dennis et al., 1999). Though this technology is generally used as a pest and disease control strategy on the pest aspect, the plant-mediated or host-induced RNAi (HI-RNAi) can be used to develop the engineered crop plant material with hairpin RNAi vector to produce dsRNA that would target the insect and pathogen genes. When the insect feeds on the plant parts, the entry of dsRNA into the insect gut will induce the RNAi activity and silence the target gene in the insect pest (Zha et al., 2011). Further, RNAi can be used to alter the gene expression in plants involved in resistance against diseases (Senthil-Kumar and Mysore, 2010) and abiotic stresses (Abhary and Rezk, 2015). Haq et al. (2010) studied the silencing of complementary-sense virus genes involved in MYMV replication in soybean by targeting a complementary-sense gene (ACI) encoding Replication Initiation Protein (Rep) against Mungbean yellow mosaic India virus. Similarly, Kumar et al. (2017) generated cowpea plants with resistance to MYMV using RNAi technology, which contained three different intron hairpin RNAi constructs. RNAi technology has been used against a number of insect-pests such as *H. armigera* by targeting the *CYP6AE14* gene 9 (Mao et al., 2007). When transcriptional factor genes of *H. armigera* were targeted by HI-RNAi, a significant reduction in mRNA and protein levels was observed that resulted in deformed

exogenous RNAs transported into the cytoplasm. (C) The dsRNA or hpRNA molecules are recognized by a ribonuclease, DICER-like (DICER), which cleaves the dsRNA into siRNAs. (D) The siRNAs are then incorporated in the RNA-induced silencing complex (RISC) that guides sequence-specific degradation or translational repression of homologous mRNAs. (E) The components of the siRNA/mRNA complex can be amplified into secondary siRNAs by the action of RNA-dependent RNA-polymerase (RdRP). (F) Movement of the RNA silencing signal between plant cells and through the vasculature. Dashed arrows depict different steps of the RNAi induction process and dsRNA/siRNA movement between plant cells and plant pathogens. The solid arrow depicts the RdRP-mediated amplification of siRNA. Red arrows depict the local and systemic movement of the RNA silencing signal in the plant (From Dubrovina and Kiselev, 2019).

larvae and larval mortality (Xiong et al., 2013). Additionally, this technology has been implicated in increasing the production of unique secondary metabolites, increasing the shelf life of the fruits, improving crop yield and improving insect and disease resistance (Abhary and Rezk, 2015). Sunkar and Zhu (2004) reported that in *Arabidopsis* plants, miRNAs are involved in tolerance against abiotic stress including cold, drought, and salinity. They further showed that exposure to higher salinity levels, dehydration, cold, and abscisic acid upregulated the expression of miR393. While RNAi technology can be used to improve biotic and abiotic stress resistance/tolerance in mungbean, large-scale field studies are needed to study any potential risks of this technology.

# BREEDING CONSTRAINTS FOR DEVELOPING BIOTIC/ABIOTIC STRESS RESISTANT/TOLERANT MUNGBEAN

In breeding for resistance to biotic and abiotic stresses in legumes, the important factors that are taken into consideration include the genetic distance between the resistant source and the cultivars to be improved, screening methodology, inheritance pattern and the resistance traits to be improved. The genetic diversity and the genetic distances between cultivars and the resistance sources can be integrated in breeding approach such as gene pyramiding (Kelly et al., 1998; Kim et al., 2015). The important breeding approaches such as the pedigree and single seed descent methods are used to transfer the major resistant alleles and QTLs between cultivars and elite breeding lines. However, the increased genetic distances between the source and the cultivars lead to segregation of characters, which can be reduced by repeated backcrossing such as inbred-backcrossing, recurrent backcrossing, or congruity backcrossing (i.e., backcrossing alternately with either parent). During early stages of the breeding program for breeding to diseases and insect resistance, introgressing resistance alleles and QTL from wild populations, recurrent or congruity backcrossing or modifications are highly important. Although gamete selection using multiple-parent crosses (Asensio-S.-Manzanera et al., 2005, Asensio-S.-Manzanera et al., 2006) and recurrent selection (Kelly and Adams, 1987; Singh et al., 1999; Terán and Singh, 2010), respectively, could be effective, their use in the legumes where a large number of pollinations are required may not be feasible.

Linkage drag is one of the important challenges while developing the disease or insect resistant cultivars, especially when wild sources are used as donors. To reduce linkage drag, repeated backcrossings are needed (Keneni et al., 2011). Deployment of wild germplasm in resistance breeding, which is an important source of resistance introgression to commercial cultivars, is often impeded by the undesirable genetic linkages, which may result in the co-inheritance of the undesired and desired traits that may affect seed quality, germination and other traits (Edwards and Singh, 2006; Acosta-Gallegos et al., 2008; Keneni et al., 2011). Breeding for resistant to diseases and insect-pests where resistance is controlled by a single gene is easier as compared to multigenic resistance (Miyagi et al., 2004; Somta et al., 2008; War et al., 2017). The multigenic disease and insect-resistance with low dominance may result in the transfer of the undesirable traits such as leaf size, seed texture, and color along with the desired traits (Edwards and Singh, 2006). Crossing over between homologous chromosomes during meiosis is important to transfer the genes controlling desired traits and to overcome the linkage drag. For this, a large number of F2 populations is required to be grown to increase the recovery of new recombinants due to crossing-over.

Another very important factor impeding breeding for resistance to diseases is the development of various strains by a pathogen and to insect-pests is the biotypic variation in insect-pests. Plant genotypes that are resistant to one pathogen strain or insect biotype may be susceptible to the other strain of the same pathogen or insect biotype. Insect biotypes show genetic variability within a pest population. Biotype species are morphologically similar, however, their biological traits vary. The emergence and spread of whitefly-transmitted viruses are attributed to the evolution of virus strains, development of aggressive biotypes and increase in the whitefly population (Chiel et al., 2007). While studying the MYMV begomoviruses infecting mungbean and their interaction with *B. tabaci* in India, Nair et al. (2017) identified that a MYMV resistant NM 94 variety was susceptible to the disease in different locations. The MYMV strains identified were MYMV-Urdbean, MYMV-*Vigna* and MYMIV. They further identified that three cryptic species of *B. tabaci* are responsible for spreading MYMD. The cryptic species of whitefly included Asia II 1 (dominant in Northern India), Asia II 8 (dominant in most of Southern, India) and Asia 1 (present in Hyderabad, Telangana, and Coimbatore, Tamil Nadu locations of Southern India). Gene pyramiding the incorporation of multiple resistant genes in a cultivar is seen as an alternative to breeding for diseases/insect resistance with several strains/biotypes.

Though there have been several continued attempts to evolve crop varieties/genotypes for a specific biotic and abiotic stress, on a larger scale, the success achieved was less owing to the combined impact of several stresses and unexpected sudden episodes of pests and diseases all along growth stages of the plants; hence, only a few countable successes have been reported in legumes, more so in cereals. Stemming the critical stage of crop growth for breeding itself need a thorough assessment, be seed germination, early vigour or field establishment, vegetative phase, flowering and early podding to podding stage, reproductive to final maturity stages etc. In this array of developmental stages, pinning down a specific stage and the very influencing trait for breeding seems very challenging though several strategies have hovered around flowering and reproductive phase (being termed `sensitive') with an objective to develop breeding lines that withstand stress load and produce relatively better pod and seed yield.

#### FUTURE OUTLOOK

Though a number of disease resistant lines have been developed for yellow mosaic, powdery mildew, and CLS, very few resistant sources are available for anthracnose, dry root rot and bacterial diseases. Further, molecular markers developed for powdery mildew and CLS need to be used in the breeding program to develop further disease resistant lines. Development of markers for dry root rot and anthracnose is needed to fast track development of disease resistant lines. Insect resistant sources of few insects such as bruchids and whiteflies are available, which are being used in breeding programs to develop insect resistant mungbean. However, there is every possibility of the introgression of undesired traits from these resistant sources to the cultivars. In order to have stable disease and insect resistant mungbean for a specific disease or pest, a synergy between the conventional breeding techniques and molecular technologies is very important (Kim et al., 2015; Schafleitner et al., 2016). Identification of molecular markers will help in the evaluation of the diseases and pest resistance and reduce our dependency on the phenotypic data, which might be laborious in big trials (Kitamura et al., 1988; Chen et al., 2007). Further, using molecular markers can help to transfer insect resistance from the related legumes such as black gram into mungbean. However, it is very important to identify and combine multiple resistant genes into the same cultivar. Thus gene pyramiding should be the target for breeders to develop mungbean with resistance to diseases and insect-pests and avoid strain/biotype development. The mechanism of diseases and insect resistance needs to be studied to identify herbivore- and pathogen-specific signal molecules and their mode of action. Furthermore, the RNAi technology can be used to improve biotic stress resistance in mungbean. However, in order to establish RNAi technology as a potential pest management strategy in plant breeding, large-scale field studies are essential. Further, the potential risks of this technology needs attention.

Breeding mungbean lines for stressful environments is very important. While in particular, stress dominates a population of environments, many of the agroecologies are featured by multiple stresses. This often makes a particular agro-ecology unique for which systemized solutions are essential. For making the best combination of abiotic stress and the traits to incorporate, it is essential to have insight on the fundamental mechanism for stress tolerance from intrinsic physiological and biochemical perspectives. We aim to develop root systems that help plants to withstand moisture deficits by drawing water from the deeper soils. Screening for various abiotic stresses needs to be more precise and stringent to identify robust donor/s for these traits. The identified donors need to put in use by the breeders at a faster pace. Plant type/s having a deep root system, early maturity span, erect stature with sympodial podbearing, multiple pods per cluster and longer pods with many nodes and shorter internodes will help in withstanding heat and droughtrelated stresses. Of late, converging various modern technologies like, infra-red thermography, automated robotics, camera images, and computational algorithms, which all make components of high throughput phenotyping facilities (phenomics and phenospex) can facilitate high throughput phenotyping for stress tolerance (Pratap et al., 2019b). However, non-destructive methods being utilized for targeted regions or environments needs optimization for establishing a relation between the known difficult to measure traits and the surrogate parameters derived from images, which represent plant responses to abiotic stresses. These phenomics methods can help precisely quantifying plant shoot architectural responses to stresses caused by soil moisture deficit, salinity, high temperature etc. More than a dozen image parameters have been explained to illustrate the responses of plants to stress that can guide in identifying the relevant traits and the protocol for screening large number of breeding lines or mapping population that are aiming at identification of stress tolerant genes. As evident from published literature, some of the traits such as high photosynthesis or quantum yields have been associated with tolerance to drought, salinity or high temperature. Generally, it is attributed to the capacity of plants to maintain water balance in the tissue reflected by relative water content and stress avoidance mechanism. However, it is essential to look into the traits such as capacity to retain physiological function, for example, even at 50% of optimum relative water content. Such traits are not feasible for application in plant breeding program with conventional approach. However, plant phenomics platform allow no destructive measurement of physiological function such as chlorphyll fluorescence based PS-II system. They are also equipped with NIR-based tools to assess non-destructively tissue water status in plants subjected to stress. These tools can allow measurement of tolerance of PS-II system health at given levels of tissue water content and hence true tolerance to stresses such as soil moisture deficit, salinity and high temperatures. Further, mechanisms to escape from abiotic stresses like drought and high temperatures are extensively been explored in many crops to get optimum yield in stress prone agroecologies. However, there is scope for exploring diurnal escape from stress in a way that plant can exhibit water saving mechanisms during peak stress hours in the diurnal cycle and keep their stomata open for sufficiently capture ambient CO2. It is possible to quantify such traits by strategically employing phenomics tools such as infrared imaging system. High temperatures during nights, is likely to enhance respiratory loss of assimilates, however, there are no mechanisms to measure these traits. It is essential to device tools/ protocols for these measurements either in high or semi-throughput modes. Since mungbean is grown largely in marginal environments or in a short time between harvest and sowing of preceding and subsequent crops, it is essential to assess recovery from stress and performance in terms of seed yield. Continuous monitoring image based system can allow precise quantification of these traits by separating developmental changes from actual impact of stress. Recently evolved CT scan based tools and protocols will allow understand root-soil-water interaction and can quantify roots system architecture more precisely. This will open up new avenues for designing phenomics and genomics approaches for supporting improvement of stress tolerance in crops.

Molecular approaches are becoming handy in revealing resistance/tolerance mechanisms, which will help in modifying mungbean plants to suit the biotic and abiotic stresses. Genome Wide Association Studies [Noble et al., 2018; Breria et al., 2019)] would help in better understanding of the genetic basis of the phenotypes. Association mapping for biotic and abiotic resistant/ tolerant traits is highly important to identify the desired haplotypes in performing association mapping on a panel of adapted elite breeding lines. This will provide the ample justification to utilize these lines directly in breeding programs. The selection of favorable haplotypes through MAS will be reduce the phenotyping material in the advanced breeding generations and increase the breeding efficiency. The development of NGS technologies, the discovery of SNP/alleles has become easy. This mungbean diversity panel constitutes a valuable resource for genetic dissection of important agronomic traits to accelerate mungbean breeding. Genetic variability with mungbean and between closely related species can be studied from the sequence-based information, which forms a pre-requisite criterion for breeding for resistant/

tolerance to biotic and abiotic stress. This is also important for the species conservation and provides breeders with new and/ or beneficial alleles for developing advanced breeding materials. Further, advanced phenotyping technologies such as NGS help to increase the discovery of trait-allele and genotype-phenotype interactions. There must be systematic efforts towards exploring physiological and biochemical regulations of biotic and abiotic stresses and studying the whole profile of genes, proteins and metabolites imparting resistance/tolerance so that the same can be manipulated to develop improved cultivars of mungbean.

#### AUTHOR CONTRIBUTIONS

RN—conceived the idea and contributed to the review in general. AKP and AW—contributed mainly to the biotic stress section. HB

#### REFERENCES


and JR—contributed mainly to the abiotic stress section. TS, AA, AP, SM, RK, EKM, CD, and RS contributed to the review in general.

#### FUNDING

The financial assistances for this review was provided by Australian Center for International Agricultural Research (ACIAR) through the projects on International Mungbean Improvement Network (CIM-2014-079) and UKaid project on "Unleashing the economic power of vegetables in Africa through quality seed of improved varieties," the strategic long-term donors to the World Vegetable Center: Republic of China (Taiwan), UK aid from the UK government, United States Agency for International Development (USAID), Germany, Thailand, Philippines, Korea, and Japan. Authors also thank ICAR-NICRA for supporting research on stress tolerance in mungbean.


from populations segregating for YMV-reaction. *Mol. Breed.* 14, 375–383. doi: 10.1007/s11032-004-0238-y


S. Shanmugsundaram (Shanhua, Tainan: Asian Vegetable Research and Development Centre), 35–41.


Kaur, L., Singh, P., and Sirari, A. (2011). Biplot analysis for locating multiple disease resistant diversity in mungbean germplasm. *Dis. Res.* 26, 55–60.


(*Megalurothrips sjostedti*) identified in recombinant inbred lines of cowpea (*Vigna unguiculata* (L.) Walp). *Afr. J. Biotechnol.* 7, 263–270.


global gridded crop model inter-comparison. *Proc. Natl. Acad. Sci.* 111, 3268– 3273. doi: 10.1073/pnas.1222463110


to bruchid beetle in the genus *Vigna* sub species *Ceratotropis. Euphytica* 115, 27–41. doi: 10.1023/A:1003906715119


gene in mungbean (*Vigna radiata,* L Wilczek). *Theor. Appl. Genet.* 84, 839–844. doi: 10.1007/BF00227394


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Nair, Pandey, War, Hanumantharao, Shwe, Alam, Pratap, Malik, Karimi, Mbeyagala, Douglas, Rane and Schafleitner. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these*

# Genomics of Plant Disease Resistance in Legumes

#### *Prasanna Kankanala†, Raja Sekhar Nandety† and Kirankumar S. Mysore\**

*Noble Research Institute, LLC, Ardmore, OK, United States*

The constant interactions between plants and pathogens in the environment and the resulting outcomes are of significant importance for agriculture and agricultural scientists. Disease resistance genes in plant cultivars can break down in the field due to the evolution of pathogens under high selection pressure. Thus, the protection of crop plants against pathogens is a continuous arms race. Like any other type of crop plant, legumes are susceptible to many pathogens. The dawn of the genomic era, in which high-throughput and cost-effective genomic tools have become available, has revolutionized our understanding of the complex interactions between legumes and pathogens. Genomic tools have enabled a global view of transcriptome changes during these interactions, from which several key players in both the resistant and susceptible interactions have been identified. This review summarizes some of the large-scale genomic studies that have clarified the host transcriptional changes during interactions between legumes and their plant pathogens while highlighting some of the molecular breeding tools that are available to introgress the traits into breeding programs. These studies provide valuable insights into the molecular basis of different levels of host defenses in resistant and susceptible interactions.

Keywords: genomics, legumes, plant–pathogen interactions, transcriptome analysis, GWAS, QTLs, markers, CRISPR/Cas9

# INTRODUCTION

Legumes belong to the third-largest angiosperm family, Fabaceae or Leguminosae. This family comprises around 750 genera and 20,000 species, including grain, forage, and economically important legumes (Polhill et al., 1981). Legumes contribute 33% of human dietary protein (Vance et al., 2000). Although legumes are cultivated over 12 to 15% of the Earth's arable land and account for 27% of the world's primary crop production (Vance et al., 2000), their yield is limited due to environmental adaptability challenges and damage caused by pests and pathogens (Graham and Vance, 2003). Some of the major fungal diseases of legumes include rusts, mildews, root rots, wilts, blights, and anthracnoses. Bacterial diseases are mainly grouped into leaf blights, leaf spots, bacterial wilts, and a diverse group with symptoms such as dwarfing and rots (Rubiales et al., 2015; Wille et al., 2019). Viral diseases are caused by *Bean pod mottle virus, Soybean mosaic virus, and Peanut stripe virus,* among others. Cyst and root-knot nematodes are the devastating parasites of legumes (Rubiales et al., 2015).

Plants have evolved robust defense mechanisms against pathogen attack that are triggered by initial recognition of the pathogen. These mechanisms involve a cascade of signaling responses known as Pathogen-Associated Molecular Pattern (PAMP) Triggered Immune (PTI) response, which eventually leads to changes in the gene expression of the host. Depending on the type of interaction, this can result in either disease susceptibility or disease resistance. The pathogens, on the other end of the

#### *Edited by:*

*Karam B. Singh, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia*

#### *Reviewed by:*

*Nicolas Rispail, Institute for Sustainable Agriculture, (CSIC), Spain Tom Warkentin, University of Saskatchewan, Canada*

#### *\*Correspondence:*

*Kirankumar S. Mysore ksmysore@noble.org*

*†These authors have contributed equally to this work*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 30 April 2019 Accepted: 27 September 2019 Published: 30 October 2019*

#### *Citation:*

*Kankanala P, Nandety RS and Mysore KS (2019) Genomics of Plant Disease Resistance in Legumes. Front. Plant Sci. 10:1345. doi: 10.3389/fpls.2019.01345*

1 **366** spectrum, have evolved several mechanisms involving effector delivery to evade the host defenses. The host defense response to effectors is called Effector-Triggered Immunity (ETI) (Young et al., 2005). The continuous arms race between the host and pathogen eventually determines the outcome of the interaction (Jones and Dangl, 2006). The host responses also vary based on pathogen infection strategies. The current understanding is that successful defense responses against biotrophic pathogens are predominantly mediated by the salicylic acid (SA)-dependent pathway and that those against hemibiotrophs and necrotrophs involve ethylene and jasmonic acid (JA) signaling (Glazebrook, 2005).

Leguminosae includes a diverse variety of plants. *Medicago truncatula* and *Lotus japonicus* have been chosen as model species to advance the study of legumes (Zhu et al., 2005). Several genetic and genomic resources have been developed in these two model legumes to assist breeding programs for enhanced tolerance/ resistance to abiotic and biotic stresses in legume crop species. These include genome sequences (Sato et al., 2008; Young et al., 2011), expressed sequence tags (ESTs) (Asamizu et al., 2004; Gamas et al., 2006), physical and genetic maps (Choi et al., 2004; Yan et al., 2004; Young et al., 2005; Wang et al., 2008; Ohmido et al., 2010; Shah et al., 2016), and insertional mutagenesis lines (Tadege et al., 2008; Urbanski et al., 2013), among others. In addition to the model plants, the genome sequences of crop plants such as *Glycine max* (cultivated soybean), *Glycine soja* (wild soybean)*, Cajanus Cajun* (pigeon pea), *Cicer arietinum* (chickpea), *Vigna radiata* (mung bean), *V. angularis* (adzuki bean), *V. unguiculata* (cowpea), *Arachis hypogaea* (cultivated peanut), *A. duranensis* (wild peanut A genome), *A. ipaensis* (wild peanut B genome), *Medicago sativa* (alfalfa), *Phaseolus vulgaris* (common bean), *Trifolium pretense* (red clover), *Lupinus angustifolius* (lupin), and *Lens culinaris* (lentil) are currently available at https://legumeinfo.org/genomes. The macrosynteny and microsynteny studies among some of these genomes have been useful for translating the knowledge from model to crop plants (Zhu et al., 2005). The availability of genome sequences coupled with recent advancements in affordable Next-Generation Sequencing (NGS) techniques and bioinformatics tools has enabled extensive study of genome-wide expression changes during plant–pathogen interactions to identify the pathways involved in plant defense. Macroarrays, microarrays, RNAseq, suppressive subtractive hybridization (Mehrtens et al., 2005), cDNA-amplified fragment length polymorphism (AFLP) techniques and gene-expression atlases have been used extensively to identify candidate genes for disease resistance. In this review, we focus on the interactions of legumes with plant pathogens such as fungi, oomycete, bacteria, nematodes, and viruses at the genomic level and the use of genomic technologies in breeding for resistance.

# USING GENOMICS TO UNDERSTAND THE BASICS OF PLANT–PATHOGEN INTERACTIONS IN LEGUMES

#### Genomics of Plant–Fungal Interactions

Fungi are among the most challenging plant pathogens to tackle owing to their genetic flexibility and plasticity, which allow them to adapt quickly to their changing environments (Perez-Nadales et al., 2014). Considerable effort has gone into understanding the plant–fungal interaction mechanisms in both model and crop legumes. Large-scale genomic studies have enabled understanding of the various plant disease-resistance mechanisms against hemibiotrophic, biotrophic, and necrotrophic fungal pathogens.

#### Hemibiotrophic Interactions

*Mycosphaerella pinodes* is a broad-host range fungal pathogen that causes ascochyta blight disease. It is known to have a transient biotrophic phase in some hosts and to behave like a necrotrophic pathogen in other hosts (Fondevilla et al., 2011; Almeida et al., 2015). *M. truncatula*-based microarrays were used to study resistant interactions of pea with *M. pinodes*. The functional gene categories involved in the resistance mechanism included phytohormones, *Pathogenesis Related* (*PR*) genes, the phenylpropanoid pathway, cell-wall fortification, and genes involved in ethylene- and jasmonic-acid(JA)-related defense pathways (Fondevilla et al., 2011). This work was later augmented by investigating the transcriptome in the host pea plants using deep SuperSAGE analysis to enrich for transcripts in the pea–*M. pinodes* interactions, followed by next-generation sequencing (NGS) of the transcripts (Fondevilla et al., 2014). Several factors that play key roles in resistance were identified, such as the WRKY protein in pathogen perception, proteases as an active defense against fungal toxins, and the roles of ethylene, abscisic acid, and indole-3-acetic acid as phytohormones in defense. Flavonoids, terpenoids, reactive oxygen species (ROS) and phytoalexins were identified as antifungal components that inhibit hyphal growth and destroy toxins (Fondevilla et al., 2014).

In *Vigna unguiculata* (cowpea) and orphan legumes such as *Lathyrus sativus* (grass pea) and *Vicia faba* (fava bean) where whole-genome sequence information is not available, SuperSAGE and Deep SuperSAGE coupled with NGS sequencing has been used successfully to study transcriptomes during *Ascochyta* infections (Madrid et al., 2013; Almeida et al., 2015). Early gene expression profiling in a resistant variety of grass pea during *Ascochyta* infection identified that classical defense-response genes involved in cell-wall fortification and the phenylpopanoid pathway were differentially expressed in resistant interactions. In addition, homologs of several candidate resistance genes such as receptor kinases containing thaumatin-like protein (TLP) domains, leucine-rich repeat (LRR) domains, and a gene homologous to Resistance to *Pseudomonas syringae* pv *maculicola* 1 (*RPM1*) involved in conferring resistance to *P. syringae* in Arabidopsis were identified (Almeida et al., 2015). During resistant interactions in the fava bean-*Ascochyta fabae* infection process, genes involved in JA signaling and pectin esterase-encoding genes were identified with the SuperSAGE technique (Madrid et al., 2013). In a later study, *de novo* transcriptome assembly was used to identify transcripts in susceptible and resistant interactions of fava bean and *A. fabae.* Genes encoding LRR proteins, Rho2 GTPase-activating protein 2 (RGA2), several plant growth regulators, heat shock proteins, chitin elicitor-binding protein, and those genes that produce chlorogenic acid, scopoletin, and flavonoids were among the significant genes involved in fava bean defenses (Ocaña et al., 2015). Functional characterization of these candidate genes will be the next essential step toward including them in breeding programs.

Anthracnose disease is caused by a hemibiotrophic fungal pathogen, *Colletotrichum* spp. The genomics of *Colletotrichum*– host and –nonhost interactions were investigated using microarray analysis in *M. truncatula* (Jaulneau et al., 2010). In this study, resistant and susceptible *M. truncatula* varieties were infected with pathogenic strain *Colletotrichum trifolii* and non-adapted strains *C. higginsianum* and *C. lindemuthianum*. Resistance responses to non-adapted *Colletotrichum* spp. were similar to the incompatible responses induced by the adapted strain on the resistant line. The nonhost responses included localized oxidative burst and fluorescent compound release. The host resistance response was characterized through defense gene signaling and SA accumulation (Jaulneau et al., 2010). To identify the genetic components of bean immunity against *C. lindemuthianum,* EST analysis was carried out in common bean with putative *A. thaliana* orthologs (Oblessuc et al., 2012). This study suggested that ETI-triggered hypersensitive response is mediated by downregulation of *FLS2*-like and *MKK-5* like putative orthologs of *A. thaliana* genes involved in pathogen perception (Oblessuc et al., 2012). The resistant and susceptible interactions of *P. vulgaris* with *C. lindemuthianum* were investigated using NGS methods (Padder et al., 2016). Most of the DEGs were expressed in the biotrophic phase in the susceptible interaction, while most of the DEGs were expressed in the necrotrophic phase in the resistant interaction. DEGs in the resistant interaction were over-represented by genes expressing PR proteins and peroxidases, while the susceptible interaction was over-represented by genes encoding sugar transporters (Padder et al., 2016). The genomics of partially resistant and susceptible interactions of *Lens culinaris* and *C. lentis* were studied using EST analysis (Bhadauria et al., 2017). Twenty-six resistance genes were identified during the symptomatic phase of infection in the compatible interaction. Further, a complex interplay of plant hormone pathways was also observed in this study (Bhadauria et al., 2017).

Fusarium wilt is a destructive disease in several legumes that is caused by host-specific *Fusarium oxysporum* strains. Extensive studies have been done to understand the molecular basis of this disease interaction in various legumes. In *Phaseolus vulgaris* (common bean), the cDNA-AFLP technique was used to determine transcriptionally regulated genes in response to *F. oxysporum* f. sp. *phaseoli* (*Fop*) infection in resistant and susceptible interactions (Xue et al., 2015). This study identified 122 defense-related gene fragments that are distributed across the genome. This distribution could serve to tag defense-related molecular markers in breeding programs (Xue et al., 2015). RNAseq analysis of *Glycine max* infected with both pathogenic and non-pathogenic strains of *F. oxysporum* identified overrepresentation of defense-related genes corresponding to necrosis in resistant interactions (Lanubile et al., 2015). RNAseq was carried out to understand the molecular differences in defense responses between cultivated and wild species of *Glycine max* against the pathogenic *F. oxysporum* Schltdl (Chang et al., 2019). That study identified the role of secondary metabolites and plant hormones in wild-type germplasm that could be adapted into cultivated species for enhanced resistance.

Several races of *F. oxysporum* f. sp. *Ciceri* (*Foc*) have been identified across the chickpea-growing regions of the world. While most *F. oxysporum* strains are considered as necrotrophic or hemibiotrophic pathogens, *Foc* race 1 (*Foc*1) is reported to be an obligate biotrophic pathogen of chickpea (Gupta et al., 2009). cDNA-AFLP-based analyses, cDNA-based microarrays, and cDNA RAPD methods have been used to study chickpea interactions with *Foc*1 (Nimbalkar et al., 2006; Ashraf et al., 2009; Gupta et al., 2009; Gupta et al., 2010; Gurjar et al., 2012). Although host responses during biotrophic infections are mediated by SA-dependent pathways, gene expression analyses in the above-referenced studies indicate non-traditional responses. The cDNA-AFLP method identified that genes encoding sucrose synthases, invertases, and β-amylase were induced in resistant interactions. The *14-3-3* gene expression was overrepresented in the susceptible interaction, indicating potential nutrient starvation, and the resistant interaction potentially copes with this sugar starvation by over-inducing sugar-metabolism genes. This study was highly suggestive that sugar also acts as a signaling molecule in response to pathogen perception (Gupta et al., 2010). A comparative study of resistant and susceptible interactions with *Foc* races 1, 2, and 7 was conducted in chickpea using the cDNA-RAPD method (Gurjar et al., 2012). This study identified a role for plant glucosyltransferase genes in resistance response. Further, race-dependent defense responses were observed (Gurjar et al., 2012). A similar study with resistant and susceptible interactions with *Foc* race 1, 2, and 4 was conducted recently with the LongSAGE method (Upasani et al., 2017). Clustering analysis and interaction networks of differentially expressed genes (DEGs) identified that the resistant interaction is characterized by ROS production, SA production, lignification, *R*-gene expression, and hormone homeostasis. The susceptible interaction was enriched for actin depolymerization genes, aquaporin genes, and tetrapyrrole synthesis genes (Upasani et al., 2017). Another study detailing the transcriptional changes during resistant and susceptible chickpea interactions with *Foc*1 was done using RNAseq analysis (Gupta et al., 2017). Plant pathogen interaction networks constructed with this transcriptional data identified several nodal hub genes that modulate defense responses and could be further characterized for resistance (Gupta et al., 2017). A microarray-based study of resistant and susceptible chickpea interaction transcriptomes with *Foc*1 was used to create regulatory gene networks (Ashraf et al., 2018). This study identified 76 disease- and immunityrelated genes. The gene regulatory networks identified transcriptional plasticity in immune pathways and disease pathways during wilt interactions. This work also highlighted that the primary metabolic components are shared between defense and disease (Ashraf et al., 2018).

#### Biotrophic Interactions

Asian soybean rust (ASR), caused by *Phakospora pachyrhizi*, is a devastating disease that is listed among the top five biotic threats to agriculture (Pennisi, 2010). Although six resistance Kankanala et al. Plant–Pathogen Interactions in Legumes

genes, *Resistance to Phakospora pachyrhizi* (*Rpp1-6*), have been identified in soybean that confer resistance to ASR in a race-specific manner, no single soybean genotype can confer resistance to all races of the rust fungus (Langenbach et al., 2016a). Several studies have been conducted to identify key players in resistance to ASR. Initial studies to identify key players in the *R*-gene-mediated response of ASR employed the SSH complementary DNA (cDNA) method (Choi et al., 2008; Soria-Guerra et al., 2010b). These studies indicated a timedependent coordinated gene expression pattern in *Rpp*-mediated resistant and susceptible interactions and identified the role of peroxidases and lipoxygenases in resistance. Later work using whole-genome microarrays confirmed these two findings and provided further detail (Van De Mortel et al., 2007; Panthee et al., 2009; Soria-Guerra et al., 2010a; Soria-Guerra et al., 2010b; Morales et al., 2013). Several of these studies have reported an overrepresentation of transcription factors (TF) and the roles of flavonoids and cell-wall lignification in the active resistance mechanisms. Metabolite analysis of the ASR interactions has confirmed some of these findings (Lygin et al., 2009). Several genomic studies involving TFs identified roles for WRKY, the Basic Leucine Zipper (bZIP) domain, and predicted TF families in resistant interactions (Pandey et al., 2011; Aoyagi et al., 2014; Bencke-Malato et al., 2014; Alves et al., 2015). The SuperSAGE technique was used to identify several antimicrobial peptides such as defensins, thionin, and lipid transfer protein (LTP) family genes in cowpea and soybean infected with the rust pathogen *P. pachyrhizi* (Kido et al., 2010).

Nonhost resistance (NHR) is a type of resistance that is displayed by plants against most potential pathogens. This type of resistance can be multi-layered, and plants can exhibit this either prior to infection (pre-invasive NHR) or post-infection (post-invasive NHR) (Senthil-Kumar and Mysore, 2013; Gill et al., 2015; Lee et al., 2017; Fonseca and Mysore, 2019). The NHR responses of Arabidopsis and *M. truncatula* against *P. pachyrhizi* were explored to identify sources of durable disease resistance. Studies with Arabidopsis have indicated that the ASR fungus exploits the necrotrophic pathway by inducing JA-mediated responses to evade host defense responses (Loehrer et al., 2008; Campe et al., 2014). The roles of *PENETRATION 1-4 (PEN1-4), SENESCENCE-ASSOCIATED GENE 101, BRIGHT TRICHOMES 1,* and *POSTINVASION-INDUCED NONHOST RESISTANCE GENES4/5/9* (*PING4/5/9*) in pre-invasive and post-invasive NHR mechanisms have been explored (Loehrer et al., 2008; Langenbach et al., 2013; Langenbach et al., 2016b). Furthermore, the potential of transferring NHR *PING* genes to soybeans and conferring enhanced ASR resistance has been demonstrated (Langenbach et al., 2016b). Transcriptional changes during the interaction of *P. pachyrhizi* with *M. truncatula* were used to identify genes involved in NHR (Ishiga et al., 2015). A combination of transcriptome and metabolite analysis indicated the role of the secondary metabolite, medicarpin, in inhibiting the germination and differentiation of rust urediniospores. Transcriptome analysis also indicated the role of chlorophyll catabolism genes in disease resistance. Further characterization of the *STAY GREEN* gene indicated its role in the hypersensitive-like response during the resistance interaction (Ishiga et al., 2015). Further, a forward genetics-based screening of *M. truncatula Tnt1* insertion lines (Tadege et al., 2008; Sun et al., 2019) for alterations in response against *P. pachyrhizi* identified the *inhibitior of rust germ tube differentiation1 (irg1)* mutant (Uppalapati et al., 2012). *IRG1* encodes a Cys(2)His(2) zinc finger transcription factor, PALM1, which plays a role in regulating epicuticular wax metabolism and transport, and epicuticular wax is important for ASR spore differentiation (Uppalapati et al., 2012; Ishiga et al., 2013).

An interesting study involving resistant interaction with two foliar pathogens, *Colletotrichum trifolii* (hemibiotrophic pathogen) and *Erysiphe pisi* (biotrophic pathogen) and a partially resistant root pathogen, *Phytophthora medicaginis* (necrotrophic pathogen), with *M. truncatula* identified three *Pathogenesis Related* (*PR*) 10 genes, a *TLP*, and a gene encoding hevein-like protein to be upregulated (Samac et al., 2011). The phenylpropanoid pathway involving isoflavonoid synthesis was also upregulated. Further characterization of these genes using RNAi lines identified the role of the *Chalcone Synthase* gene in the phenylpropanoid pathway in conferring resistance to necrotrophic pathogens. (Samac et al., 2011).

A quantitative PCR-based TF platform in *M. truncatula* was used to conduct TF expression profiling during interactions with *Uromyces striatus* (Madrid et al., 2010; Villegas-Fernández et al., 2014). The TF profiling in resistant interactions of *M. truncatula* with *U. striatus* identified genes encoding pathogenesis-related ethylene response factor (PR-ERF), WRKY, and the Myb class of TFs to be differentially expressed (Madrid et al., 2010). Comparing the TF expression profiling in the two pathosystems, *Botrytis* spp. and *U. striatus*, there were higher constitutively expressed TFs in *M. truncatula*-*Botrytis* spp. interactions, indicating an NHRlike response, although this resistance was compromised in the lab-experimental system, allowing infection. This may also be indicative of the differences in the host response to biotrophic versus necrotrophic pathogens (Madrid et al., 2010).

#### Necrotrophic Pathogen Interactions

The availability of whole-genome microarrays of the model legume *M. truncatula* has advanced the understanding of various other fungal pathogen interactions in legumes (Uppalapati et al., 2009; Samac et al., 2011). *M. truncatula* is a susceptible host for Phymatotrichopsis root rot caused by the fungus *Phymatotrichopsis omnivora* (Uppalapati et al., 2010). Microarray analysis of this interaction identified JA- and ethylene-responsive genes, indicating a necrotrophic infection strategy (Uppalapati et al., 2009). Secondary metabolite genes involved in isoflavonoid synthesis were upregulated at the early infection stage but were eventually reduced to basal level during later disease progression stages, indicating the role of fungal manipulation of host defenses (Uppalapati et al., 2009).

Although *M. truncatula* is a nonhost for *Botrytis* spp., by screening several *M. truncatula* genotypes under lab conditions, a partially resistant genotype and a susceptible genotype were identified (Villegas-Fernández et al., 2014). This study identified *Botrytis fabae* as a more aggressive pathogen compared to *B. cinerea* on *M. truncatula*. However, microscopic studies indicate there is higher spore germination of the later pathogen species. Transcription factor (TF) profiling indicated that the host perceives *B. fabae* to be a more virulent pathogen by upregulating diverse TFs involved in stress responses even before the visible symptoms appear (Villegas-Fernández et al., 2014).

An integrated omics approach using RNAseq and metabolomics (1H NMR) data was used to understand the primary metabolism regulation of soybean in response to *Rhizoctonia solani* infection (Copley et al., 2017). A significant flux of responses in redox reactions and ROS signaling along with changes in peroxidases, post-infection, were observed in soybean leaves (Copley et al., 2017).

#### Genomics of Plant-Oomycete Interactions

The oomycete pathogens of legumes that have been most studied using genomic tools are *Phytophthora* spp and *Aphanomyces* spp. Some of the early studies of gene expression changes in soybean with *Phytophthora sojae* reflected the hemibiotrophic infection strategy of the pathogen at the molecular level (Moy et al., 2004). A cDNA microarray with genes from both host plant and pathogen was custom-built, and a time course of susceptible interaction studies revealed the expression of active defenses in the host mediated by SA-triggered pathways at early infection stages. This included expression of *PR1a* gene and genes involved in the phenylpropanoid pathway. The host and pathogen responses peaked at around 24 hpi, followed by a reduction in the host responses, indicating the shift from biotrophy to necrotrophy (Moy et al., 2004). With the availability of the Affymetrix® gene chip for soybean, a detailed mapping of the soybean transcriptome change was carried out using three genotypes – resistant, partially resistant, and susceptible soybean varieties. This experiment was conducted with 72 biological replicates to understand the effect of genotypic variation on transcriptome changes (Zhou et al., 2009). The large number of replicates coupled with detailed statistical analysis demonstrated that almost the entire genome underwent low-level transcriptional changes in response to disease and genetic variation, yet most of the differences were less than two-fold in magnitude. This work hypothesized that these pervasive and statistically significant low-level changes may reflect the genotype-specific host adaptive changes in response to the pathogen and that studying these might be valuable. A macroarray study of resistant and susceptible interactions in the same disease system identified a role for putative regulators of chromosome condensation 1 protein family in the resistant interaction, suggesting the suppression of nucleocytoplasmic trafficking as one of the host strategies for combating disease (Narayanan et al., 2009). A more recent attempt to enrich the transcripts differentially expressed during the disease process employed the SSH cDNA library coupled with NGS (Xu et al., 2012). This study identified genes encoding several traditional proteins involved in disease-resistance strategies, including various PR-like proteins, the WRKY class of transcription factors, and proteins involved in the phenylpropanoid pathway. A novel discovery of this work involved identifying the allergen gene *Pru ar 1* (*Prunus armeniaca*) in soybean, which could be involved in resistance. Functional characterization of the *Pru ar 1* gene identified it as a novel gene encoding PR10 protein (Fan et al., 2015).

MicroRNAs (miRNAs) are 20- to 24-nucleotide long, singlestranded non-coding RNAs that play critical roles in various biological functions, including plant innate immunity. miRNAsmRNA complexes regulate these responses (Navarro et al., 2008). Microarray analysis was conducted with susceptible, qualitative-resistant, and quantitative-resistant cultivars of soybean infected with *Phytophthora sojae* (Guo et al., 2011). This study identified different microRNAs in the three different interactions. The bioinformatics search indicated that some of the targets involved diverse categories such as defense response genes, kinases, transcriptional factors, etc. Several microRNAs have inverse expression patterns to their putative target genes. These data indicated a role for microRNAs in regulating plant defense responses in resistant interactions. To understand the single dominant gene in *Resistance to Phytophthora sojae* (*Rps)-*mediated resistance mechanisms in soybean-*P. sojae* interactions, researchers conducted transcriptome analysis of 10 near-isogenic lines, each with a unique *Rps* gene/allele (Lin et al., 2014). This study identified that *Rps* recognition was characterized by induction of SA-, ethylene-, and brassinosteroid phytohormone-signaling pathways, repression of JA pathways, ROS, WRKY transcription factors, MAP kinase-signaling pathways, and phytoalexin production. The compatible reaction was characterized by the induction of the JA pathway, repression of the ethylene pathway, and no changes to the SA and brassinosteroid pathways (Lin et al., 2014).

*Aphanomyces euteiches*, the causal organism of Aphanomyces root rot, is a major soilborne oomycete pathogen that infects various legume species, including pea, lentil, and alfalfa (Pilet-Nayel et al., 2009). In an attempt to understand the strategies employed during *A. euteuches* interactions with *M. truncatula* in a compatible interaction, a cDNA-AFLP approach was first employed to understand the optimal infection time for evaluation, followed by cDNA enrichment with SSH (Nyamsuren et al., 2003). This study identified classical *PR*- and defense genes. The molecular analysis indicated abscisic acid-mediated signaling that could induce PR-10 protein. The PR-4 protein-containing hevein domain, which could bind chitin, was also identified. A more recent study was conducted to compare the transcriptional responses in compatible interactions of pea plants with both the oomycete pathogens discussed here—*Phytophthora pisi* and *A. euteiches*—using a *M. truncatula* microarray (Hosseini et al., 2015). The study revealed different recognition and signaling components in the host against the two pathogens. PTI and ETI responses were detected in the early stages of infection with both pathogens. JA- and ET-hormone signaling were involved in both interactions, while the auxin-induced SAUR family proteins were specific to *A. euteiches*. The interactions of downy mildew pathogen, *Peranospora viciae* f. sp. *Pisi*, with pea leaves were investigated with SSH cDNA libraries (Feng et al., 2012). The study identified downy mildew resistance genes *RPP6/6/27* involved in this interaction.

#### Genomics of Plant–Bacteria Interactions

The application of genomic tools in understanding bacterial pathogenesis in legumes is relatively limited compared to in fungal pathogenesis studies. NGS technologies have been employed to understand the interactions of *Xanthomonas axonopodis* pv. *glycines* (*Xag*), which causes bacterial leaf pustule (BLP) disease in soybean (Kim et al., 2011; Chatnaparat et al., 2016). Kim et al. (2011) studied the transcriptome profiling in near-isogenic lines (NILs) of resistant and susceptible cultivars of BLP while Chatnaparat et al. (2016) studied the gene expression of the pathogen in susceptible host leaves. In the former study, several genes involved in PTI response such as *EF-TU RECEPTOR (EFR)-* and *FLAGELLIN SENSING 2 (FLS2)-*, *ATPASE 4 (ACA4)-*, *ACA11-*, *MAP KINASE 4 (MPK4)-*, *MPK6-,* and *RESPIRATORY BURST OXIDASE HOMOLOGUE (RBOH)-*like genes, and Damage-Associated Molecular Pattern (DAMP) receptors such as *PLASMA MEMBRANE LRR RECEPTOR KINASE 1(PEPR1)* and *PEPR2,* were induced at 0 hours post inoculation (hpi) in BLP-resistant NILs and not in BLP-susceptible NILs. Defense response genes such as *RPP*-, *RPM1*-, and *Mildew Locus O (MLO)* like genes also were induced at this time point in BLP-resistant NILs. The authors speculate that this early up-regulation of PTIrelated genes potentiates immune response during pathogen attack. Although the *Xanthomonas* species is known to be a biotroph, *Xag* behaves like a necrotrophic pathogen in soybean, as demonstrated by the molecular mechanisms in this study. Several genes encoding jasmonate-zim domain (JAZ)-like and MYC2 TF proteins were also induced at 0 hpi (Kim et al., 2011). This work demonstrates that the activity of PTI components at the early stage of infection is an important defense mechanism in the resistant soybean NIL tested.

Bacterial wilt caused by *Ralstonia solanacearum* is an important pathogen of peanuts, and genomic tools have been employed to understand the host–pathogen interactions. Early studies in this system were done using cDNA libraries where both the roots and leaves of peanuts were challenged with the pathogen, while in nature *Ralstonia* is a root pathogen (Huang et al., 2012). Ethylene and JA pathway genes were induced in both the roots and leaves of a highly resistant peanut cultivar. Several secondary metabolite genes were induced in roots and not in leaves, indicating the natural adaptation of the host to a root pathogen (Huang et al., 2012). In a more recent study of this pathosystem, NGS technology was used to study the gene expression differences between susceptible and resistant cultivars (Chen et al., 2014). In this study, the suppression of primary metabolism, especially carbohydrate metabolism, was an important feature in the resistant interaction, indicating the shift of energy investment from the primary metabolism to defense mechanisms. The PTI defense pathway was triggered in both resistant and susceptible interactions, and its partial suppression by the pathogen was observed. The expression patterns of secondary metabolites and defense response genes and hormone analysis indicated that resistance was primarily conferred by defense response genes in the ETI response cascade. Bacterial blight disease of soybean is caused by *Pseudomonas syringae* pv. *glycinea* (*Psg*). In a cDNA microarray study of both resistant and susceptible interactions of *Psg* with soybean using a virulent and avirulent strain of *Psg*, a three-phase response was studied. In phase I, which corresponded to the induction stage at 2 hpi, no significant differences were seen between susceptible and resistant interactions. Phase II, which lasted from 3 to 10 hpi, and phase III, up to 24 hpi, corresponded to the effector stage and programmed cell death (PCD) stages in resistant interaction, respectively. Several gene expression changes were observed in phases II and III. An important reported observation was a 92% reduction in the expression of chloroplast-related genes in the resistant interaction event at 8 hpi with no visible symptoms. Physiological measurements supported these data. There was a lack of ROS in the susceptible interaction at phase II. This study suggested the role of photosystem centers as a potential source of the secondary ROS or the oxidative burst response that eventually leads to PCD in phase III (Zou et al., 2005). *Pseudomonas syringae* pv. *syringae* causes bacterial stem blight in alfalfa. RNAseq was conducted to study the host–pathogen interactions in resistant and susceptible alfalfa cultivars at two different time points (Nemchinov et al., 2017). The timing of resistance response differed in both cultivars. The ZG9830 cultivar triggered ETI responses much earlier than the Maverick cultivar. The resistance response in cultivar ZG9830 may involve NBS-LRR, TIR-unknown (TX), and nematode-resistance proteins named based on their homology to Hs1pro-1 (*HSPRO2*) like *R* genes, while the cultivar Maverick may involve the CNL class of *R* genes (Nemchinov et al., 2017).

#### Genomics of Nematode–Plant Interactions

*Heterodera glycines,* commonly known as soybean cyst nematode (SCN), is among the most devastating soybean pathogens. In an attempt to engineer resistance against SCN, detailed characterization of the molecular changes during the infection process in both soybean and the pathogen have been performed. cDNA-based microarrays were initially used to profile the transcriptome changes in the soybean roots during different infection stages in both compatible and incompatible interactions with SCN (Alkharouf et al., 2004; Khan et al., 2004; Alkharouf et al., 2006). These studies identified compatible interactionspecific and incompatible interaction-specific genes as well as time-based induction of genes. Stress-induced gene *PR-10* was identified in both compatible and incompatible interactions. Several genes belonging to carbohydrate metabolism, plant defense response, and signaling were indicated in compatible interaction. In a time-course study by Alkharouf et al. (2006), plant responses were documented prior to feeding cell selection (pre-FCS) as well as after feeding cell selection (post-FCS) during compatible interaction. The pre-FCS stages induced *PR-10* genes, stress-related genes, carbohydrate-metabolism genes, and secondary metabolism genes. The differentially expressed genes during post-FCS were involved in transcription and protein synthesis. Later studies employing the Affymetrix® gene chip identified differentially expressed genes like *PR-5*, *PR1a*, *Expansins*, cell wall-fortification genes, and phenylpropanoid pathway genes during the post-FCS stage (Ithal et al., 2007). Differential gene expression changes were observed in different genotypes even during the pre-FCS based on the interaction type (compatible/incompatible) in whole-root analysis. Genes belonging to No Apical Meristem (NAM) domain-containing TFs, the WRKY class of TFs, Nucleotide Binding Site-Leucine Rich Repeat (NBS-LRR) kinases, signal transduction, cell wallfortification, and GC-enriched elements in promoters of genes were identified in the incompatible interaction (Klink et al., 2007b; Mazarei et al., 2011; Wan et al., 2015). In an attempt to enrich for genes differentially expressed during pathogenesis, RNA isolated from laser-capture microdissection samples of syncytial cells was used for microarray analysis with the soybean Affymetrix® gene chip to understand the defense responses in both compatible and incompatible interactions (Klink et al., 2007a; Klink et al., 2009b; Klink et al., 2010; Kandoth et al., 2011; Matsye et al., 2011). The developmental stages of syncytium are divided into a parasitism phase where the syncytium develops and a second phase when the resistance develops. Microarray studies of gene expression during these specific stages indicated that the whole-root analysis masked several key players that are involved in the specific interactions. There were no significant changes in gene expression in the parasitism stage in the compatible and incompatible interactions (Klink et al., 2010). *Lipoxygenases*, *14-3-3*, and genes involved in JA and ethylene biosynthesis, the S-adenosyl methionine pathway, the flavonoid pathway, and coumarin and cellulose biosynthesis were highly induced in the resistant interaction at different stages of resistance response (Klink et al., 2007a; Klink et al., 2009b; Klink et al., 2010). The gene expression profiling in *Resistance to H. glycines* (*Rhg1*)-mediated soybean resistance utilizing laser capture microdissections identified apoptosis-related, hypersensitive, and SA-induced defense response genes in the resistant interaction. Several of these genes were either partially or completely suppressed during susceptible interactions with SCN (Kandoth et al., 2011). Genotype-specific defense response studies in soybean indicated two different types of resistant responses involving the varieties Peking and PI88788. Resistance in the Peking variety involved rapid and potent cell wall appositions, while resistance in the PI88788 variety involved potent but slow response without cell wall appositions (Matsye et al., 2011). Microarray studies in these two varieties indicated the role of amino acid transporter and alpha soluble NSF (*N*-ethylmaleimide-sensitive factor) attachment protein in plant defense (Matsye et al., 2011). Another study with two different SCN populations that invoke a resistant and susceptible response in the same soybean genotype, Peking, revealed that SCN might have evolved different mechanisms to overcome host resistance (Klink et al., 2009a). Approximately 71 genes were induced and 44 genes were suppressed in the SCN strain that triggered a resistant reaction in the host during pre-infection stage. As the infection progressed, many SCN genes were suppressed in the resistant interaction. These data indicate that the feeding and nutritional uptake mechanisms of SCN might be the targets of the host defense. A recent study conducted by Tian et al. identified the microRNAs that are differentially expressed between two soybean cultivars, KS4313N and KS4607, which have differential resistance response to SCN. They identified a total of 60 differentially expressed miRNAs belonging to 25 families correlating to the response of the cultivars (Tian et al., 2017). Black soybean, Huipizhi Heidou, has different grades of resistance to SCN. RNAseq analyses at three different infection time points were conducted in two cultivars representing resistant and susceptible interactions (Li et al., 2018b). The study suggested roles for five plant hormones in the resistance. While SCN is a pathogen of soybean, it can also infect and reproduce in the roots of common bean, causing yield reductions. Gene expression profiling of common bean roots upon infection with SCN resulted in the differential expression of genes encoding nucleotide-binding site leucine-rich repeat resistance (NLR) proteins, WRKY TFs, PR proteins, and heat shock proteins (Jain et al., 2016).

*Meloidogyne* spp., commonly known as root-knot nematodes (RKN), are biotrophic parasites of soybean and cause major crop losses. The resistance mechanisms of incompatible soybean interactions with *Meloidogyne incognita* indicated auxinmediated defense responses (Beneventi et al., 2013). Based on transcript profiling, a potential defense model was proposed in which ROS-mediated calcium signaling and nucleoside sugar formation play a critical role in plant hormone signaling. ROS homeostasis was proposed through a balance of auxin-mediated gibberellic acid (GA) and ROS-mediated JA and GA pathway signaling, which maintains low oxidative stress in plants and allows for plant growth. DELLA-like protein was proposed to be a key element in the plant hormone signaling pathway. On the other hand, the gene expression studies on the compatible interactions with RKN indicated an induction of genes involved in the cell cycle, sugar metabolism, and cell wall metabolism. These processes are involved in the successful establishment of giant cells during the infection process. Host defense response genes involving JA-mediated pathways were induced in the early infection stages, while most of them were downregulated by the time the infection had progressed, indicating that RKN actively manipulates host defenses (Ibrahim et al., 2011). The studies on the *Rk* locus-mediated resistant interaction of RKN indicated that most of the host defenses were suppressed during the infection and feedings stages. Based on this work, it was proposed that the host defenses are triggered against the nematode infection, likely due to the high accumulation of toxins involving unique resistance mechanisms (Das et al., 2010). NGS studies during the early and late stages of the compatible interaction of *M. incognita* in common bean identified biotic and abiotic stress responses (Santini et al., 2016). Enhanced expression of wound responsive genes at early stages and the TMV resistance protein encoding *N* gene indicated an active host response to block pathogen infection. This basal response was broken by suppression of ET/JA pathways and at later infection stage (Santini et al., 2016). *M. incognita* can also infect alfalfa and cause disease in some varieties or accessions. Resistant and susceptible interactions of *M. incognita* with alfalfa were profiled using both cDNA libraries and through NGS using Illumina Hiseq 2000 (Potenza et al., 2001; Postnikova et al., 2015). There was a high induction of defense-related and stress-response genes in susceptible interaction, indicating basal defense responses. Analysis with the bioinformatics platform for plant resistance (*R*) gene analysis, PRGdb, identified two potential resistance (*R*) genes specific to the resistant interaction. Recently, NGS was used to study the genomics of resistance in wild diploid peanut *Arachis stenosperma* that harbors resistance to *M. arenaria* (Guimaraes et al., 2015). This study identified components of genetic resistance and induced resistance that could be integrated into breeding programs for durable resistance to RKN.

#### Genomics of Plant–Virus Interactions

*Soybean mosaic virus* (SMV) is an RNA virus and is one of the most prevalent viral pathogens of soybean. A handful of genomics studies have been conducted to understand the molecular changes involved in this disease interaction, including transcriptomics, degradome-seq, and smallRNA-seq (sRNA-seq). One of the earliest genomic studies of SMV-soybean interaction was conducted using cDNA microarrays to investigate the transcriptional changes from early to late infection stages. This study revealed that the plant immune responses are activated at late infection stages in the compatible interaction and that this delayed defense response may be critical to establishing systemic infection (Babu et al., 2008). To study the impact of elevated ozone on the SMV–soybean compatible interaction, gene expression profiling was conducted using soybean microarrays. Increasing ozone concentrations delayed the onset of disease, and this delay corresponded to the expression of basal defense response genes (Bilgin et al., 2008). Comprehensive RNA-seq, sRNAseq, and degradome-seq were performed in soybean during compatible and incompatible interactions with SMV in two different studies (Chen et al., 2016; Chen et al., 2017). An miRNA-mRNA regulatory network was developed based on these data to elucidate the role of miRNAs in the SMV infection process. This study further identified 71 genes that potentially play a role in defense during SMV infection (Chen et al., 2016). One of the differentially expressed genes, *Eukaryotic Elongation Initiation Factor 5A* (*ElF5A*), was further characterized, and the knockout mutant of this gene was hyper-susceptible to SMV (Chen et al., 2017). A time-course RNA-seq study during the soybean–SMV compatible interaction identified roles for SA and NLR family genes that were downregulated during compatible interaction and upregulated during incompatible interactions (Zhao et al., 2018).

Transcriptional responses during *Bean common mosaic virus* (BCMV) interaction with common bean were investigated with two known and one unknown strains of BCMV. The known strains that caused moderate disease symptoms induced more transcriptional changes than the unknown strain that caused severe symptoms (Martin et al., 2016). More recently, a study was conducted to identify miRNAs during the infection of *Mungbean yellow mosaic India virus* (MYMIV) in common bean employing high-throughput sequencing and identified 107 differentially expressed miRNAs during infection and 3,367 potential target genes for these miRNAs (Patwa et al., 2018).

#### Genomic Applications in Legume Breeding Molecular Markers in Legume Plant–Pathogen Interactions

Engineering or breeding for resistance against plant diseases and nematodes is a more economical and eco-friendly approach than is the use of pesticides. The selective breeding process depends on the type of trait and whether the information for such resistance can be inherited in a qualitative or quantitative manner (Poland and Rutkoski, 2016). Qualitative disease-resistance breeding involves large screening assays that are often laborious and require extensive knowledge of plant–pathogen interactions. Lately, introgression of resistance genes into selective breeding material have relied on the use of molecular markers to assist breeders in the breeding process, which is often called Marker Assisted Selection (MAS) (Cobb et al., 2019). Molecular markers, including AFLPs, simple sequence repeats (SSRs), and more commonly single nucleotide polymorphisms (SNPs), have been developed in a variety of crops and used for different breeding programs. Nucleotide binding site (NBS) profiling, a new marker technology that improves the detection of molecular markers for disease resistance, was developed to identify markers by using NBS regions in the genomes (Van Der Linden et al., 2004). Due to its gene-targeting nature, NBS-profiling directs a PCR reaction to NBS domains through which a large number of *R* genes can be identified as molecular markers (Van Der Linden et al., 2004). More recently, the availability of genome sequence information for a number of plant species, including legumes, has helped the identification of molecular markers such as SNP markers that can be integrated into breeding programs for resistance screening. Some of the recent genomic resources available in legumes such as *M. truncatula*, *L. japonicus*, soybean, chickpea, and pigeon pea are described here (Sato et al., 2007; Sato et al., 2008; Schmutz et al., 2010; Young et al., 2011; Varshney et al., 2012; Varshney et al., 2013; Wang et al., 2013; Pecrix et al., 2018). Markers developed in a variety of ways are integrated into legume breeding programs for resistance against plant pathogenic fungi, oomycetes, bacteria, and nematodes. Though a common approach for engineering resistance into plants is through the integration of race-specific resistance against a known pathogen, this may not impart long or durable resistance, as the single *R* gene-mediated resistance can be overcome in an arms race by the rapidly evolving pathogens (Fonseca and Mysore, 2019). Hence, a better approach to create a durable resistance is through the deployment of quantitative trait loci (QTLs) through breeding strategies (Kou and Wang, 2010; Kou and Wang, 2012; Zhou et al., 2018) or through a transgenic approach using genes involved in NHR from different plant species (Fonseca and Mysore, 2019). Previous studies identified several genes involved in NHR against important legume pathogens (Fonseca and Mysore, 2019). *Phytophthora sojae* is a fungal pathogen that causes root rot in soybean and is non-pathogenic on *M. truncatula*, alfalfa, and Arabidopsis. Penetration mutants (*pen1-1*) in Arabidopsis were found to be compromised to *Phytophthora sojae*, thus establishing the case for pre-invasive non-host resistance (Sumit et al., 2012). This gene, when transferred to soybean, resulted in enhanced resistance to *Fusarium virguliforme* (Wang et al., 2018). Similarly, the Asian soybean rust pathogen, *P. pachyrhizi*, was not able to infect *M. truncatula,* alfalfa, or Arabidopsis (Loehrer et al., 2008; Langenbach et al., 2013; Ishiga et al., 2015). The strategy of transferring genes involved in NHR was used to increase the resistance of soybean to *P. pachyrhizi*. In this case study, 10 *PING* genes were overexpressed in soybean, resulting in enhanced resistance to *P. pachyrhizi* infections (Langenbach et al., 2016b).

In this section, we will explore the use of markers and QTLs in legume resistance breeding. Bulk segregant analysis (BSA) was used to map resistance to *A. euteiches* (*AER1*) in *M. truncatula* (Pilet-Nayel et al., 2009). Meta-QTL analysis in pea resulted in the identification of 27 meta-QTLs for resistance to *A. euteiches*. Six of them were found to co-localize with six of the meta-QTL regions identified for plant height and earliness (Hamon et al., 2013). Two major QTLs *Ae-Ps7.6* and *Ae-Ps4.5* were identified in pea near-isogenic lines (NILs) that were able to delay the symptoms of oomycete pathogen *A. euteiches* in pea (Lavaud et al., 2016).

Ascochyta blight (AB) of pea is caused by complex of fungal pathogens including *Didymella pinodes* and *Phoma medicaginis* var *pinodella*. QTLs of resistance to the blight complex pathogens were identified as QTLs based on two QTL mapping populations, A26 × Rovar and A88 × Rovar. QTL peaks, for the *Asc2.1*, *Asc4.2*, *Asc4.3*, and *Asc7.1* QTLs, were defined by four of the pea defense candidate genes (Timmerman-Vaughan et al., 2016). These regions were identified on linkage group I in the vicinity of markers c206 and sB17-655, on linkage group III in the vicinity of markers M2P5-169 and PI39, and on linkage group VII in the vicinity of markers Z12-2400, HSP18.1, and MAPKinase (Timmerman-Vaughan et al., 2016). Similarly, AB of dry pea is predominantly caused by *Didymella pisi*. In an effort to identify QTLs that show consistency across locations and years, Jha et al. (2016) identified two QTLs, *abIII-1* and *abI-IV-2*, for AB resistance. AB is also an important disease in faba bean, resulting in yield losses of 35-40% (Atienza et al., 2016). Two QTLs governing resistance to *Ascochyta fabae* were identified on chromosome II (Af2) and chromosome II (Af3) of faba bean (Atienza et al., 2016).

Rusts in pea are caused by pea rust pathogen *Uromyces pisi*. Using DArT-Seq and 8,514 SNP markers, two QTLs, *UpDSII* and *UpDSIV*, were identified in the Linkage Groups (LGs) II and IV that controlled resistance to *Uromyces pisi* (Barilli et al., 2018). In cowpea, rust is caused by the *Uromyces vignae* pathogen. A single dominant *R* gene (*Ruv2*) that confers resistance against *U. vignae* was found to be inherited in RILs against the *U. vignae* isolate, Auv-LS (Wu et al., 2018).

Powdery mildew in pea is caused by *Erysiphe pisi*. The infection results in the formation of small diffused spots on the upper surface of the leaves and at advanced stages covers the entire plant surfaces as a white powdery growth (Ek et al., 2005). Powdery mildew resistance in pea is governed by a pair of recessive alleles "*er1er1*" (W.H, 1948; Tiwari et al., 1997). Humphry et al. (2011) identified the *Er1* locus as *PsMLO1* and established through complementation that the loss of *PsMLO1* function conditions durable broad-spectrum powdery mildew resistance in pea. Besides the recessive allele *er1*, another recessive allele *er2* and a dominant gene *Er3* was recently identified and reviewed by Fondevilla and Rubiales (Fondevilla and Rubiales, 2012). *Er3*, the dominant gene conferring resistance to powdery mildew in pea, was mapped to pea linkage group IV (Cobos et al., 2018). A variety of molecular markers closer to the *Er* locus were developed to screen the genotypes of pea for powdery mildew resistance (Ek et al., 2005; Sun et al., 2016a; Ganopoulos et al., 2018). A novel *er1-7* allele conferring pea powdery mildew resistance was identified through a 10-bp deletion in *PsMLO1* cDNA (Sun et al., 2016b). Other natural variations of *er1* alleles have been identified, and markers have been designed to screen for powdery mildew resistance in pea (Sudheesh et al., 2015; Sun et al., 2016a; Sun et al., 2016b).

Pea root rot is caused by a variety of fungal plant pathogens, and the causal agent has been identified as *Fusarium solani fsp. pisi* (*Fsp*). A strong QTL, *Fsp-Ps 2.1*, governing resistance to *Fsp* has been detected in the recombinant inbred line (RIL) populations of pea (Baccara × PI 180693). The QTL *Fsp-Ps 2.1* has been identified along with two other minor variance QTLs using three criteria: root disease severity, ratios of diseased vs. healthy shoot heights, and dry plant weights under controlled conditions (Coyne et al., 2019).

Soybean cultivation is significantly affected by SCN and by the sudden death syndrome (SDS) caused by the soilborne fungus *F. virguliforme*. SDS of soybean results in necrosis/rot of roots, while SCN infection results in yellow dwarf symptoms in soybean. Using soybean plant populations resistant to SCN and SDS, QTL mapping populations have been developed to identify QTLs for both SDS and SCN (Swaminathan et al., 2018).

Verticillium wilt, caused by the soil borne fungus *Verticillium alfalfae*, is one of the most serious diseases of alfalfa. Through the use of BSA on the alfalfa genotypes and by using markertrait associations with the help of SSRs and SNPs, 17 SNP markers linked to Verticillium wilt resistance were identified (Zhang et al., 2014). Similarly, using *M. truncatula* as a model to develop QTLs for resistance against Verticillium, a population of recombinant inbred lines (RILs) from a cross between resistant line F83005.5 and susceptible line A17 were inoculated with a potato isolate of *V. albo-atrum*, LPP0323. Following the inoculation and screening, a set of four QTLs were identified for the area under the disease progress curve and for maximum symptom score (Negahi et al., 2014). A similar study design was used to identify three distinct QTLs (MtVa1, MtVa2 and MtVa3) that confer resistance to *V. albo-atrum* in a population of A17 and DZA45.5 (Ben et al., 2013). A recent transcriptomic study conducted on the early root responses of *M. truncatula* lines A17 (resistant) and a susceptible line (F83005.5) identified core transcriptional responses against root pathogens and showed that the resistance line A17 displayed higher defense-related genes upon inoculation with *V. alfalfae* V31-2 (Toueni et al., 2016). Phytophthora root rot is caused by an oomycete pathogen *Phytophthora sojae,* resulting in damping-off, yellowing and wilting diseases in soybean (Li et al., 2017). A QTL, *Resistance to Phytophthora sojae* (*RpsQ*), which confers resistance against *P. sojae* in soybean cultivar Qichadou 1, was mapped using SSR markers to a 118-kb region on the soybean chromosome 3. This 118-kb mapped region consists of 11 candidate genes, and one of them, *Glyma.03g027200*, was found to encode a serine threonine receptor-like kinase (RLK), which was later confirmed as a likely candidate gene of *RpsQ*. In a study of *M. truncatula* roots colonized by pathogenic oomycete *Phytophthora palmivora*, SNP markers associated with plant colonization response were identified upstream of a *Required for Arbuscule Development 1 (RAD1)* locus, a positive regulator of arbuscular mycorrhizal (AM) fungus (Rey et al., 2017). The *rad1* mutant was impaired in colonization by AM fungi as well as by *P. palmivora* (Rey et al., 2017). This is one example showing how the use of association mapping in legumes can help identify the genes responsible for genetic resistance against an oomycete pathogen. Readers are encouraged to read the more recent review on fungal root diseases in grain legumes and the implications of plant genetic variation in plant breeding (Wille et al., 2019).

Anthracnose of lentils is caused by *Colletotrichum lentis* and accounts for 70% of the crop loss in lentils (Bhadauria et al., 2019). Recent genomic sequencing studies on one of the pathogenic races of *C. lentis* (virulent race 0) combined with QTL mapping led to the identification of a single QTL, *qClVIR-11*, located on mini chromosome 11, thus explaining 85% of the variability in virulence of the *C. lentis* population (Bhadauria et al., 2019).

Cowpea is one of the highly cultivated legume crops and is susceptible to many biotic stresses caused by nematodes, bacteria, and fungi. Root-knot nematodes (RKN) are the most important pests of cowpea, resulting in huge losses due to their interference with the root architecture, which results in poor development of the plants (Santos et al., 2018). Previously, two resistance genes, *Resistance to Root knot* (*Rk*) and *Rk2* , were identified to confer resistance against RKN in cowpea (Das et al., 2010; Ndeve et al., 2019). A recent QTL mapping study using RIL population 524B x IT84S-2049 in cowpea resulted in the identification of a major QTL, *QRk-vu9.1*, associated with resistance to *Meloidogyne javanica* reproduction (Santos et al., 2018). This QTL was mapped on linkage group LG9 at position 13.37 cM using egg production data. Interestingly, the mapped intervals for this QTL corresponded with six *TIR-NBS-LRR* (*TNL*) genes that were identified using transcriptomic analysis between NILs resistant and susceptible to RKN (Santos et al., 2018). A majority of the examples quoted here are in early studies towards achieving economic benefits by developing disease-resistant cultivars.

#### Use of GWAS in Legume–Pathogen Interactions

In plant species, underlying variation with phenotypic data can be quantified, and genome-wide association mapping (GWAS) can be applied for identifying genes and for associating them with the phenotypes. This type of GWAS analysis for SNP discovery is made possible through the development of several target-enrichment or reduction-of-genome-complexity methods such as Genotypingby-Sequencing (GBS) (Elshire et al., 2011; Glaubitz et al., 2014) or restriction site-associated DNA sequencing (RADSeq) (Davey and Blaxter, 2010). RADseq combines two simple molecular biology methods such as restriction digestion of DNA into fragments and then tagging them using identifier tags followed by NGS (Das et al., 2010). Recently, Diversity Arrays Technology (DArT) in combination with NGS platforms, known as DArTseq™, was used to develop a relatively large number of polymorphic markers to build dense genetic maps (Kilian et al., 2012; Kilian and Graner, 2012). In the GBS methodology, genome complexity is reduced through the use of methylation-sensitive restriction enzymes. Such a method helps avoid the sequencing of repetitive regions and can aid in the sequencing of low copy regions with high efficiency (Elshire et al., 2011). The majority of the GWAS studies described below in this section followed the GBS methodology to generate the genotyping data for SNP identification and exclusively used the GWAS pipelines developed for model and non-model plants (Lipka et al., 2012; Glaubitz et al., 2014; Tang et al., 2016). A variety of experiments have been conducted using GWAS to study the variations associated with the phenotypes in Arabidopsis (Atwell et al., 2010), Maize (Tian et al., 2011), rice (Huang et al., 2010), soybean (Lam et al., 2010), and *M. truncatula* (Branca et al., 2011). In this section, we will review GWAS studies of legumes in relation to disease resistance phenotypes. Alternatively, HapMap accessions that are sequenced by whole genome sequencing provide an excellent opportunity for the identification of SNPs across the HapMap populations (http://www.medicagohapmap.org/). Some of the *M. truncatula* HapMap accessions (288 accessions) were used to map flowering time traits linked to nitrogen fixation and for identifying other traits of interest (Stanton-Geddes et al., 2013; Kang et al., 2015; Kang et al., 2019). GWAS was used to estimate linkage disequilibrium levels and identify quantitative resistance loci (QRL) controlling resistance to both anthracnose and Angular leaf spot (ALS) diseases of 180 accessions of common bean. The study resulted in the identification of 21 and 17 statistically significant SNPs associated with anthracnose and ALS diseases of common bean, respectively (Perseguini et al., 2016). Bonhomme and his colleagues (Bonhomme et al., 2014) used high-density SNPs (~5.1 million single nucleotide polymorphisms) to perform GWAS studies with Aphanomyces root rot resistance against 179 HapMap accessions of *M. truncatula*. With the use of GWAS, they were able to identify two QTL loci on chromosome 3, with candidate SNPs in the promoter and coding regions of an F-box protein coding gene (Bonhomme et al., 2014). GWAS was recently used in 175 *Pisum sativum* lines and were genotyped for resistance to *A. euteiches* using 13,204 SNPs from the GenoPea Infinium® BeadChip (Desgroux et al., 2016). The study resulted in the identification of 52 QTLs of small size intervals associated with resistance to *A. euteiches* and further validated six of the seven previously reported QTLs (Desgroux et al., 2016). A similar GWAS study was performed to find the association between the plant system architecture of pea and *A. euteiches* resistance by using 266 pea lines that varied in both of the traits (plant system architecture and disease resistance). Genotyping the lines with 14,157 SNP markers resulted in the identification of one significant SNP mapped to major QTL *Ae-Ps7.6* associated with both resistance and root system architecture (RSA) traits (Desgroux et al., 2017).

Brown stem rot (BSR) of soybean, caused by the soilborne fungus *Cadophora gregata*, affects soybean production in the Northern United States, Canada, and Brazil. Using GWAS, a BSR resistance QTL has been identified in chromosome 16 and is located between 32.8 and 33.1 Mb based on the Glyma2.0 assembly (Rincker et al., 2016). More importantly, this region also maps to previously identified *Resistance to Brown Stem Rot* genes (*Rbs*) *Rbs1*, *Rbs2,* and *Rbs3* (Rincker et al., 2016). This narrow range of resistance QTL could be useful for MAS breeding programs. A comprehensive global view of disease resistance loci in soybean against multiple plant pathogens was presented through the use of GWAS on public Germplasm Resources Information Network and public SNP data (SoySNP50K; (Chang et al., 2016). Using GWAS, the authors identified significant novel SNPs associated with resistance to: bacterial pustule caused by *Xanthomonas axonopodis* pv. *glycines*; BSR caused by fungus *C. gregata*; Diaporthe stem canker caused by *Diaporthe phaseolorum* var. *caulivora* and *D. phaseolorum* var. *meridionalis*; SDS caused by *F. virguliforme*; ASR caused by *P. pachyrhizi*; SCN caused by reniform nematode, *Rotylenchulus reniformis;* and bean pod mottle virus (Chang et al., 2016).

GWAS was applied to detect SNPs significantly associated with resistance to *H. glycines* in the core collection of the common bean. There were 84,416 SNPs identified in 363 common bean accessions (Wen et al., 2019). GWAS identified SNPs on chromosome 1 that were significantly associated with resistance to *H. glycines* type 2.5.7. These SNPs were in linkage disequilibrium with a gene cluster orthologous to the three genes at the *Resistance to H. glycines* (*Rhg1*) locus in soybean. A novel signal on chromosome 7 was detected and associated with resistance to *H. glycines* type 1.2.3.5.6.7. Genomic predictions for resistance to these two *H. glycines* types in common bean achieved prediction accuracy of 0.52 and 0.41, respectively (Wen et al., 2019).

The molecular markers developed in legume species such as *M. truncatula*, pea, lentil, faba bean, and lupin can be used in other legumes. Recently, the transferability of molecular markers was tested in legumes such as chickling pea (*Lathyrus cicera*) and grass pea (*L. sativus*) (Almeida et al., 2014). During this study, ~130 markers were successfully cross-amplified in *L. cicera* and *L. sativus* with an efficiency of 55% for gene-based markers (Almeida et al., 2014). Such comparative mapping can greatly boost the use of resources and expand the knowledge base in other related species as well.

#### Gene Editing in Legume–Pathogen Interactions

Gene introgression through breeding often comes with some undesirable trait inheritance that can perturb a desired outcome. Hence, the integration of plant breeding with precise editing of target genes can efficiently aid in the implementation of pathogen resistance in plants. This precision editing is recently made possible through the use of programmable sequence-specific nucleases such as zinc finger nucleases (ZFNs), transcription activator-like effector nucleases (TALENs), and, more recently, the clustered regularly interspaced short palindromic repeat (CRISPR)-associated protein9 (Cas9)-based genome editing tool (CRISPR/Cas9). These tools effectively generate target site mutations based on base-pairing of the engineered single-guide RNAs (sgRNAs) to the target DNA sites. More information on the development and applications of the CRISPR-Cas9 technologies in plant genomes can be found in previous reviews (Cong et al., 2013; Jiang et al., 2013; Perez-Pinera et al., 2013; Shan et al., 2013; Belhaj et al., 2015; Piatek et al., 2015; Kleinstiver et al., 2016; Ma et al., 2016; Tsai and Joung, 2016; Knott and Doudna, 2018). CRISPR/Cas9 or TALEN entry vectors were developed for gateway cloning in soybean and *M. truncatula* (Curtin et al., 2018). Several new web tools, such as E-CRISP (Heigwer et al., 2014) and CHOPCHOP (Montague et al., 2014), for the identification of CRISPR-Cas9 target sites are available both for target site identification and also for identification of off-target sites. In legumes, one such tool was developed for CRISPR/Cas9 design (Michno et al., 2015), and a methodology to perform gene editing in *M. truncatula* also is available (Curtin, 2018).

Using the CRISPR/cas9 technology, severe loss-of-function mutants were developed in the necrotrophic fungal pathogen *Sclerotinia sclerotiorum*. Using the previously characterized *Ssoah1* gene as the gene target, insertional gene mutants were generated that were found to be less virulent on soybean, *Brassica* spp. and tomato (Li et al., 2018a). Similarly, gene editing was adapted in *P. sojae* to generate mutants of *P. sojae* by manipulating *Avr4/6* genes of the pathogen (Fang and Tyler, 2016). These studies were important for determining the function of fungal or oomycete genes in pathogen virulence.

Recently, it has been demonstrated that, by using CRISPR/Cas9 genome editing of promoters, diverse cis-regulatory alleles can be generated and that quantitative variation can be an invaluable tool for breeding. A genetic scheme designed by Rodriguez-Leal et al. (2017) exploits transgenerational heritability of Cas9 activity in heterozygous loss-of-function mutant backgrounds. Such a system could also be used in the screening of QTLs for disease resistance if we knew the functions of the cis-regulatory alleles and could be a valuable tool for breeding (Rodriguez-Leal et al., 2017). This concept of generating variations was made possible through the use of epimutagenesis, a method that rapidly generates DNA methylation variation through random demethylation. This ability to manipulate plant methylomes to create epigenetically distinct individuals could be an invaluable breeding tool (Ji et al., 2018). Even though currently not many legume plants have been gene-edited to confer resistance against pathogens, in the future, we anticipate that gene editing will be used more frequently to engineer legume plants with yield-saving disease resistance.

# CONCLUSION

The advancements in cost-economic sequencing technologies have enabled global transcription profiling during plant– pathogen interactions in legumes and identified several pathways and candidate genes responsible for either disease susceptibility or resistance (**Table 1**). This progress has enabled a broader understanding of both plant and pathogen strategies during resistant and susceptible disease interactions. These studies have identified a repertoire of candidate genes that play key roles in resistance or disease processes. However, functional studies to evaluate their roles in plant–pathogen interactions are limited in some legume species, largely due to lack of mutant resources and appropriate methods for gene function validation. The *Tnt1* mediated insertion mutagenesis in *M. truncatula* has generated ~21,000 lines with ~90% gene-tagging coverage in the genome (Tadege et al., 2008; Cheng et al., 2014; Sun et al., 2019). This genetic resource has been utilized to evaluate some candidate genes involved in plant–pathogen interactions. Similarly, several genetic resources are being developed for other legume species such as soybean and *L. japonicus* (Sato et al., 2007; Libault and Dickstein, 2014). Functional characterization of a few candidate genes has been achieved through RNAi methods and recombinant gene expression studies (Singh et al., 2013). Several genes identified in microarray analysis of SCN–soybean interactions have been characterized by overexpression studies and grouped into genes that enhance, reduce, or have no impact on disease susceptibility (Matthews et al., 2013). Such studies will augment the genomics data generated through whole-genome transcriptional studies. A variety of molecular markers, including AFLPs, SSRs, and SNPs, have been developed and then used to identify QTLs governing resistance to fungal and bacterial pathogens and to root-knot nematodes (**Table 2**). More recently,

#### TABLE 1 | Summary of genomic methods and legume*–*pathogen interaction studies.


#### TABLE 1 | Continued


#### TABLE 2 | Summary of QTL/marker analysis in legume–pathogen interaction studies.


the underlying phenotypic variations combined with genotype information (SNPs) have been used for GWAS and are being used extensively in legume crops to identify the QTLs associated with the resistance loci against plant microbes. Precise genome editing technologies such as CRISPR-Cas9 have been employed to effectively knock out *P*. *sojae* effector *Avr4/6* and uncover the functional role of the corresponding resistance gene *RPS4/6* (Fang and Tyler, 2016). The utilization of these resources will help the biological function of genes identified through various genomic approaches to be better understood. Introgression of plant defense-related traits identified through genomics is in its early infancy and could lead to an economic success in the next few years. We predict that the use of the genomics tools in breeding mentioned in this review such as the use QTL introgression, GWAS, and CRISPR/cas9 editing of the genomes for generating plant variation will become increasingly popular in the next few years and will further advance our understanding as well as define our approaches to making improved cultivars

#### REFERENCES


in legumes. Genomic studies of plant–pathogen interaction will continue to provide us with novel disease resistance or defenserelated genes that can be incorporated into elite legume cultivars, either by classical breeding or by biotechnological approaches.

### AUTHOR CONTRIBUTIONS

PK and KM conceived the idea. PK and RN wrote the manuscript. KM edited the manuscript.

# FUNDING

Projects in the KSM laboratory are funded by the Noble Research Institute, LLC, and the National Science Foundation. We thank Courtney Leeper (Noble Research Institute) for editing the manuscript.


time points of Fusarium oxysporum f. *sp. ciceri Race 1 attack*. *PLoS One* 12, e0178164. doi: 10.1371/journal.pone.0178164


Colletotrichum lindemuthianum infection. *PLoS One* 7, e43161. doi: 10.1371/ journal.pone.0043161


reveals new defense-related genes and rapid HR-Specific downregulation of Photosynthesis. *Mol. Plant-Microbe Interact.* 18, 1161–1174. doi: 10.1094/ MPMI-18-1161

**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Kankanala, Nandety and Mysore. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Genome Wide Association Study and Genomic Selection of Amino Acid Concentrations in Soybean Seeds

*Jun Qin1,2, Ainong Shi2, Qijian Song3, Song Li4, Fengmin Wang1, Yinghao Cao5, Waltram Ravelombola2, Qi Song4, Chunyan Yang1 and Mengchen Zhang1\**

1 National Soybean Improvement Center Shijiazhuang Sub-Center, North China Key Laboratory of Biology and Genetic Improvement of Soybean, Ministry of Agriculture, Laboratory of Crop Genetics and Breeding of Hebei, Cereal & Oil Crop Institute, Hebei Academy of Agricultural and Forestry Sciences, Shijiazhuang, China, 2 Department of Horticulture, University of Arkansas, Fayetteville, AR, United States, 3 Soybean Genomics and Improvement Lab, USDA-ARS, Beltsville, MD, United States, 4 Crop and Soil Environmental Science, Virginia Tech, Blacksburg, VA, United States, 5 Bioinformatics Center, Allife Medical Science and Technology Co., Ltd, Beijing, China

#### Edited by:

Jose C. Jimenez-Lopez, Experimental Station of Zaidín (EEZ), Spain

#### Reviewed by:

Rafael Nisa-Martínez, Zaidin Experimental Station (EEZ) Spanish National Research Council (CSIC), Spain Jose V. Die, Departamento de Genética, Universidad de Córdoba, Spain

\*Correspondence:

Mengchen Zhang mengchenzhang@hotmail.com

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 30 April 2019 Accepted: 17 October 2019 Published: 15 November 2019

#### Citation:

Qin J, Shi A, Song Q, Li S, Wang F, Cao Y, Ravelombola W, Song Q, Yang C and Zhang M (2019) Genome Wide Association Study and Genomic Selection of Amino Acid Concentrations in Soybean Seeds. Front. Plant Sci. 10:1445. doi: 10.3389/fpls.2019.01445

Soybean is a major source of protein for human consumption and animal feed. Releasing new cultivars with high nutritional value is one of the major goals in soybean breeding. To achieve this goal, genome-wide association studies of seed amino acid contents were conducted based on 249 soybean accessions from China, US, Japan, and South Korea. The accessions were evaluated for 15 amino acids and genotyped by sequencing. Significant genetic variation was observed for amino acids among the accessions. Among the 231 single nucleotide polymorphisms (SNPs) significantly associated with variations in amino acid contents, fifteen SNPs localized near 14 candidate genes involving in amino acid metabolism. The amino acids were classified into two groups with five in one group and seven amino acids in the other. Correlation coefficients among the amino acids within each group were high and positive, but the correlation coefficients of amino acids between the two groups were negative. Twenty-five SNP markers associated with multiple amino acids can be used to simultaneously improve multi-amino acid concentration in soybean. Genomic selection analysis of amino acid concentration showed that selection efficiency of amino acids based on the markers significantly associated with all 15 amino acids was higher than that based on random markers or markers only associated with individual amino acid. The identified markers could facilitate selection of soybean varieties with improved seed quality.

Keywords: Glycine max, genome-wide association study, genomic selection, genotyping by sequencing, amino acid concentration, single nucleotide polymorphism

# INTRODUCTION

Soybean [*Glycine max* (L.) Merr.] is a major source of protein for humans and livestock in the world. For the past several decades, soybean meal has been the leading protein feed source for

**Abbreviations:** SNP Single nucleotide polymorphism; GWAS Genome-wide association study; AA Amino acids; Ala Alanine; Arg Arginine; Asp Aspartic acid; Glu Glutamic acid; Gly Glycine; His Histidine; Ile Isoleucine; Leu Leucine; Lys Lysine; Phe Phenylalanine; Pro Proline; Ser Serine; Thr Threonine; Tyr Tyrosine; Val Valine; SSR Simple sequence repeat; MAS Markerassisted selection.

Qin et al. GWAS and GS for Soybean Seeds

the animal and poultry production operations because of its high concentration of protein. Poultry and livestock industries use about 68 and 77% of the soybean meal consumed in the European Union and United States, respectively1,2. A major function of proteins in nutrition is to supply adequate amounts of required amino acids (Friedman and Brandon, 2001). Thus, genetic improvement of amino acid composition and balance is an important goal in soybean breeding. Developing new molecular markers for marker assisted selection (MAS) and genomic selection (GS) of amino acid composition in soybean will help to achieve this goal.

Quantitative trait loci (QTL) mapping of amino acids have been reported in soybean. Panthee et al. (2006) identified 32 simple sequence repeat (SSR) markers associated with 16 amino acids in soybean seeds based on 101 F6-derived recombinant inbred lines (RIL) from a cross of N87-984-16 × TN93-99. Fallen et al. (2013) reported ten QTLs associated with 17 amino acids and three genomic regions on chromosome 13 (4.89, 21.51, 40.69 cM) controlled multiple amino acids in 282 F5:9 RILs derived from a cross of Essex × Williams 82. As a sole dietary source of protein, soybean is deficient in lysine (Lys), threonine (Thr), methionine (Met), and cysteine (Cys) for poultry and swine. Warrington et al. (2015) conducted QTL analysis for the four amino acids in the Benning × Danbaekkong soybean population with 98 SSRs and 323 single nucleotide polymorphism (SNP) markers, and detected two QTLs on chr 8 and 20 for Lys; three on chr 9, 17, and 20 for Thr; four on Chr 6, 9, 10, and 20 for Met; and one on chr 10 for Cys (Van Warrington, 2011; Warrington et al., 2015). Khandaker et al. (2015) analyzed MD96-5722" × "Spencer" RIL population and identified 13 QTLs associated with amino acids. However, reports of genetic diversity of amino acids and mapping of QTLs controlling amino acid in soybean germplasm are limited.

Because SSR, SNPs, and indels are abundant in plants and can be assayed with high-throughput technology, the markers have been widely used for genetic linkage mapping, association studies, diversity analysis, and tagging of genes controlling important traits (Liang et al., 2010; Lehne et al., 2011; Li et al., 2014; Shi et al., 2016; Taranto et al., 2016; Zatybekov et al., 2017; Qin et al., 2017a; Qin et al., 2017b; Chang et al., 2018). Genotyping by sequencing (GBS) takes advantage of the next-generation sequencing platforms and utilizes a highly-multiplexed system to assay DNA variants from reduced representation DNA libraries of plant materials (Elshire et al., 2011; Sonah et al., 2013). As a cost-effective technique, GBS has been successfully used in implementing genome wide association study (GWAS), genomic diversity study, genetic linkage analysis, molecular marker discovery and GS in plant breeding programs (Heslot et al., 2013; He et al., 2014; Qin et al., 2016; Shi et al., 2017).

With the decreased genotyping cost and improved statistical methods, GWAS and GS offer new approaches for genetic improvement of complex traits in crop species (Bernardo and Yu, 2007; Li et al., 2013; Morris et al., 2013; Yano et al., 2016; Zhang et al., 2017). GWAS is one of the powerful tools to overcome limitations in traditional QTL mapping (Luo et al., 2019). To date, it has been used to identify molecular markers for a broad range of complex traits in different plant species including Arabidopsis (Angelovici et al., 2017), wheat (Peng et al., 2018), maize (Li et al., 2013; Deng et al., 2017), rice (Huang et al., 2010; Yano et al., 2016), soybean (Fang et al., 2017); sorghum (Morris et al., 2013). In soybean research, GWAS were used in agronomic traits (Zatybekov et al., 2017; Chang et al., 2018), seed quality (Zhang et al., 2018), seed traits (Xia et al., 2018), phosphorus efficiency (Lü et al., 2018), disease resistance (Qin et al., 2017b; Hanson et al., 2018) etc. As soybean is globally cultivated primarily for its protein and oil, and soybean protein is a complete protein as it contains all the essential amino acid that are required for human health. Numerous studies have reported on the QTL mapping and GWAS for protein (Li et al., 2018; Li et al., 2019). GS is to select desired individual within a population based on genomic estimated breeding values (GEBVs) (Hayes et al., 2009), GS has been shown more efficient than the traditional MAS for the improvement of traits controlled by QTL with minor effects (Bernardo and Yu, 2007; Heffner et al., 2009; Shikha et al., 2017; Zhang et al., 2017). GS has been applied to various agronomic traits and disease resistance in maize (Bernardo, 1996; Piepho, 2009; Albrecht et al., 2011; Technow et al., 2013; Shikha et al., 2017), rice (Onogi et al., 2015; Spindel et al., 2015; Duhnen et al., 2017), soybean (Jarquin et al., 2016; Xavier et al., 2016), and wheat (Heffner et al., 2011; Rutkoski et al., 2011; Poland et al., 2012; Battenfield et al., 2016), etc. Previous studies reported the efficiency of GS prediction by cross-validation approach (Dawson et al., 2013; Michel et al., 2016) and suggested that the size of the training population was critical (Xavier et al., 2016). Zhang et al. (2018) conducted GWAS for seed composition, including protein, oil, fatty acids, and amino acids, using 313 diverse soybean germplasm accessions genotyped with a high-density SNP array of the Illumina Infinium SoySNP50K BeadChip (Song et al., 2013). After filtered, a total of 31,850 SNPs with minor allele frequency (MAF) ≥5% were used for GWAS in their analysis and 87 chromosomal regions were identified to be associated with seed composition, explaining 8–89% of genetic variances.

However, little GWAS and no GS for amino acid concentrations in soybean has been reported so far. The main objectives of this study were to (1) evaluate amino acid compositions in soybean germplasm from China, Korea, Japan and U.S. (2) identify SNP markers associated with amino acid concentrations of soybean *via* GWAS, and (3) explore efficiency of GS for amino acids in soybean breeding. The newly identified markers are anticipated to facilitate MAS and GS of nutritional traits in soybean, and the soybean accessions with high concentrations of amino acids will be potential parents for soybean breeding.

#### MATERIALS AND METHODS

#### Panel for Genome-Wide Association Analysis and Genomic Selection

The panel with a total of 249 soybean accessions was chosen for this study (**Supplementary Table 1**). These accessions were collected from China, United States, South Korea, and Japan

<sup>1</sup> http://www.soystats.com, accessed on August 10, 2019

<sup>2</sup> https://www.fediol.eu, accessed on August 10, 2019

with 169 (67.9% out of 249), 75 (30.1%), 3 (1.2%), and 2 (0.8%) accessions, respectively (**Supplementary Table 1**).

# DNA Extraction, GBS, and SNP Discovery

Genomic DNA was extracted from freeze-dried fresh leaves of soybean plants using the CTAB (hexadecyltrimethyl ammonium bromide) method (Kisha et al., 1997). DNA library was prepared using the fragment digested by restriction enzyme ApeKI following the GBS protocol described by Elshire et al. (2011) and DNA sequencing was performed using GBS method (Elshire et al., 2011; Sonah et al., 2013). The 90 bp pair-end sequencing was obtained from each soybean genotype at the Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, Beijing, China. The GBS dataset contained 3.26 M short-reads or 283.74 Mbp of sequence for each accession. The short reads were aligned to soybean whole genome sequence (Wm82.a1.v1)3,4 using SOAPaligner/soap2 and SOAPsnp v. 1.05 was used for SNP calling (Li et al., 2009; Li, 2011).

Approximately a half million SNPs were discovered from the 249 soybean germplasm accessions. SNPs were eliminated if MAF was less than 5%, or missing and ambiguous alleles larger than 15%. After filtering, 23,279 SNPs remained for genetic diversity and association analyses.

### Amino Acid Content Determination and Phenotypic Data Analysis

Soybean germplasm was grown at three locations, Shijiazhuang (114°83′E, 38°03′N), Cangzhou (116°7′E, 38°03′N), and Handan (114°48′E, 36°62′N) in Hebei province in a randomized complete block design with three replications in June 2012. Each plot consisted of six rows with a row length of three meters and raw space of 50 cm in all trials. The density was 225,000 plants per ha. The soil at Shijiazhuang was cinnamon. The organic matter, available P and available K concentration were 1.74% 29.9 mg/kg, 94.3 mg/kg, respectively. The soil at Cangzhou was light loamy. The organic matter, available P and available K concentration were 1.0–1.2%, 15 mg/kg, and 100 mg/kg, respectively. The soil at Handan was fluviatile loamy and the organic matter, available P and available K concentration were 1.6%, 19.3 mg/kg, 156.2 mg/kg, respectively. The plots were irrigated once at seed-filling stage. Plants were harvested after 95% leaves had fallen off. Ten plants were randomly chosen from the middle of a plot for seed traits analysis.

A total of 15 amino acids, Ala, Arg, Asp, Glu, Gly, His, Ile, Leu, Lys, Phe, Pro, Ser, Thr, Tyr, and Val in soybean seeds were measured by Biochrom 30 amino acid analyzer (Biochrom Ltd, Cambridge, UK) using the acid hydrolysis method (Davies and Thomas, 1973; Tsugita and Scheffler, 1982). Analysis was carried out by ion exchange chromatography under the experimental conditions recommended for protein hydrolysates. Each sample containing 0.1 g soybean seed powder was acid hydrolyzed with 10 ml of 6 N HCl at 110°C for 22 h in a 15 ml vacuum-sealed glass tube. The top hydrolysate in the tube was filtered into another 50 ml tube, and water was added to the tube. A total of 1 ml liquid from the 50 ml tube was transferred to a 1.5 ml tube and dried at 55°C, re-dissolved with 1 ml loading buffer and measured in the analyzer. The amino acid composition was calculated from the standard area obtained from the integrator and expressed as a percentage of the total weight.

Statistical analyses of the 15 amino acids were performed by JMP Genomics 7 (SAS Institute, Cary, NC, USA)5 (Sall et al., 2012). The mean, range, standard deviation (SD), standard error (SE) and coefficient of variation (CV) were estimated for each amino acid concentration using 'Tabulate'; the distributions of amino acid concentrations were drawn using 'Distribution' in JMP Genomics 7.

#### Population Structure, Genetic Diversity, and Association Analysis

STRUCTURE, a program that uses Bayesian method to analyze multi-loci data in population genetics (Pritchard et al., 2000)6 , was used to analyze population structure and to create Q-matrix for association analysis. We used the default parameters of STRUCTURE 2.0 software: Admixture Model; Allele Frequencies Correlated; and Compute Probability of the Data (Kaeuffer et al., 2007). The number of subpopulation (K) was assumed to be between 1 and 12. Thus, each K was run 10 times, the Markov Chain Monte Carlo (MCMC) length of the burn-in period was 20,000 and the number of MCMC iterations after the burn-in was 20,000. For each simulated K, the statistical value delta K was calculated using the formula described by Evanno et al. (2005). The optimal K was determined using STRUCTURE HARVESTER7 (Earl, 2012). After optimal K was determined, a Q-matrix was obtained and used in TASSEL 5 (Bradbury et al., 2007) for association analysis. Each soybean accession was then assigned to a cluster (Q) based on the probability that the genotype belonged to that cluster. The cut-off probability for the assignment to a cluster was 0.5. Based on the optimum K, a bar plot with 'Sort by Q' was obtained to visualize the population structure among the 249 accessions. Genetic diversity was also assessed and the phylogenic tree was drawn using MEGA 6 (Tamura et al., 2013) based on the Maximum Likelihood (ML) tree method (Shi et al., 2016) with the following parameters. Test of phylogeny: bootstrap method with No. of Bootstrap replications 500; Model/Method: General Time Reversible model, Rates among Sites: Gamma distributed with Invariant sites (G/I), Number of Discrete Gamma Categories: 6, Gaps/Missing Data Treatment: Use all sites, ML Heuristic Method: Subtree-Pruning-Regrafting-Ex-tensive (SPR level 5), Initial Tree for ML: Make initial tree automatically (Neighbor Joining), and Branch Swap Filter: Moderate. The population structure and the cluster information were imported to MEGA 6 for combined analysis of genetic diversity. For sub-tree of each Q (cluster), the shape of 'Node/ Subtree Marker' and the 'Branch Line' was drawn using the same color scheme of the STRUCTURE analysis.

<sup>3</sup> https://www.soybase.org/GlycineBlastPages/archives/Gma1.01.20140304.fasta.zip 4 https://www.soybase.org/GlycineBlastPages/index.php?db\_select=Gma1.01

<sup>5</sup>https://www.jmp.com/en\_us/software/genomics-data-analysis-software.html; accessed on August 10, 2019

<sup>6</sup>https://web.stanford.edu/group/pritchardlab/structure\_software/release\_ versions/v2.3.4/html/structure.html, accessed on August 10, 2019

<sup>7</sup> http://taylor0.biology.ucla.edu/structureHarvester/

Association mapping for the 15 amino acids was conducted separately based on the mixed linear model (MLM-Q+K) in TASSEL 58 (Bradbury et al., 2007) The SNP markers were considered significantly associated with amino acids if logarithm of the odds (LOD) value ≥3.0 based on MLM-Q+K models.

#### Linkage Disequilibrium Analysis and SNP-Based Haplotype Blocks

TASSEL 5.0 (Bradbury et al., 2007) was used to calculate the linkage disequilibrium (LD) (r2) for all pairwise loci within a window of 1MB of each chromosome. Haplotype blocks (HAP) were constructed in Haploview (Barrett et al., 2004) with a cutoff of 1% (Contreras-Soto et al., 2017). The LD (r2) for all marker pairs was performed using the R script LDit9 .

# Candidate Gene Selection

Two databases including the annotations for genes at Soybase at https://www.soybase.org/dlpages/ 10 and the plant metabolic network (PMN) database11, were used for searching candidate genes related to amino acids in soybean.

Currently, three Williams 82 genome sequence assemblies are available at Soybase (Glyma1.1, and Glyma 2.0)10. However, we used Glyma1.1 as the reference because the SNP data were provided by Institute of Genetics and Developmental Biology, Chinese Academy of Sciences, at the time, Glyma1.1 was the best assembly available. We downloaded gene annotation of Glyma1.1 from Soybase and the corresponding gene positions in the Glyma 2.0 were obtained from https://www.soybase. org/correspondence/index.php12. For each SNP significantly associated with amino acids, we searched candidate genes within 10 kb of the SNP position. We also downloaded gene annotation from PMN for candidate gene discovery, because the metabolic pathway in PMN is updated with newer version of the genome (Phytozome v12: Gmax\_275\_Wm82.a2.v1.protein.fa).

#### Genomic Selection Method 1: Ridge Regression Best Linear Unbiased Prediction

Ridge regression best linear unbiased prediction (RR-BLUP) was used to predict genomic estimated breeding value (GEBV) in GS and performed in the rrBLUP package (Endelman, 2011) with the R software Version 3.5.0 (Thuiller et al., 2009). The rr-BLUP is an effective and accurate prediction method as demonstrated in a wide range of traits and crops (Heslot et al., 2012; Jarquín et al., 2014; Lipka et al., 2014; Zhang et al., 2016).

We used 4:1 size ratio of training set and validation set randomly selected from the 249 accessions, which is a fourfold cross-validation, and repeated 100 times. Each training population subset consisting of 199 accessions was randomly selected from the association panel, and the remaining 50

9 https://github.com/rossibarra/r\_buffet/blob/master/LDit.r, verified on May 10, 2018 10https://www.soybase.org/dlpages/; accessed on August 10, 2019

accessions as the validation set (Resende et al., 2012; Shikha et al., 2017).

Two sets of SNPs were used to predict GEBV for each amino acid concentration in each accession: (1) all 23,279 high quality SNPs from GBS, and (2) all 231 SNP markers associated with 15 amino acid concentrations with LOD ≥3.0 from GWAS. In addition, we predicted GEBV for each amino acid concentration based on the SNP markers associated with the amino acid.

The prediction accuracy was estimated using the average Pearson's correlation coefficient (r) between the GEBVs and observed values for each amino acid concentration in the validation set (Zhang et al., 2010; Resende et al., 2012; Shikha et al., 2017). The training and validation sets were randomly created 100 times and the r value was estimated each time. The average r value was calculated for each amino acid. The r value indicates the prediction accuracy and the selection efficiency of GS.

Method 2: Genomic Best Linear Unbiased Prediction GS was also performed with the genomic best linear unbiased prediction (gBLUP) and the method was extended to compressed best linear unbiased prediction (cBLUP) by using the Compressed Mixed Linear Model (CMLM) approach in GAPIT (Lipka et al., 2012; Tang et al., 2016; http://www.zzlab.net/GAPIT/gapit\_help\_ document.pdf). In order to conduct a four-fold cross-validation for estimating prediction efficiency, we randomly selected 199 accessions as the training set and the remaining 50 accessions as the validation set to predict GEBV for each accession. GEBV was calculated using the cBLUP in GAPIT using the SNP markers which were associated with the 15 amino acid concentrations with LOD ≥3.0 from GWAS. The Pearson's correlation coefficient (r) between GEBV and observed value of the amino acid concentrations in both training and validation sets were calculated based on the 249 accessions. A total of 100 replications were used to calculate the r values and the average r value for each amino acid was used as the indicator of prediction accuracy.

#### RESULTS

#### Phenotypic Variation and Association of Amino Acids in Soybean Seeds

The concentration of 15 amino acids, Ala, Arg, Asp, Glu, Gly, His, Ile, Leu, Lys, Phe, Pro, Ser, Thr, Tyr, and Val varied widely among the 249 accessions (**Supplementary Table 2**). Concentration distribution of all amino acids except for Val, Ile and Gly in the accessions was near normal, indicating the amino acids are complex traits (**Supplementary Figure 1**). Glu and Asp were the main components of soybean seeds, which consisted of 20.1% and 13.3% of the total 15 amino acids, respectively. Glu had the highest concentration (74.42 ppm) among the 15 amino acids, followed by Asp (49.15 ppm). Two to five times of difference were observed between the accessions with the lowest and the highest concentration of Arg, Gly, Ile, Leu, Pro, Thr, and Val (**Supplementary Table 2**). The large variations of the amino acids were also indicated by the high CV values (**Supplementary Table 2**).

Most of the correlation coefficients among the 15 amino acids were greater than the threshold of 0.124 at P = 0.05 significant level

<sup>8</sup> http://www.maizegenetics.net/tassel

<sup>11</sup>https://www.plantcyc.org/; accessed on August 10, 2019

<sup>12</sup>https://www.soybase.org/correspondence/index.php12, accessed on August 10, 2019

(**Table 1**). Significant and negative coefficients were also observed between Asp and Ile, Asp and Val, Ile and Gly, Ile and Ser, etc. (**Table 1**). Based on the correlation coefficient values, the 15 amino acids except for Arg, His, and Pro could be divided into two groups (**Table 1**). Group one consisted of five amino acids: Ala, Asp, Glu, Gly and Ser, their pairwise correlation coefficients were greater than 0.75 except for the pair between Glu and Gly (r = 0.6) (**Table 1**). Group two contained seven amino acids: Ile, Leu, Lys, Phe, Thr, Tye, and Val with r values greater than 0.48 for all pairs. However, most correlation coefficients of amino acids between the two groups were negative (**Table 1**). Since the content of amino acids within each group were all significantly and highly correlated, they could have practical application in breeding program, e.g. breeders don't need to improve amino acid individually, they can simultaneously improve multiple amino acids within the same group.

Based on 15 amino acid concentrations, we identified three accessions with the highest concentrations in each of the 15 amino acid concentrations. In addition, we ordered the 249 soybean accessions based on the concentration of each amino acid, and chosen 20 soybean accessions with at least one amino acid concentration topping three among the 249 soybean accessions. These 20 soybean accessions, Zhonghuang 10, Zhongzuo 983, 8588, Jian 31, Jidou 12, Zhengzhou 135, Wandou 15, Nanguanxiaopiqing, Lu 93748-1, Dabaipi, Bendidahuangdou, Jidou 12-3l, Lvrouheipidou, Xinliuqing, PI 547850, Zhongdou 33, Zheng 8516, Yudou 12, Huaheihu, and Lv 96150 would be good amino acid resources for improving amino acids concentration in soybean breeding programs (**Supplementary Table 2** and **Figure 1**).

#### Association Mapping and SNP Marker Identification

The population structure of the 249 soybean accessions was initially inferred using STRUCTURE 2.3.4 (Pritchard et al., 2000) and the peak of delta K was observed at K = 6, indicating the presence of six sub-populations (clusters, Q1-Q6) (**Figure 2A**). In total, 51 of the 249 accessions were assigned to Q1 subpopulation with 50 accessions from China; 65 assigned to Q2 with 42 from U.S., 21 from China and two from Korea; 55 assigned to Q3 with 54 cultivars from China; 42 assigned to Q4 with 27 cultivars from China and 12 accessions from U.S.; 21 assigned to Q5 with 16 from U.S.; and 15 to Q6 with all 15 from China (**Figure 2B**, and **Supplementary Table 1**). Phylogenetic analysis of the 249 soybean accessions using MEGA 6 also showed that the clustering of accessions was consistent with that inferred by STRUCTURE (**Figure 2C**).

A total of 318 SNP markers consisted of 231 SNPs were associated with the 15 individual amino acid at LOD ≥3 (**Supplementary Table 3** and **Supplementary Figure 2**). Because some SNPs were associated with two or more amino acids as pleiotropic association, the number of SNPs was only 231 (**Table 2**). Of the 318 SNPs, 11 were associated with Ala, 29 with Arg, 9 with Asp, 34 with Glu, 29 with Gly, 19 with His, 51 with Ile (**Figure 3**), 20 with Leu, 14 with Lys, 9 with Phe, 24 with Pro, 11 with Ser, 21 with Thr, 13 with Tyr, and 24 with Val (**Supplementary Table 3** and **Supplementary Figure 3**).

The total number of haplotype blocks was 3,458 based on 23,279 SNPs, the 231 SNPs were positioned in 85 of these haplotype blocks (**Supplementary Table 3**). Many haplotype blocks contained more than two SNP markers. For example, Gm12\_4525341 and Gm12\_4525326 were in the same haplotype block and associated with Arg; Gm06\_289575, Gm06\_399885, and Gm06\_582930 were in the same haplotype block on Chr 6 and were associated with Gly (**Supplementary Table 3**).

The number of the haplotype blocks varied among chromosomes, e.g. 12 of the 85 haplotype blocks were on Chr 16; 11 haplotype blocks on Chr 18; 1 on Chrs 6 and 9. Twenty of the 85 haplotype blocks had significant association with more than one amino acids, e.g. Gm20\_42531505 on the Chr. 20\_Block 2 was significantly associated with Thr, Gly, Ile, Tyr, Leu, Phe; Two SNP markers, Gm04\_43207248 and Gm04\_43207187 in the Chr.

TABLE 1 | Correlation coefficients among 15 amino acid concentrations in soybean seeds.


\*The significance threshold based on 249 samples: r = 0.124 at P = 0.05; r = 0.162 at P = 0.01; and r = 0.206 at P = 0.001. P < 0.00001 for those r values bolded.

4\_Block 3, were significantly associated with Ile, Phe, Gly and Thr; and two markers, Gm15\_42452169 and Gm15\_42452285 in the Chr. 15\_Block 2 associated with Val, Phe and Lys (**Supplementary Table 4**).

Based on phenotypic patterns of the amino acid concentration among accessions, the 15 amino acids could be divided into two groups which were showed in phenotypic variance section. SNP markers associated with amino acids in each group were also found. Twenty-five SNP markers were associated with five amino acids, Ala, Asp, Glu, Gly, and Ser in group one (**Table 3**), and 28 SNP markers with seven amino acids, Ile, Leu, Lys, Phe, Thr, Tyr, and Val in group two (**Table 4**). The SNP markers in each group can be used to simultaneously select multiple amino acids within the group. Such as Gm10\_48103776 was associated with five amino acids, Ala, Asp, Glu, Gly, and Ser in group one with LOD values of 2.93, 3.15, 3.51, 2.35, and 3.60, respectively (**Table 3**) and it can be used to simultaneously select soybean lines with higher contents of the five amino acids in soybean breeding progress.

For group two, such as Gm20\_42531505 was associated with seven amino acids, Ile, Leu, Lys, Phe, Thr, Tye, and Val with LOD values of 3.53, 4.55, 2.89, 4.79, 5.04, 3.87, and 2.10, respectively (**Table 4**), indicating that it can be used to simultaneously select the soybean lines with higher contents of seven amino acids. Meanwhile, both phenotypic and genetic data supported there were two groups of amino acids existed in soybean.

# Candidate Gene Selection

The linkage disequilibrium (LD) of soybean genome was analyzed, the average distance of markers at half of the maximum LD decay rate was about 200kb. Considering the LD decay value may vary from genomic region to region, we used the 10kb windows as previously reported (Xie et al., 2018). We identified 704 genes with all or partial sequence within the 10 kb windows that flanked each of the 217 out of 231 unique SNPs associated with one or more amino acids (**Supplementary Table 5**) and the other 14 SNPs did not have any candidate genes at the 10 kb windows on the chromosomes.

Based on gene annotations of the soybean whole genome assembly Gmax\_275\_Wm82.a2.v1 from Soybase and PMN (Phytozome v12: Gmax\_275\_Wm82.a2.v1.protein.fa), we found that 15 SNPs were in 14 genes related to amino acid metabolism in gene ontology annotation terms (**Supplementary Table 6**), e.g. in the region flanking the SNP Gm03\_36417795, there was a candidate gene "Glyma03g28476 (Glyma 1.1)/Glyma.03g129100 (Glyma 2.0)" encoding for pyrroline-5-carboxylate reductase (Delauney and Verma, 1990)13 (**Supplementary Table 6**). This enzyme catalyzes the last step of L-proline biosynthesis through the L-glutamate degradation pathway. In the region flanking the SNP Gm03\_36465287, there was a gene Glyma03g28530 (Glyma 1.1)/Glyma.03g129700 (Glyma 2.0) encoding β L-selenocystathionase, a key enzyme catalyzing L-homocysteine and L-cysteine interconversion. L-homocysteine and L-cysteine interconversion is an intermediate step for conversion between methionine and cysteine (McCluskey et al., 1986)14 (**Supplementary Table 6**).

#### Genomic Selection for Amino Acid Concentration Based on RR-BLUP in rrBLUP

Based on RR-BLUP in rrBLUP, the GEBV of each amino acid was estimated using three different sets of SNPs, i.e. 23,279 SNPs, 231

<sup>13</sup>https://link.springer.com/article/10.1007/BF00259392, accessed on August 10, 2019 14https://doi.org/10.1016/0031-9422(86)80067-X, accessed on August 10, 2019

#### TABLE 2 | List of SNP markers associated with each amino acid concentrations at LOD ≥ 3.0, respectively.


TABLE 3 | Twenty-five SNP markers associated with five amino acids of group one, simultaneously.


\*LOD (-log(P-value)) from MLM of Tassel.

SNP markers associated with 15 amino acid, and SNP markers associated with an individual amino acid.

The correlation coefficients between GEBV and observed value varied among amino acids based on all 23,279 SNPs (column-2 in **Table 5**), the r value was 0.61 for Arg; 0.50 for Phe; between 0.35 and 0.50 for His, Lys, Thr and Tyr; between 0.25 and 0.35 for Ala, Glu, Ile, Leu, Pro, and Val; and less than 0.25 for Asp, Gly, and Ser. The r values for most amino acids were less than 0.5, suggesting GS prediction accuracy for most amino acids was low based on genome-wide random SNPs.

The correlation coefficients between GEBV and observed value of the 15 amino acids were equal or higher from 231 SNPs than those from the 23,279 SNPs (column-3 vs column-2 in **Table 5**). The *r* value was larger than 0.6 for Arg, Ile, Lys, Phe, and Thr, and between 0.5 and 0.6 for Asp, Gly, His, Leu, Tyr, and Val, indicating that associated markers were more efficient to predict amino acids for soybean lines than all the SNPs (**Figure 4** and column-3 in **Table 5**).

Of the 231 SNPs, a total of 171, 42, 12, 4, 1 and 1 SNPs were associated with only one, two, three, four, five, and six amino acids, respectively. A total of 11, 29, 9, 34, 29, 19, 51, 20, 14, 9, 24, 11, 21, 13, and 24 SNP markers were associated with Ala, Arg, Asp, Glu, Gly, His, Ile, Leu, Lys, Phe, Pro, Ser, Thr, Tyr, and Val, respectively (**Supplementary Table 3**). We used the SNP markers only associated with individual amino acid to predict the GEBV for each amino acid, the r values for the 14 amino acids were higher than those from the 23,279 SNPs except for Phe, but equal or lower than those from the 231 SNP markers except for Val (**Table 5**).

T-test was conducted to compare the r values from the 231 SNPs and from the all 23,279 SNPs and found that the r value from



\*LOD (-LOG(P-value)).

the 231 SNPs in column-3 for each amino acids was significantly higher than that in column-2 from all SNPs with P = 0.01 level in **Table 5**, indicating that using the associated SNPs had better prediction for GS than using all randomly SNPs (**Table 5**).

#### Genomic Selection for Amino Acid Concentration Based on CMLM in GAPIT

Based on cBLUP method using CMLM in GAPIT, the average r was estimated (**Table 5** and **Figure 5**). The average correlation coefficient in the training set was greater than 0.7 and was higher than those in validation set. The average values in validation set were greater than 0.5 for amino acids except for Pro.

Two comparisons were tested to validate the stability of GS using different estimate methods and approaches: (1) RR-BLUP in rrBLUP vs cBLUP in Gapit, and (2) selfvalidation (training set by itself) vs cross-validation (training set). For the first comparison, the 15 r values in column-3 ("231 SNPs in 249 accessions") was compared to those in column-6 ("231 SNPs in validation set") in **Table 5** and we found a strong association between the average r values from RR-BLUP in rrBLUP and from cBLUP in Gapit (r = 0.85) based on the 231 associated SNPs. For the second comparison, the 15 r values in column-5 ("231 SNPs in training set") was compared to those in column-6 ("231 SNPs in validation set") in **Table 5** and we found a strong association between the average r values from cBLUP in Gapit (r = 0.84) based on the 231 associated SNPs. The strong association with high r value >0.8 between different methods and approach showed that we can use the 231 SNPs to select high amino acid content in soybean through GS.

# DISCUSSION

#### Application of Marker-Assisted Selection to Genetic Improvement of Soybean

Previous studies using bi-parental segregating populations have identified QTLs controlling 15 amino acids in soybean seeds (Panthee et al., 2006; Fallen et al., 2013; Khandaker et al., 2015; Warrington et al., 2015). The QTL were associated with 84 molecular markers on 14 chromosomes (**Supplementary Table 7**). In this study, we identified 231 unique SNP markers significantly associated with 15 amino acids (**Supplementary Table 3**). Eight SNPs were in the same regions of SSR markers that were associated with amino acid concentrations reported by Panthee et al. (2006), e.g. the SNP marker, Gm07\_4574178 (located at 4.5 Mb on chr 7) associated with Ser was near the SSR marker, Satt 567 (located at 63,663 bp on chr 7), Gm19\_41048945 at 41 Mb on chr 19 for Glu was near Satt076 at 374,148 bp of chr 19; Gm02\_15368490 at 15,368,490 bp on chr 2 for Val near Satt537; Gm01\_45320366 at 45,320,366 bp on chr 1 for Ile near Satt203; Gm19\_35491961 at 35,491,961 bp on chr 19 for Ile near Satt313; Gm02\_50269310 at 50,269,310 bp on chr 2 for Arg also near Satt274 and Satt196; and Gm09\_43488824 at 43,488,824 bp on chr 9 for Asp near Satt196 (Panthee et al., 2006). Two SNP markers,

Gm09\_43488824 at 43,488,824 bp on chr 9 for Asp and Gm10\_48103776 at 48,103,776 bp on chr 10 for His were close to the regions controlling the two amino acids reported by Fallen et al. (2013) (**Supplementary Table 7**). In addition, Gm09\_43488824 at 43,488,824 bp on chr 9 associated with Asp was in the regions reported by Panthee et al. (2006) and Fallen et al. (2013).

As GWAS for amino acid concentrations in soybean, Zhang et al. (2018) reported that 54 SNPs, as 92 markers were associated with 18 amino acids; 38 of the 54 SNPs associated with only one amino acid; and 11 SNPs associated with 2 to 12 amino acids. The SNP markers for each amino acid were located at one chromosome such as Pro or Ser, nine chromosomes such as Arg or Asp, up to 11 chromosomes such as Try (**Supplementary Table 7**). Comparisons with the SNP markers associated with amino acids reported by Zhang et al. (2018), most of SNP markers were located at different regions of soybean chromosomes. However, there were four regions similar to our results: (1) 3.71–3.82 Mb of chr 7 for Arg; (2) 33.85–35.73 Mb of chr 16 for Arg; (3) 16.28–17.65 Mb of chr13 for Asp; and (4) 8.27–9.33 Mb of Chr 8 for Gly. From our study, the SNP marker Gm07\_3811476 was associated with Arg at 3,811,476 bp on chr 7, which was near with around 98 kb to the SNP markers ss715597475 at 3,713,267 bp on chr7 for Arg reported by Zhang et al. (2018). Another SNP, Gm16\_33853366 close to ss715624781 with 1.87 M distance on chr 16 was also associated with Arg; Gm16\_33853366 was at 33,853,366 bp and ss715624781 at 35,721,993 bp on chr 16. For Asp, the Gm13\_17646967 at 17,646,967 bp was close to ss715616790 at 16,286,313 bp with a distance 1,360,654 bp on chr 13. The SNP markers, Gm08\_8480396 and Gm08\_8538031 associated with Gly from this study were close to the two SNP markers, ss715602750 and ss715602851 with Gly (Zhang et al., 2018) and the four markers are located at a region with one


TABLE 5 | The averaged correlation coefficient (r) among 15 amino acids between the observed values (each amino acid content) and the GEBVs predicted from (1) all 23,279 SNPs, (2) the 231 SNP markers, and (3) only the associated SNP markers with the specific amino acid content using RR-BLUP in rrBLUP software, and from (4) the 231 SNP markers in reference set (training set) and inference set (validation set) using CBLUPin Gapit.

\*Associated SNPs signifies that the average correlation coefficient (r) for each amino acid in column-4 was calculated with the SNP markers only associated with the individual amino acid to predict the GEBV for each amino acid, such as for r = 0.33 for Ala, which was calculated from 11 associated SNPs.

Mb distance on chr 8. Thus, the four regions were validated to be associated with one of the amino acid, Arg, Asp, or Gly.

These SNPs identified for 15 amino acids in this study can be used as molecular markers to select lines with high amino acids content through marker-assisted selection (MAS). PCR-based KASP SNP genotyping can be used in soybean breeding program to select high amino acids through MAS. Targeted region sequencing such as tGBS (targeted genotyping-by-sequencing) (Simko et al., 2018) can also be used for MAS and GS based on the sequences flanking these SNPs (Ott et al., 2017).

From this study, 14 candidate genes were found to be related to amino acid metabolism based on gene annotations from Soybase and PMN with gene ontology annotation terms using the DNA sequences in the 15 regions with the 15 SNPs in column-B of **Supplementary Table 7** significantly associated with amino acids (**Supplementary Table 6**). Our further research will develop the molecular markers such as PCR-based assays or targeted region sequencing to validate these candidate genes in our association panel and others. Gene-silence through CRISPR/Cas9 may be used as an approach to validate these candidate genes.

# Genomic Selection

Prediction accuracy is the main parameter to measure the performance of GS (Jarquin et al., 2016; Zhang et al., 2016; Duhnen et al., 2017). Prediction accuracy is affected by several factors including GS models, marker density, level of LD, QTL number, the population size specially the training population size, relationship between training population and validation population, and trait heritability (Jarquin et al., 2016).

Zhang et al. (2016) estimated prediction accuracy (r value) of seed size based on 309 soybean accessions and reported r = 0.85 when 2000 SNPs or 31,045 SNPs were included, r = 0.8 when 1000 SNPs or 500 SNPs were used. They also identified 48 SNPs on 12 chromosomes associated with seed size based on GWAS. The r value ranged from 0.64 to 0.74 when 5, 10, and 15 of the 48 SNP markers were used, which were 25% higher than those calculated from the same number of randomly selected SNPs. Our results showed that the highest r value (0.56) was obtained based on the model including 231 SNPs significantly associated with one or multiple amino acids, followed by the model including SNPs significantly associated with individual amino acid (r = 0.45), and the least was the model including all SNPs (r = 0.34). A t-test showed r values were significantly different among the sets.

We also estimated the GEBV and r values using the cBLUP in GAPIT. Based on the set of 231 SNPs, the correlation coefficient was greater than 0.7 in the training population and greater than 0.5 in

markers using cBLUP method in GAPIT.

validation population. The high correlation between the reference and inference (0.84) based on 15 amino acids, further confirmed the reliability of the GS. A high correlation (0.85) of the prediction accuracy between rrBLUP and GAPIT based on 231 SNPs, indicated that both RR-BLUP in rrBLUP or cBLUP in GAPIT were consistent.

## CONCLUSION

In this study, soybean accessions with high concentrations of amino acids in seeds, and molecular markers associated with individual and groups of amino acids were identified. These soybean accessions with high amino acid concentrations could be used as parents in soybean breeding programs. The SNP markers strongly associated with the concentrations of the amino acids could be used to improve the nutritional quality of soybean through marker-assisted selection. In addition, fourteen candidate genes that were related to amino acid metabolism were also identified. These candidate genes will lead to a better understanding of the molecular mechanisms that control amino acids metabolism in soybean seeds. Genomic selection analysis of amino acid concentration showed that the selection efficiency of amino acids based on the markers significantly associated with 15 amino acids was higher than that based on genome-wide random markers or markers only associated with an individual amino acid. These results suggest that including a set of markers significantly associated with multiple amino acids in genomic selection is likely to help breeders to efficiently select soybean varieties with improved amino acid content.

### DATA AVAILABILITY STATEMENT

SNP data can be found in the ENA using accession number PRJEB34546 (https://www.ebi.ac.uk/ena/data/view/PRJEB34546).

#### REFERENCES


# ETHICS STATEMENT

All data and materials are not related to human and animals. This research is not related to any plant specimens to be deposited as vouchers or any other association for this section.

#### AUTHOR CONTRIBUTIONS

JQ, AS, YC, FW, and WR carried out phenotyping and genotyping. AS, JQ, SL, and QS analyzed the data. JQ composed the draft of the manuscript. MZ and CY directed and managed this research. AS and QJS reviewed and edited the manuscript. All authors have read, made corrections, and approved the final manuscript.

# FUNDING

The authors would like to thank Prof. Lijuan Qiu (Chinese Academy of Agricultural Sciences) for providing seeds of 249 soybean accessions. This study was supported by the Hebei Province Natural Science Foundation for Distinguished Young Scholars (C2014301035), National Natural Science Foundation of China (31100880), and Key Project of the Natural Science Foundation of Hebei Province (C2012301020).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01445/ full#supplementary-material

traits in diverse samples. *Bioinformatics* 23, 2633–2635. doi: 10.1093/ bioinformatics/btm308


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Qin, Shi, Song, Li, Wang, Cao, Ravelombola, Song, Yang and Zhang. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Genome-Wide Association Mapping for Agronomic and Seed Quality Traits of Field Pea (Pisum sativum L.)

*Krishna Kishore Gali1, Alison Sackville1, Endale G. Tafesse1, V.B. Reddy Lachagari2, Kevin McPhee3, Mick Hybl4, Alexander Mikic´ 5, Petr Smýkal6, Rebecca McGee7, Judith Burstin8, Claire Domoney9, T.H. Noel Ellis10, Bunyamin Tar'an1 and Thomas D. Warkentin1\**

1 Crop Development Centre, Department of Plant Sciences, University of Saskatchewan, Saskatoon, SK, Canada, 2 AgriGenome Labs Pvt. Ltd, Hyderabad, India, 3 Department of Plant Sciences and Plant Pathology, Montana State University, Bozeman, MT, United States, 4 Crop Research Institute/Department of Genetic Resources for Vegetables, Medicinal and Special Plants, Olomouc, Czechia, 5 Forage Crops Department, Institute of Field and Vegetable Crops, Novi Sad, Serbia, 6 Department of Botany, Palacký University, Olomouc, Czechia, 7 Grain Legume Genetics and Physiology Research Unit, USDA, ARS, Pullman, WA, United States, 8 INRA, UMRLEG, Dijon, France, 9 Department of Metabolic Biology, John Innes Centre, Norwich, United Kingdom, 10 School of Biological Sciences, University of Auckland, Auckland, New Zealand

#### Edited by:

Karam B. Singh, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

#### Reviewed by:

Zhiying Ma, Hebei Agricultural University, China Mulatu Geleta, Swedish University of Agricultural Sciences, Sweden

\*Correspondence:

Thomas D. Warkentin tom.warkentin@usask.ca

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 17 April 2019 Accepted: 04 November 2019 Published: 26 November 2019

#### Citation:

Gali KK, Sackville A, Tafesse EG, Lachagari VBR, McPhee K, Hybl M, Mikić A, Smýkal P, McGee R, Burstin J, Domoney C, Ellis THN, Tar'an B and Warkentin TD (2019) Genome-Wide Association Mapping for Agronomic and Seed Quality Traits of Field Pea (Pisum sativum L.). Front. Plant Sci. 10:1538. doi: 10.3389/fpls.2019.01538

Genome-wide association study (GWAS) was conducted to identify loci associated with agronomic (days to flowering, days to maturity, plant height, seed yield and seed weight), seed morphology (shape and dimpling), and seed quality (protein, starch, and fiber concentrations) traits of field pea (Pisum sativum L.). A collection of 135 pea accessions from 23 different breeding programs in Africa (Ethiopia), Asia (India), Australia, Europe (Belarus, Czech Republic, Denmark, France, Lithuania, Netherlands, Russia, Sweden, Ukraine and United Kingdom), and North America (Canada and USA), was used for the GWAS. The accessions were genotyped using genotypingby-sequencing (GBS). After filtering for a minimum read depth of five, and minor allele frequency of 0.05, 16,877 high quality SNPs were selected to determine marker-trait associations (MTA). The LD decay (LD1/2max,90) across the chromosomes varied from 20 to 80 kb. Population structure analysis grouped the accessions into nine subpopulations. The accessions were evaluated in multi-year, multi-location trials in Olomouc (Czech Republic), Fargo, North Dakota (USA), and Rosthern and Sutherland, Saskatchewan (Canada) from 2013 to 2017. Each trait was phenotyped in at least five location-years. MTAs that were consistent across multiple trials were identified. Chr5LG3\_566189651 and Chr5LG3\_572899434 for plant height, Chr2LG1\_409403647 for lodging resistance, Chr1LG6\_57305683 and Chr1LG6\_366513463 for grain yield, Chr1LG6\_176606388, Chr2LG1\_457185, Chr3LG5\_234519042 and Chr7LG7\_8229439 for seed starch concentration, and Chr3LG5\_194530376 for seed protein concentration were identified from different locations and years. This research identified SNP markers associated with important traits in pea that have potential for marker-assisted selection towards rapid cultivar improvement.

Keywords: field pea, genetic diversity, genome-wide association study, genotyping-by-sequencing, single nucleotide polymorphisms

# INTRODUCTION

Pea (*Pisum sativum* L., 2*n* = 14) is an important cool season pulse crop grown in more than 100 countries on over 12 million hectares worldwide (FAOSTAT 2016; http://www. fao.org/faostat/en/#data/QC). Pea seeds are considered as a nutritional powerhouse because they are rich in protein, complex carbohydrates, vitamins, minerals and phytochemicals (Burstin et al., 2011). Pea seeds have a large crude protein proportion (~25% w/w) and high levels of the amino acids lysine and tryptophan, which are relatively low in cereal grains. To enhance the productivity of pea production and meet the global demand for pea consumption, over the last three decades pea breeding programs worldwide have made significant improvement in yield, disease resistance, plant architecture, and lodging resistance (Warkentin et al., 2015). In order to meet future demands, pea breeding must focus both on crop productivity and improving seed quality (Duc et al., 2015).

The use of diverse genetic resources is important for breeding crop varieties (Glaszmann et al., 2010). Crop species with narrow genetic diversity are susceptible to emerging pathogens or other constraints leading to loss of productivity and this may lead to a serious decline in the areas of adaptation (Dyer et al., 2014). Significant morphological diversity exists within pea accessions (Warkentin et al., 2015). The pea leaf type varies from normal with both leaflets and tendrils to semi-leafless that has leaflets replaced by ramified tendrils, and flower color varies from white to reddish-purple (Mikić et al., 2011). Pea growth habit can be indeterminate or determinate, and cotyledon color can be yellow, green or red. Pea accessions also differ substantially in yield potential, ease of harvest, vine length, maturity, seed shape, seed size, and disease resistance (Ouafi et al., 2016; Rana et al., 2017). Thus, knowledge of the genetic diversity of pea accessions is of importance to select genetically diverse parents and to broaden the genetic basis of the cultivated peas.

Initial attempts to estimate the genetic diversity of pea accessions and to assist breeding programs to select diverse accessions were based on a limited number of DNA markers. Tar'an et al. (2005) studied the relations among pea cultivars from USA, Canada, Europe, and Australia using simple sequence repeat (SSR) markers. The cultivars from Canada were observed to group somewhat separately from cultivars from Europe. However, the molecular marker-based genetic similarity did not correlate significantly with similarity based on the agronomic characters, suggesting that the two systems give different estimates of genetic relationship among the varieties. Smýkal et al. (2008) used SSR and retrotransposon-based insertion polymorphism (RBIP) markers to study the genetic diversity of 164 Czech and Slovak pea varieties. The clustering of accessions based on molecular markers did not completely separate the fodder and food types, supporting the findings of Tar'an et al. (2005). Jing et al. (2010) studied the genetic diversity of 3020 *Pisum* accessions using RBIP markers, which separated the landraces, cultivars and wild *Pisum* accessions into distinct groups, and provided a framework for designing core collections. Genetic variation of pea accessions based on SSR markers has also been reported in other studies and the test accessions were clustered into distinct gene pools (Kumari et al., 2013; Jain et al., 2014; Rana et al., 2017; Wu et al., 2017).

Single nucleotide polymorphism (SNP) markers are desirable for estimation of genetic diversity because of their abundance in the genome. SNPs have the ability to discriminate between closely related individuals at a higher resolution. SNP markers have been developed and used to study genetic diversity (Burstin et al., 2015; Diapari et al., 2015; Siol et al., 2017) and genetic mapping in pea (Sindhu et al., 2014; Tayeh et al., 2015a). These genome-wide SNP markers were used to develop SNP arrays for high throughput genotyping of pea germplasm and mapping populations (Sindhu et al., 2014; Tayeh et al., 2015a). Kulaeva et al. (2017) integrated the information of pea gene-based SNP markers from different studies and provided an easy-to-use online tool called the Pea Marker Database. Using next-generation sequencing (NGS) technologies and inexpensive high throughput genotyping platforms, SNPs were used to assess the genetic diversity and to estimate the linkage disequilibrium (LD) in many crop species including pea (Cui et al., 2017; Holdsworth et al., 2017). Using NGS platforms for simultaneous SNP discovery and genotyping, many more SNP markers have been developed and used to construct dense pea linkage maps for the identification of quantitative trait loci (QTLs) for various agronomic and seed quality traits (Tayeh et al., 2015b; Ma et al., 2017; Huang et al., 2017; Gali et al., 2018). While the markers identified in these studies can potentially be used for marker-assisted selection (MAS) of traits in breeding programs, there is also a need to identify additional markers based on a larger gene pool than the bi-parental mapping populations.

Genome-Wide Association Study (GWAS) is an efficient approach to dissect the genetic basis of complex traits using the naturally occurring genetic diversity (Korte and Farlow, 2013). GWAS provides higher mapping resolution than classical bi-parental populations to detect associations between molecular markers and traits of interest, and has been used for identification of markers associated with desirable traits in a wide range of crops (Liu et al., 2016; Cui et al., 2017; Xu et al., 2017). GWAS requires an assessment of the population structure of the diversity panel to determine the genetic relatedness of individuals and minimize detection of false associations (Korte and Farlow, 2013; Sul et al., 2016), and is dependent on the use of an adequately large number of markers. Recent advances in NGS platforms and SNP genotyping provide additional tools to characterize genetic diversity at a high resolution and allow breeders to select for useful diversity to develop new varieties.

The overall objectives of the current study were to characterize the diversity of the genetic sources that are available for pea breeding internationally, and to identify SNP markers associated with agronomic and seed quality traits. A total of 135 accessions from different pea breeding programs around the globe were assembled and used for GWAS. The accessions were genotyped using genotyping-by-sequencing (GBS) method and evaluated in multi-year, multi-location trials for agronomic and seed quality traits.

#### MATERIALS AND METHODS

#### Plant Material

The GWAS panel consisted of 135 cultivated pea accessions from 23 breeding programs in Africa (Ethiopia), Asia (India), Australia, Europe (Belarus, Czech Republic, Denmark, France, Lithuania, Netherlands, Russia, Sweden, Ukraine, and United Kingdom), and North America (Canada and USA) as listed in **Table 1**. All the accessions are within the primary gene pool of *Pisum sativum* and most are cultivars released over the past 50 years for local production. The accessions were derived from self-fertilizing lineages, and as such, significant heterozygosity was not expected. All the accessions used were pure lines of F10 generation or later, and progeny seeds were used from year to year for phenotyping. All the accessions flowered and matured under the growing conditions at the field test sites, allowing the successful evaluation of the phenotypic traits of interest. The wide distribution of geographic origin and high phenotypic variation of this panel is expected to be a good model to explore the genetic diversity of pea and to identify significant markertrait associations (MTAs).

# Phenotyping of the GWAS Panel

The GWAS accessions were phenotyped for multiple characteristics in four locations: Sutherland (Canada; 2013– 2017), Rosthern (Canada; 2016 and 2017), Fargo, (USA; 2013, 2014, and 2015) and Olomouc (Czech Republic; 2013). In each location and year, the accessions were arranged as a randomized complete block design with two replicates. Plots consisted of 3 rows of 4 m length with 30 cm row spacing and planting density of 75 seeds m-2.

The location descriptors are Sutherland (near the city of Saskatoon) (52°12′ N, 106°63' W), Rosthern (52°66′ N, 106°33′ W) in Saskatchewan, Canada, Fargo (47°00′ N, 97°11′ W) in North Dakota, USA, and Olomouc (49°59′ N, 17°25′ E) in Czech Republic. At each location, agronomic practices best suited for field pea production were utilized.

The phenotypes including days to flower, days to maturity, plant height, lodging (1–9 rating scale, 1 = no lodging and 9 = completely lodged (flat) at physiological maturity), grain yield and 1000 seed weight were measured at all locations-years as described by Warkentin et al. (2015). The seeds harvested from selected trials were evaluated for the concentration of acid

TABLE 1 | List of pea accessions used as genome-wide association study panel. Breeding organization/country Pea accession Pulse Breeding Australia, Australia EXCELL(72), KASPA(73), Morgan(71), OZP0805(74), OZP0819(75), OZP0902(76), OZP0903(77), OZP1001(78), OZP1002(79), OZP1004(81), OZP1101(80), OZP1102(84), OZP1104(83), PARAFIELD(85), PBA GUNYAH(86), PBA OURA(87), PBA PERCY(88), PBA TWILIGHT(89) and STURT(90) Belarus TMP 15213(142) Agriculture and Agri-Food Canada, Canada Agassiz(171), MPG87(141), MP1401(155) and Trapper(165) Palacký University, Czech Republic B 99/108(53), Bohatyr(6), Dalibor(48), Dick Trom(49), Hrotovicky Moravska krajova(56), Kamelot(52), Kapucin(59), Klatovsky zeleny(44), Moravsky Hrotovicky krajovy(47), Milion zeleny(45), Moravsky Odeon(51), Prebohatyr(50), Purpurviolett Schottige Nero(57), Slovensky expres(46), Sponsor(54), Stupicka jarni(58) and Terno(55) Crop Development Centre, University of Saskatchewan, Canada CDC 1-150-81(169), CDC 1-2347-144(170), CDC Acer(163), CDC Bronco(144), CDC Centennial(145), CDC Dakota(177), CDC Golden(146), CDC Meadow(147), CDC Sage(158), CDC Striker(150), CDC Vienna(167) and Cutlass(143) McFayden Seed Co., Canada GRAY'S(36) Danisco Seeds, Denmark DS Admiral(148) and Lido(175) DLF Trifolium, Denmark Nitouche(152) Ethiopia 22778(42), 22791(43), G 9173(38), No. 8120(39) and No. 9292(37) Agriobtentions, France Dove HR(35) INRA, Dijon, France Cameor(135), Carouby de Maussane(60), Champagne(61), Chemin Long(62), Cote D'or(63), D'auvergne(70), Fin de la Bievre(64), Gloire de Correze(65), Merveille D'etampes(66), Normand(67), Picar(68), Piver(69), Serpette Terese(160) and Torsdag(161) Sarasem, France Hardy(172) and Cartouche(173) India Matar(153) and PLP 105A(41) Limagrain, Netherlands Abarth(20), Alfetta(157), Audit(11), Aukland(30), Avantgarde(12), Camry(26), CEB-Montech 4152(28), Cooper(151), Delta(162), Eclipse(149), Emerald(18), Espace(159), Evergreen(19), Garde(25), Lasso(13), Matrix(27), Neon(22), Nette(17), Prophet(24), Quadril(14), Rebel(15), Satelit(16), Sorento(21), Spider(29) and Strada(23) Lithuania TMP 15133(137) Svalof-Weibull, Sweden Carneval(154) and Highlight(168) Booker, UK Radley(166) John Innes Centre, Norwich, UK Brutus(132), Enigma-NIAB(134) and Kahuna-NIAB(133) Russia AMPLISSIMO ZAZERSKIJ(40), TMP 15159(138), TMP 15202(139) and TMP 15206(140) Sharpes, UK Orb(156) Progene, Othello, WA, USA Aragorn(176) Ukraine Naparnyk(164) and TMP 15116(136) USDA, Pullman, WA, USA Lifter(31), Medora(33), Melrose(34), NDP080111(4), NDP080138(5), PS05ND0232(1), PS05ND327(8), PS05ND330(9), PS05ND0434(10), PS07ND0164(2), PS07ND0190(3), Serge(32), Shawnee(7) and Superscout(174)

The number indicated in parenthesis after the name of each accession represents the entry number used for field trials.

detergent fiber (ADF), neutral detergent fiber (NDF), starch, and protein, as well as seed shape and seed dimpling according to methods reported by Arganosa et al. (2006) and Ubayasena et al. (2011).

For trait measurements in each trial, normal distribution of residuals and homogeneity of variance were checked using Levene and Shapiro-Wilk tests, respectively (Levene, 1960; Shapiro and Wilk, 1965). Then analysis of variance was conducted for each trait using SAS Proc MIXED (Version 9.4, SAS Institute). The effect of genotype was treated as a fixed factor, while the effect of replication was treated as a random factor. Association of traits among themselves was determined using Pearson correlation coefficients using the correlation function of Mintab18, and significance was declared at P < 0.05.

# Genotyping of the GWAS Panel

The GWAS panel was genotyped using the GBS method following the protocol described by Elshire et al. (2011). For DNA extraction, the GWAS panel was grown in a growth chamber at the University of Saskatchewan phytotron facility. Leaf tissue from a single plant of each accession was harvested and freeze dried. DNA was extracted using the QIAGEN DNeasy 96 plant kit and quantified using picogreen. Individual DNA samples were diluted to 20 ng/µl using 1× TE buffer, pH 8.0.

Two hundred ng of each DNA sample (10 µl volume) was digested with restriction enzymes *Pst*I and *Msp*I, and ligated to unique 4-8 sequence barcode adapters. Five µl aliquots of adapter-ligated DNA samples were pooled in a single tube to produce 59-plex libraries. The pooled DNA was PCR-amplified using sequencing primer followed by purification using a QIAGEN PCR purification kit. For restriction, ligation and PCR amplification, standard experimental conditions as described by Elshire et al. (2011) were followed. The purified DNA library was quantified using a Bioanalyzer (Agilent Technologies) and the 59-plex libraries were sequenced on a single lane of Illumina HiSeq™ 2500 platform (Illumina® Inc., San Diego, CA, USA) using V4 sequencing chemistry at the Sick Kids Hospital, University of Toronto, Canada.

# SNP Variant Calling

The raw reads from Illumina sequencing were assigned to individual accessions based on the 4 to 8 base pair barcode adapters ligated to individual DNA using in-house Perl scripts. Following the deconvolution step, barcode sequences were removed from the read sequences, and the reads were trimmed for quality using the read trimming tool Trimmomatic-0.33. To discover SNP polymorphisms, filtered reads were mapped to the *P. sativum* (cv. Cameor) genome assembly (Kreplak et al., 2019) using the sequence alignment tool Bowtie 2 version 2.2.5. Samtools-1.1 and BCFtools-1.1 were utilized to call variants and saved them in variant call format (VCF). After filtering for a minimum read depth of five and minor allele frequency of 0.05, 16,877 SNP markers were selected and used to determine the population structure and marker-trait association. The selected SNPs were named to represent the corresponding chromosome number, linkage group number, and the base pair position of the SNP.

# Analysis of Population Structure

The population structure of the GWAS panel based on SNP genotyping data was determined by estimating the most likely number of clusters (*K*) into which the accessions could be grouped, and their degree of admixtures, using the program fastSTRUCTURE (Raj et al., 2014). The value of *K* that best fits the data, which is the most likely number of clusters in the population, was determined based on the lowest prediction error, and the smallest number of iterations for convergence. From the matrix of contributions, *Q* the probabilities of belonging to one of the clusters were derived, and accessions assigned accordingly. An unweighted neighbor-joining (NJ) tree was constructed using a shared allele index based on a dissimilarity matrix estimated from the SNP dataset (Perrier et al., 2003).

# Linkage Disequilibrium Analysis

LD of SNP markers of each chromosome was calculated as the correlation between marker-pairs calculated as Pearson correlation coefficient (r). LD decay was calculated by Quantile regression (R package 'quantreg'; Koenker 2017) by plotting r2 values as a function of genetic distance.

# Association Analysis

The association between SNP genotypes and the phenotypes was determined using the software GAPIT (Genome Association and Prediction Integrated Tool – R package; Lipka et al., 2012). The Q values, which consider the genetic structure of the GWAS panel, and the kinship coefficient matrix (K) that explains the most probable identity by state of each allele between accessions, were used in the analysis. Mixed linear method (MLM) and SUPER (Tang et al., 2016) were tested for association analysis. MLM was run using K values calculated by GAPIT and identity-by-state (IBS) methods, and principal co-ordinate values as covariates. To select the appropriate model for association analysis, the quantilequantile (Q-Q) plots of each drawn between the observed and expected log10 P values were compared, and the MLM based on Q and K values from IBS was used for association analysis.

# RESULTS

# Genotyping of the GWAS Panel

From the three lanes of sequencing on HiSeq™ 2500, a total of 1005.1 million reads of 100 bp length were obtained with a minimum of 1.47 million and maximum of 12.9 million reads per accession. The average Q30 ratio and guanine–cytosine (GC) content of the reads were 92.3 and 44.1%, respectively. Of the raw reads, 98.0% remained after trimming for barcode adapter sequences and quality. These high quality reads were aligned to the pea genomic sequence (Kreplak et al., 2019). On average, 60.5% of the reads per accession were aligned to the reference sequence and 91.9% of the aligned sequences were uniquely aligned. After filtering the identified SNPs for a minimum allele frequency (MAF) of 0.05 and minimum read depth of five, 16,877 SNPs were selected and used for analysis of population structure and marker-trait association. Of the selected SNPs, 15,608 loci were located on the seven chromosomes of pea (**Figure 1**). The remaining 1269 markers were chromosomally non-assigned, and were designated by their corresponding scaffolds or superscaffolds. The SNP markers were named according to their assigned chromosome and linkage group followed by the base pair position within the chromosome. The designation of chromosomes and linkage groups is in accordance with the pea genome sequence assembled by Kreplak et al. (2019).

#### Linkage Disequilibrium Analysis

LD decay based on SNP markers of each chromosome was calculated as the Pearson correlation coefficient (r2 ) between marker pairs. The r2 max,90, which is the maximum r2 achieved in the 90th percentile of chromosomes 1 to 7 is 0.35, 0.25, 0.26, 0.24, 0.32, 0.32, and 0.29, respectively. The LD decay varied among the seven chromosomes, and chromosomes 2 and 5 had the most rapid and slowest decay, respectively. The LD1/2max,90 of chromosomes 1 to 7, which is the physical distance in Mb at which LD has decayed to half of r2 max,90 is 0.06, 0.02, 0.02, 0.04, 0.08, 0.06, and 0.07, respectively. LD plots of each chromosome are presented in **Figure 2**.

#### Genetic Structure of GWAS Accessions

The genetic structure of the 135 accessions was analyzed using fastSTRUCTURE. Model-based, maximum likelihood ancestry estimation procedure was used for the analysis. The most likely number of clusters (k) was tested from 2 to 10, and a k-value of 9 was selected to describe the genetic structure of the 135 accessions. The admixture analysis estimated the probability of membership of each individual accession to each cluster (**Figure 3**). The corresponding Q-matrix at k = 9 was used for marker-trait association analysis. The admixture analysis assigned individual accessions to clusters to study hybrid regions of the genome, and identified common ancestry of accessions from different pea breeding programs. In general, accessions from specific breeding programs tended to cluster together.

represents number of SNPs in each million bp of genetic distance of the seven pea chromosomes. The chromosome and linkage group assignment was in accordance to the pea genome assembled by Kreplak et al. (2019). The graphs are based on number of SNPs identified on chromosomes 1 to 7 (1685, 1768, 1786, 2356, 2917, 2349 and 2747, respectively).

In cluster 1, 10 accessions from USA breeding programs clustered with 2 accessions from Canada, 4 accessions from Czech Republic, Carneval from Sweden, and Brutus from United Kingdom, and showed varying degrees of hybrid zones from accessions of other geographical regions. The four accessions, Kahuna (John Innes Centre, UK), Neon (Limagrain, Netherlands), Strada (Limagrain, Netherlands), and Kapucin (Palacky Univeristy, Czech Republic), which formed cluster 2 are accessions of marrowfat market class characterized by large green cotyledon seeds with blocky seed shape used typically as snack foods. Seven accessions from four breeding programs formed cluster 3, and six of the accessions had no admixture from other clusters. Some of these accessions Champagne (INRA, France), CDC Vienna (CDC, Canada) and Melrose (USDA, USA) are known to have greater frost tolerance. Five

older pea accessions from different breeding programs formed cluster 4, of which Trapper and Torsdag are known forage pea accessions. Clusters 5 and 6 are comprised of 20 and 38 accessions from multiple breeding programs, respectively. The accessions in cluster 5 are relatively older varieties and cluster 6 has many relatively recent western European varieties (like Delta, Alfetta, Nitouche, Lido) and a few Canadian varieties (like Agassiz, MP1401 and CDC Centennial). Twelve of the 19 accessions from Pulse Breeding Australia (PBA) clustered together in cluster 7. Eight of the 12 accessions from CDC, Canada and Highlight from Svalof-Weibull (Sweden) formed cluster 8. The four accessions in this cluster which had no admixture are CDC Bronco, Highlight (parent of CDC Bronco), CDC 1-150-81, and CDC 1-2347-144 (the two latter are mutants of CDC Bronco). Cluster 9 has many accessions

from Eastern European programs and all five accessions from Ethiopia.

The neighbor-joining (NJ) tree presented in **Figure 4** is based on the shared-allele genetic distance. The grouping of phylogenetic clusters differed to some extent from the grouping of accessions based on the extent of admixture as shown in **Figure 3**. For example, the 18 accessions represented as cluster 1 in structure analysis, were regrouped with 9 accessions as one cluster, 7 accessions as another cluster along with other accessions. Two accessions PS07ND0164 and Bohatyr of cluster 1 and four accessions Kahuna-NIAB, Neon, Strada and Kapucin of cluster 2 in structure analysis were grouped as one phylogenetic cluster along with accessions from Australia. In structure analysis, 12 accessions from PBA formed cluster 7, while accessions EXCELL and OZP0805 from PBA were grouped in cluster 5. In the NJ tree, these fourteen accessions from PBA were clustered together along with accessions from other sources. The nine accessions in cluster 8 of the admixture plot (**Figure 3**), along with DS-Admiral and CDC Centennial which showed significant admixture from this cluster, were part of one cluster in the NJ tree.

# Phenotypic Measurements

Phenotypic data collected for the GWAS panel in multi-location, multi-year trials are summarized in **Table 2**. The accessions varied widely in the characteristics measured. The days to flowering (DTF) varied significantly within the GWAS panel by an average of 16.8 days between the early flowering and late flowering accessions compared across the years and locations. In comparison the accessions differed by 18.1 days in days to maturity (DTM). Substantial variation of plant height was observed, where the average of minimum and maximum plant height measured across the trials is 43.7 and 151.3 cm, respectively. In terms of lodging resistance, the accessions varied from a score of 1.0 to 9.0 measured on a 1-9 rating scale. The yield of individual accessions ranged from less than 100 kg/ha to >6000 kg/ha. The seed weight of the accessions, measured as 1000 seed weight, varied from 70 g to 436 g. The GWAS accessions were also quite diverse for seed dimpling and seed shape.

The GWAS panel is also highly diverse for the seed quality traits measured as percentage of acid detergent fiber, neutral detergent fiber, starch, and protein content. The acid and neutral detergent fiber concentrations varied from 3.2% and 7.4% to 15.9% and 26.3%, respectively. The starch concentration varied from 17.8% to 58.3%, and protein concentration varied from 19.1% to 30.9%. Overall, there is sufficient phenotypic diversity in the GWAS panel, in terms of agronomic traits, seed morphology and seed quality traits, to support association analysis.

# Association Analysis

Of the MTAs identified for individual trials, 251 MTAs as listed in **Table 3** were selected based on their P value and occurrence in multiple trials. The flanking sequences of the markers listed were provided in **Table S1**. Nine markers, positioned on chromosomes 1, 2, 4 and 6, and three non-chromosomal scaffolds were associated with DTF in at least four trials, and on average each marker explained 3-11% of the phenotypic variance (PV) measured as the difference in R-square of the model with the SNP and without the SNP. SNP marker Chr1LG6\_362652367 was associated with DTF in seven of the 11 trials. Five markers, four on chromosome 3 and one on chromosome 5 were associated with DTM in multiple trails. SNP marker Chr3LG5\_126657675 was associated with DTM in eight of the 11 trials.

Four SNP markers on chromosome 5 were associated with plant height in four to seven of the nine trials. The R-square value of a model with SNP ranged up to 0.72. Five SNP markers associated with lodging resistance were positioned on chromosomes 1, 2, 3 and 5. SNP marker Chr2LG1\_409403647 was identified in four of the 10 trials. Manhattan plots showing the association of SNP markers with plant height and lodging resistance in multiple trials, and the corresponding Q-Q plots are presented as examples from this research in **Figure 5** and **Figure 6**, respectively. The Q-Q plots represent the observed P values of each SNP marker against the expected P values. The Manhattan plots in **Figure 5** show the significant association of SNP markers on chromosome 5 (LG3) with plant height in each of the individual trials presented. The Manhattan plots in **Figure 6** show the significant association of SNP markers on multiple chromosomes with lodging resistance. In all the Q-Q plots of lodging resistance (**Figure 6**), the observed P values are almost the same as expected values.

Two SNP markers on chromosome 1, Chr1LG6\_57305683 and ChrLG6\_366513463, were associated with yield in three of the 10 trials. Four SNP markers were associated with seed weight, of which SNP marker Chr1LG6\_176606388 is located on chromosome 1, and three other SNP markers were positioned on non-chromosomal scaffolds.

Seven SNP markers associated with two seed morphological traits, seed shape and seed dimpling, were identified. Four markers associated with seed shape are distributed on chromosomes 2, 5 and 7, and were associated with seed shape in three to four of the six trials. Two markers on chromosome 1 and one marker on chromosome 3 were associated with seed dimpling. SNP marker chr1LG6\_100615820 was associated with seed dimpling in four of the six trials.

Multiple SNP markers were associated with four of the seed quality traits including concentrations of seed acid detergent fibre (ADF), neutral detergent fiber (NDF), starch and protein. Five SNP markers on chromosomes 5, 6 and 7, and eight markers on chromosomes 2, 3, 5, 6 and 7 were identified to be associated with ADF and NDF, respectively. Two markers Chr6LG2\_372463590 and Chr7LG7\_7724682 were common for ADF and NDF concentrations. Multiple markers positioned on chromosomes 2, 3, 5 and 7 were associated with seed starch concentration, of TABLE 2 | Minimum, maximum and mean values of phenotypic traits measured in 135 pea accessions of genome-wide association study panel.

#### TABLE 2 | Continued



TABLE 3 | Trait linked SNP markers identified by association analysis of various pea phenotypes using the mixed linear model (MLM).




#### TABLE 3 | Continued


#### TABLE 3 | Continued


The number indicated in parenthesis after the name of each trait represents the number of trials the trait was measured. Only markers which were significant in multiple trials for a given trait are listed in the table. In each SNP ID, Chr and LG refers to chromosome and linkage group followed by the base pair position. For non-chromosomal SNPs, Sc and and SSC refers to scaffold and superscaffold followed by the scaffold number and base pair position. Each locus is represented by one SNP marker of the LD block. †R-square value presented is the difference of R-square explained by the model with and without SNP.

which three markers Chr2LG1\_457185, Chr3LG5\_234519042, and Chr7LG7\_8229439 were associated with starch concentration in four of the five trials. Two SNP markers on chromosome 3 and one marker on chromosome 5 are associated with seed protein concentration. Chr3LG5\_138253621 and Chr3LG5\_194530376 are associated with protein concentration in three and four of the five trials, respectively.

Of all the MTAs that were observed in ≥50% of the trials, the following markers explained the highest average phenotypic variance (PV) across the traits: Sc00936\_29805 (8% PV) and Sc03817\_83023 (8% PV) for DTF, Chr3LG5\_112288560 (7% PV) and Chr3LG5\_126657675 (6% PV) for DTM, Chr5LG3\_566189651 (9% PV), Chr5LG3\_572899434 (9% PV) and Chr5LG3\_573518168 (8% PV) for plant height, Chr3LG5\_197482300 (10% PV) and Chr6LG2\_68264764 (10% PV) for seed shape, Chr1LG6\_46289124 (6% PV) and Chr1LG6\_100615820 (6% PV) for seed dimpling, Chr7LG7\_7724682 (8% PV) as a common marker for both ADF and NDF, Chr2LG1\_457185 (8% PV) and Chr7LG7\_486526644 (8% PV) for seed starch concentration, and Chr3LG5\_194530376 (6% PV) for seed protein concentration.

# DISCUSSION

With the availability of cost-effective, high throughput SNP genotyping methods and genomic resources, GWAS has been used as an effective method to identify alleles associated with traits of many crop species including legumes (Desgroux et al., 2016; Sun et al., 2017; Mourad et al., 2018). The current GWAS was undertaken to identify SNP markers associated with several important field pea breeding traits. The natural diversity of pea accessions selected in the 23 pea breeding programs across the world was used to identify trait-linked SNP markers, which could potentially be used for MAS in pea breeding programs. The pea accessions used in this study include accessions from pea breeding programs in Africa, Asia, Australia, Europe and North America, which represent the genetic variations used in these breeding programs as genetic sources for multiple traits. These accessions were expected to possess a diversity of alleles for various agronomic and seed quality traits, and thus were selected for this GWAS study to identify loci controlling multiple traits.

GBS identified 16,877 good quality SNPs, of which 15,609 were distributed across seven chromosomes of pea and 1268 were non-chromosomal SNPs. LD patterns of population structure are important for association mapping (Flint-Garica et al., 2003), thus we analyzed the LD of 135 GWAS accessions by chromosome. The LD decay estimates of the 7 pea chromosomes varied from 0.03 to 0.18 Mb. Siol et al. (2017) reported that LD decays steeply in pea, and the median r2 value was less than 0.05 at a genetic distance of ~3 cM. The clustering of 135 accessions into nine major groups (K = 9) partially independent of their geographical origin reflects the use by pea breeders of genetic variation from diverse sources. Siol et al. (2017) grouped 917 *Pisum* accessions into 16 clusters of which spring and winter accessions represented 10 and 4 clusters, respectively.

The genetic diversity represented by the pea GWAS panel was used for identification of MTAs. In a previous GWAS study of pea, using 175 pea accessions and genotyping based on a 13.2K SNP array, Desgroux et al. (2016) detected 52 loci associated with *Aphanomyces* root rot resistance which included novel loci that validated the reported major and minor QTLs. They also confirmed the linkage between *Aphanomyces* resistance alleles and late flowering alleles, and reported the break of linkage between resistance alleles and colored flowers.

The traits selected for this GWAS study included agronomic traits (DTF, DTM, lodging resistance, seed yield and seed weight), seed characteristics (seed dimpling and seed shape) and seed quality traits (fiber, protein and starch concentrations), all of which are important targets for pea breeding globally. We identified QTLs for all of these traits in our previous study (Gali et al., 2018) using multiple recombinant inbred line (RIL) populations derived from bi-parental crosses. The current research is expected to expand the understanding of genetic loci governing these traits. Genetic relatedness (or kinship) and population structure are known as the major confounding factors that may lead to spurious associations in GWAS (Yu et al., 2006). Thus, upon verification of Q-Q plots, we used MLM method with the combination of Q and K matrices for association analysis, which has been used for association analysis in many plant species (Hao et al., 2012; Huang and Han, 2014).

Using the pea GWAS panel, MTAs were identified for all the traits in repeated tests. Flowering time is one of the key determinants of pea adaptation to different ecological and geographical regions, thus the pea GWAS panel is an ideal population for identification of loci controlling flowering time. Over 20 loci related to flowering time and inflorescence development have been identified in pea

Fargo (B) 2013 Sutherland, (C) 2014 Sutherland, (D) 2015 Sutherland, (E) 2016 Rosthern, (F) 2016 Sutherland, (G) 2017 Rosthern, and (H) 2017 Sutherland. The Manhattan plots are based on association of 15608 chromosomal and1269 non-chromosomal SNPs with plant height of 135 pea accessions in the multi-year, multi-locational trials.

and the interactions of these loci determine the flowering time, of which *HIGH RESPONSE* (*HR*), *STERILE NODES* (*SN*), and *LATE FLOWERING* (*LF*) loci are important (reviewed by Weller and Ortega, 2015). In the pea GWAS panel, we identified nine loci for flowering time and five loci for maturity time in repeated tests illustrating the diverse nature of the panel.

Major and minor QTLs were identified for plant height in pea in previous studies. Tar'an et al. (2003) reported three major QTLs and Hamon et al. (2013) identified three minor QTLs. Gali et al. (2018) identified a major QTL for plant height on LG3, in three RIL populations, which explained 33-65% of the phenotypic variance. Ferrari et al. (2016) also reported QTL for plant height on LG3. Similarly, in the pea GWAS panel, we identified four loci on chromosome 5 (LG3) associated with plant height. These four loci together represented a region of ~7.5 million base pairs on chromosome 5 and previously reported SNP marker Psc7220p181

(Gali et al., 2018) is in proximity of this locus. The pea GWAS panel has greater genetic variation for plant height, compared to the RIL populations, with over 3-fold difference between minimum and maximum plant height. Thus, by capturing the diversity for this trait in the GWAS panel, the major loci for plant height were confirmed to be on chromosome 5 (LG3).

Major QTLs explaining 58% (Tar'an et al., 2003), 50% (Smitchger 2017), and >30% of phenotypic variance for lodging resistance were identified in bi-parental mapping populations (Gali et al., 2018). Ferrari et al. (2016) identified QTLs for lodging resistance on LG3 and LG4. In the current GWAS study, in addition to a locus on chromosome 5 (LG3), additional loci on chromosomes 1, 2, and 3 (LGs 6, 1 and 5) were also identified for association with lodging resistance. Identification of these additional loci could be due to the wide range of diversity for lodging resistance in the GWAS panel as the individual accessions ranged from a lodging score of 1.0 to 9.0 on a 1-9 rating scale. Co-localization of QTLs of plant height and lodging resistance was reported in previous studies (Tar'an et al., 2003; Gali et al., 2018), but in the current study the loci associated with these two traits were not co-localized.

We identified two loci on chromosome 1 (LG6) for association with grain yield in three of the ten trials conducted using the pea GWAS panel. The locus represented by the SNP marker Chr1LG6\_366513463 was also associated with DTF. In previous studies based on RILs, multiple QTLs for grain yield were identified on multiple linkage groups (Krajewski et al., 2012; Gali et al., 2018; Tar'an et al., 2004). Since the genetic variation for grain yield is contributed by many loci each contributing a minor portion of the variance for this trait, or largely affected by GxE interactions, it is possible that in the pea GWAS panel we could not identify multiple loci for this trait in repeated tests.

Using the pea GWAS panel, four loci were identified for association with seed weight. One of these loci is on chromosome 1 (LG6) and the other three are located on scaffolds that couldn't be positioned on the assembled chromosomes. In comparison, we earlier reported major QTLs for seed weight on LG3, LG4 and LG6 (Gali et al., 2018). For seed dimpling, two loci on chromosome 1 (LG6) and one locus on chromosome 3 (LG5) were associated with the trait in repeated tests, as compared to the identified key locus on LG5 (Gali et al., 2018). The loci identified for seed shape in repeated trials were positioned on chromosomes 3, 6 and 7 (LGs 5, 2 and 7, respectively), and supports the earlier reported major QTLs on LG2 and LG5 (Gali et al., 2018). In the current study, the four SNP markers identified for association with seed shape were also associated with either seed starch or fiber concentrations.

For all the seed quality traits tested, i.e. seed starch, fiber and protein concentrations, multiple associated markers distributed on different chromosomes were identified. Markers distributed on chromosomes 2, 3, 5 and 7 (LGs 1, 5, 3 and 7) were associated with seed starch concentration. Loci for this trait are known to be positioned on LGs 2, 5 and 7 in PR-07 mapping population (Gali et al., 2018). The markers associated with acid and neutral detergent fiber concentrations were on chromosomes 2, 3, 5, 6 and 7. These traits are known to be controlled by multiple loci (Gali et al., 2018). SNP markers associated with seed protein concentration were on chromosomes 3 (LG5) and 5 (LG3). QTLs for seed protein concentration on LG3 are known in PR-07 mapping population and the loci identified on chromosome 3 (LG5) are additional. Overall, this GWAS study identified new MTAs for seed quality traits.

Overall, detection of multiple MTAs in the GWAS panel compared to RIL populations is as expected because of the ability to detect a range of genes controlling the phenotype in this panel, while QTL detection in RIL populations is limited to the alleles segregating from the two parents. The increased resolution in the GWAS panel is also a result of the historical recombination in this panel, rather than the more limited recombination in the progeny of a bi-parental population. Overall the SNP markers identified in this study often corresponded to the loci reported for the same traits at the linkage group level. However, the current markers differed from the reported markers when compared for base pair position within the same linkage group and did not represent the exact same locus. The identified MTAs are valuable for pea breeders to identify sources of genetic variation for these traits. The average phenotypic variance explained by identified MTAs is ≤10%, and it has to be noted that most agronomic traits are controlled by multiple genes each with minor effect.

Some of the trait-linked markers identified in this study using diverse germplasm are useful to validate the QTL regions identified in earlier studies up to the linkage group level. The sequences of flanking markers of previously reported QTLs (Gali et al., 2018) were used to identify the corresponding regions in the pea genome assembly used in this study. Other than one QTL for plant height, the markers identified in this study were different than the previously reported QTLs in comparison of base pair positions, though they were on the same linkage group. This is possibly because of the greater phenotypic diversity in the GWAS population than in the previous bi-parental populations. We will validate the markers identified in this study with those identified in earlier studies both by genotyping and *in silico* experiments in future studies and explore the candidate genes within the genomic regions of identified loci.

In this study, we performed a GWAS to detect genome regions controlling quantitative traits, using 16,877 SNP markers in a genetically diverse panel of 135 pea germplasm accessions. We identified multiple significant loci associated with agronomic and seed traits of pea. SNP markers identified for association with plant height (Chr5LG3\_566189651 and Chr5LG3\_572899434), lodging resistance (Chr2LG1\_409403647) yield (Chr1LG6\_57305683 and Chr1LG6\_366513463), seed weight (Chr1LG6\_176606388), seed starch concentration (Chr2LG1\_457185, Chr3LG5\_234519042 and Chr7LG7\_8229439), and seed protein concentration (Chr3LG5\_194530376) can be of potential use for markerassisted selection in future pea breeding. The loci identified in this study can be used for further analysis to identify the causal gene(s), to select genetic variation, for marker-assisted trait introgression, as well to pyramid multiple genes in pea through marker-assisted breeding. The genotypic data should be a useful resource for the detection of other agriculturally important loci for many other traits using association analysis.

#### DATA AVAILABILITY STATEMENT

The datasets generated for this study can be found in the ENA https://www.ebi.ac.uk/ena/browser/view/PRJEB35147.

#### AUTHOR CONTRIBUTIONS

TW, BT, and KG conceptualized the study. AS, KM, MH, AM, PS, RM, and CD conducted the field trials for phenotyping of GWAS panel. TW and AS co-ordinated the trials at different locations. ET conducted the statistical analysis for phenotypic data. KG genotyped the GWAS panel. KG and VL conducted genotypic and association analysis. KG drafted the manuscript with suggestions from TW. All authors contributed to the manuscript review.

### FUNDING

Funding for this research from Saskatchewan Ministry of Agriculture and Saskatchewan Pulse Growers is gratefully acknowledged.

#### ACKNOWLEDGMENTS

The authors of this manuscript are grateful for the financial support of the Saskatchewan Ministry of Agriculture, and

#### REFERENCES


Saskatchewan Pulse Growers, as well as the technical expertise of the pulse crop breeding staff at the University of Saskatchewan. PS acknowledges the financial support from Palacký University grant Agency IGA 2017\_001, 2018\_001 and 2019\_004.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01538/ full#supplementary-material


mapping for seed mineral concentrations and contents in pea (*Pisum sativum* L.). *BMC Plant Biol.* 17, 43. doi: 10.1186/s12870-016-0956-4


**Conflict of Interest:** Author VL was employed by company AgriGenome Labs Pvt. Ltd., Hyderabad, India.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Gali, Sackville, Tafesse, Lachagari, McPhee, Hybl, Mikić, Smýkal, McGee, Burstin, Domoney, Ellis, Tar'an and Warkentin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Physiological Traits for Shortening Crop Duration and Improving Productivity of Greengram (Vigna radiata L. Wilczek) Under High Temperature

*Partha Sarathi Basu1\*, Aditya Pratap2, Sanjeev Gupta2\*, Kusum Sharma1, Rakhi Tomar2 and Narendra Pratap Singh2*

1 Division of Basic Science, ICAR - Indian Institute of Pulses Research, Kanpur, India, 2 Division of Crop Improvement, ICAR - Indian Institute of Pulses Research, Kanpur, India

#### Edited by:

Matthew Nicholas Nelson, Commonwealth Scientific and Industrial Research Organisation, Australia

#### Reviewed by:

Giovanna Aronne, University of Naples Federico II, Italy Kamrun Nahar, Sher-e-Bangla Agricultural University, Bangladesh

\*Correspondence:

Partha Sarathi Basu psbsu59@gmail.com Sanjeev Gupta saniipr@rediffmail.com

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 30 March 2019 Accepted: 30 October 2019 Published: 04 December 2019

#### Citation:

Basu PS, Pratap A, Gupta S, Sharma K, Tomar R and Singh NP (2019) Physiological Traits for Shortening Crop Duration and Improving Productivity of Greengram (Vigna radiata L. Wilczek) Under High Temperature. Front. Plant Sci. 10:1508. doi: 10.3389/fpls.2019.01508

Greengram is an important protein-rich food legume crop. During the reproductive stage, high temperatures cause flower drop, induce male sterility, impair anthesis, and shortens the grain-filling period. Initially, 116 genotypes were evaluated for 3 years in two locations, and based on flowering, biomass, and yield attributes, they were grouped into four major clusters. A panel of 17 contrasting genotypes was selected for their heat tolerance in high-temperature greenhouses. The seedlings of the selected genotypes were exposed to heat shock in the range 37°C–52°C and their recovery after heat shock was assessed at 30°C. The seedlings of EC 398889 turned completely green and rejuvenated, while those of LGG 460 failed to recover, therefore, EC 398889 and LGG 460 were identified as heat-tolerant and heat-sensitive genotypes, respectively. Except for EC 398889, the remaining genotypes could not survive after heat shock. Fresh seeds of EC 398889 and LGG 460 were planted in field and pollen fertility and sucrose-synthase (SuSy) activity in grains were assessed at high temperatures. The pollen germination and SuSy activity were normal even at temperatures beyond 40°C in EC 398889 and high SuSy activity enabled faster grain filling than in LGG 460. The precise phenotyping demonstrated significant differences in the light-temperature response of photosynthesis, chlorophyll fluorescence imaging of quantum yield (Fv/Fm), and electron transport rate (ETR) between heat-tolerant (EC 398889) and heat-sensitive (LGG 460) genotypes. Molecular profiling of selected accessions showed polymorphism with 11 SSR markers and the markers CEDG147, CEDG247, and CEDG044 distinguished tolerant and sensitive groups of accessions.

Keywords: thermo-tolerance, acquired thermotolerance, chlorophyll fluorescence, photosynthesis, sucrose synthase

# INTRODUCTION

Greengram (*Vigna radiata* L. Wilczek), also known as mungbean, is an important grain legume containing a high amount of digestible protein, amino acids, sugar, minerals, soluble dietary fibres, and vitamins. It is cultivated across seasons, in different environments, and in variable soil conditions in the South and South-East Asia, Africa, South America, and Australia (Parihar et al., 2017). The productivity

1 **420** and adaptability of greengram are adversely affected by several abiotic stresses including heat, drought, salinity, and water-logging, which affect crop growth and development by altering physiological processes and the plant-water relationship (Dreesen et al., 2012; Bita and Gerats, 2013; Suzuki et al., 2014; Kaur et al., 2015; Zandalinas et al., 2017; Landi et al., 2017). Several studies have reported a reduction in growth and development of legumes because of high-temperature stress (Tzudir et al., 2014; Hanumantha Rao et al., 2016). Greengram thrives most effectively at temperatures between 30°C and 40°C, however, significant flower shedding occurs at temperatures beyond 40°C (Zinn et al., 2010; Sita et al., 2017). Rainey and Griffiths (2005) reported that the abscission of reproductive organs is the primary determinant of yield under heat stress in several grain legumes. The production is considerably influenced by changes in the photoperiod and temperature across the growing regions of greengram extending from low to high latitudes. Because greengram is a quantitative short-day plant (Chauhan and Williams, 2018), short day length at low latitude hastens flower initiation, and the plants rapidly reach the reproductive phase without adequate vegetative biomass production. By contrast, long photoperiod at high latitudes delays the onset of the reproductive phase, but the biomass is adequate and has a high leaf area index.

Crops grown at high latitudes are often exposed to high temperatures beyond the threshold tolerance limit (40°C). The interactive effects of photoperiod and temperature in greengram are inadequately understood although both factors are crucial determinants of grain yield; therefore, photo-thermo insensitivity is a major attribute in breeding strategy in the development of greengram varieties with higher stability across diverse climatic conditions. Generally, a higher mean temperature hastens flowering, or a lower mean temperature delays flowering in all photoperiods (Sharma and Dhanda, 2014; Sharma et al., 2016). Singh and Singh (2011) and Lateef et al. (2018) reported temperature × flowering interactions in greengram with high mean temperatures (24°C to 28°C) and long photoperiods (15 to 16 h). Grain yield reduction in heat stress in several plant species has been reported to be associated with a decrease in photosynthetic capacity because of altered membrane stability (Savchenko et al., 2002; Salvucci and Crafts-Brandner, 2004; Zhang et al., 2009; Zhang and Sharkey, 2009; Egorova et al., 2011; Kumar et al., 2011; Horváth et al., 2012; Bita and Gerats, 2013; Kumar et al., 2013; Rakavi and Sritharan, 2019) and enhanced maintenance respiration (Reynolds et al., 2007) along with a reduction in radiation-use efficiency. However, photosynthesis is the most sensitive physiological process impaired by heat stress (Crafts-Brandner and Salvucci, 2002; Marchand et al., 2005; Kepova et al., 2005; Yang et al, 2006; Wang et al., 2009).

Decrease in photosynthesis at high temperatures could result from structural and functional disruptions of chloroplasts, reduction of chlorophyll, inactivation of chloroplast enzymes (Dekov et al., 2000; Langjun et al., 2006) or both stomatal and nonstomatal limitation (Wahid et al, 2007). Oxidative stress can cause lipid peroxidation and consequently membrane injury, protein degradation, and enzyme inactivation (Meriga et al., 2004).

High temperatures adversely affect starch and sucrose synthesis through a reduction in the activity of sucrose phosphate synthase and ADP-glucose pyrophosphorylase (Rodriguez et al., 2005; Zhao, 2013). Several reports have indicated that reproductive failure in heat stress could possibly be due to impaired sucrose metabolism in the leaves, developing grains, and anthers as well as the inhibition of sucrose transporters which reduces the availability of triose phosphates to the developing pollen grains and causes reproductive failure (Kaushal et al., 2013; Kaushal et al., 2016). Crops exposed to high temperature are often subjected to oxidative stressproducing reactive oxygen species (ROS), which are highly toxic to cellular functions in plants because they damage nucleic acids and cause protein oxidation and lipid peroxidation; this oxidative damage eventually causes cell death (Suzuki et al., 2012; Tuteja et al., 2012). ROS toxicity during various stresses is considered to be one of the major causes of low crop productivity worldwide (Vadez et al., 2012). An increase in the activity of antioxidant enzymes, such as guaiacol peroxidase (GPX) and catalase (CAT), plays a significant role in minimizing the toxic effects of stressinduced ROS production (Anderson and Sonali, 2004; Hassan and Mansoor, 2014). The present investigation is an attempt at largescale screening of germplasm across diverse climates, identifying lines with photo-thermo insensitivity, thermotolerance, and, high grain-filling rates, as well as at deciphering the mechanisms of heat tolerance in lines with high production potential.

# MATERIALS AND METHODS

# Field Trial

A set of 116 greengram genotypes, including exotic lines and cultivars, were grown during summer in augmented design during April–May for 3 consecutive years from 2015 to 2017 at two contrasting growing regions at different latitudes, namely, Vamban, Tamil Nadu, India (10.20°N, 78.50° E; day length 11:30 to 12:45 h) and Kanpur (26.4° N; 80.3° E; day length 12:30 to 14:0 h) in an augmented design with four checks, namely, Samrat, IPM 99-125, IPM 02-3, and IPM 02-14, which were replicated at an interval of 20 test genotypes. Recommended package of practices were followed for successful crop growth. The performance of genotypes and their grouping was assessed on the basis of phenology, biomass, pod fill duration, harvest index, and grain yield for pooled data of adjusted mean of each trait over 3 years for cluster analysis. The selected 17 high-yielding and stable genotypes were sown in three replications under natural field conditions at Kanpur for detailed study, whereas for precision phenotyping, the plants were grown in a controlled environmental chamber (High Point, Taiwan). The naturally-lit greenhouse experiment conducted using 17 high-yielding and 11 low-yielding greengram genotypes led to the identification of two degrees of thermotolerance when the plants were grown under 45°C and 25°C maximum and/minimum temperature, respectively, and 14 h day length. For all the traits for phenotyping, replicated samples (3–5) were used and significant levels of treatment means were worked out using factorial design of analysis of variance test. These promising genotypes, namely, EC 398889 with a high pod bearing capacity and LGG 460 without pods, were selected for further studies in detail to decipher their differential sensitivity towards high temperature and ability to set pods at high temperature.

#### Physiological Characterization of Contrast Genotypes for Heat Tolerance

The physiological characterization of selected two contrasting photoinsensitive genotypes with high and low yields was conducted for assessing their sensitivity to heat stress using different parameters, namely, membrane stability, acquired thermotolerance (ATT), chlorophyll index, chlorophyll fluorescence, pollen germination, sucrose synthase activity for sink strength, and protein and molecular profiling.

## Membrane Stability Test

The fully expanded young leaves of the photoinsensitive, EC 398889 and LGG 460 genotypes grown in a naturally-lit greenhouse were sampled for membrane stability test. In the test, electrolyte leakage was assessed after treatment using a conductivity meter model Hanna (USA). This treatment was repeated in a session for 1 h at 40°C (C 1) followed by 100°C (C2) and the electrical conductivity of this solution at the two temperature was measured separately. The relative membrane stability index was calculated using the formula given by Blum and Ebercon (1981), as Membrane Stability or injury index = C1/C2, where C1 = Electrical conductivity (EC μS) at test temperature 40°C for 1h; and C2 = Electrical conductivity (EC μS) at 100°C for 1 h.

# Chlorophyll Estimates

For instant chlorophyll estimation in the plants grown at a high temperature, a noninvasive technique was used to assess the chlorophyll status or "greenness index," which used MinoltaModel 502 Soil Pant Analytical Development (SPAD).

## Acquired Thermotolerance

Acquired thermotolerance (ATT) of the seedlings of the heatsensitive (LGG 460) and heat-tolerant (EC 398889) greengram genotypes was assessed by the methods described by Porter et al. (1994). The germinating seedlings were subjected to temperature shock treatment starting from 37°C to 52°C with an increment of 2°C and 2 h of incubation at each temperature. After reaching the peak temperature (52°C) the treatments were reversed to 37°C in the descending order. Cell viability test was assessed using the 2,3,5 Triphenyl tetrazolium chloride (TTC) reduction assay on the seedlings after treatment at normal (37°C) and high (52°C) temperatures. After heat shock for 2 h, the cells were cooled down to room temperature and 1% TTC solution was added to the cells followed by overnight incubation. A purple color developed because of the formation of formazan in the tissues that remained viable and could restore respiration. The level of (ATT) was determined by measuring the percentage reduction of TTC to formazan using the following formula: ATT (%) = (OD37°c-52°c/OD 37°c) ×100.

### Specific Leaf Area

A young leaf disc of 1-cm diameter from 10 plants each of 17 high-yielding and 11 low-yielding selected field-grown greengram genotypes was excised during the podding stage when average maximum and minimum temperature reached approximately 40°C/30°C. The leaf discs were dried and weighed ten leaf discs of each genotype. The specific leaf area *(*SLA) was calculated by the total area of ten leaf discs over their total dry weight and expressed as cm2 g−1. The values of SLA were regressed with SPAD Chlorophyll Meter Reading (SCMR) values of same leaf and obtained the relationship.

# Carbon Isotope Discrimination

At 50 days after sowing, replicated samples of fully turgid green leaves of 17 high- yielding and 11 low-yielding field-grown genotypes adjacent to podding cluster was excised (pooled samples). The leaves of each genotype were dried gradually at temperatures less than 80°C using a hot air oven for 3 days and were fine-powdered in a mill. Carbon isotope composition was determined on 1 mg sample with Isotopic Ratio Mass Spectrometer (Thermo Finnigan, Bremen, Germany) at the University of Agricultural Sciences, Bangalore, India. Carbon isotope discrimination (D) was calculated according to Farquhar et al. (1989). The values of delta carbon regressed with SLA values of the same leaf to obtain the relationship.

# Fluorescence Image Analysis

Field-grown replicated leaf samples collected from well-irrigated plants of EC 398889 and LGG 460 were used for chlorophyll fluorescence studies at the flowering stage. Following chlorophyll fluorescence, studies were immediately conducted using a fluorescence imaging system (Mess & Regeltechnik, Waltz, Germany). Initially, uniform specific geometrical areas of a single leaf of each genotype were selected to obtain ETR. The darkadapted leaves were used before getting the light curve and initial fluorescence values, Fo and Fm, were used for further calculation using the following formula:

$$\text{ETR} = \text{Quantum yield} \times \text{PAR} \times 0.5 \times \text{absorption} \text{ y}$$

while

$$\text{ETR} = \text{Photospnitetic electron transport rate}; \text{PAR} = \text{Photospnitelectric active radiation}$$

The absorptive parameter describes the fraction of incident light, which is absorbed. The factor 0.5 considers that only half of the absorbed quanta is distributed to PS II (under steady state conditions), light curve of individual selection was obtained by increasing the order of irradiance till ETR became light saturated.

### Gaseous Exchange

The plants of EC 398889 and LGG 460 were raised in two chambers maintained at maximum/minimum temperature regimes of 40°C/30°C and 25°C/18°C with a 14-h light period at 450 µmol photon m−2s−1 in a controlled growth chamber (Hi-Point, Taiwan). The plants grown at 25°C/18°C were considered to be grown at a low temperature (LT), while those grown at 40°C/30°C were considered to be grown at a high temperature (HT). The photosynthesis and other gaseous exchange parameters of the LT- and HT-grown plants were measured using a portable photosynthesis system (Model Li-COR 6400 xt, USA) under saturating light intensity 1500 µmol photon m−2s−1.

# Pollen Germination Test

The pollen germination test was conducted on field-grown greengram genotypes EC 398889 and LGG 460 by exposing excised flowers of these genotypes placed over moistened filter paper to different controlled temperature regimes (29°C, 32°C, 35°C, 37°C, 39°C, 41°C, and 43°C) for 2 h for acclimatization. The germinating pollen tubes were stained using 10% acetocarmine solution. Germination of fresh pollen grains was assessed using the sucrose-hanging-drop culture. A drop of germination medium (15% aqueous sucrose solution containing 200 mg H3BO3, 100 mg Ca (NO3)2, 100 mg MgSO4, 100 mg KNO3, and 50 mg EDTA) was placed on a coverslip and pollen dusted onto the drop. The coverslip was then inverted and placed over a concave depression on a slide, using glycerol to seal the coverslip and prevent desiccation. Then incubated for 24 hours at 29°C, 32°C, 35°C, 37°C, 39°C, 41°C, and 43°C. Following this, pollens were stained using acetocarmine solution and viewed under microscope (Leica DM 2000, Germany).

# Sucrose Synthase Activity

Freshly developed pods of the field-grown genotypes EC 398889 and LGG 460 were excised when they attained a length of approximately 1 cm. Sucrose synthase activities were determined in developing grains at various stages by using the method described by Wang et al. (1993) with slight modifications. Tissue samples (approximately 2 g of fresh tissue) were cut into small pieces and homogenised in extraction buffer. To assay enzyme activity, an aliquot of the extract was desalted using a microcentrifuge desalting procedure using Sephadex G-25 columns. The solution thus obtained was incubated for 15 min at 30°C with 20 μM of fructose and 20 μM of UDPglucose in 90 μL of 50 mM HEPES buffer (pH 8.5) containing 15 mM MgCl2, and the reaction was stopped by adding 120 μL of 1 N NaOH. The amount of enzyme solution and reaction time were previously determined to be in the linear range of the reaction. The sucrose synthase activity was assayed in the forward direction only.

# Protein Profiling

Seedlings were allowed to grow for 3 weeks under controlled conditions in a plant growth chamber with an illumination of 460 μmol photons m−2s−1and 14-h day length. Both the genotypes (EC 398889 and LGG 460) were grown in two temperature regimes, namely, 30°C/20°C and 43°C/35°C. A standard protocol was followed. Accordingly, the treated leaf tissue (0.5 g) was homogenized in buffer for protein extraction. The protein concentration in the supernatants of the samples were estimated following the method described by Peterson (1977). A 12.5% separating gel containing 375 mM Tris-HCl, pH 8.8, 0.1% (w/v) SDS, 0.05% (w/v) ammonium persulfate and 0.4 μL ml−1 TEMED was used for resolving the polypeptides. Protein markers containing polypeptides of different molecular weight were run along with protein samples extracted from the test samples. Approximately 15–30 µg of protein sample was loaded in each well.

# Molecular Profiling

Total genomic DNA was isolated according to the method of Doyle and Doyle (1987) with slight modifications (Gupta et al., 2013). The quantity as well as quality of extracted DNA were checked by comparison with 300 ng of standard ƛDNA. The working DNA sample was diluted to a standard concentration of 25 ng/µL. The DNA samples used for molecular marker analysis by using 79 SSR primer pairs derived from adzuki bean (Wang et al., 2004) were screened to detect polymorphism among groups of heat-tolerant, moderately tolerant, and sensitive greengram lines. Polymerase chain reaction was carried out following standard procedure. PCR products obtained were resolved by electrophoresis on 3% agarose gel for 3 h in 19 TAE buffer, stained with ethidium bromide and photographed using a Gel Documentation System (Uvitech, Cambridge, UK). Polymorphic markers were distinguished on the basis of the presence or absence of amplified product and difference in allele size by comparison with 100-bp DNA ladder.

# STATISTICAL ANALYSIS

The observations made on field evaluation were subjected to statistical analysis of augmented design as described by Fedrer (1961). The analysis takes into account the variability among blocks measured by standard check varieities, according to which the values of entries were subjected to comparison. Significance of treatment mean difference or least square difference was estimated using multivariate factorial analysis variance (ANOVA) alongwith standard error of means and deviation. The genotypes were grouped into different clusters based on Ward's method using squared Euclidian distances by using the statistical software SPSS version 16.0 (SPSS, Chicago, USA) program.

# RESULTS

To identify the greengram genotype with adequate thermotolerance, the criteria were chosen as yield potential and stability across different locations and temperature regimes (**Figure 1**). The test population of greengram at Vamban and

experimental sites.

Kanpur classified into four major clusters and 11 subsclusters were distinctly different from each other in the phenological and yield attributing traits (**Table 1**). Three years of experimentation resulted in the identification of contrasting green- gram genotypes with stable high and low yields (**Table 2**). All the 116 genotypes showed earlier flowering (25–32 days) and pod setting, lower mean biomass (700–3600 kg/ha), and lower grain yield (343– 1745 kg/ha) at Vamban than at Kanpur (**Table 2**). The days to first flower, biomass and harvest index differed considerably in most of the test genotypes in Vamban and Kanpur (**Table 2**). Out of 116 genotypes, 17 accessions had high and stable yields, while 11 had stable low yields across two locations and 3 years of experiments, although genotype × location × years interaction was highly significant at the 1% level (**Table 2**). Therefore, investigating heat tolerance of 17 high and 11 low-yielding greengram genotypes under controlled grenhouse conditions at a HT regime (max/min 45°C/25°C) throughout the entire period of crop growth became necessary. This greenhouse experiment led to identification of two contrasting genotypes, namely LGG 460 (heat-sensitive genotype) and EC 3398889 (heat-tolerant genotype) (**Figure 2**). Based on the ability to set pods at HT in EC 398889, it was tentatively designated a heat-tolerant genotype, while LGG 460, which failed to form pods, was assumed to be a heat-sensitive genotype. The membrane stability index was moderately high in EC 398889 as compared with LGG 460, however chlorophyll index as assessed by SCMR (SPAD chlorophyll meter reading) remained higher in EC 398889 at HT regime (**Table 2**). ATT was considerably higher in EC 398889 (76.8%) than in LGG 460 (34.5%). This phenomenon was further validated by TTC test as indicated by the differences in intensity of purple formazan formation when TTC solutions was added to the seedlings of EC 398889 and LGG 460 after heat shock treatment treated at 52°C for 2 h (**Figures 3A**, **D**). Stepwise heat shock treatment from 37°C to 52°C with increments of 2°C at each step and its reversal to normal temperature provided crucial evidence regarding adaptation of EC 398889 towards HT because this genotype showed TTC positive staining, greening, and normal restoration of plant growth after severe heat shock (**Figures 3B**, **C**), while LGG 460 completely lost seed viability after HT shock (**Figures 3E**, **F**).

Molecular profiling of identified genotypes was conducted using 79 SSR markers of which 11 were found to be polymorphic. These markers exhibited considerable genetic variability among different genotypes. Three among 11 polymorphic primers exhibited clear differentiation between heat-tolerant and heatsensitive genotypes. The marker CEDG147 distinguished both heat-tolerant and heat-sensitive accessions (**Figure 4**). It was amplified at 300 bp in the tolerant genotype EC 398889 and at 285 bp in the sensitive genotype LGG 460. Similarly, another marker, CEDG 247, also distinguished both the genotypes at 161 bp and 168 bp, respectively (**Figure 4**). Furthermore, CEDG 044, distinguish between tolerant and sensitive genotypes at 192 bp in EC 398889 and at 162 bp, in LGG 460 (**Figure 4**).

SDS-PAGE of leaf protein extracted from heat-sensitive (LGG 460) and heat-tolerant (EC 398889) greengram genotypes grown under controlled environment chamber (25°C/18°C) and (43°C/35°C) with 14-h photoperiod was conducted to identify the differences in protein profiles of these two contrasting genotypes. An additional protein band between 91–137 kDa was detected in the genotype EC 398889 under heat-shock (shown in circle) (**Figure 5**). Light-saturated rates of photosynthesis (Pmax) in LT-adapted plants of LGG 460 showed a progressive reduction in photosynthesis from 20°C to 40°C, and Pmax drastically declined in HT-grown plants of LGG 460 (**Figure 6A**). By contrast, the LT-grown EC 398889 showed no reduction in the Pmax within the range of test temperature 20°C–40°C, while Pmax progressively increased with increasing test temperatures from 20°C to 40°C in HT-grown plants of EC 398889 (**Figure 6A**). Stomatal conductance in LGG 460 in both LT- and HT-grown plants decreased with a progressive increase in the test temperatures from 20°C to 40°C (**Figure 6**). However, despite the reduction in stomatal conductance, the Pmax in LT- and HT-grown plants of EC 398889 did not proportionately decrease. However, photosynthesis increased with a progressive increase in the test temperature from 20°C to 40°C (**Figures 6A**, **B**). The transpiration rate in HT-grown LGG 460 increased along with the increase in the test temperatures (**Figure 6C**); however, negative photosynthesis (**Figure 6A**) and high transpiration (**Figure 6C**) in HT-grown LGG 460 appeared detrimental for the genotype in terms of negative carbon gain and more water loss. By contrast, in spite of substantial reduction in the stomatal conductance in HT-grown EC 398889 (**Figure 6B**) and relatively higher transpiration rate (**Figure 6C**), the Pmax remained considerably higher, which indicated an enhanced capacity for photosynthesis in EC 398889 at HTs. Light response of photosynthetic electron transport rate (ETR) in EC 398889 and LGG 460 is shown in **Figure 6D**. The nonstomatal components of photosynthesis were assessed using quantum yield (Fv/Fm) and a final conversion into ETR for targeting the possible sites of action at the chloroplast level as a consequence of HT. Interactive effects of genotype × temperature × irradiance levels was significant and LGG 460 leaves treated at HT 40°C showed complete inhibition of photosynthetic ETR at all irradiance levels, while partial inhibition was noted in the heat-treated leaves of EC 398889 (**Figure 6D**).

Higher light harvesting efficiency was observed in the heattolerant genotype EC 398889 than in LGG 460 in the field-grown crop. Fresh leaf area per unit dry matter weight, which is known as SLA, was considerably lower in EC 398889 than in LGG 460 along with an increase in the SCMR (SPAD chlorophyll meter reading) value (**Figure 7**) at the podding stage, which indicated a higher light harvesting capacity and higher production of dry matter per unit leaf area in EC 398889 than in the genotype LGG 460. The SLA was negatively correlated with SCMR (**Figure 7A**), while SLA was observed to share a positive correlation with delta carbon (**Figure 7B**), which indicated that a lower SLA was associated with lower delta carbon (carbon isotope discrimination) and further suggested that higher water-use efficiency (WUE) was seen in heat-tolerant genotype EC 398889 than in the genotype LGG 460. Lower SLA with lower delta carbon values proved to be unique physiological attributes contributing tolerance to the genotype EC 398889 to adapt effectively by escaping terminal heat stress (**Figure 7**). Fluorescence images of dark- and light-adapted leaves was performed in both the genotypes treated at 30°C and 43°C for 1 h and images were captured for investigating the changes in fluorescence parameters, such as minimal fluorescence (F0), maximal fluorescence (Fm), and quantum yield denoted by the

#### TABLE 1 | Clustering of greengram genotypes based on yield attributing traits across two locations and over three years of trials.

TABLE 2 | Categorization of greengram genotypes into Group 1, 2, and 3 based on yield performance across two locations and over 3 years of trials.


Interaction of Genotype x Location x Year : Significant at ≤0.01.

Bolded text represented the greengram genotype (EC 398889) with consistent stable high yield while genotype (LGG 460) having consistent low stable yield performance at two locations and over three years of trial conducted.

Harvest index (%) 7-34 17-59

FIGURE 2 | Performance of a heat tolerant (EC 398889) and heat sensitive (LGG 460) genotypes at high temperature (45/250C max/min) and 14 h day length (Heat tolerant genotype showed pod formation while sensitive genotype without pods at high temperature).

ratio of the variable fluorescence (Fv) to maximal, Fm fluorescence (Fv/Fm) affected by temperature change (**Figure 8**). Fluorescence images of heat-shocked leaves (43°C) were compared with those of normal temperature (30°C) treated leaves as checks. The numerical values of fluorescence parameters along with changes in the colour code have been interpreted as the degree of damage to the photosynthetic system caused by HT. More damaging effect was observed in light-adapted leaves treated at

FIGURE 3 | Seedling viability and regeneration after heat shock at 520C in heat tolerant (EC 398889) (A–C) and heat sensitive (LGG 460) genotype (D–F). (A and D) showing results of TTC test; (B and E) showing amount of chlorophyll accumulation and (C and F) showing rejuvenation or failure of normal growth of plants after heat treatment.

(IPM-02-03); 5.(IPM-02-14); 6. (LGG 460); 7. (Kopergaon); 8. (NSB 007)

HT 43°C as quantum yield (Fv/Fm) declined from 0.62 (**Figure 8I**) to 0.045 (**Figure 8L**) and this changes in the fluorescence parameters were also substantiated by changes in false colour code (color bar indicating high to low values from right to left). The reduction in the Fv/Fm in the heat-shocked leaves of LGG 460 was almost 100%, thus indicating that the magnitude of heat stress >40°C could be lethal or detrimental for this genotype LGG 460 to sustain photosynthesis (**Figure 8L**). By contrast, better heat adapted genotype EC 398889 showed reduction in the quantum yield (Fv/Fm) from 0.66 to 0.24 after the heat shock and the partial inhibition of quantum yield (approximately 63%) depicted by Fv/Fm images (**Figure 8L**). The results revealed that threshold temperature at which photosynthetic system irreversibly changed in greengram could 43°C, however, genetic diversity for heat tolerance trait is evident in the present investigation.

Sink strength under stress is a crucial factor determining grain yield. To investigate the sink efficiency, one of the key enzyme sucrose synthase (SuSy) was targeted and activity was measured at different developmental stages in both the genotypes grown during summer. Significant differences in SuSy activity at different

398889 adapted to high temperature regime (as shown in circle).

developmental stages were noticed. The SuSy activity in LGG 460 remained low after anthesis till Day 10 of anthesis and after that attained high activity. However, SuSy was extremely high even after 5th day of anthesis in tolerant genotype EC 398889 attaining maximum activity on Day 8 or 9 and declined to an extremely low activity state when pods were near maturity (**Figure 9**). Thus, pod fill duration appeared to be regulated by time-dependant activation state of SuSy, and the two genotypes could be differentiated by the early or late upregulation of SuSy immediately after anthesis. The effect of HT on pollen germination was also investigated in contrasting greengram genotypes. With a progressive increase in the temperature beyond 35°C, the length of the pollen tubes decreased, diameter of tubes increased, and pollen sap became denser and more viscous, which results in poor mobility of pollen sap or slowdown of cytoplasmic streaming (**Figure 10**). Anthesis/ fertilization might also have been affected by altered physiological changes that occur at HTs. Abnormalities such as coiling of pollen tubes, emergence of multiple tubes or bursting of pollen cell sap from multiple sites were observed in the insensitive line LGG 460 beyond 37°C (**Figure 10**). Most of physiological features in pollen germination that were affected by HT remained similar in both the genotypes. However, the genotype EC 398889, which is better adapted to HTs than LGG 460, showed normal growth and fertile pollens even at 43°C unlike LGG 460, which showed complete pollen sterility as seen in LGG 460 at 43°C (**Figure 10**).

# DISCUSSION

The panel of greengram genotypes, constituting 116 diverse germplasms was classified into four broad clusters, which revealed wide genetic diversity of the greengram population under study,and genotype EC 398889 belonging to cluster 1 and LGG 460 belonging to cluster 3 were not linked with each other or distantly related in terms of the phenology and yield attributing traits (**Table 1**). The large variation in grain yield between the selected greengram experimental sites, Vamban and Kanpur in India, suggested the presence of ample genetic variability for various yield and yieldcontributing traits (**Table 1**). The mean grain yield of 116 genotypes at Vamban was almost half of the mean grain yield at Kanpur although Vamban has better environmental conditions than does Kanpur with respect to temperature (38°C/21°C maximum/ minimum) and humidity (approximately 70%) prevailed during crop growth period and no other abiotic factors could have affected the yield except for the day length, which was shorter by 1to 2 h in Vamban in comparison to Kanpur. Questions of whether the temperature regime during the crop season or day length at Vamban was not favorable and both the factors might have played a crucial role in determining the grain yield are likely to arise. Some of the questions are as follows: What caused Kanpur to consistently record higher yields than Vamban? Should different breeding strategies be adopted to develop greengram varieties? What traits are crucial for improving yield as well as enhancing the yield stability across environments? The duration of specific stages of growth appeared to have direct relationship with temperature because early growth stages of the crop at Vamban experienced higher temperature with max >35°C and min >25°C coupled with a shorter photoperiod of 11:30 h than Kanpur. Consequently, the combined effect on the crop was that the attainment of reproductive stage occurred considerably earlier because flower initiation occurred 3 to 10 days earlier than in Kanpur (**Table 2**). The results corroborated with earlier reports indicating that high mean temperature hasten flowering and, a low mean temperatures delay flowering at all photoperiods. Flowering is often progressively delayed in greengram when the photoperiod is extended (Pratap et al., 2013; Pratap et al., 2014). The reports suggested that long day length at Kanpur (12–14 h) delayed flowering, while short day length in Vamban (11–13 h) induced early flowering and maturity. Although greengram possess an indeterminate growth habit characterized by alternate flushes of flowers followed by vegetative growth, early flowering tends to shorten the crop cycle and favor early maturity with a substantial yield penalty. Evidently, Kanpur had drier weather (40%–50% RH) and longer photoperiod, which contributed higher biomass or vegetative growth at the early stages and the maximum temperature reached beyond 40°C (**Figure 1**). These results suggested that crops grown at Kanpur were adequately supported by vegetative biomass that had accumulated before flowering, and this could be one of the major yield determinants as is evident by observing a high yield at Kanpur although terminal heat stress at Kanpur had been more severe than that at Vamban, exceeding 40°C. In the context of yield improvement in greengram under these two contrasting environments, two-pronged strategies should be developed because Vamban does not experience terminal heat stress during the reproductive phase of the crop. Consequently, productivity could be substantially enhanced by introducing photoinsensitive and thermoinsensitive varieties, thus allowing substantial biomass to support grain filling. By contrast, heat-tolerant varieties are necessary for higher latitudes like Kanpur, where recurrent heat episodes are a regular feature.

FIGURE 6 | Response of net photosynthetic rate (A), stomatal conductance (B), and transpiration rate (C) to increasing temperatures (200C, 300C, and 400C) in preadapted plants to low temperature (LT) or high temperature (HT) conditions in heat tolerant (EC 398889) and heat sensitive (LGG 460) genotypes. (D) represents light response of photosynthetic electron transport rate, ETR 300C and 400C in heat tolerant (EC 398889) and heat sensitive (LGG 460) genotypes. (A–C) Each value represents mean of three replications with standard error of mean (SEm) shown by error bar. Analysis of variance test using two factors factorial design (Genotype,G and Temperature, T) showed significant interaction effects (GxT) at P ≤ 0.01 on photosynthesis, stomatal conductance and transpiration with CD values 0.76\*\*, 0.32\*\*, and 0.15\*\*, respectively, for treatment mean comparison. While (D), three factors such as Genotype,G (2), temperature,T (2), and irradiance levels, L (13) were taken into account to test the significance level of interaction among these factors (GxTxL) which was shown by CD value 2.1\*\* (P ≤ 0.01) for treatment mean comparison.

FIGURE 7 | Relationship among specific leaf area (SLA), SPAD chlorophyll meter reading (SCMR), and carbon discrimination (D13C) in selected greengram genotypes. The heat tolerant genotype (EC 398889) showed lower, SLA (A) and D13C (B) values indicating higher photosynthate partitioning and water-use efficiency as compared to heat sensitive genotype (LGG 460) (A, B). Each value represents mean of five replications.

The controlled environment studies are crucial for determining the effect of a specific environmental factor on yield and yieldcontributing traits by eliminating the effects of other factors. In the present study, the controlled greenhouse conditions allowed the crop to grow at HT regime (45°C/25°C) with 14 h day length. Based on the results of field trials over 3 years at two locations, 17 genotypes with high stable yield and 11 with a stable but low yield were selected and evaluated in greenhouses (**Table 2**). The EC 398889 demonstrated the highest yield out of 17 putatively identified genotypes as stable high yield while the lowest yield was recorded in LGG 460 (**Table 2**).

Membrane stability index, as well as chlorophyll content or greenness index, remained higher in EC 398889 as compared with LGG 460 (**Table 3**). Under stress conditions, a sustained function of cellular membranes is considered crucial for maintaining cellular processes such as photosynthesis and respiration (Blum, 1998). The integrity and function of cell membranes are sensitive to HT, as heat stress alters structures of membranes proteins leading to increased permeability of membranes as evident from the increased loss of electrolytes in the test leaf samples (**Table 3**). The increased solute leakage is closely associated with cell membrane thermo- stability (Ilık et al., 2018), and various attempts have been made to use this method as an indirect measure of heat tolerance in diverse plant species such as food

FIGURE 9 | Sucrose synthase (SuSy) activity in developing grains of field-grown greengram (heat tolerant EC 398889 and sensitive LGG 460) genotypes at different pod development stages. Each value represents mean of three replications.Treatment means comparison (Genotype, G and days after anthesis, D) and significance levels of difference (CD) between genotype and days after anthesis activating SuSy was performed based upon the CD value 751.1\*\* of interaction effects of Genotype x days (GxD) significant at P ≤ 0.01.

legumes (Srinivasan et al., 1996), soybean (Scafaro et al., 2010), potato, cotton, and tomato (Rahman et al., 2004; Hu et al., 2010), wheat (Blum et al., 2001).

The genotype EC 398889 was characterised by high acquired thermo- tolerance (76.8%) as compared with LGG 460 (34.5%). This was inferred based on the ability of TTC reduction by seedlings adapted to 37°C and 52°C for 2h (**Figure 3**). In addition to this, the heat- tolerant genotype had a unique attribute to start accumulating chlorophyll in cotyledonary leaves followed by regeneration of new leaves from the seedlings after severe heat shock (52°C), gradually tended to revive to normal plant after series of heat episodes from 37°C to 52°C (**Figures 3B**, **C**), however readjustment of physiological processes toward normalization took a long time for recovery. Seedlings that turned green and generated new leaves were scored as survivors. Thus, TTC and chlorophyll accumulation tests were found to be appropriate for monitoring sensitivity of a genotype to hightemperature stress. By contrast, heat-sensitive genotype LGG 460 failed to revive after episodic heat stress and completely lost cell viability as TTC test was found negative (**Figure 3D**). Our findings are in accordance with earlier reports indicating strong association of higher membrane thermo stability and cell viability after heat stress treatment of seedlings and the technique has been widely used for assessment of HT tolerance (Gupta et al., 2010). The TTC reduction assay measures the level of mitochondrial respiration activity, which serves as an indicator of cell viability (Berridge et al., 2005). Variability was detected among the 56 genotypes for acquired thermotolerance ranging from 14.1% to 61.3%.

The development of candidate gene markers for crucial heat tolerance genes may allow for the development of new cultivars with increased abiotic stress tolerance using marker-assisted selection (Pratap et al., 2015; Pratap et al., 2017; Jespersen et al., 2017). Molecular profiling of greengram accessions was also done using 79 SSR markers of which 11 were polymorphic. Among the polymorphic primers, three markers showed a clear differentiation between heat-tolerant and heat-susceptible genotypes. These markers exhibited a large amount of genetic variability among different accessions. The marker CEDG 147 distinguished tolerant and susceptible group of accessions and amplified at 300 bp in the heat-tolerant genotype EC 398889 and at 285 bp in sensitive genotype LGG 460. Similarly, another marker CEDG 247 also distinguished heat-tolerant and heatsusceptible genotypes at 161 and 168 bp, respectively. Likewise, marker CEDG 044 distinguished between tolerant and sensitive genotype at 192 and 162 bp, respectively (**Figure 4**).

SDS-PAGE of leaf protein extracted from LGG 460 and EC 398889 grown under controlled environment chamber (25°C/18°C)



and (43°C/35°C) with 14 h photoperiods performed. An additional protein band between 91-137 kDa was detected in the genotype EC 398889 (shown in the circle), whereas this band was absent in LGG 460 (**Figure 5**). However, because this additional band was not thoroughly characterized, it could be inferred that the expression of this protein might have some role as a protective mechanism. This protein band was extremely close to the size of approximately 100– 105 kDa. The result also showed the expression of one heat shock protein (HSP) of molecular size of 101 KDa (**Figure 5**) in the heattolerant greengram genotype EC 398889, which was consistent with earlier studies (Yoshida et al., 2011). Expression of various HSPs is an adaptive strategy in heat tolerance. Some HSFs (Hsp101, HSA32, HSFA1, and HSFA3) are critical for thermotolerance and play a crucial role in stress signal transduction, protecting and repairing damaged proteins and membranes, protecting photosynthesis as well as regulating a cellular redox state (Wang et al., 2004; Chi et al., 2019). The expression of various HSPs is known to be an adaptive strategy in heat tolerance. Hsp101 has been considered to be a molecular chaperone that impart heat tolerance to plants (Schlesing et al., 1982; Suk and Elizabeth, 2001), furthermore, it has special significance in maintaining proper conformation of proteins and facilitates the survival of organisms in high‐temperature stress. HSPs are induced by heat and strongly linked to heat tolerance (Yıldız and Terzioğlu, 2006). Different classes of HSPs play different roles in protection from stress; however, most HSPs serve as chaperones.

The genotype LGG 460 could not adapt at high thermal regimes (maximum/minimum) 40°C/30°C in a controlled environment because photosynthesis (Pmax) was inhibited completely compared with LT-grown plants (25°C/18°C) (**Figure 6A**). The Pmax was more adversely affected (**Figure 6A**) than stomatal conductance (**Figure 6B**) and transpiration (**Figure 6C**) in the HT-grown plants indicating involvement of nonstomatal components that were likely to be the factors responsible for inhibiting photosynthesis. By contrast, the genotype EC 398889 was not only adapted well under high thermal regime 40°C/30°C, instead Pmax progressively increased when photosynthesis measured from lower (20°C) to higher (40°C) test temperatures (**Figure 6A**), and the temperature response of Pmax was proportionate to relative changes in temperature response of stomatal conductance (**Figure 6B**) and transpiration (**Figure 6C**). Higher photosynthesis with low stomatal conductance and transpiration rate enabled more carbon gain over water loss; hence, WUE increased in heat-tolerant genotypes when subjected to stress (**Figure 6**). By contrast, sensitive genotype LGG 460 confronted with different situations, such as reduction of photosynthesis at HT, which was associated with an increase in the stomatal conductance and transpiration. Hence, no carbon gain per unit loss of water occurs, which suggested that the plants encountered multiple stresses, such as HT, light intensity, and drought (**Figure 8**). Photosynthesis is sensitive to HT (Kim and Portis, 2005) and the ability to sustain leaf gas exchange under heat stress is directly correlated with heat tolerance (Bita and Gerats, 2013). The reduction of active Rubisco and Rubisco activase could be responsible for the inhibition of photosynthesis (Maestri et al., 2002; Morales et al., 2003; Salvucci and Crafts-Brandner, 2004; Ristic et al., 2009), the carbon fixation is affected by limitation of Rubisco. The stroma and thylakoid membrane system are the most sensitive and primary target sites of heat injury (Maestri et al., 2002; Morales et al., 2003; Wise et al., 2004). Photosynthesis is the most thermosensitive plant function (Kim and Portis, 2005); hence, supraoptimal temperatures adversely affect photosynthesis. Photosynthesis can occur optimally at wide temperatures in the range 15°C–35°C, but it is adversely affected at temperatures exceeding 40°C. Chloroplast stroma and thylakoid membranes are damaged by HTs (Wang et al., 2010). Photosystem (PS)II in the light reaction (Heckathorn et al., 2002) and Rubisco (ribulose1, 5-bisphosphate carboxylase/oxygenase) activase in the Calvin cycle (Crafts-Brandner and Salvucci, 2000) are both thermo-labile. Heat stress thus impairs the electron transport chain and affects the activation and activity of the enzyme Rubisco (Ahmad et al., 2010). Although PSI and PSII are both adversely affected by HTs, but PSII is more sensitive to heat stress than is PSI (Moustaka et al., 2018).

The first distinct change in both structure and function of photosystem II (PSII) reported to be occurred at 40°C –50°C in barley (Lípová et al., 2010). The first temperature induced transient changes had been shown at 42°C to 48°C with a disruption of the PSII donor side and corresponding loss of oxygen evolution (Cramer et al., 1981) followed by changes in thylakoid membranes at about 60°C and loss of electron transport through PSII (Smith et al., 1989) representing a denaturation of the PSII reaction centers. At about 75°C, a denaturation of lightharvesting complex of PSII(LHCII) has been observed (Smith et al., 1989).

In the present study, inhibition of photosynthesis at HT was assessed through gaseous exchange as well as chlorophyll fluorescence imaging, which indicate the effects of stress on PS II photosynthetic membrane system and ETR. The light response of ETR at two pretreatment temperatures, namely, 30°C and 40°C, is shown in **Figure 6D**. The ETR in HT pretreated leaves (40°C) of EC 398889 never declined to zero with progressive increase in irradiance levels (**Figure 6D**). However, heat-sensitive genotype LGG 460 showed complete reduction of ETR at all levels of irradiance when pretreated at 40°C. Reduced electron transport and damaged photosystems caused by high temperature have been reported in poplar by Song et al. (2014).

The genotype EC 398889 had low SLA (leaf area g−1 leaf weight). Furthermore, it had a high SCMR or greenness index, which suggested higher chlorophyll levels within a smaller leaf surface area, which enabled the plant to absorb more solar radiation per unit area of leaf in comparison with genotype LGG 460 (**Figure 7**). More chlorophyll per unit of leaf area in EC 398889 was likely to enhance photosynthesis than in the genotypes having higher SLA and low SCMR, such as LGG 460 (**Figure 7**). SLA was also positively correlated with delta carbon indicating that lower values of delta carbon are associated with low SLA values. High radiation-use efficiency and high WUE are attributed to low SLA coupled with low delta carbon values, as exhibited by EC 398889 (**Figure 7**). SLA has been reported to be associated with variation in photosynthetic capacity and chlorophyll density (Nageswara Rao et al., 2001; Kalariya et al., 2015). SCMR contributes to high photosynthesis and ultimately to increased yield (Arunyanark et al., 2009). Koolachart et al. (2013) reported that SLA indicates high chlorophyll content in leaves that contribute to high photosynthesis and yield.

The fluorescence parameters (F0, Fm, and Fv/Fm) were altered because of heat treatment at 43°C in dark and light-adapted leaves in the heat-tolerant and heat-sensitive genotypes. The modification of chlorophyll florescence in response to heat stress has been reported in numerous crops, and heat tolerance of plant species can be quantified by measuring chlorophyll florescence (Willits and Peet, 2001). Complete inhibition of quantum yield (Fv/Fm) of photosystem II was observed in light-adapted leaves pretreated at 43°C in genotype LGG 460 (**Figure 8L**), however, light-adapted leaves of EC 398889 that were subjected to the same treatment showed reduction in quantum yield (Fv/Fm) by approximately 64% (**Figure 8L**) compared with dark-adapted leaves (**Figure 8I**) immediately after heat shock. This finding suggested that light is an additional stress, and when leaves are exposed to heat shock, it becomes more detrimental to photosynthesis. The relative assessment of fluorescence images, particularly for quantum yield (Fv/Fm) after heat treatment, revealed that light-adapted leaves of the heat-tolerant greengram genotype EC 398889 exhibited higher quantum yield than the heat-sensitive genotype, LGG 460, as evidenced by fluorescence images for Fv/Fm. The photosynthetic system partially or completely collapsed in LGG 460 because no quantum yield (Fv/ Fm) images were obtained with light-adapted leaves (**Figure 8L**). The fluorescence images combined with the light curve of ETR strongly suggested differential sensitivity of photosynthesis in the two contrasting genotypes (**Figure 8**).

The images of effective PS II quantum yield (YII) captured under high temperature and irradiance level were able to distinguish heat tolerant and susceptible genotypes. Similarly, the light response of electron transport rate (ETR) was also able to distinguish the genotypes based on their sensitivity to heat stress. Overall, this investigation indicates the suitability of chlorophyll fluorescence imaging system technique for precise phenotyping of greengram based on their sensitivity to heat stress. The findings are in accordance with earlier reports in rice (Pradhan et al., 2019) and wheat (Brestic et al., 2012).

Differential degree of membrane thermostability may distinguish the genotypes towards different sensitivity to heat stress. Chen et al. (2018) reported that chloroplast-targeted AtFtsH11 protease plays critical roles for maintaining the thermostability and structural integrity of photosystems under high temperatures. Therefore, the photosynthetic efficiency may be modified under heat stress by improving FtsH11 protease in photosystems, hence, to improve plant productivity. Sucrose synthesis in developing grains plays a crucial role in sink development and also determines the sink strength in several crops. It also acts as a signal molecule for promoting the conversion of transported sugar into starch. In the present study, sucrose synthase activity at different developmental stages differed among the test genotypes (**Figure 9**). The activity of sucrose synthase in developing grains of LGG 460 remained low and followed a long lag phase till day 10 of pod setting, while the genotype EC 398889 and a variety named "Virat," which was derived using this genotype as the male parent showed a sharp increase in the activity of sucrose synthase after day 5 of pod setting with a concomitant increase in the sucrose content in developing grains. The early activation of sucrose synthase in the test genotypes EC 398889 appeared to be responsible for rapid grain filling and pod development and is likely to be associated with early pod maturity. The availability of photosynthates and sucrose, the transportable sugar, could also be responsible for long lag phase kinetics of sucrose synthase activity in LGG 460 because decrease in photosynthesis also limits sucrose transport to the sink, which might have influenced sink development at HT. Thus, sink development is inhibited in heat-sensitive genotypes. Hence, the first step in the conversion of sucrose to starch is likely to be primarily catalysed by sucrose synthase. These results also suggested that sucrose synthase activity could be considered a marker for sink strength. The enzymes responsible for metabolising sucrose may regulate sucrose import into the sink. High activities of sucrose-metabolising enzymes could increase the sucrose gradient; consequently, large amounts of sugar are imported for metabolism and storage. Wang et al. (1993) emphasised the importance of sucrose synthase rather than acid invertase as the dominant enzyme in metabolising imported sucrose in a growing sink. Sucrose synthase is responsible for the breakdown of sucrose, thus providing intermediates for the synthesis of starch and other polysaccharides. Reduced sucrose metabolism under high temperatures has been attributed to the changes in sucrose synthase and invertase (Dai et al., 2015).

Many legumes and cereals exhibit a high sensitivity to heat stress during flowering. One of the major yield determinants in greengram is pollen fertility and flower shedding at a HT. The length and size of the pollen tube and density of the pollen sap appeared to be altered by a progressive increase in the temperature beyond 37°C (**Figure 10**). The effect of HT on pollen germination was characterized by transformation of pollen sap into a dense and viscous fluid that probably hinders the smooth movement of male gametes. In addition to a reduction in the length of pollen tubes, no other pollen abnormalities were observed in the heat-tolerant genotype EC 398889 up to 40°C (**Figure 10**). By contrast, multiple abnormalities were detected in pollen tubes of the heat-sensitive genotype LGG 460, where the emergence of multiple tubes, and their bursting and coiling were observed; eventually, the pollen failed to germinate at temperatures exceeding 40°C (**Figure 10**). Earlier reports on rice have also indicated that an increase in temperature could limit yield by affecting pollen germination and grain formation (Endo et al., 2009; Wassmann et al., 2009; Chakrabarti et al., 2010). The male gametophyte is particularly sensitive to HTs at all stages of development, while the pistil and the female gametophyte are considered to be more tolerant (Hedhly, 2011). The sensitivity of pollen grains to temperature damage could be considered a crucial parameter for predicting rice yield in warmer climates. In legumes, heat stress during post-anthesis results in poor pollen germination on the stigma and reduced pollen tube growth in the style (Talwar et al., 1999). Under HT (30°C), flower sterility has been correlated with diminished anther dehiscence, poor shedding of pollen, poor germination of pollen grains on the stigma, reduced elongation of pollen tubes, and reduced *in vivo* pollen germination (Fahad et al., 2015; Fahad et al., 2016). High temperature decreases pollen viability and leads to sterile pollens and decrease of pod set and yield (Hasanuzzaman et al., 2013), as pollens are most sensitive to high temperature, the crop yield is affected when temperature rises during pollen development (Ploeg Van der and Heuvelink, 2005). The observed reduction in photosynthesis, in the present study, in heat sensitive genotype LGG 460 under high temperature might restrict accumulation of desired level of essential carbohydrates such as sucrose, hexoses and starch in the developing pollens, as a result pollen germination or fertility is adversely affected. The role of sugars and invertase/sucrose synthase activity in anther development and pollen germination has been reported in several crops (Goetz et al., 2001; Castro and Clement, 2007; Pressman et al., 2012; García et al., 2013; Le Roy et al., 2013; Singh and Knox, 2013). In the present study, Photosensitive character was eliminated by the series of field and controlled environment trials and eventually putative photoinsensitive lines were selected for evaluating their thermotolerance. These contrasting greengram lines, namely, EC 398889 and LGG 460, proved to be extremely valuable germplasm resources for crossing programmes.

# CONCLUSION

The changes in photothermoperiods across locations and under high-temperature stress during the reproductive stage have been considered to be major yield destabilizing factors in greengram. Three consecutive years of field experiments using a set of 116 genotypes at two locations, namely, Vamban and Kanpur in India, differing in day length and thermal regimes led to the identification of a few promising genotypes with stable high or low yield, depending on their relative insensitivity towards photothermoperiods based on two location data. Thermotolerance of selected genotypes was assessed under HT conditions simulated in a naturally-lit greenhouse. After vigorous testing, two contrasting genotypes were found to differ primarily in pod setting and grain yield at HT. The genotypes EC 398889 and LGG 460 exhibited the highest and lowest yield, respectively, based on their heat tolerance in addition to photo insensitivity exhibited by them based on multilocation trials. Heat tolerance and underlying mechanisms have been deciphered in these genotypes involving cellular thermal stability, ATT, chlorophyll fluorescence, pollen germination and photosynthesis, WUE, pollen fertility, and sink capacity. The results showed that source (leaf) efficiency could be enhanced by increasing the amount of chlorophyll per unit leaf, which means reducing SLA, which will improve WUE and minimise stomatal conductance and transpiratory water loss. Threshold temperature for tolerance for photosynthesis in greengram has been detected to the limit of 43°C based on photosynthetic ETR and other fluorescence parameters. Beyond 43°C, often irreversible changes is occurred in photosynthetic system. Faster activation of sucrose synthase in developing grains immediately postanthesis supported rapid grain filling and hastened pod maturity before the onset of HT. Therefore, modifications are necessary both at source and sink levels to improve productivity of greengram under changing climates.

# DATA AVAILABILITY STATEMENT

The datasets for this article are not publicly available. The data is important, once published this data will be publicly available. Requests to access the datasets should be directed to psbsu59@ gmail.com.

# AUTHOR CONTRIBUTIONS

PB: Conception, designing, interpretation, drafting the manuscript, and conducting the experiments. AP, SG, and NS: Compilation of the results, editing and approval of final version. KS and RT: Conducted laboratory based experiments.

# FUNDING

The work was accomplished with financial support of the ICAR-National Innovations on Climate Resilient Agriculture (NICRA), Indian Council of Agricultural Research, New Delhi, India.

# ACKNOWLEDGMENTS

The authors are thankful to Indian Council of Agricultural Research, New Delhi for financial support. Contributions of Mr. Srimanta Dey, Media & Information Unit, ICAR, Krishi Bhawan, New Delhi for composing and editing the images and pictures and Dr. Hemant Kumar for statistical analysis are thankfully acknowledged.

# REFERENCES


studies in azuki bean [*Vigna angularis* (Willd.) Ohwi & Ohashi]. *Theor. Appl. Genet.* 109, 352–360. doi: 10.1007/s00122-004-1634-8


**Conflict of Interest:** The research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Basu, Pratap, Gupta, Sharma, Tomar and Singh. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Genotype × Environment Studies on Resistance to Late Leaf Spot and Rust in Genomic Selection Training Population of Peanut (Arachis hypogaea L.)

*Sunil Chaudhari1, Dhirendra Khare2, Sudam C. Patil3, Subramaniam Sundravadana4, Murali T. Variath1, Hari K. Sudini1, Surendra S. Manohar1, Ramesh S. Bhat5 and Janila Pasupuleti1\**

#### Edited by:

Jose C. Jimenez-Lopez, Consejo Superior de Investigaciones Científicas (CSIC) Granada, Spain

#### Reviewed by:

Suvendu Mondal, Bhabha Atomic Research Centre (BARC), India Pei Xu, China Jiliang University, China Muthukrishnan Sathiyabama, Bharathidasan University, India

> \*Correspondence: Janila Pasupuleti p.janila@cgiar.org

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 15 May 2019 Accepted: 25 September 2019 Published: 04 December 2019

#### Citation:

Chaudhari S, Khare D, Patil SC, Sundravadana S, Variath MT, Sudini HK, Manohar SS, Bhat RS and Pasupuleti J (2019) Genotype × Environment Studies on Resistance to Late Leaf Spot and Rust in Genomic Selection Training Population of Peanut (Arachis hypogaea L.). Front. Plant Sci. 10:1338. doi: 10.3389/fpls.2019.01338

1 Crop Improvement- Asia Program, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India, 2 Department of Plant Breeding and Genetics, Jawaharlal Nehru Krishi Vishwa Vidyalaya (JNKVV), Jabalpur, India, 3 Oilseeds Research Station, Mahatma Phule Krishi Vidyapeeth (MPKV), Jalgaon, India, 4 Coconut Research Station, Tamil Nadu Agricultural University (TNAU), Coimbatore, India, 5 Department of Biotechnology, University of Agricultural Sciences, Dharwad, India

Foliar fungal diseases especially late leaf spot (LLS) and rust are the important production constraints across the peanut growing regions of the world. A set of 340 diverse peanut genotypes that includes accessions from gene bank of International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), elite breeding lines from the breeding program, and popular cultivars were screened for LLS and rust resistance and yield traits across three locations in India under natural and artificial disease epiphytotic conditions. The study revealed significant variation among the genotypes for LLS and rust resistance at different environments. Combined analysis of variance revealed significant environment (E) and genotype × environment (G×E) interactions for both the diseases indicating differential response of genotypes in different environments. The present study reported 31 genotypes as resistant to LLS and 66 to rust across the locations at 90 DAS with maturity duration 103 to 128 days. Twenty-eight genotypes showed resistance to both the diseases across the locations, of which 19 derived from A. cardenasii, five from A. hypogaea, and four from A. villosa. Site regression and Genotype by Genotype x Environment (GGE) biplot analysis identified eight genotypes as stable for LLS, 24 for rust and 14 for pod yield under disease pressure across the environments. Best performing environment specific genotypes were also identified. Nine genotypes resistant to LLS and rust showed 77% to 120% increase in pod yield over control under disease pressure with acceptable pod and kernel features that can be used as potential parents in LLS and rust resistance breeding. Pod yield increase as a consequence of resistance offered to foliar fungal diseases suggests the possibility of considering 'foliar fungal disease resistance' as a must-have trait in all the peanut cultivars that will be released for cultivation in rainfed ecologies in Asia and Africa. The phenotypic data of the present study will be used for designing genomic selection prediction models in peanut.

Keywords: G x E, GGE, genomic selection, peanut, training population

# INTRODUCTION

Peanut (*Arachis hypogaea* L.) is an important annual food, feed, and oilseed crop grown nearly in 114 tropical and subtropical countries, covering an area of 27.66 m ha, annual production of 43.98 m tonne and productivity of 1590 kg/ha (FAOSTAT, 2016). The productivity of peanut in Asia (2186 kg/ha) and Africa (903 kg/ha) are quite low in comparison to America (3381 kg/ha), Europe (3102 kg/ha), and Australia and New Zealand (2825 kg/ha) (FAOSTAT, 2016). Exposure to various biotic and abiotic stresses, poor agronomic management practices, non-availability of quality seeds of released varieties and socio-economic issues are some key factors for the low productivity in Asia and Africa. Among the biotic stresses, foliar diseases such as late leaf spot (LLS) (caused by *Phaeoisariopsis personata* Berk. & Curtis) and rust (caused by *Puccinia arachidis* Speg.) are economically most important. Nearly 50–70% reduction in pod yield and adverse effect on seed quality was reported due to infection of rust and LLS together (Miller et al., 1990; Grichar et al., 1998). Plants susceptible to LLS exhibit complete defoliation under high disease pressure leading to low yield. Leaf rust also has considerable economic importance in many peanut growing regions of the world. The losses due to occurrence of rust can vary from 40% to 70% under favorable conditions and presence of susceptible cultivars (Subrahmanyam et al., 1985; Dwivedi et al., 2002). The disease can be particularly severe when it occurs together with LLS.

Identifying disease resistant genotypes and introgressing trait into the improved genetic background is one of the most effective and eco-friendly measures to enhance production and productivity under resource-limited farming systems especially in semi-arid regions of developing countries. In the past, several efforts were made to identify sources of resistance to LLS (Subrahmanyam et al., 1985; Gorbet et al., 1990) and rust (Wynne et al., 1991; Subrahmanyam et al., 1989) in peanut. Majority of identified resistant sources belong to subspecies *fastigiata* var. *fastigiata* and are landraces from South America (Subrahmanyam et al., 1989). Wild *Arachis* species, in contrast, have shown variation ranging from immune to highly resistant reaction to LLS (Abdou et al., 1974; Subrahmanyam et al., 1985). However, the use of wild species in resistance breeding programs remained limited due to cross-compatibility barriers, the occurrence of linkage drag, late maturity, and undesirable pod and seed features.

Foliar fungal disease screening under field conditions is cumbersome, time-consuming, resource intensive, and often demanding to evaluate large number of individuals of segregating generations. The efficiency and accuracy of selection are largely depending on the environment of disease development and evaluation techniques. Genomic selection (GS) is an emerging approach to increase selection intensity, accuracy, and genetic gains in breeding program for improving complex polygenic traits through increasing frequency of favorable alleles in advance generation with the help of genomic estimated breeding value (GEBV) predicted using whole genome marker profile data and multi-environmental phenotypic data (Meuwissen et al., 2001). An earlier study using Marker Assisted Backcrossing (MABC) approach to introgress a major quantitative trait loci (QTL) explaining 80% Phenotypic Variation (PV) for rust resistance and 65% PV for LLS resistance has revealed that phenotyping for disease resistance together with selection for the QTL of interest is needed to derive lines with the desired level of resistance (Janila et al., 2016). Therefore, GS may be a valuable approach for improving resistance to foliar fungal diseases in peanut as it enables simultaneous selection of several genomic regions based on GEBVs. To implement GS, multi-environment phenotypic and genome-wide markers data on a diverse set of genotypes called genomic selection training population (GSTP) are used to train a prediction model which is applied to a new set of selection candidates that have been genotyped with genome-wide markers. GS using only molecular information prior to phenotyping will be useful for increasing the rate of genetic gain by reducing the breeding cycle time and increasing the selection intensity and accuracy. Therefore, the present study was aimed to evaluate GSTP for resistance to LLS and rust diseases across different environments which will be used for construction GS prediction models in peanut. The present study is the first comprehensive field evaluation of GSTP against rust and LLS diseases. The screening of this diverse set of genotypes for both the diseases also identified genotypes resistant to both diseases which can be used in future breeding programs.

# MATERIALS AND METHODS

# Plant Material

A set of 340 peanut genotypes, of which 227 belonged to subspecies *fastigiata* and 113 to subspecies *hypogaea*, and differing for morphological and economically important traits constituted a genomic selection training population (GSTP) at ICRISAT. Among the 227 genotypes of ssp. *fastigiata,* 212 genotypes belong to botanical variety *vulgaris* (Spanish bunch), 10 to *fastigiata* (Valencia), four to *peruviana* and a single genotype to *aequatoriana*; while among the 113 genotypes of ssp. *hypogaea*, 111 genotypes belong to botanical variety *hypogaea*, one to *hirsuta* and one to unknown botanical type. A total of 51 genotypes were taken from 20 different countries whereas 289 were developed/originated at 11 major peanut breeding centers of India. Among these, 189 genotypes were contributed by ICRISAT and 63 by University of Agricultural Sciences, Dharwad. The details of genotypes such as subspecies, botanical variety, market type, origin, and pedigree are given in **Supplementary Table 1**.

# Experimental Design

The experiment was conducted at three locations in India *viz.,* International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Patancheru, Telangana (17º53 'N, 78º27 'E, 545.0 MSL), Oilseed Research Station (ORS), Mahatma Phule Krishi Vidyapeeth, Jalgaon, Maharashtra (21°03 'N, 75°34 'E, 201.2 MSL) and Coconut Research Station (CRS), Tamil Nadu Agricultural University (TNAU), Aliyarnagar, Tamil Nadu (10°29 'N, 76°58 'E, 288.0 MSL) during rainy season 2015 for multilocation evaluation of GSTP against two major foliar fungal diseases (rust and LLS), and pod yield under disease pressure. Nutritional quality traits were assessed during post-rainy 2015–16 at ICRISAT, Patancheru. Two of the evaluation sites, ORS, Jalgaon and CRS, Aliyarnagar are natural disease hotspots for LLS and rust, respectively. At ICRISAT, Patancheru, the natural infection is supplemented with artificial disease infection created by inoculating the diseases through infector row technique. The trials were planted in Alpha Lattice Design with two replications at all the environments. Each replication was divided into 20 equal sized homogeneous blocks with the block size of 17 plots to reduce heterogeneity in the experiments by eliminating inter-block effect. Single row plots were planted with 4 m length and with inter and intra-row spacing of 30 and 10 cm, respectively. The sowing was done on broad bed system as recommended for peanut cultivation with 4 rows per bed. Standard agronomic management practices were followed at each environment: 60 kg phosphorus pentoxide (P2O5) as a basal application, pre-emergence application of Pendimethalin (@1 kg active ingredient per ha) for weed control and irrigation soon after planting and subsequently when needed. There were no disease symptoms observed during the post-rainy season, hence management practices were not adopted for either of the diseases. Gypsum (@400 kg/ha) was applied to the experimental field at peak flowering stage and protection was taken against insects whereas no protection measure applied to control foliar fungal diseases.

#### Field Evaluation of GSTP for Resistance to LLS and Rust

At Aliyarnagar and Jalgaon which are natural hotspots for rust and LLS, infector rows of a highly susceptible cultivar TMV 2 were planted after every four broad beds to maintain uniform disease pressure. At ICRISAT, artificial disease screening was used with infector rows of TMV 2 after every four broad beds, and along the borders to create optimum disease pressure for screening. For artificial inoculation, urediniospores of *Puccinia arachidis*  (rust) and conidial suspension of *Phaeoisariopsis personata* (LLS pathogens) were collected separately using a cyclone spore collector (Fischer Scientific Co., USA) from naturally infected leaf lesions of the susceptible cultivar TMV 2. The inoculum were stored at −20°C. Ten days before field planting, the susceptible peanut cultivar TMV 2 was planted in polybags in the greenhouse. Thirty-five day-old TMV 2 seedlings raised in the greenhouse were inoculated separately by spraying with *urediniospores* of rust and conidia of LLS at 5 × 104 spores ml−1. The non-ionic detergent, Tween 20 was added to the spore solution as a surfactant at the rate of 0.05% of the spore solution. Water was sprinkled in and around the inoculated plants in the polybags and the plants were covered with polyethylene sheet during the nights for 7 days to maintain high humidity (95%). Severe rust and LLS developed on these plants in two weeks. The infected plants in polybags were transplanted in the infector rows of the trial at one-meter distance around 50 days after sowing (DAS). Conidia of LLS and *urediniospores* of rust were sprayed at a concentration of 5×104 spores ml−1 on infector rows of the trial. Sprinkler irrigation was provided to the trial daily for 30 min for a period of one month starting from the day of field inoculation with the pathogen to promote disease development (Sudini et al., 2015).

# Observations

The visual disease scoring on a modified 1 to 9 point scale for LLS and rust given by Subrahmanyam et al. (1995) was used for recording disease scores at three different crop growth stage *viz.,*  75, 90, and 105 DAS or at harvest for the entries maturing in <105 days. This is a standard procedure for recording disease scoring for genotypes of medium maturity group (100 to 130 days). The disease severities corresponding to the rust and LLS scores are 1 = 0%; 2 = 1–5%; 3 = 6–10%; 4 = 11–20%; 5 = 21–30%; 6 = 31–40%; 7 = 41–60%; 8 = 61–80%; and 9 = 81–100%. Based on the disease severity scores at 90 and 105 DAS, genotypes were categorized into resistant (≤3), moderate resistance (4–5), susceptible (6–7), and highly susceptible (>7) (Sudini et al., 2015). Genotypes with lowest severity ratings for LLS and rust at ICRISAT and Aliyarnagar were selected for evaluating the disease progress at 75, 90, and 105 DAS and were compared with that of resistant and susceptible checks. Days to maturity, hundred kernel mass and pod yield per hectare was also recorded across the environments. Haulm yield per plant was only recorded at ICRISAT during rainy 2015.

# Statistical Analysis

Standard statistical procedures were adopted for data analysis. Individual, as well as combined analysis of variance (ANOVA) was computed using general linear mixed model using proc glm function of SAS version 9.2 (SAS Institute Inc, 2008). Best linear unbiased predictions or adjusted means were estimated for every trait except disease severity scores of rust and LLS because higher severity score among both the replications was considered as the final score of genotype. Test for the homogeneity of error variances was conducted for disease severity scores and yield traits using Levene's test (Steel and Torrie, 1980). Genotypes which had ≤3 disease severity scores for either of the diseases at ICRISAT\_R15 were selected to check their stability for reaction against both the diseases and pod yield across the environments. The stability analysis of 110 selected genotypes for disease reaction against LLS and rust at 90 DAS was done using the data recorded in rainy season 2015 across three locations whereas for pod yield, data recorded during post-rainy 2015–16 data was also used in analysis.

Site regression analysis (commonly known as GGE biplot) was used to illustrate the genotype plus genotype-by-environment variation using principal components (PC) scores from singular value decomposition (SVD) (Yan et al., 2000). GGE biplot with average-environment coordination (AEC) and polygon view was drawn to examine the performance of all genotypes within a specific environment and to simultaneous select genotypes based on stability and mean performance. The model for the GGE based on SVD of first two PCs is given by:

$$Y\_{\#} - \mu - \beta\_{\flat} = \lambda\_1 \xi\_{\sharp 1} \eta\_{\flat 1} + \lambda\_2 \xi\_{\sharp 2} \eta\_{\flat 2} + \varepsilon\_{\#}$$

Where *Yij* is the mean performance of genotype i in environment *j*,µ is the grand mean, β*<sup>j</sup>* is the environment *j* main effect, λ1 and λ2 are the singular values of the first and second PC, ξ*i*1 and ξ*i*2 are the eigenvectors for genotype i*,* and η*j*1 and η*j*2 are the eigenvectors for environment *j* and ε*ij* is the residual effect. Simple scatter plot was also plotted for comparing environmentcentered incidence score of genotypes in two environments. All analyses were performed using GenStat software 15th edition (VSN International, Hemel Hempstead, UK).

# RESULTS

#### Analysis of Variance and Genetic Variability Parameters for Disease Resistance

Individual environment ANOVA revealed significant genotypic differences (*p* < 0.001) for LLS and rust disease score at 90 DAS, days to maturity and pod yield per hectare under disease pressure (rainy 2015) and pod yield under the absence of disease pressure (Post-rainy 2015–16) (data not presented). Combined ANOVA showed significant genotypic differences along with significant environment and genotype × environment (G×E) interaction (GEI) effects (*p* < 0.001) for LLS and rust disease score at 90 DAS, days to maturity and pod yield per hectare under disease pressure and disease free condition. The environmental variance was high for both diseases. The genotypic variance was high compared to G×E interaction variances (**Table 1**).

The estimates of genetic variability parameters revealed high genetic variability for rust and LLS at 75, 90, and 105 DAS (**Table 2**). In general, the phenotypic coefficient of variation (PCV) was higher than the genotypic coefficient of variation (GCV) across individual environments and pooled analysis. The GCV and PCV values were moderate at 90 DAS and low at 105 DAS. High estimates of heritability in broad sense for rust (82.0%) and LLS (80.9%) disease score at 90 DAS coupled with high genetic advance as percent of mean (GAM) (28.2% for rust and 21.2% for LLS) was reported across the environments.

#### Disease Reaction of Genotypes Against LLS

The disease pressure was high for LLS and rust at Aliyarnagar and ICRISAT as observed by the disease severity score of ≥8 for the susceptible cultivar TMV 2 at 90 DAS. Moderate disease pressure was observed at Jalgaon wherein a disease severity score of 5 was recorded on TMV 2 at 90 DAS for LLS and rust.

The disease score of genotypes for LLS at ICRISAT varied from 1 to 6 at 75 DAS, 2 to 9 at 90 DAS and from 4 to 9 at 105 DAS. However, at Aliyarnagar it varied from 1 to 4 at 75 DAS, 2 to 8 at 90 DAS and from 3 to 9 at 105 DAS; whereas at Jalgaon it varied from 1 to 3 at 75 DAS, 1 to 6 at 90 DAS, and from 2 to 8 at 105 DAS (**Table 2**). Due to moderate disease pressure at Jalgaon, the genotypes were not categorized into resistant and susceptible groups. Out of 340 genotypes of GSTP, 67 reported as resistant (R), 167 as moderately resistant (MR), 104 as susceptible (S) and two genotypes as highly susceptible (HS) to LLS at 90 DAS whereas, five genotypes exhibited R, 35 MR, 126 S, and 174 HS reaction to LLS at 105 DAS at Aliyarnagar (**Figure 1**). Out of five resistant lines, four were matured in >115 days whereas one line SPS 7 matured in 104 days. At ICRISAT, nine R, 67 MR, 148 S, and 116 HS genotypes to LLS at 90 DAS at ICRISAT (**Figure 1**). Of the nine resistant lines, only one (ICGV 86699) matured in <100 days whereas eight other lines matured in >120 days with disease score of 4 to 5 at 105 DAS. None of the genotypes showed resistant reaction to LLS up to 105 DAS at ICRISAT, while 19 genotypes had MR, 47 S, and 274 HS reaction to LLS at 105 DAS (**Figure 1**).

The pooled LLS scores varied from 1 to 4 at 75 DAS, 2 to 7 at 90 DAS, and 4 to 8 at 105 DAS. Thirty-one genotypes showed R, 162 MR, and 147 S reaction against LLS at 90 DAS across the environment (**Figure 1** and **Supplementary Table 3**). None of the genotypes of GSTP showed R reaction against LLS up to 105 DAS while 38 identified as MR, 176 as S, and 126 as HS to LLS at 105 DAS across the environment (**Figure 1**). Out of moderately resistance genotypes, 7 matured in ≤120 days where 31 other matured in >120 days.

At ICRISAT, 283 out of 340 genotypes matured in <120 days whereas remaining 57 genotypes matured in >120 days. Of the 283 genotypes, ICGV 86699 showed resistance to LLS at 90 DAS with disease score of 2, whereas four other lines, ICGVs 01273 and 00362, SPS 2, and SPS 8 were moderately resistance with disease score of 4 for LLS at 90 DAS. Nineteen genotypes showed resistance to rust with a score of ≤3 at 90 DAS. Out of 57 genotypes that matured later (>120 days), eight recorded a disease score of ≤3 at 90 DAS, and four to five at 105 DAS. Sixteen genotypes were moderately resistant to LLS with disease score of 4 to 5 at 105 DAS. However, nine genotypes showed a resistant reaction to rust with ≤3 disease score at 90 DAS and 3 to 5 at 105 DAS.



Where \*\* represents significant at 1% probability level.

dfa, Degree of freedom for LLS75, LLS90, LLS105, Rust75, Rust90, Rust105; dfb, Degrees of freedom for days to maturity and pod yield per hectare; LLS75, LLS90, and LLS105, Disease severity score of late leaf spot recorded at 75, 90, and 105 days after sowing, respectively, and Rust75, Rust90, and Rust105, Disease severity score of rust recorded at 75, 90 and 105 days after sowing, respectively; DM, Days to maturity; PYH, Pod yield per hectare; ENV, Environment; REP, Replication.

TABLE 2 | Mean, range, and genetic parameters for disease severity scores to LLS and rust on GSTP of peanut evaluated across the locations during rainy season 2015.


LLS75, LLS90, and LLS105 = Disease severity score of late leaf spot at 75, 90 and 105 days, respectively; Rust75, Rust90, and Rust105 = Disease severity score of rust at 75, 90 and 105 days, respectively; Min, Minimum; Max, Maximum; GCV, Genotypic co-efficient of variation (%); PCV, Phenotypic co-efficient of variation (%); h2 bs, Heritability in broad sense (%); GAM, Genetic advance as percent of mean (%).

#### Disease Reaction of Genotypes Against Leaf Rust

The disease severity scores of genotypes for rust at Aliyarnagar varied from 1 to 5 at 75 DAS, 1 to 8 at 90 DAS, and 2 to 9 at 105 DAS. At Jalgaon, the rust score varied from 1 to 3 at 75 DAS, 1 to 6 at 90 DAS, and 2 to 8 at 105 DAS. However, disease severity scores of genotypes for rust under artificial disease pressure at ICRISAT varied from 1 to 6 at 75 DAS, 2 to 8 at 90 DAS, and 3 to 9 at 105 DAS (**Table 2**). Out of 340 genotypes of GSTP, 87 exhibited R, 96 MR, 154 S and 3 HS reaction against rust at 90 DAS whereas

environments during rainy season 2015.

11 genotypes were reported with R, 38 with MR, 151 with S, and 140 with HS reaction against rust at 105 DAS at Aliyarnagar (**Figure 2**). However, 51 genotypes reported as R, 75 as MR, 166 as S and 48 as HS to rust at 90 DAS under artificial disease pressure at ICRISAT (**Figure 2**). Three genotypes showed resistant reaction against rust up to 105 DAS while 43 genotypes were reported as MR, 69 as S, and 225 as HS to rust at 105 DAS under artificial disease pressure at ICRISAT during rainy 2015 (**Figure 2**).

The disease severity scores of genotypes for rust across the environments varied from 1 to 4 at 75 DAS, 2 to 7 at 90 DAS, and 3 to 8 at 105 DAS. Out of 340 genotypes, 66 exhibited R, 138 MR, and 136 S against rust at 90 DAS across the environments (**Figure 2** and **Supplementary Table 4**). However, eight genotypes showed R, 59 MR, 173 S, and 100 HS reaction against rust across the environments at 105 DAS (**Figure 2**).

#### Stability of Disease Reaction Across the Environments

Out of 340 genotypes of GSTP evaluated for resistance to rust and LLS along with yield traits, 109 genotypes which had ≤3 disease severity score for rust and LLS at ICRISAT and Aliyarnagar along with a susceptible check (TMV 2) were subjected to stability analysis to identify stable sources of disease resistance and pod yield. The GGE biplot graphically explains genotype main effect along with genotype × environment interaction using first two principal components (PC1 and PC2) derived from SVD of the environment-centered data. The first two PCs in the biplot (PC1 and PC2) explained 87.51% and 89.94% of the total variation due to genotype main effect and GEI for LLS and rust at 90 DAS, respectively (**Figure 3**).

#### (a) Polygon View of GGE Biplot for LLS and Rust Scores at 90 DAS

The polygon view of a biplot is the best way to visualize the interaction patterns between genotypes and environments to show the presence or absence of crossover GEI which is helpful in estimating the possible existence of different megaenvironments. Visualization of the "which won where" pattern of MET data is necessary for studying the possible existence of different mega-environments in the target environment. In the biplot presented in **Figures 3A, B**, a polygon was formed by connecting the vertex genotypes with straight lines and the rest of the genotypes placed within the polygon. For LLS score at 90 DAS the vertex genotypes were 262, 238, 3, 73, 186, 269, 82, 321, 256, and 268 (**Figure 3A**). These genotypes were the best or the poorest genotypes for disease resistance in some or all of the environments because they were farthest from the origin of the biplot. From the polygon view of biplot analysis of MET data in three environments, the genotypes fell in four sections and the test environments fell in two sections. The first section contains the test environments Aliyarnagar and Jalgaon and the vertex genotypes for this section were genotype 73 (TMV 2) which is susceptible to LLS whereas genotype 262 (ICGV 86699) plotted farthest on the left side indicates lowest disease scores across the environments. The second section contains the environments ICRISAT\_R15 (ICRISAT rainy season 2015) with the genotype 321 (ICG 13895) as the high scoring genotype for LLS.

Similarly, for rust score at 90 DAS the vertex genotypes were 296, 109, 305, 174, 73, 186, 82, and 268 (**Figure 3B**). These genotypes were the best or the poorest genotypes for rust resistance in some or all of the environments because they were farthest from the biplot origin on either of the sides. For rust, the genotype 73 (TMV 2) plotted farthest on the right side of the biplot indicating its high susceptibility, whereas genotypes 236 (ICGV 99052) and 301 (ICG 11426) which plotted farthest on the left side of biplot were resistant across the environments.

#### (b) Mean Performance and Stability of Genotypes for LLS and Rust Score at 90 DAS

The ranking of 109 genotypes of GSTP based on their disease severity score and stability performance for LLS and rust have

FIGURE 2 | Categorization of genotypes based on reaction against rust at 90 and 105 DAS at Aliyarnagar, ICRISAT and pooled across the environments during rainy 2015.

been presented in **Figures 3C, D,** respectively. The line passing through the biplot origin is called the average environment axis (AEA), which is defined by the average PC1 and PC2 scores of all environments. A concentric circle drawn on AEA is called AEC. The genotypes closer to concentric circle indicates higher mean performance. The line which passes through the biplot origin and is perpendicular to the AEA represents the stability of genotypes. Distance in either direction away from the biplot origin on this axis indicates greater GEI and reduced stability. The genotypes on the right side of this perpendicular line performed greater than mean disease severity score across the environments and the genotypes on the left side of this line had lesser score than mean across the environments. In the biplot, the genotypes plotted left side of biplot and have the shortest vector from the AEA are better genotypes. For selection, the stable resistant genotypes are those with both lowest disease severity score and least vector length from AEA. The genotype 71 (GPBD 4), 238 (ICGV 00248), 84 (ICGV 06142), 152 (ICGV 02411), 237 (ICGV 00246), 246 (ICGV 00068), 293 (SPS 11), and 301 (ICG 11426) were found as stable resistant genotypes based on their disease score for LLS and vector length from AEA (**Figure 3C**). The genotype 262 (ICGV 86699) had lowest disease score of LLS compared to others with greater vector length from AEA.

For rust the genotypes 236 (ICGV 99052), 301 (ICG 11426), 235 (ICGV 99051), 262 (ICGV 86699), 71 (GPBD 4), 27 (ICGV 06422), 30 (ICGV 07223), 32 (ICGV 07235), 77 (ICGV 05100), 84 (ICGV 06142), 152 (ICGV 02411), 153 (ICGV 05155), 229 (ICGV 00362), 237 (ICGV 00362), 238 (ICGV 00248), 239 (ICGV 01361), 252 (ICGV 99160), 253 (ICGV 02323), 260 (ICGV 87846), 288 (SPS 2), 291 (SPS 7), 293 (SPS 11), 296 (SPS 21), and 303 (ICGV 02446) can be considered as stable resistant genotypes based on their low disease score and short vector length (**Figure 3D**). Also, the genotype 109 (49 M-16) and 268 (ICGV 05032) had lower mean disease score for rust but greater vector length from AEA indicating their unstable nature. Genotype 109 recorded high disease score at Aliyarnagar whereas 268 recorded high disease score at ICRISAT.

#### (c) Relationship Among Test Environments

The summary of the interrelationships among the test environments for LLS and rust has been presented in **Figures 3E**, **F,**  respectively. The lines that connect the biplot origin and the markers for the environments are called environment vectors. The angle between the vectors of two environments is related to the correlation coefficient between them. The cosine of the angle between the vectors of two environments approximates the correlation coefficient between them. Acute angles indicate a positive correlation, obtuse angles a negative correlation and right angles indicate no correlation. A short vector may indicate that the test environment is not related to other environments. Based on the angles between environment vectors, all the three environments (Aliyarnagar, Jalgaon, and ICRISAT\_R15) were positively correlated with each other for LLS and rust because of acute angles (< 90°) formed between them. View of position of environments on biplot revealed that ICRISAT\_R15 was the most suitable environment for screening genotypes for LLS and rust followed by Aliyarnagar whereas Jalgaon was the poor environment plotted nearer to biplot origin indicates that genotypes recorded lower disease scores at this environment. Also, the ranking of environments with respect to ideal test environments (**Figures 3E, F**) revealed that the ICRISAT\_R15 and Aliyarnagar plotted on the border of inner circle in the biplot indicating that both had similar disease pressure and are ideal for cultivar evaluation against LLS and rust disease.

#### **Stability for Pod Yield**

The partitioning of GEI through GGE biplot analysis showed that PC1 and PC2 together accounted for 81.20% of GGE mean sums of squares for pod yield per hectare (**Figure 4**). The vertex genotypes in the biplot are 79, 24, 293, 3, 267, 165, 328, 334, 321, 34, and 335 indicating that these genotypes were the best or the poorest genotypes for pod yield per hectare in some or all the environments depending on their direction from the origin (**Figure 4**). The polygon view of MET data of four environments in the biplot showed that genotypes fell in four sections whereas the test environments fell into two sections. The first section contains the test environments Aliyarnagar and Jalgaon, whereas the second section contains ICRISAT\_R15 and ICRISAT\_PR15 (ICRISAT post-rainy season 2015–16). Among these four environments, ICRISAT\_PR15 was farthest from the biplot origin followed by Jalgaon, Aliyarnagar and ICRISAT\_R15. The distance indicated that the genotypes performed better at ICRISAT\_PR 15 followed by Jalgaon.

The ranking biplot of genotypes based on mean pod yield and stability revealed that genotype 154 (ICGV 06100), 26 (ICGV 05163), 153 (ICGV 05155), 30 (ICGV 07223), 32 (ICGV 07235), 253 (ICGV 02323), 266 (ICGV 06099), 37 (ICGV 07120), 152 (ICGV 02411), 25 (ICGV 05161), 45 (ICGV 03043), 1 (ICGV 06423), 42 (ICGV 01273), and 27 (ICGV 06422) were superior and stable performers across the environments. The genotype 3 (ICGV 07247), followed by 24 (ICGV 03064), 293 (SPS 11), 180 (ICGV 01276), 247 (ICGV 01495), 84 (ICGV 06142), 43 (ICGV 01274), 76 (ICGV 03042), 109 (49 M-16), and 268 (ICGV 05032) were also higher yielding genotypes but greater vector length from AEA indicates their unstable performance for pod yield per hectare (**Figure 4**). Among these, 3 (ICGV 07247), 180 (ICGV 01276), 247 (ICGV 01495), and 109 (49 M-16) are plotted near to Aliyarnagar and Jalgaon indicating their environment specific adaptability under these environments whereas 24 (ICGV 03064), 293 (SPS 11), 84 (ICGV 06142), 43 (ICGV 01274), and 76 (ICGV 03042) plotted towards ICRISAT\_PR15 indicating their superior performance at ICRISAT during post-rainy season compared to other genotypes (**Figure 4**).

# Discussion

In the present study, significant differences for genotypes, environment and G × E interaction effects was observed for disease scores of LLS and rust at all three stages of growth (75, 90, and 105 DAS) indicating their polygenic nature and the role of genotype, environment and their interaction in disease infection, establishment, and spread. The diverse nature of location and differences in the environmental condition is reflected by mean squares due to environment in ANOVA indicates that

the environment plays an important role in these disease traits. The mean squares due to error term represent the unexplained variation in the experiment. The negligible error mean square values for disease traits could be attributed to the precision in conducting experiment and analysis, the robustness of experimental design in explaining/partitioning the total variation into different sources of variation and the unit of measurement. While stable resistance across the growing environment can be identified from the present study, the significant G, E, and G × E interactions for resistance to LLS and rust suggests the possibility of identifying resistance with specific adaptability to a target environment and the need to deploy specifically adapted varieties in future for a more effective genetic control of these diseases. The significant environment and G × E interaction effects on rust and LLS resistance (Iwo and Olorunju, 2009; Mothilal et al., 2010a) and pod yield in peanut are also evident from earlier studies (Makinde and Ariyo, 2011; Upadhyaya et al., 2014). The complex nature of inheritance including the role of polygenes with additive effect for LLS (Nevill, 1982; Jogloy et al., 1987; Wambi et al., 2014) and rust (Singh et al., 1984), and the involvement of maternal genes in the inheritance of LLS was also reported (Janila et al., 2013; Narasimhulu et al., 2013). The comparison of mean pod yield of susceptible (SG) and resistant genotypes (RG) at Aliyarnagar (996 kg in SG and 1981 kg in RG), ICRISAT\_R15 (1312 kg in SG and 2329 kg RG), and Jalgaon (1579 kg in SG and 1606 kg in RG) showed yield penalty due to disease incidence of LLS and rust. Both the diseases cause serious damage to the crop with pod yield losses up to 70% in commonly grown susceptible cultivars (Harrison, 1973; Subrahmanyam and McDonald, 1987).

Out of 340 genotypes, a total of 31 (9.1%) genotypes were resistant to LLS across the environments. Of these 15 were from ssp. *fastigiata* var *vulgaris* (Spanish bunch), two from ssp. *fastigiata* var *fastigiata* (Valencia) and 14 from ssp. *hypogaea* var *hypogaea* (Virginia bunch). However, 66 genotypes exhibited resistant reaction against rust across the environments, of which 39 were from ssp. *fastigiata* var *vulgaris*, 26 from ssp. *hypogaea*  var *hypogaea* and a single genotype from ssp. *fastigiata* var *peruviana*. A total of 28 genotypes showed resistant reaction against both the diseases with ≤3 disease severity score across the environments at 90 DAS. Among these, 15 genotypes were Spanish bunch type whereas 13 were Virginia bunch type. Nine out of 28 genotypes *viz.,* SPS 11, ICGV's 05163, 01274, 06142, 07235, 02323, 02411, 03043, and 49 M-16 recorded >2500 kg equivalent to 77% to 120% increase in pod yield per hectare over the control (**Table 3**).

Out of 69 resistant genotypes for either rust and LLS or both, 45 belong to *A. cardenasii*, 18 to *A. hypogaea* and 6 to *A. villosa*. A total of 14 genotypes matured in <120 days, of which *A. villosa* derived genotypes had high level of disease resistance to both the diseases (3.00 and 2.50 for LLS and rust at 90 DAS, respectively) followed by *A. cardenasii* (3.67 and 2.77 for LLS and rust at 90 DAS, respectively) and *A. hypogeae* (3.53 and 3.13 for LLS and rust at 90 DAS, respectively). Similarly, out of 55 genotypes that matured in >120 days, *A. villosa* derived genotypes had high level of disease resistance (4.00 and 3.92 for LLS and rust at 105 DAS, respectively) followed by *A. cardenasii* (5.11 and 4.34 for LLS and rust at 105 DAS, respectively) and *A. hypogaea* (5.62 and 4.62 for LLS and rust at 105 DAS, respectively) (**Supplementary Table 2**). In 44 out of 66 genotypes, resistant to rust was derived from *A cardinasii*, in 16 from *A hypogaea* and in six from *A villosa*. Whereas among the 31 genotypes resistant to LLS, 20 had *A. cardenasii* as source of resistance, seven from *A hypogaea* and four from *A. villosa* (SPS 2, SPS 8, SPS 11 and SPS 20). A mutant line M 28-2 belongs to species *hypogaea* used as a source of resistance to develop 49 M-16 and 49 M-1-1 (**Supplementary Table 2**). At Dharwad center, GPBD 4 a popular resistant cultivar was derived from a cross between KRG 1 and ICGV 86855. KRG 1 is an early maturing, Spanish bunch local cultivar susceptible to foliar diseases developed at the Regional Research Station, Raichur, Karnataka through selection from material introduced from Argentine. Whereas, ICGV 86855 (*A hypogaea* x *A. cardenasii*) is a Virginia bunch interspecific derivative, resistant to rust and late leaf spot developed at ICRISAT, Patancheru, India (Gowda et al., 2002a). GPBD 4 and ICGV 86699 derivatives of *A. cardenasii* are the most commonly used sources for rust and LLS resistance breeding programs in India. The identification of resistant lines from the derivatives of *A. villosa* and mutagenesis opens the possibility of widening the genetic base of resistance to both diseases in peanut, which has largely relied on *A. cardinasii* so far. Among the 28 genotypes showed resistance to both the diseases, 20 are advanced breeding lines developed at ICRISAT, seven from University of Agricultural Sciences, Dharwad and a single line from mini-core collection indicating accumulation of favorable


TABLE 3 | Superior performing genotypes for LLS and rust resistance and other yield traits across the environments during rainy season 2015.

Where LLS90 and Rust90, Disease severity score of late leaf spot and rust across the environments at 90 days after sowing, respectively; HKM, Hundred kernel mass (g); DM, Days to maturity; PYH, Pod yield per hectare (kg); HYPP, Haulm yield per plant (g); SC, Susceptible check; RC, Resistant check.

alleles for resistant to rust and LLS in the breeding populations. Therefore, recycling the resistant advanced breeding lines results in enhanced genetic gain for resistance to rust and LLS. Resistance in peanut is often associated with long maturity duration (Nigam, 2000). In the present study, the regression analysis showed negative association among disease resistance and maturity duration with a lesser value of the coefficient of determination for LLS (0.18 and 0.24) and rust (0.17 and 0.24) at 90 and 105 DAS, respectively (**Figures 5A**–**D**). The lines that mature in <120 days, disease score at 90 DAS can be used for comparison among the lines whereas the disease score at 105 DAS should also be considered for the lines that mature in >120 days. Out of five resistance lines identified in Aliyarnagar, four were matured in >115 days whereas one line (SPS 7) matured in 104 days. Similarly, out of nine resistance lines to LLS at 90 DAS, only one (ICGV 86699) has matured in <100 days whereas eight other lines matured in >120 days with disease score of 4–5 at 105 DAS indicates that these lines could resist LLS till 105 DAS. The identified rust and LLS resistant genotypes belonged to early and medium maturity groups (varied from 100 to 130 days) with desirable pod and seed features **(Figures 5A**–**D**; **Table 3**). Hence, they can be directly utilized in resistance breeding in peanut. Combining foliar fungal disease resistance with early maturity has been a challenge in peanut breeding programs, thus early/medium maturing sources of resistance are preferred by breeders to combine disease resistance with early maturity and high pod yield potential. A few genotypes with early maturity and tolerance to LLS were earlier reported (Branch and Culbreath, 1995). In the present study, GPBD 4, ICGV's 06142, 02411, 00246, 00248, 00068, 86699, SPS 11, and SPS 20 showed multiple disease resistance with lowest scores for rust and LLS across the environments belong to early and medium maturity group. Genotypes with multiple disease resistance were earlier reported by Gowda et al. (2002a; 2002b), Narasimhulu et al. (2013), and Sudini et al. (2015). The significant G × E interaction effects also create the need to identify stable source of resistance that can perform better under a wide range of environments and/or identify resistance with specific adaptation to a target environment. Being polygenic in nature with background effect, transfer of resistant genes into different backgrounds is quite difficult through conventional breeding (Janila et al., 2013). Hence, it is suggested to use modern tools like genomic selection to overcome the above limitations and realize a higher rate of genetic gain in the breeding programs.

An important objective of resistance breeding is to identify genotypes with durable resistance irrespective of the environment. Horizontal resistance or quantitative resistance is governed by many small effect QTLs or genes with additive effect on resistance mechanism and thus offers more durable resistance compared to vertical major gene resistance. Such type of durable resistance was reported for rust resistance in wheat (Johnsons, 1978), leaf rust of barley (Parlevliet, 1975), stem and leaf rust resistance of wheat (Singh and Rajaram, 1992). Similarly, genetics of LLS and rust resistance in peanut also indicated quantitative inheritance with additive effect of minor genes on inheritance (Janila et al., 2013). Hence, for durable resistance selection for minor gene along with major one should be focused by the breeders. Molecular-assisted selection can assist in the selection of major genes. Major QTLs linked to LLS and rust governing 67% and 80% phenotypic variation were identified in peanut (Khedikar et al., 2010; Sujay et al., 2012; Kolekar et al., 2016) and used to introgress LLS and rust resistance into elite varieties (Janila et al., 2016). Also, SNPs for LLS and rust

were developed and are under validation for use in molecular breeding (Pandey et al., 2017). However, to achieve the desired impact both major and minor QTLs need to be identified. Several approaches such as marker-assisted recurrent selection and GS have been proposed to identify the minor genes and improve the durable resistance. The multi-environment LLS and rust phenotyping data presented in the paper will be combined with whole genome sequencing data to develop genomic selection prediction models that can be utilized to detect the small effects QTLs (Meuwissen et al., 2001). The GS model can then be used in a breeding program to select genotypes for crossing nurseries and individuals of early generations based on their GEBV without laborious phenotypic work. GS is the best approach to capture the effect of each and every minor QTL and increases the frequency of favorable alleles in individuals of advanced generations. GS is one of the important genomic tool that can increase selection intensity and accuracy which is required to accelerate genetic gains for complex traits. Considering GEI in GEBV will be helpful to obtain the end product adaptable to a wide range of environments and can also be useful to predict the performance of genotypes in untested environments (Schulz-Streeck et al., 2013).

The significant G × E interaction effects indicate the need to identify stable sources of resistance that can perform better under a wide range of environments. The GGE biplot analysis for disease severity to LLS and rust at 90 DAS explained high proportion of variation (~90%) due to GEI. The ranking of 109 genotypes of GSTP based on their disease severity score and stability performance identified eight genotypes stable for resistance to LLS, whereas 24 as stable for rust across the environments.

The position of environments on biplot revealed positive correlation among environments. The results indicated ICRISAT\_R15 as best environment for screening for LLS and rust followed by Aliyarnagar and Jalgaon. It could be attributed to the better artificial foliar disease screening facilities available at ICRISAT. The moderate disease pressure at Jalgoan could be attributed to unfavourable environmental components such as low humidity (< 85%) and lack of rains during disease infection, establishment and spread. Environmental factors especially relative humidity, temperature, and rainfall plays an important role in disease infection and establishment of rust and LLS (Nigam et al., 1991; Cu and Phipps, 1993). For the conidial production by *Phaesaeriopsis personata*, a minimum of ≥95% relative humidity for 4 h per day is needed whereas highest numbers of conidia were produced when the lesions were subjected to ≥95% relative humidity for 16 h or more (Alderman and Nutter, 1994). Besides these, sowing at Jalgaon (23 June 2015) was nearly 15 days earlier compared to Aliyarnagar (07 July 2015) and ICRISAT (10 July 2015). Hence, the genotypes could have possibly escaped the peak disease period resulting in low infection. The significant influence of sowing time on disease severity of rust and LLS was earlier reported by Naidu and Vasanthi (1995).

The ultimate aim of the breeder is to develop genotypes which have high and stable yield performance along with disease resistance across environments/locations. In the present study, biplot analysis identified stable genotypes for pod yield that performed consistently across the environments as-well-as genotypes that are well adapted to the specific environment. Finding environment specific adaptability is also important to develop cultivars for a targeted region with region-specific adapted traits. The stable genotypes across the environments can be released after evaluation and comparison with popular national checks. Genotypes with stable yield performance were earlier reported by Mothilal et al. (2010b), Hariprasanna et al. (2008), and Pradhan et al. (2010). The biplot for pod yield per hectare shows that among the four environments, ICRISAT\_ PR15 plotted separately indicating that the performance of genotypes during the post-rainy season was different compared to the rainy season at ICRISAT. The superior performance of genotypes during the post-rainy season can be attributed to disease-free condition. In the present study, most of the stable genotypes for yield and its contributing traits are improved breeding lines. The genotypes from mini-core and reference set collection do not possess high yield potential but can contribute desirable genes or QTLs for other traits like disease resistance and nutritional quality traits (Upadhyaya et al., 2014 and Patil et al., 2014). Different germplasm lines with disease resistance and nutritional quality traits were earlier identified in mini-core collection (Upadhyaya et al., 2005).

# CONCLUSIONS

The present study evaluated ICRISAT's GSTP of peanut for resistance to late leaf spot and rust. The GSTP comprising 340 genotypes including trait-specific advanced breeding lines from ICRISAT and UAS, Dharwad, lines from ICRISAT's mini-core and reference set collection, and popular varieties cultivated in India. The study identified genotypes resistant to LLS and rust under natural and artificial diseases epiphytotic conditions. The resistant genotypes are also useful for recycling as elite parents in peanut breeding program. The hurdle of late maturity associated with resistance to LLS and rust can be overcome using early maturing sources (ICGV 86699, ICGV 01274 and SPS 8) identified in the study. Majority of the lines in GSTP were evaluated for LLS and rust for first time and the extensive variability in early and medium maturing lines indicates a positive step for improvement of LLS and rust resistance in peanut. High heritability coupled with high GAM for resistance across the environments result in greater response to selection. Understanding on mechanism of resistance in genotypes identified for specific adaptation and wide adaptation will enable the peanut breeders to diversify the genetic base of resistance to foliar fungal diseases. Significant differences in resistance among studied environments and influence of G × E interactions on resistance suggests that deployment of target ecology specific resistance to LLS and rust will be beneficial. The extensive losses of pod and haulm yield and quality caused by LLS and rust across the rainfed production environments and the pod yield increase as a consequence of resistance offered to foliar fungal diseases suggests the possibility of considering 'foliar fungal disease resistance' as a must-have trait in all the groundnut cultivars that will be released for cultivation in rainfed ecologies in Asia and Africa.

# DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the article/ **Supplementary Files**.

# AUTHOR CONTRIBUTIONS

JP and SC designed the experiment; SC carried out experiment at ICRISAT, Patancheru, collected, analyzed data, prepared tables, interpreted the results, and written manuscript. JP constituted the GSTP based on the available historical data; DK and JP has prepared work plan and reviewed the manuscript critically, SP and SS conducted trials and recorded data at Jalgaon and Aliyarnagar, respectively. MV helped in drafting and revising the manuscript critically, HS prepared disease screening nursery and inoculum of spores for both the fungus. SM helped in generating field layout, multiplying and sending seeds, and technical resources during the course of experiment. RB has shared some of the genotypes for the study. All authors have read and approved the final manuscript.

# REFERENCES


# FUNDING

The authors are thankful to CRP-Grain Legumes and Dryland Cereals (CRP-GLDC) for financing the research work and Bill and Melinda Gates Foundation for providing the scholarship to the first author and financial support to conduct this experiment.

## ACKNOWLEDGMENTS

The authors are thankful to CRP-Grain Legumes and Dryland Cereals (CRP-GLDC) for financing the research work; Bill and Melinda Gates Foundation for providing scholarship to first author (through Tropical Legume-II project); Coconut Research Station, TNAU, Aliyarnagar, Tamil Nadu and Oilseeds Research Station, MPKV, Jalgaon, Maharashtra for providing field and technical support to conduct trials at their respective location. Authors acknowledge receipt of lines from UAS, Dharwad. The work is carried out under CRP-GLDC.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01338/ full#supplementary-material


SAS Institute Inc. (2008). SAS/STAT® 9.2 User's Guide. Cary, NC: SAS Institute Inc.


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Chaudhari, Khare, Patil, Sundravadana, Variath, Sudini, Manohar, Bhat and Pasupuleti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# High Temperatures During the Seed-Filling Period Decrease Seed Nitrogen Amount in Pea (Pisum sativum L.): Evidence for a Sink Limitation

#### Annabelle Larmure\* and Nathalie G. Munier-Jolain

Agroécologie, AgroSup Dijon, INRA, Univ. Bourgogne Franche-Comté, Dijon, France

#### Edited by:

Penelope Mary Smith, La Trobe University, Australia

#### Reviewed by:

Eva Stoger, University of Natural Resources and Life Sciences Vienna, Austria Petr Smýkal, Palacký University Olomouc, Czechia

#### \*Correspondence:

Annabelle Larmure annabelle.larmure@agrosupdijon.fr

#### Specialty section:

This article was submitted to Plant Physiology, a section of the journal Frontiers in Plant Science

Received: 29 March 2019 Accepted: 15 November 2019 Published: 20 December 2019

#### Citation:

Larmure A and Munier-Jolain NG (2019) High Temperatures During the Seed-Filling Period Decrease Seed Nitrogen Amount in Pea (Pisum sativum L.): Evidence for a Sink Limitation. Front. Plant Sci. 10:1608. doi: 10.3389/fpls.2019.01608 Higher temperatures induced by the on-going climate change are a major cause of yield reduction in legumes. Pea (Pisum sativum L.) is an important annual legume crop grown in temperate regions for its high seed nitrogen (N) concentration. In addition to yield, seed N amount at harvest is a crucial characteristic because pea seeds are a source of protein in animal and human nutrition. However, there is little knowledge on the impacts of high temperatures on plant N partitioning determining seed N amount. Therefore, this study investigates the response of seed dry matter and N fluxes at the whole-plant level (plant N uptake, partitioning in vegetative organs, remobilization, and accumulation in seeds) to a range of air temperature (from 18.4 to 33.2°C) during the seed-filling-period. As pea is a legume crop, plants relying on two different N nutrition pathways were grown in glasshouse: N2-fixing plants or NO3 − -assimilating plants. Labeled nitrate (15NO3 − ) and intra-plant N budgets were used to quantify N fluxes. High temperatures decreased seed-filling duration (by 0.8 day per °C), seed dry-matter and N accumulation rates (respectively by 0.8 and 0.032 mg seed−<sup>1</sup> day−<sup>1</sup> per °C), and N remobilization from vegetative organs to seeds (by 0.053 mg seed−<sup>1</sup> day−<sup>1</sup> per °C). Plant N2-fixation decreased with temperatures, while plant NO3 <sup>−</sup> assimilation increased. However, the additional plant N uptake in NO3 − -assimilating plants was never allocated to seeds and a significant quantity of N was still available at maturity in vegetative organs, whatever the plant N nutrition pathway. Thus, we concluded that seed N accumulation under high temperatures is sink limited related to a shorter seed-filling duration and a reduced seed dry-matter accumulation rate. Consequently, sustaining seed sink demand and preserving photosynthetic capacity of stressed plants during the seed-filling period should be promising strategies to promote N allocation to seeds from vegetative parts and thus to maintain crop N production under exacerbated abiotic constraints in field due to the on-going climate change.

Keywords: high temperatures, Pisum sativum L, Seed N amount, N partitioning, 15N labeling, seed-filling, plant N uptake

# INTRODUCTION

Temperature is one of the main environmental factors explaining the variations in seed yield and quality in annual crop plants, especially legumes (Wheeler et al., 2000; Peng et al., 2004; Schlenker and Roberts, 2009; Asseng et al., 2011; Sita et al., 2017). The observed global increase in temperature (1.0°C of global warming above pre-industrial levels) is projected to continue by 0.2°C per decade due to past and ongoing emissions (including greenhouse gases) (IPCC, 2018). High temperatures are thus expected to be more frequent during the reproductive period of crops in temperate climate. They are already a major cause of the recent yields stagnation and projected decline due to the climatic change in Europe (Brisson et al., 2010; Supit et al., 2012; Trnka et al., 2012).

Pea (Pisum sativum L.) is an important annual legume crop grown in temperate regions for its high seed nitrogen (N) concentration. Including legumes in rotations leads to environmental benefits thanks to their unique capacity to acquire N via atmospheric N2 symbiotic fixation (Jensen and Hauggaard-Nielsen, 2003; Nemecek et al., 2008; Siddique et al., 2012). However, to extend the pea crop area in Europe, pea yield and seed protein concentration should be increased as well as their stability, especially in fluctuating climatic conditions (Siddique et al., 2012; Vadez et al., 2012).

Nitrogen yield (product of the yield and the seed N concentration) is a crucial characteristic at harvest in pea because seeds are a source of protein in animal and human nutrition. During the reproductive phase, N partitioning is the key process involved in the modulation of N yield. In most grain crops and, above all, in legumes, newly acquired N is generally low and insufficient to fulfill the high N demand of seeds, consequently endogenous N previously accumulated in vegetative parts is exported to seeds (Sinclair and Wit, 1976; Salon et al., 2001; Malagoli et al., 2005; Schiltz et al., 2005; Kichey et al., 2007; Barraclough et al., 2014). This remobilized N derives from the proteolysis of essential leaf proteins involved in photosynthesis, mostly Rubisco (Gregersen et al., 2008; Masclaux-Daubresse et al., 2008). The resulting decrease in leaf photosynthetic capacity may thus limit yield by shortening the duration of the seed-filling period (Sinclair and Horie, 1989; Munier-Jolain et al., 2008; Bueckert et al., 2015). Nitrogen remobilization not only affects yield, but also N yield since N remobilized from vegetative parts is the major contributor to seed N in most grain crops (Malagoli et al., 2005; Schiltz et al., 2005; Kichey et al., 2007; Araujo et al., 2012).

High temperatures affect plant phenology and carbon metabolism through various processes such as hastening reproductive development (Badeck et al., 2004; Barnabas et al., 2008; Bueckert et al., 2015; Sita et al., 2017), reducing photosynthesis (Guilioni et al., 2003; Kirschbaum, 2004; Sage and Kubien, 2007; Pimentel et al., 2013; Tacarindua et al., 2013), and reducing seed set (Guilioni et al., 2003; Djanaguiraman et al., 2013; Edreira and Otegui, 2013; Tacarindua et al., 2013; Bueckert et al., 2015). Conversely, impacts of high temperatures on assimilate partitioning remain unclear, especially concerning their effect on N remobilization to filling seeds. Some authors reported a decrease in N remobilizationfrom vegetative parts tofilling grainin response to heat stress in wheat (Triticum aestivum L.) (Tahir and Nakata, 2005) and in rice (Oryza sativa) (Ito et al., 2009). On the contrary, other authors suggest that high temperatures increase N remobilization from vegetative organs to seeds causing an acceleration of senescence (Spiertz, 1977; Morison and Lawlor, 1999; Masclaux-Daubresse et al., 2008; Zhao et al., 2011; Wu et al., 2012). Moreover, high temperatures may also affect N uptake of legumes (mainly acquired via N2fixation), but unfortunately little is known about temperate legume crops (Bordeleau and Prevost, 1994).

Further investigations are thus needed to improve the understanding of the effect of high temperatures on N assimilate partitioning during the seed-filling period and to quantify the impact on seed N yield in legumes. For this purpose, the present study therefore assessed the response of seed dry matter and N fluxes at the whole-plant level (seed N accumulation, N remobilization, plant N uptake, and N amount variation in vegetative organs) to contrasting temperature ranging from permissive to heat stress during the seed-filling period of pea (Guilioni et al., 2003). We compared N2-fixing and NO3 − -assimilating plants, the first being more representative of field conditions while the later allow the use of a 15NO3 − -labeled nutrient solution to assess N fluxes.

# MATERIALS AND METHODS

# Plant Material and Growth Conditions

Three different glasshouse experiments (Exp. 1, 2, and 3) were conducted. One single line of spring dry pea (cv. Baccara) has been used, all plants were genetically identical. Baccara characteristics are described in Bourion et al. (2002a; 2002b). Pea seeds were sown in 5 L pots at a density of eight plants per pot. Pots were filled with a 1:1 (v/v) mixture of sterilized attapulgite and clay balls (diameter 2–6 mm) in Exp. 1 and 3 or with a mixture of 1/6 vermiculite, 1/3 siliceous sand, and 1/2 clay balls (diameter 2–6 mm) in Exp. 2. After seedling establishment the plants were thinned to the four most homogeneous per pot. Plant N nutrition relied exclusively on NO3 <sup>−</sup> assimilation in Exp. 1 and 2 due to the high nitrate availability of the nutrient solution (14 meq NO3 − , P, K, and micronutrients; Table S1). In Exp. 3, pea plant N nutrition relied exclusively on N2 fixation due to a nutrient solution without nitrate (0 meq NO3 − , P, K, and micronutrients; Table S1) and an inoculation. Seedlings were inoculated with 1ml of Rhizobacterium leguminosarum bv. Vicieae, strain P221 (MIAE01212, 108 bacteria per plant), the strain usually used in the laboratory because of its good efficiency in particular with cv. Baccara (Voisin et al., 2013).

Photosynthetic active radiation was provided to the plants by day light and mercury lamps (MACS 400 W; Mazda, Dijon, France) with a 14-h day length. The air temperature was recorded every 5 min in order to calculate the mean daily temperature. Prior to the different temperature treatments, the glasshouse temperature was maintained at a day/night temperature of 24/16°C.

#### Temperature Treatments During the Seed-Filling Period

The experiments aimed at testing the effect of temperature during the seed-filling period. As peas are indeterminate plants with a sequential flowering up the stem leading to a wide heterogeneity of seed developmental stages, the temperature treatments started at the beginning of seed filling of the last reproductive node (BSL) and ended when seed physiological maturity was reached at the whole-plant level, as described by Larmure et al. (2005).

At BSL, different sets of pots were randomly transferred to glasshouses maintained at the different day/night temperatures until plant physiological maturity. The air temperature treatments tested in Exp. 1, 2, and 3 ranged approximately from 20/16°C to 35/ 31°C day/night (Table 1). In Exp. 1 and 2 monitoring NO3 − assimilating plants, respectively four and three day/night temperatures were chosen in order to form a range of seven temperatures. In Exp. 3 monitoring N2-fixing plants, four day/ night temperatures were chosen in order to form a temperature range similar to that tested for NO3 -assimilating plants.

The temperatures were modified gradually during two acclimatization days to reach the temperature objectives of each treatment. All temperature treatments are described in Table 1 including the average of mean air temperatures actually observed in the glasshouses (ranging from 18.4 to 33.2°C). Plants were maintained at the maximum soil water capacity by providing non-limiting water availability with an automatic watering system.

#### Plant Sampling and Measurements

Prior to the different temperature treatments, seed water concentration was destructively measured at each node twice a week to assess the date of BSL.

For each temperature treatment, randomly chosen pots were harvested: (1) at the beginning of the temperature treatment, (2) during the temperature treatment, and (3) after plant physiological maturity (three pots per treatment for Exp. 1 or five pots per treatment for Exp. 2 and 3). At each sampling date, seeds, leaves, stems, pod walls, and roots were collected separately. Dry matters, seed number, and water concentration were determined as described by Larmure et al. (2005).

In Exp. 1 and 2, total N concentrations and 15N enrichments were determined using a dual inlet mass spectrometer coupled with a CHN analyzer (Sercon, ANCA-GSL-2020). In Exp. 3, total N concentrations were determined with an elemental analyser (Carlo Erba).

#### Determination of N Fluxes

Nitrogen fluxes (seed N accumulation, endogenous-N remobilization, plant exogenous-N uptake, and N amount variation in vegetative organs) were expressed in mg seed−<sup>1</sup> day−<sup>1</sup> . This unit is adequate to depict N partitioning to seeds in plants, because the individual seed N accumulation rate depends on N available per seed (N from endogenousremobilization and exogenous sources) (Lhuillier-Soundélé et al., 1999; Larmure and Munier-Jolain, 2004). Moreover, this unit allows to compare N fluxes in plants differing in seed number and vegetative parts biomass.

# Plant 15N Labeling and Calculation of N Fluxes for

NO3 -Assimilating Plants 15N labeling sessions with NO3 − -assimilating plants (Exp. 1 and 2) were used to distinguish the remobilization of endogenous-14N stored before labeling from the exogenous-15N uptake supplied by 15NO3 <sup>−</sup> nutritive solution with 5% 15N APE (atom percent excess) enrichment. Successive 3-day labeling sessions were conducted during the temperature treatments as described by Schiltz et al. (2005). Homogenous groups of six pots for Exp. 1 or 10 pots in

TABLE1 | Glasshouse experiments characteristics and seed number, individual seed dry weight, seed N concentration and amount, and vegetative organs N concentration at maturity.


Pea plants were exposed to temperature treatments during the seed-filling period, i.e. from the beginning of seed filling of the last reproductive node (BSL) to plant maturity. Mean temperatures during the seed-filling period (with standard error) were assessed as the average of the daily air temperatures observed from BSL to maturity (14-h day length). Values with the same letter are statistically not different at P = 0.05.

Exp. 2 were constituted and randomly used for the each labeling session. The first labeling session began at the end of the two acclimatization days. In Exp. 1, three successive labeling sessions were conducted for all temperature treatments, except for the warmest treatment permitting only two labeling sessions due to an earlier physiological maturity. In Exp. 2, two successive labeling sessions were conducted for all temperature treatments. At the beginning of each session, unlabeled control pots were harvested (three pots for Exp. 1 or five pots for Exp. 2). During the session, labeled pots were supplied during three days with the 15NO3 − nutritive solution and harvested (three pots for Exp. 1 or five pots for Exp. 2).

For NO3 -labeled assimilating plants, N fluxes were assessed using the data of the labeling sessions. Rates of plant N uptake, seed N accumulation, endogenous-N remobilization to filling seeds, and variation of N amount in each vegetative organ during a labeling session were calculated using the total N concentrations and the 15N enrichment of the labeling nutrient solution (5 %) as described by Schiltz et al. (2005). Each N flux value represents the mean value of the two or the three 3-day labeling sessions. Values resulted from the measurement of three (Exp. 1) or five (Exp. 2) biological replicates, each consisting of one pot with four plants.

#### Calculation of N Fluxes for N2-Fixing Plants

For unlabeled N2-fixing plants (Exp. 3), rates of plant N uptake, seed N accumulation, and variation of N amount in each vegetative organ were assessed as the linear regressions coefficients of each variable (plant N, seed N, and vegetative organ N amounts, respectively) v. time (expressed in days). Values resulted from the measurement of five biological replicates, each consisting of one pot with four plants. Endogenous-N remobilized to filling seeds could not be determined in Exp. 3 using unlabeled N2-fixing plants.

#### Statistical Analysis

The experiments were conducted with completely randomized design with three (Exp. 1) or five (Exp. 2 and 3) biological replications. Each biological replication consisting of one pot with four plants (one single line, cv. Baccara). Data were analyzed using SigmaPlot® 12 (Systat Software, Inc.). All data obtained were subjected to analysis of variance. Differences at P ≤ 0.05 were considered significant.

### RESULTS

#### Seed Number, Individual Seed Dry Weight, Seed N Amount, and Seed N Concentration At Maturity

Seed number per plant at maturity was not significantly different among temperature treatments within an experiment (Table 1). Seed N amount at maturity and individual seed dry weight decreased in response to increasing temperatures in all three experiments (Table 1). On the contrary, seed N concentration increased with the increase in temperature in all experiments (Table 1). These changes of seed characteristics at maturity were significant for Exp. 1 and 3, that explored a wider range of mean daily air temperature during the seed-filling period than Exp. 2 (Table 1).

Seed number per plant at maturity was significantly different between experiments: it was lower in Exp. 2 than in Exp. 1 and 3 (Table 1), as was total seed dry matter (Table S2). And thus, seed N amount at maturity was also lower in Exp. 2 than in Exp. 1 and 3 (Table 1).

#### Response of Seed Dry Matter Accumulation and Seed N Accumulation to the Increase in Temperature

Individual seed dry matter accumulation during the seedfilling period decreased linearly with increasing temperature for both NO3 − -assimilating and N2-fixing plants (data from the three experiments gathered) by 19.6 mg seed−<sup>1</sup> per °C, from 227.8 mg seed−<sup>1</sup> at 18.4°C to 26.5 mg seed−<sup>1</sup> at 33.2°C (R<sup>2</sup> = 0.95) (Figure 1A). Individual seed dry matter accumulation was assessed as the product of the seed-filling duration and the rate of seed dry matter accumulation during the temperature treatments. Both variables significantly decreased with increasing temperature for the three experiments and for both plant N nutrition pathways (Figures 2A, B). The seed-filling duration was reduced progressively by 0.8 day for each additional °C (Figure 2A). Similarly, the rate of seed dry matter accumulation decreased progressively by 0.8 mg seed−<sup>1</sup> day−<sup>1</sup> per °C from 19.8 mg seed−<sup>1</sup> day−<sup>1</sup> at 18.4°C to 5 mg seed−<sup>1</sup> day−<sup>1</sup> at 33.2°C (Figure 2B).

Individual seed N accumulation during the temperature treatments decreased linearly with increasing temperature for both NO3 − -assimilating and N2-fixing plants (data from the three experiments gathered) by 0.76 mg seed−<sup>1</sup> per °C from 10.3 mg N seed−<sup>1</sup> at 18.4°C to 0.55 mg N seed−<sup>1</sup> at 33.2°C (R<sup>2</sup> = 0.81) (Figure 1B). Seed N accumulation was assessed as the product of the seed-filling duration and the rate of seed N accumulation during the temperature treatments. Both variables significantly decreased with increasing temperature from 18.4°C to 33.2°C, for the three experiments and both plant N nutrition pathways (Figures 2A, C). The rate of seed N accumulation decreased progressively by 0.032 mg seed−<sup>1</sup> day−<sup>1</sup> per °C from 0.73 mg seed−<sup>1</sup> day−<sup>1</sup> at 18.4°C to 0.10 mg seed−<sup>1</sup> day−<sup>1</sup> at 33.2°C (Figure 2C).

#### Effect of High Temperatures on the Remobilization of Endogenous-N to Filling Seeds by NO3 − -Assimilating Plants

Endogenous-N remobilization to filling seeds was measured on labeled NO3 − -assimilating plants in Exp. 1 and 2. The contribution of remobilized N to the rate of seed N accumulation exceeded 82 % in both experiments with NO3 − assimilating plants (Exp. 1 and 2) for all temperatures

(Figures 2B and 3). The temperature increase dramatically decreased the rate of N remobilization to filling seeds from 0.71 mg seed−<sup>1</sup> day−<sup>1</sup> at 18.4°C to 0 at 33.2°C (Figure 3). The detrimental effect of increasing temperature suggests a full stop of N remobilization at a temperature around 33°C (intersection of regression and X-axis in Figure 3).

(when larger than symbol). The data were fitted with a linear regression.

#### Effects of High Temperatures on the Plant N Uptake by NO3 − -Assimilating and N2- Fixing Plants

The rate of plant N uptake during the seed-filling varied between 0.11 and 0.64 mg seed−<sup>1</sup> day−<sup>1</sup> whatever the plant nutrition pathway. The variation range of the plant N uptake rate for N2 fixing plants was included in the variation range for NO3 − assimilating plants.

The rate of plant N uptake relying exclusively on NO3 − assimilation significantly increased from 0.11 mg seed−<sup>1</sup> day−<sup>1</sup> at 18.4°C to 0.64 mg seed−<sup>1</sup> day−<sup>1</sup> at 33.2°C (Figure 4A). Plant N uptake was not significantly modified by the small range of increasing temperature from 21.8 to 26.8°C in Exp. 2, while it increased linearly with increasing temperature from 18.4 to 33.2° C in Exp. 1 (Figure 4A).

Conversely, for N2-fixing plants in Exp. 3 the temperature increase significantly decreased the rate of N uptake in plants

following a linear relationship from 0.39 mg seed−<sup>1</sup> day−<sup>1</sup> at 19.9° C to 0.13 mg seed−<sup>1</sup> day−<sup>1</sup> at 31.3°C (Figure 4B).

#### Effects of High Temperatures on the Variation of the N Amount Within the Different Vegetative Organs During the Seed-Filling Period of NO3 − -Assimilating and N2-Fixing Plants

A net export of N represents a decrease in the N amount of a vegetative organ during the temperature treatment application through the seed-filling period, while a net import represents an increase in the N amount (Figure 5).

Considering NO3 − -assimilating plants (Exp. 1 and 2), the effect of temperature on rates of the N amount variation during the seedfilling period was significant only in leaves and to a lesser extent in stems (Figure 5A). In the leaves, Nfluxes switchedfrom N export to N import approximately above 26.3°C (Figure 5A). At the lowest temperature (18.4°C) leaves and stems respectively exported 0.34 and 0.15 mg seed−<sup>1</sup> day−<sup>1</sup> , while at the highest temperature (33.2°C) leaves and stems respectively imported 0.59 and 0.04 mg seed−<sup>1</sup> day−<sup>1</sup> (Figure 5A). Thus, the rate of the N amount variation during the seed-filling period in leaves was by far the most responsive to temperature among vegetative organs in NO3 − -assimilating plants (Figure 5A).

Considering N2-fixing plants (Exp. 3), all vegetative organs presented a net export of N whatever the temperature (Figure 5B). The temperature increase (from 19.9 to 31.3 °C) had no significant effect on the rate of the N export whatever the vegetative organ of N2-fixing plants (Figure 5B).

At maturity, N concentrations of vegetative organs (roots, pod walls, stems, and leaves) were above 16 mg g−<sup>1</sup> , for both NO3 − -assimilating and N2-fixing plants and whatever the temperature treatment (Table 1).

FIGURE 4 | Opposite responses to temperature increase of exogenous-N uptake rate in plants during the seed-filling period for NO3 − -assimilating plants (A) and N2-fixing plants (B). The vertical bars represent SE. The data were fitted with a linear regression.

#### DISCUSSION

The present study quantifies and explains, for the first time, the effects of high temperatures on N partitioning to filling seeds in pea, an annual legume crop. Plants differing in seed number between experiments allow us to assess trends representative of various field conditions. The wide range of mean air temperature explored (from 18.4 to 33.2°C) is representative of the present and future climatic conditions expected in field during the seedfilling period of most annual crops in Western Europe (June-July): mean monthly temperatures above 18°C and an increase in the frequency, intensity, and duration of heat waves (Christensen et al., 2007; Vliet et al., 2012; Xu et al., 2012). This temperature range was similar for the two plant N nutrition pathways tested: 19.9 to 31.3°C for NO3 − -assimilating plants allowing to measure endogenous fluxes and 18.4 to 33.2°C for N2-fixing plants, more representative of field conditions. Temperature treatments started when all seeds had begun to fill. At this stage, pea

plants had no longer the possibility to adjust the number of seed sinks to assimilate availability as earlier in their development (Ney et al., 1993). Indeed, seed number per plant at maturity was equal for all temperature treatments within an experiment.

#### Decrease in Seed Dry Matter and N Accumulation With Increasing High Temperature, Resulting Effects on Seed N Concentration and N Yield

The rate of individual seed dry matter accumulation and seedfilling duration in pea were reduced by 0.8 mg seed−<sup>1</sup> day−<sup>1</sup> and 0.8 days, respectively, for each additional °C of mean temperature from 18.4 to 33.2°C. Therefore, individual seed weight decreased with increasing temperature. These results are consistent with previous reports of a reduction in seed weight at high temperatures due to a decrease in the rate of seed fill and an abbreviated seed-filling duration (Singletary et al., 1994; Kim et al., 2011; Bueckert et al., 2015).

Our study demonstrates that seed N accumulation was also reduced by 0.76 mg seed−<sup>1</sup> day−<sup>1</sup> for each additional °C of mean temperature from 18.4 to 33.2°C, for both NO3 − -assimilating and N2-fixing plants. Results showed that, whatever the plant N nutrition pathway, the decrease of seed N accumulation with increasing temperature was due to the reduction of both the rate ofindividual seedN accumulation and the seed-filling duration.The rate of individual seed N accumulation progressively decreased by 0.032 mg seed−<sup>1</sup> day−<sup>1</sup> for each additional °C temperaturefrom 18.4 to 33.2°C. Therefore the amount of N accumulated in seeds significantly decreased with increasing temperatures.

Seed N concentration at maturity is the ratio of seed N and seed dry matter accumulation rates during the seed-filling period. Our results demonstrate that the decrease of the individual seed N rate with increasing high temperatures was lower than that of the individual seed dry matter rate (0.032 and 0.8 mg seed−<sup>1</sup> , respectively). Thus seed N concentration increased with increasing high temperatures. This result is consistent with previous reports of higher seed N concentration when temperatures rise during the seed-filling period (Karjalainen and Kortet, 1987; Tashiro and Wardlaw, 1991; Wardlaw and Wrigley, 1994; Larmure et al., 2005; Farooq et al., 2018).

In Europe, the current and projected warming rate in summer (June to August) is between 4.5 and 6.8°C/century, higher than for other seasons (Rowell, 2005; Xu et al., 2012; Terray and Boe, 2013). Consequently, the on-going climate warming has caused and will continue to cause severe seed N yield losses in pea without adaptation strategies. From our study, it can be expected that at the field scale, seed N yield in pea could decrease by 1.8 gN m−<sup>2</sup> for each additional °C of mean temperature during the seedfilling period, considering 2,400 seed m−<sup>2</sup> . From the perspective of French pea production, it represents more than 13 % loss of recent seed N yield (~13.8 gN m−<sup>2</sup> calculated with the mean yield and seed N concentration from 2013 to 2017: respectively 3.83 t·m−<sup>2</sup> and 36.2 mgN·g−<sup>1</sup> ; UNIP and ARVALIS, 2013, 2014; Terres Inovia and Terres Univia, 2015, 2016, 2017). Our study enables the identification of plant mechanisms involved in these seed N yield losses in order to provide levers for improving varieties tolerating heat stress.

#### Nitrogen Sources Availability Does Not Explain the Decrease in Seed N Amount With Increasing High Temperature

Nitrogen for pea seeds comes from two sources: current plant N uptake and N remobilization from vegetative organs (Lhuillier-Soundélé et al., 1999; Schiltz et al., 2005). Nitrogen availability from plant sources is known to determine seed N accumulation (Lhuillier-Soundélé et al., 1999; Martre et al., 2003; Larmure and Munier-Jolain, 2004; Kinugasa et al., 2012) . However, our results contradict the possibility of a decrease in seed N accumulation at high temperatures resulting of a limitation in N supply.

Indeed, plant NO3 <sup>−</sup> assimilation provides higher N availability under high temperatures (with non-limiting water availability) as plant N uptake of NO3 − -assimilating plants significantly increased with increasing temperature by 0.032 mg seed−<sup>1</sup> day−<sup>1</sup> for each additional °C temperature. NO3 <sup>−</sup> assimilation may have been enhanced by the increase in plant transpiration with increasing temperature under our no-limiting water conditions, because the transport ofwater and N solutesfrom roots to shootsis driven by the evaporative loss of water (Salon et al., 2011). Indeed, the transpiration of well-watered plants is expected to increase by 1– 5% for each additional °C temperature between 5 and 35°C (Kirschbaum, 2004). Contrary to NO3 <sup>−</sup> assimilation, plant N2 fixation was reduced under high temperatures: plant N uptake of N2-fixing plants decreasedwithincreasing temperature by0.022mg seed−<sup>1</sup> day−<sup>1</sup> for each additional °C temperature.High temperatures may decrease N2-fixation efficiency by affecting nitrogenase activity and/or nodulelongevity (Bordeleau and Prevost,1994;Hungria and Vargas, 2000), as no nodule production occurs during the seedfilling period of N2-fixing plants (Voisin et al., 2003; Bourion et al., 2007).

Despite the opposite effect of increasing temperature on plant N uptake acquired via N2 fixation or NO3 <sup>−</sup> assimilation, a lot of N was still available at maturity in vegetative organs (leaves, stems, pod walls, and roots), whatever the plant N nutrition pathway and the temperature treatment. Concentrations of vegetative organs at maturity were all above 16 mg g−<sup>1</sup> , much higher than the threshold of non-remobilizable N concentration (Larmure and Munier-Jolain, 2004). This result suggests that the shorter duration of seed-filling at high temperature was not due to a reduction of photosynthetic activity caused by N remobilization from vegetative organs to seeds. Indeed, the present study using 15NO3 <sup>−</sup>-labeled N source clearly demonstrates a gradual limitation of the rate of endogenous-N remobilization from vegetative organs to filling seeds above 18.4° C. N remobilization was nevertheless the major contributor to the N filling of pea seeds whatever the temperature, consistently with the previous observations at non-stressing temperatures in oilseed rape (Brassica napus) and in pea (Malagoli et al., 2005; Schiltz et al., 2005).

#### Sink Strength Determines Plant N Fluxes to Filling Seeds Under Heat Stress Conditions

Our results demonstrate a sink limitation of seed N accumulation by high temperatures (from 18.4 to 33.2°C). Actually, additional plant N uptake in NO3 − -assimilating plants at high temperature provided by the xylem was never allocated to seeds but stored in leaves and to a lesser extent in stems. This findings are in line with the observation that the majority of seeds N intake is attributable to phloem (Pate and Hocking, 1978). This hypothesis of sink limitation at high temperature is consistent with (1) the shorter duration of seedfilling with increasing temperature observed in our study, that leads to a progressive premature reduction of seed sink; (2) the decrease of the individual seed dry matter accumulation rate with increasing temperature that reduces seed sink; and (3) previous studies reporting a decrease in photoassimilates translocation to filling seeds at high temperatures due to reduced sink activity rather than source activity (Ito et al., 2009; Suwa et al., 2010; Kim et al., 2011). Early loss of individual seed sink activity at high temperature may result from a reduction of the activity of starch synthesis-related enzymes in the seed (Ito et al., 2009; Suwa et al., 2010; Yamakawa and Hakata, 2010; Kim et al., 2011). At high temperature, synthesis of hemicelluloses, cellulose, and starch in grain declines while sucrose accumulates (Ito et al., 2009; Suwa et al., 2010; Yamakawa and Hakata, 2010). While increasing temperatures might impede phloem transport, they also might hasten the preferential unloading of carbon (C) along the stem to meet local increasing respiratory demand (Atkin and Tjoelker, 2003; Sevanto, 2014). The resulting enrichment in N relative to C in the phloem sap reaching the seeds would explain its higher N concentration (Layzell and Larue, 1982).

#### Definition of Plant Senescence Under Heat Stress and Strategies to Develop Cultivars Adapted to Higher Temperatures Due to Climate Change

The original results of our study throw a new light on the regulation of N remobilization and definition of senescence in plants submitted to abiotic stress, such as heat-stress. At moderate temperatures senescence is linked to N remobilization to filling seeds, a mechanism to compensate the limitation of N uptake by roots (Hebbar et al., 2014). On the other hand, this research established that the heat-induced senescence (noticeable through the reduction of seed-filling duration) is surprisingly not associated with an acceleration of N nutrient remobilization to filling seeds. Under high temperature, shorter duration of seed-filling with increasing temperature may more likely result from alterations in various photosynthetic attributes and carbon budget than from plant N resources remobilization to cope with the heat stress (Wahid et al., 2007; Mathur et al., 2014).

Our results demonstrate that seed N yield processes are and will continue to be very frequently sink-limited by high temperatures during the seed-filling period in the warming climate context. It is worth noting that under the current and future climate change context, the increased frequency of early heat waves are and will be often associated to water deficit in field, resulting from either decreased precipitation and/or increased evaporation (Dai, 2013; Sehgal et al., 2018). The combined effects of water deficit and heat-stress on crops are more severe (Sehgal et al., 2018). Both abiotic constraints were previously reported to enhance assimilate remobilization from source to sink (Pic et al., 2002; Sehgal et al., 2018). On the contrary, our study using labeled nitrate demonstrates that N assimilate remobilization was reduced and most likely sinklimited under heat stress. Consequently, sustaining seed sink demand and preserving photosynthetic attributes of stressed plants during the seed-filling period should be promising strategies to maintain crop N production under exacerbated combined heat and water-deficit stresses in field due to the ongoing climate change. Such improvements may especially require further investigations in order to elucidate how sink activity could be modulated at high temperature and water deficit. While water deficit can be mitigated by irrigation (Bueckert et al., 2015), few cultural practices are available to leverage high temperatures stress. A better understanding of mechanisms controlling C and N allocation to sinks, are required to build robust sustainable practices.

#### DATA AVAILABILITY STATEMENT

The datasets generated for this study are available on request to the corresponding author.

#### AUTHOR CONTRIBUTIONS

AL and NM-J designed the study. AL collected and analyzed the data. AL and NM-J interpreted the results. Both authors contributed to manuscript writing.

### FUNDING

This research was supported by INRA, AgroSupDijon and a grant of UNIP (Union Nationale Interprofessionnelle des Plantes Riches en Protéines).

#### REFERENCES


#### ACKNOWLEDGMENTS

The authors are grateful to Vincent Durey, Christian Jeudy, Patrick Mathey, Carole Reibel, Valentine Pelissier, Eric Pimet and Anne-Lise Santoni for their excellent technical assistance, and the greenhouse staff at INRA Dijon for managing the experiments. This research was supported by INRA, AgroSupDijon and a grant of UNIP (Union Nationale Interprofessionnelle des Plantes Riches en Protéines). The experimental protocol and the manuscript were greatly improved thanks to the valuable comments of colleagues from the UMR 1347 Agroécologie: Nathalie Colbach, Christian Jeudy, Delphine Moreau, Marion Prudent, Christophe Salon, Aude Tixier and Anne-Sophie Voisin.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01608/ full#supplementary-material

Science Basis. Contribution of Working Group I to the Fourth Assessment Report of the Intergovernmental Panel on Climate Change. Eds. S. Solomon, D. Qin, M. Manning, Z. Chen, M. Marquis, K. B. Averyt, M. Tignor and H. L. Miller (Cambridge, United Kingdom and New York, NY, USA: Cambridge University Press). doi: 10.1111/jac.12005


temperature conditions. J. Agron. Crop Sci. 195, 368–376. doi: 10.23986/ afsci.72238


and molecular events as monocarpic senescence in pea. Plant Physiol. 128, 236–246. doi: 10.1007/s00382-005-0068-6


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Larmure and Munier-Jolain. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Genotype and Environment Effects on Prebiotic Carbohydrate Concentrations in Kabuli Chickpea Cultivars and Breeding Lines Grown in the U.S. Pacific Northwest

George Vandemark 1\*, Samadhi Thavarajah2,3, Niroshan Siva2 and Dil Thavarajah<sup>2</sup>

<sup>1</sup> Grain Legume Genetics and Physiology Research Unit, Washington State University, Pullman, WA, United States, <sup>2</sup> Plant and Environmental Sciences, Clemson University, Clemson, SC, United States, <sup>3</sup> Revelle College, University of California UC San Diego, La Jolla, CA, United States

Prebiotic carbohydrates are compounds that include simple sugars, sugar alcohols, and raffinose family oligosaccharides, which are fermented by gut bacteria and can influence the species profile of the gut microbiome to reduce obesity and weight gain. Prebiotic carbohydrates are also associated with several health benefits including reduced insulin dependence and incidence of colorectal cancer. Although pulse crops such as chickpea have been important sources of nutrition for human diets for thousands of years, relatively little is known about the profiles of prebiotic carbohydrates in pulse crops. The objectives of this study were to characterize the type and concentration of seed prebiotic carbohydrates in 18 kabuli chickpea genotypes grown in 2017 and 2018 in Idaho and Washington, and partition variance components conditioning these nutritional quality traits in chickpea. Genotype effects were significant for fructose, sucrose, raffinose, and kestose. Environment effects were also significant for several carbohydrates. However, year effects were the greatest sources of variance for all carbohydrates. Concentrations of most carbohydrates were significantly greater in 2017, when there was less precipitation during the growing season coupled with greater heat stress during grain filling than in 2018. This may reflect the role of many of these carbohydrates as osmoprotectants produced in response to heat and water stress. Overall, our results suggest that a survey of more genetically diverse plant materials, such as a chickpea 'mini-core' collection, may reveal genotypes that produce significantly greater concentrations of selected prebiotic carbohydrates and could be used to introduce desirable nutritional traits into adapted chickpea cultivars.

Keywords: biofortification, breeding, chickpea, gut microbiome, nutrition

#### Edited by:

Jose C. Jimenez-Lopez, Experimental Station of Zaidín (EEZ), Spain

#### Reviewed by:

Damián Maestri, National University of Cordoba, Argentina Paola Leonetti, Italian National Research Council, Italy

\*Correspondence:

George Vandemark george.vandemark@ars.usda.gov

Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 24 April 2019 Accepted: 24 January 2020 Published: 21 February 2020

#### Citation:

Vandemark G, Thavarajah S, Siva N and Thavarajah D (2020) Genotype and Environment Effects on Prebiotic Carbohydrate Concentrations in Kabuli Chickpea Cultivars and Breeding Lines Grown in the U.S. Pacific Northwest. Front. Plant Sci. 11:112. doi: 10.3389/fpls.2020.00112

# INTRODUCTION

Chickpea (Cicer arietinum L.) was one of eight 'founder crops' domesticated 9,000–11,000 years ago by Neolithic communities in riparian zones along the Tigris and Euphrates rivers in what is now Turkey and Syria (Lev-Yadun et al., 2000). Currently chickpea is the third most important pulse crop in terms of global production, after dry bean (Phaseolus vulgaris L.) and dry pea (Pisum sativum L.), with over 14.7 million Mt produced in 2017 (FAOSTAT, 2019). India is responsible of more than 80% of annual global production with Myanmar, Ethiopia, Turkey, and Pakistan being other major producers (FAOSTAT, 2019).

Chickpeas can be divided into two major classes, 'kabuli' and 'desi', based on seed characteristics. Desi chickpeas have a 'teardrop' shape and tend to be smaller in size and have thicker and darker seed coats than kabuli chickpeas, which have a rounder shape and tend to be larger and lighter in color (Toker, 2009). Desi chickpeas are typically dehulled to remove seed coats and then split and cooked to produce dhal, or are ground to make flour, whereas kabuli chickpeas are usually cooked whole without removing seed coats and then used for salads, canned, or for making edible spreads such as hummus (Yadav et al., 2007).

The first chickpeas grown commercially in the U.S. were large, light colored kabuli chickpeas known as 'Spanish White', which were grown in the San Joaquin Valley of southern California (Muehlbauer et al., 1982). Chickpea production began to expand in the 1980s to areas of Idaho and Washington where the predominate cropping system was dryland wheat and barley grown in rotation with lentil and pea. Commercial chickpea production in the U.S. consists almost entirely of kabuli chickpeas (Vandemark et al., 2014a). Currently chickpea is an important component of dryland production systems throughout the U.S. Pacific Northwest and Northern Plains. In 2017 more than 240,000 ha of chickpeas were harvested in the U.S. with a production value greater than \$200 million (NASS, 2019). In 2017, Washington and Idaho together accounted for approximately 51% of total U.S. chickpea production, while Montana and North Dakota together accounted for approximately 43% of total production (NASS, 2019).

Biofortification, a process by which crop plants have higher concentrations of nutritional factors such as proteins, carbohydrates, or minerals, has been proposed as a way of improving human and animal nutrition (White and Broadley, 2005). Biofortification may be accomplished through management practices, the development of new cultivars with improved nutritional qualities through plant breeding, or a combination of management and genetic approaches (de Benoist et al., 2008). At least three billion people globally suffer from malnutrition caused from dietary deficiencies in iron (Fe) or zinc (Zn) (de Benoist et al., 2008; Wessells and Brown, 2012). Nutritional characterization of chickpea has largely been limited to determining seed concentrations of minerals (Bueckert et al., 2011; Ray et al., 2014; Vandemark et al., 2018) and dietary fiber (Chen et al., 2016). Non-genetic sources of variance including environment, year and their interactions have been found to have greater magnitudes of effect than genetic variance for several important minerals of global concern, including Fe, Mg, and Zn (Ray et al., 2014; Vandemark et al., 2018).

In contrast to health consequences associated with dietary deficiencies, excesses in food consumption, coupled with genetic and environmental factors, have resulted in increases in the global incidence of obesity, coronary artery disease (CAD), and diabetes. Prebiotic carbohydrates are compounds found in many food sources that have been associated with diverse health benefits (Carlson et al., 2018). The definition of 'prebiotic' in the scientific community has evolved over more than 20 years of discussion and research and is most currently 'A nondigestible compound that, through its metabolism by microorganisms in the gut, modulates the composition and/or activity of the gut microbiota, thus conferring a beneficial physiologic effect on the host' (Bindels et al., 2015). Prebiotic carbohydrates include the simple sugars glucose and sucrose, several sugar alcohols (SA) including sorbitol and mannitol, fructooligosaccharides (FOS) such as kestose and nystose, and raffinose family oligosaccharides (RFOs), which include raffinose, stachyose, and verbacose (Peterbauer and Richter, 2001). Prebiotic carbohydrates are fermented by gut bacteria and can influence the species profile of the gut microbiome, including increasing concentration of Bifidobacteria sp. that are associated with reduced obesity and weight gain (Schwiertz et al., 2010). Fermentation of prebiotic carbohydrates produces short chain fatty acids (SCFA) that are associated with several health benefits including reduced obesity and insulin dependence (Gao et al., 2009) and protection against development of colorectal cancer (Keku et al., 2015).

Significant genotype, location, and year effects have been detected for seed concentrations of several prebiotic carbohydrates in lentil (Lens culinaris L.), including sorbitol, mannitol, and verbacose (Johnson et al., 2013). However, the effects of genetic and non-genetic sources of variance on seed prebiotic carbohydrate concentrations have not been estimated for chickpea. Understanding these effects is essential for developing new chickpea cultivars that produce seed with higher concentrations of selected prebiotic carbohydrates across different environments. The objectives of this study were to characterize concentrations of seed prebiotic carbohydrates in 18 kabuli chickpea genotypes grown in Washington and Idaho and partition variance components conditioning these nutritional quality traits in chickpea.

#### MATERIALS AND METHODS

#### Plant Materials and Field Trials

This study examined 18 cafe kabuli chickpea entries (Table 1), which included five cultivars, Billy Beans, CDC Frontier, CDC Orion, Royal, and Sierra, and 12 breeding lines. All entries were planted at two locations: Genesee, ID, (46.55° N, 116.92° W), and TABLE 1 | Mean# yield, hundred seed weight (HSW) and days to mature for chickpea cultivars and breeding lines grown at Pullman, WA and Genesee, ID in both 2017 and 2018.


# Means within a column followed by the same letter are not significantly different (Tukey's HSD, a = 0.05).

£ Mean of yield trials conducted at Pullman, WA in 2017 and 2018.

Pullman, WA (46.73° N, 117.18° W) in both 2017 and 2018. All seeds were treated before planting with fludioxonil (0.56 g kg−<sup>1</sup> , Syngenta, Greensboro, NC, USA), mefenoxam (0.38 g kg−<sup>1</sup> , Syngenta) and thiabendazole (1.87 g kg−<sup>1</sup> , Syngenta) to control fungal diseases, thiamethoxam (0.66 ml kg−<sup>1</sup> , Syngenta) for insect control, and molybdenum (0.35 g kg−<sup>1</sup> ). Approximately 0.5 g Mesorhizobium ciceri inoculant (1 × 108 CFU g−<sup>1</sup> ; Exceed, Cambridge, MA, USA) was applied to each seed packet one day before planting. Chickpeas were planted at a density of 43 seeds m−<sup>2</sup> in a 1.5 m × 6.1 m block (~430,000 seeds ha−<sup>1</sup> ). All yield trials used a randomized complete block design with three replications. Weeds were controlled by a single post-plant/preemergence application of metribuzin (0.42 kg ha−<sup>1</sup> , Bayer Crop Science, Raleigh, NC) and linuron (1.34 kg ha−<sup>1</sup> , NovaSource, Phoenix, AZ, USA). All plots were exclusively rainfed and no supplemental irrigation was applied. Plots at Pullman were evaluated during the growing season for field traits including days to harvest maturity. Plots were mechanically harvested and seed yield (kg ha−<sup>1</sup> ) determined. Hundred seed weight (HSW) was determined for each entry at Pullman by taking the average weight (g) of 100 seeds from each of three replicate plots.

#### Prebiotic Carbohydrates

Ground seed samples (500 mg) were placed in 15-ml polypropylene conical tubes and 10 mL ddH2O was added to each tube, which were incubated for 1 h at 80°C (Muir et al., 2009). Samples were centrifuged at 3,000×g for 10 min. An aliquot (1 ml) of the supernatant was diluted with 9 ml ddH2O, and the diluted supernatant was filtered through a 13 mm × 0.45 μm nylon syringe filter (Fisher Scientific, Waltham, MA, USA) prior to analysis. Prebiotic carbohydrate concentrations (SA, RFO, and FOS) were measured using high performance anion exchange chromatography (HPAE) (Dionex, ICS-5000, Sunnyvale, CA, USA) as previously described (Feinberg et al., 2009; Johnson et al., 2013). SA (sorbitol and mannitol), RFO (raffinose, stachyose, and verbascose), and FOS (kestose) were identified and quantified using pure standards (> 99%), and concentrations were detected within a linear range of 3 to 1,000 mg g−<sup>1</sup> with a minimum detection limit of 0.2 mg g−<sup>1</sup> . A lab reference (CDC Redberry lentil) was used to ensure the accuracy and reproducibility of detection. The peak areas of the external reference, glucose (100 ppm), SA (3–1,000 ppm), RFO (3–1,000 ppm), and FOS (3–1,000 ppm) were routinely analyzed for method consistency and detector sensitivity, with an error of less than 5%.

#### Resistant Starch

RS concentrations were determined as previously described (McCleary and Monaghan, 2002) using a commercial assay (Megazyme, 2012). Ground samples (500 mg) were incubated with 4 ml of 100 mM sodium malate (pH 6) containing aamylase (10 mg ml−<sup>1</sup> ) and amyloglucosidase (3 U ml−<sup>1</sup> ) for 16 h in a water bath (37°C) with 200 strokes/min vertical shaking (Orbit shaker bath, Lab Line Instruments Inc., Melrose Park, IL, USA). After incubation, 4 ml of 95% ethanol were added, and the samples were centrifuged at 1,500×g for 10 min at room temperature. The pellets were re-suspended with 6 ml of ethanol (50% v/v), centrifuged, and decanted. The resuspension and centrifugation processes were done twice. Supernatants from the three centrifugations were pooled and brought to a volume of 100 ml in ddH2O. The pellets were dissolved in 2 ml of potassium hydroxide (2 M) in an ice bath (~0°C) while stirring with a magnetic stirrer for 20 min. The suspensions were diluted with 8 ml of sodium acetate buffer (1.2 M, pH 3.8), with 0.1 ml of 3,300 U ml−<sup>1</sup> amyloglucosidase then immediately added followed by incubation at 50°C for 30 min. The suspension was then centrifuged at 1,500×g for 10 min at room temperature. Aliquots (0.1 ml) of both the supernatant containing the RS fractions and the diluted washings containing the soluble starch (SS) fractions were transferred separately to 10-ml glass tubes. A reagent blank was prepared using 0.1 ml sodium acetate buffer (pH 4.5). An aliquot (3 ml) of GOPOD reagent was added to each tube, which were incubated in a water bath at 50°C for 20 min. Absorption was measured using a spectrophotometer (Genesys 20, Thermo Scientific, NC, USA) at 510 nm. Starch fractions were calculated as follows:

$$\begin{aligned} \text{RS} &= \frac{\text{X} \times \text{ (Abs}\_{\text{sample}})}{\text{(Abs}\_{\text{gluxose}} \times \text{ W}\_{\text{sample}})}, \\\\ \text{SS} &= \frac{\text{Y} \times \text{(Abs}\_{\text{sample}})}{\text{(Abs}\_{\text{gluxose}} \times \text{W}\_{\text{sample}})}, \end{aligned}$$

where Abssample and Absglucose are the absorbance value of sample and glucose corrected against reagent blank, respectively; Wsample is the moisture corrected weight of sample; and X and Y are the dilutions factors for RS and SS, respectively. Regular corn starch (RS concentration 1.0 ± 0.1% (w/w)) was used to verify the data, and batches were checked regularly to ensure an analytical error of less than 10%.

#### Chemicals

Solvents and standards used for high performance anion exchange chromatography (HPAE) and enzymatic assays were purchased from Fisher Scientific (Asheville, NC, USA), Sigma-Aldrich (St. Louis, MO, USA), and VWR International (Satellite Blvd, Suwanee, GA, USA). Distilled and deionized water (ddH2O; NANO-pure Diamond, Barnstead, IA, USA) was used in these analyses.

#### Statistical Analysis

Entries (genotypes) were considered fixed factors and locations (environments), replications (blocks) within locations, and years were considered random factors. Combined ANOVA was conducted across both locations and years to detect effects of genotypes, environments, and their interactions. Entry means were compared between all pairs using Tukey's HSD test (a = 0.05). Pairwise correlations were determined between seed carbohydrate concentrations and yield from data combined across both locations and years, and correlations were also determined between carbohydrate concentrations, HSW and days to mature for data obtained at Pullman, WA in 2017 and 2018. All statistical analyses were performed with JMP software (SAS, Cary, NC, USA).

#### RESULTS

#### Chickpea Seed Carbohydrate Concentrations

Mean squares of combined analysis of variance for chickpea seed carbohydrate concentrations are presented in Table 2. Genotype effects were significant for fructose, sucrose, raffinose, and kestose. Genotype effects were greatest for the simple sugars fructose and sucrose. Environment effects were also significant for several carbohydrates including sorbitol, glucose, fructose, kestose, and soluble starch. Environment effects were greatest for fructose, soluble starch, and glucose. Year effects were significant for all carbohydrates. Year effects were the greatest sources of variance for all carbohydrates. A significant genotype × environment effect was only observed for fructose. Significant genotype × year effects were observed for fructose and raffinose, however, the magnitudes of these effects were minor in comparison with year effects.

TABLE 2 | Mean squares of combined ANOVA, and coefficient of variation (CV) for concentrations of prebiotic carbohydrates in chickpea cultivars and breeding lines grown in Idaho and Washington# .


# Study included 18 kabuli genotypes evaluated at two environments (Pullman, WA and Genesee, ID) in 2017 and 2018.

£ Resistant starch concentrations were only determined for samples harvested at Pullman and Genesee in 2017.

\* Significant at P < 0.05.

\*\* Significant at P < 0.001.

\*\*\* Significant at P < 0.0001.


TABLE 3 | Mean# concentrations of prebiotic carbohydrates for chickpea cultivars and breeding lines grown at Pullman, WA and Genesee, ID in both 2017 and 2018.

# Means within a column followed by the same letter are not significantly different (Tukey's HSD, a = 0.05).

Environment × year effects were significant for all carbohydrates except sucrose, verbacose, and soluble starch. The greatest interaction effect for all carbohydrates was the environment × year effect. A significant genotype × environment × year effect was only observed for fructose.

The most abundant carbohydrate in chickpea seed was sucrose, which on average constituted greater than 1.6% of total seed weight, followed by stachyose and sorbitol (Table 3). Sucrose represented greater than 95% of total simple sugars (sucrose + fructose + glucose). Stachyose represented greater than 50% of total RFO (stachyose + raffinose + verbacose), which was the most abundant class of prebiotic carbohydrates. The least abundant carbohydrates in chickpea seed were fructose and mannitol. Concentrations of glucose and kestose were similar in chickpea seeds. Significant differences between means of chickpea entries were detected only for seed concentrations of sucrose. CA13900023C had a significantly higher sucrose concentration than CA13900046C, but no other significant differences were detected. Soluble starch on average constituted 41% of total seed weight and was approximately 10× more abundant than resistant starch.

Mean concentrations of carbohydrates across locations and years are presented in Table 4. For the majority of carbohydrates, including sorbitol, mannitol, glucose, sucrose, stachyose, raffinose, verbacose, and kestose, mean concentrations at both locations in 2017 were significantly greater than both locations in 2018. Significant differences in mean concentrations of carbohydrates between Pullman-2017 and Genesee-2017 were only observed for mannitol and raffinose. Significant differences in mean concentrations between Pullman-2018 and Genesee-2018 were observed for several carbohydrates including sorbitol, glucose, fructose, raffinose, kestose, and soluble starch.

#### Correlations Between Carbohydrate Concentrations, Yield, HSW, and Days to Mature

Significant correlations (P <0.05) between carbohydrate concentrations were observed for the majority of pairwise

TABLE 4 | Mean# concentrations by location and year of prebiotic carbohydrates for chickpea cultivars and breeding lines grown at Pullman, WA and Genesee, ID in both 2017 and 2018.


# Means within a column followed by the same letter are not significantly different (Tukey's HSD, a = 0.05).

combinations and only correlations with r ≥0.80 will be noted. The highest positive correlations between carbohydrate concentrations were observed between verbacose and sorbitol (r = 0.93), verbacose and stachyose (r = 0.92), stachyose and sorbitol (r = 0.88), stachyose and sucrose (r = 0.85), and verbacose and sucrose (r = 0.82).

Correlations between seed carbohydrate concentrations and agronomic traits tended to be less than those observed between different carbohydrate concentrations. Correlations between carbohydrate concentrations and HSW or days to flower had relatively low magnitude (r <0.40) or not significant. Correlations between carbohydrate concentrations and days to mature tended to positive for most carbohydrates and were highest for sorbitol (r = 0.67) and verbacose (r = 0.65). However, significant negative correlations of appreciable magnitude were observed between several carbohydrate concentrations and plot yield. The highest negative correlations with yield were observed for the RFOs verbacose (r = −0.80) and stachyose (r = −0.77), followed by simple sugars sorbitol (r = −0.66) and mannitol (r = −0.65).

#### DISCUSSION

Significant genotype effects were detected for several prebiotic carbohydrates (Table 2). However, non-genetic sources of variance including year effects and environment × year interaction effects were the greatest sources of variance for all carbohydrates (Table 2). These results suggest that only limited gains may be made in these traits using adapted parental materials. Minor genotype effects, or in many cases a lack of significant genotype effects are likely due in part to the relatively narrow genetic base present in the examined chickpea cultivars and breeding lines (Table 1). Three breeding lines are full-sibs derived from CA0469C020C/Dwelley and seven breeding lines share as a parent CA0469C020C, which has resistance to Ascochyta blight and is a full-sib line to CA0469C025C, a germplasm with improved disease resistance and high yield (Vandemark et al., 2014b).

Significant environment effects were detected for several prebiotic carbohydrates (Table 2). Although only two environments were examined, these results suggest improved understanding of factors contributing to environmental and management sources of variance may promote reliable production of more nutritious chickpeas. The absence of significant genotype × environment interaction effects observed in this study for all carbohydrates except fructose (Table 2) can likely be attributed to limited genetic variation between plant materials and similarities between the two test locations.

Year effects were the greatest source of variance for all carbohydrate concentrations (Table 2). For the majority of carbohydrates, mean concentrations in 2017 were significantly greater than in 2018 (Table 4). Monthly average temperatures and total monthly precipitation are presented in Table 5 for Pullman, WA and Genesee, ID during 2017 and 2018. Average temperatures early in the growing season (April and May) were warmer in 2018 than 2017 at Pullman and Genesee. However, average temperatures later in the growing season (July and August) were cooler in 2018 than 2017 at both locations. Both locations received more precipitation early in the growing season (April and May) in 2018 than 2017. These data suggest that the higher concentrations of many carbohydrates observed in 2017 may be the result of lower precipitation during the growing season coupled with greater heat stress later in the season (July and August) during grain filling. This may reflect the role of many of these compounds as osmoprotectants produced in response to heat and water stress.

Total RFO content in chickpea seed averaged 2.0% of dry weight, which is consistent with reports for other seeds ranging from 2 to 10% (Peterbauer and Richter, 2001). The most abundant RFO in chickpea seed was stachyose (Table 2). This is consistent with previous reports for other legume seeds, including dry bean (P. vulgaris L.) (McPhee et al., 2002) and soybean (Glycine max L.) (Kumar et al., 2010) for which stachyose was more abundant than raffinose.

A positive correlation with r >0.80 was observed between seed concentrations of verbacose and stachyose. This likely reflects their shared RFO biosynthetic pathway in seeds, in which galactosylation of raffinose leads to production of stachyose, to which an additional galactosyl residue is transferred to produce verbacose (Peterbauer and Richter, 2001). Similarly high correlations were also observed between these two RFOs, sucrose, and sorbitol. The high correlations between sucrose, stachyose, and verbacose can also be explained by the role of sucrose as the first galactosyl residue acceptor in the RFO biosynthetic pathway. High correlations

```
TABLE 5 | Average monthly temperature and precipitation during growing season in Pullman#
                                                                              , WA and Genesee£
                                                                                                 , ID in 2017 and 2018.
```


# Data from Washington State University AgWeatherNet (https://weather.wsu.edu).

£ Data from U.S. National Center for Climate Information (https://www.ncdc.noaa.gov).

between sorbitol, stachyose and verbacose likely reflect that along with sucrose, SA such as sorbitol are primary products of photosynthesis and a major source of translocated carbohydrate to seed (Slewinski and Braun, 2010).

Only minor or non-significant correlations were observed between seed carbohydrate concentrations and seed size (HSW). However, high negative correlations were observed between yield and concentrations of RFOs verbacose and stachyose, and between yield and SAs sorbitol and mannitol. Although RFOs primarily function to store carbon in seeds, they are also known to accumulate in response to abiotic stress factors including heat (Panikulangara et al., 2004) and drought (Downie et al., 2003). Sorbitol has been shown to accumulate in several plant species in response to various abiotic factors including osmotic (Pommerrenig et al., 2007) and drought stress (Li et al., 2012). Similarly, accumulation of mannitol has been shown to increase tolerance to drought stress in several plant species (Patonnier et al., 1999; Abebe et al., 2003). Climatic conditions that contributed to lower yields in 2017, including higher temperatures during grain filling and lower precipitation (Table 5), also likely resulted in higher seed concentrations of carbohydrates associated with drought and heat stress.

Identifying sources of genetic variation in chickpea for seed concentrations of prebiotic carbohydrates and understanding the magnitude of genotype, environment, and their interaction effects on these traits are important for accelerating progress in breeding more nutritious chickpea cultivars. In this study nongenetic effects contributed more than genetic effects to total variation in carbohydrate concentrations, suggesting there is very limited genetic variation for these traits in the elite chickpea breeding lines and cultivars examined in this study. However, a survey of more genetically diverse plant materials, such as a chickpea 'mini-core' collection (Upadhyaya and Ortiz, 2001) may reveal chickpea genotypes that produce exceptionally

#### REFERENCES


high concentrations of selected prebiotic carbohydrates and could be used to introduce desirable nutritional traits into adapted chickpea cultivars.

#### DATA AVAILABILITY STATEMENT

The datasets generated for this study are available on request to the corresponding author.

#### AUTHOR CONTRIBUTIONS

GV and DT conceived this work. GV planned and carried out field experiments including data collection, harvesting and cleaning seed samples. DT directed laboratory work to determine prebiotic carbohydrate profiles, maintained equipment for high performance anion exchange chromatography (HPAE), and analyzed data. ST and NS performed laboratory work and collected data. GV performed statistical analysis. GV and DT drafted the manuscript. All authors read and approved the manuscript.

#### FUNDING

This work was funded by a U.S. Department of Agriculture, Agricultural Research Service Pulse Crop Health Initiative competitive grant ('Improving the nutritional value of chickpeas').

radicle protrusion is prevented. Plant Physiol. 131, 1347–1359. doi: 10.1104/ pp.016386


drought stress. Plant Mol. Biol. Rep. 30, 123–130. doi: 10.1007/s11105-011- 0323-4


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Vandemark, Thavarajah, Siva and Thavarajah. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Advances in Crop Improvement and Delivery Research for Nutritional Quality and Health Benefits of Groundnut (Arachis hypogaea L.)

Chris O. Ojiewo1\*, Pasupuleti Janila2 , Pooja Bhatnagar-Mathur <sup>2</sup> , Manish K. Pandey <sup>2</sup> , Haile Desmae<sup>3</sup> , Patrick Okori <sup>4</sup> , James Mwololo<sup>4</sup> , Hakeem Ajeigbe<sup>5</sup> , Esther Njuguna-Mungai <sup>1</sup> , Geoffrey Muricho<sup>1</sup> , Essegbemon Akpo<sup>1</sup> , Wanjiku N. Gichohi-Wainaina<sup>4</sup> , Murali T. Variath<sup>2</sup> , Thankappan Radhakrishnan<sup>6</sup> , Kantilal L. Dobariya<sup>7</sup> , Sandip Kumar Bera<sup>6</sup> , Arulthambi Luke Rathnakumar <sup>6</sup> , Narayana Manivannan<sup>8</sup> , Ragur Pandu Vasanthi <sup>9</sup> , Mallela Venkata Nagesh Kumar <sup>10</sup> and Rajeev K. Varshney <sup>2</sup>

<sup>1</sup> Research Program – Genetic Gains, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Nairobi, Kenya, <sup>2</sup> Research Program – Genetic Gains, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India, <sup>3</sup> Research Program – West and Central Africa, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Bamako, Mali, <sup>4</sup> Research Program – Eastern and Southern Africa, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Lilongwe, Malawi, <sup>5</sup> Research Program – West and Central Africa, International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Kano, Nigeria, <sup>6</sup> Indian Council of Agricultural Research - Directorate of Groundnut Research (ICAR-DGR), Junagadh, India, <sup>7</sup> Main Oilseeds Research Station, Junagadh Agricultural University (JAU), Junagadh, India, <sup>8</sup> National Pulses Research Center, Tamil Nadu Agricultural University (TNAU), Pudukkottai, India, <sup>9</sup> Regional Agricultural Research Station, Acharya NG Ranga Agricultural University (ANGRAU), Tirupati, India, <sup>10</sup> Department of Genetics and Plant Breeding, Professor Jayashankar Telangana State Agricultural University (PJTSAU), Hyderabad, India

Groundnut is an important global food and oil crop that underpins agriculture-dependent livelihood strategies meeting food, nutrition, and income security. Aflatoxins, pose a major challenge to increased competitiveness of groundnut limiting access to lucrative markets and affecting populations that consume it. Other drivers of low competitiveness include allergens and limited shelf life occasioned by low oleic acid profile in the oil. Thus grain off-takers such as consumers, domestic, and export markets as well as processors need solutions to increase profitability of the grain. There are some technological solutions to these challenges and this review paper highlights advances in crop improvement to enhance groundnut grain quality and nutrient profile for food, nutrition, and economic benefits. Significant advances have been made in setting the stage for marker-assisted allele pyramiding for different aflatoxin resistance mechanisms—in vitro seed colonization, pre-harvest aflatoxin contamination, and aflatoxin production—which, together with pre- and post-harvest management practices, will go a long way in mitigating the aflatoxin menace. A breakthrough in aflatoxin control is in sight with overexpression of antifungal plant defensins, and through host-induced gene silencing in the aflatoxin biosynthetic pathway. Similarly, genomic and biochemical approaches to allergen control are in good progress, with the identification of homologs of the allergen encoding genes and development of monoclonal antibody based ELISA protocol to screen for and quantify major allergens. Doublemutation ofthe allotetraploid homeologous genes, FAD2AandFAD2B,

#### Edited by:

Karam B. Singh, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

#### Reviewed by:

Gijs A. Kleter, Wageningen University & Research, Netherlands Vivekanand Tiwari, Agricultural Research Organization (ARO), Israel

> \*Correspondence: Chris O. Ojiewo c.ojiewo@cgiar.org

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 01 June 2019 Accepted: 13 January 2020 Published: 21 February 2020

#### Citation:

Ojiewo CO, Janila P, Bhatnagar-Mathur P, Pandey MK, Desmae H, Okori P, Mwololo J, Ajeigbe H, Njuguna-Mungai E, Muricho G, Akpo E, Gichohi-Wainaina WN, Variath MT, Radhakrishnan T, Dobariya KL, Bera SK, Rathnakumar AL, Manivannan N, Vasanthi RP, Kumar MVN and Varshney RK (2020) Advances in Crop Improvement and Delivery Research for Nutritional Quality and Health Benefits of Groundnut (Arachis hypogaea L.). Front. Plant Sci. 11:29. doi: 10.3389/fpls.2020.00029 has shown potential for achieving >75% oleic acid as demonstrated among introgression lines. Significant advances have been made in seed systems researchto bridgethe gap betweentrait discovery, deployment, and delivery through innovative partnerships and action learning.

Keywords: aflatoxin, allergens, Arachis hypogaea, crop improvement, groundnut, oleic acid, science of delivery

#### INTRODUCTION

Groundnut is an invaluable source of protein, calories, essential fatty acids, vitamins, and minerals for human nutrition (Willett et al., 2019). Groundnut consumption is reported to be associated with several health benefits (Kris-Etherton et al., 2008; Sabate et al., 2010; Guasch-Ferré et al., 2017). Indeed, a recent Lancetcommissioned publication concludes that transformation to healthy diets by 2050 requires substantial dietary shifts, including more than 100% increment in consumption of healthy foods, such as nuts, fruits, vegetables, and legumes (Willett et al., 2019). Higher consumption of total and specific types of nuts was found inversely associated with total cardiovascular disease and coronary heart disease (Guasch-Ferré et al., 2017). Qualified health claim linking early groundnut introduction and reduced risk of developing groundnut allergy was acknowledged by Food and Drug Administration (FDA) (FDA, 2017).

Groundnut is a rich source of dietary protein with ability to meet up to 46% of recommended daily allowance; essential vitamins especially E, energy from its oils and fats, and dietary fiber. It is also a rich source of minerals such as K, Na, Ca, Mn, Fe, and Zn among others and a rich source of biologically active compounds (arginine, resveratrol, phytosterols, and flavonoids). Zinc in particular, is one of the limiting micronutrients especially among rural households in Africa affecting especially infants and young persons (Wessells and Brown, 2012). This explains why groundnut is a rich base in therapeutic foods. The World Health Organization of the United Nations encourages consumption of groundnut-based "ready-to-use therapeutic foods" (RUTF) for community-based treatment of severe malnutrition. For example Plumpy'Nut®, used to treat severe acute malnutrition in children is i) calorie-dense, high in proteins, vitamins, and minerals; ii) simple to deliver and administer without training; iii) fast acting; iv) affordable; v) culturally acceptable; vi) packed in single-serve packets; vii) requires little preparation before use; viii) equipped with adequate shelf-life and stability; ix) storable in varied climatic conditions and temperature; x) resistant to bacterial contamination; and xi) not causative of addiction in children (Ojiewo et al., 2015).

In Ghana, groundnut has been identified as a nutrient-dense food with high capacity to deliver nutrition and income outcomes to producers and consumers. The crop scored highly for nutrition quality, affordability, acceptability, integrity, and business/ investment interest (Anim-Somuah et al., 2013). In Malawi and Tanzania, groundnut is prioritized as a crop for diversifying the economy, being included in the national investment strategy and national agricultural development blue print (Anonymous, 2014). In spite of being a cheap source of dietary benefits per-capita consumption is still low even among major producers such as Malawi and Tanzania. For example, our evaluation studies show that per-capita consumption in Mtwara, Southern Tanzania, a major groundnut producer is low. From a sample of 224 farmers in Mtwara, only 3% of mothers reported to feed their children (6– 23 months old) food containing groundnut, yet animal protein consumption in the same age-group of children was only less than 1%. From the same study, similar trends were observed in consumption patterns among women of reproductive age where groundnut could serve as a vital source of iron and zinc. In Malawi per capita consumption is 8 kg, doubling from 4 kg in the mid-2000s after intervention by International Crops Research Institute for the Semi-Arid Tropics (ICRISAT). In fact groundnut production and consumption in Malawi, is relatively higher than neighboring countries, with 47.9% of households consuming groundnut food products more than four times a week (Seetha et al., 2018a).

In another study in Malawi among respondents of diverse genders and regions, greater than 70% reported to consume groundnuts in different forms at least three times a week (Gama et al., 2018). Data from Nigeria also indicates the highest per capita consumption of top 20 groundnut consumers surveyed (International Nut and Dried Fruit, 2018). These studies however, indicate consumption either among a mixed group of adults or the general population with no stratification of consumption among nutritionally vulnerable groups such as infants, young children, and women of reproductive age. Promotion of consumption among these groups is still warranted and vital as they bear the burden of iron and zinc deficiency, minerals that are rich in groundnuts.

If groundnut consumption is to be increased in such areas where micronutrient deficiencies remain of public health importance, crop improvement must address productivity and nutritional quality challenges. Thus, concerted efforts are needed to develop high yielding, nutrient dense, and market preferred groundnut varieties.

Equally important is the need to improve functionality of groundnut seed systems to improve access and adoption of improved varieties. Effective and efficient seed systems (seed value chains), support delivery, and access to improved crop varieties on time and at affordable price. It also supports planning demand and supply from the farm to national levels, a critical point for seed security (Sperling, 2008; McGuire and Sperling, 2016). Seed is the vehicle that delivers all the millions of base pairs of DNA that pattern into the genome of a plant expressed in its phenome. Upstream research should therefore work with delivery in mind and form strategic partnerships that create real impacts on the ground besides high impact publications<sup>1</sup> , 2 . In this review, we have highlighted some of the

<sup>1</sup> https://www.nature.com/articles/d41586-018-01642-w

<sup>2</sup> https://www.nature.com/articles/d41586-019-01643-3

advances in crop improvement efforts related to nutritional and oil quality as well as addressing key challenges of groundnut grain nutrient quality, with a focus on aflatoxin contamination and allergens. We also highlight efforts to link upstream research and delivery of nutritionally superior groundnut varieties for food, nutrition, and income security.

### METHODS, LIMITATIONS, AND BIAS

This synthesis paper is highlighting major efforts, achievements, lessons learned, challenges, and gaps in the process of development to delivery of nutrient dense and health safe groundnut. For the most of the work around development of low aflatoxin, low allergen, and high oleic acid groundnut, emphasis is on the work done by the ICRISAT together with its network of national and international partners. A significant portion of literature cited and reported data is published work stemming from major projects hosted by ICRISAT at various times.

ICRISAT hosted a large project on "Tropical Legumes: Improving Livelihoods for Smallholder Farmers: Enhanced Grain Legume Productivity and Production in Sub-Saharan Africa and South Asia between 2007–2010 (Phase 1), 2011– 2014 (Phase 2) and 2015–2019" (TLIII). The Tropical Legumes projects put emphasis on developing, testing, and promoting improved crop cultivars to enhance legume productivity and production in the drought-prone areas of target regions and countries. Besides other emphasis, the partners put concerted efforts in developing cultivars tolerant to drought and the major production and consumption constraints including aflatoxin challenge, groundnut allergens, and lipid profile using concerted approaches such as marker-assisted selection and genetic engineering. ICRISAT also hosted the Consultative Group on International Agricultural Research (CGIAR) Research Program on Grain Legumes (CRP-GL) which progressed into CGIAR Research Program on Grain Legumes and Dryland Cereals (CRP-GLDC) with work packages covering crop improvement of groundnut. Results from these projects together with some of their precursors and successors form the bulk of literature cited here, thereby explaining the bias toward ICRISAT. This synthesis paper includes limited literature on the major groundnut research too as examples that can be referred to in the process of mainstreaming orphaned crops.

#### ADVANCES IN CROP IMPROVEMENT TO MITIGATE GROUNDNUT AFLATOXIN CONTAMINATION

Aflatoxin is a dangerous mycotoxin produced by the fungus Aspergillus flavus Link : Fr, from which it draws its name. Aflatoxin contamination, is particularly common in all starchy agricultural food products, because of the ubiquitous nature of the Aspergillus species, a saprophyte that starts infecting crop products, especially grain, from the field to storage in the process producing aflatoxins (Njoroge et al., 2017; Seetha, et al., 2018b); aflatoxin laced food products certainly affect nutrition benefits and trade.

Studies suggest that the aflatoxin is carcinogenic, immunosuppressive (reduction of the activation or efficacy of the immune system), hepatotoxic (liver toxicity), and teratogenic (abnormalities of physiological development) in nature and thus has adverse impacts on human and animal health thus affecting nutrition and trade in many African and Asian countries (Amaike and Keller, 2011; Kensler et al., 2011; Monyo et al., 2012; Kamika et al., 2014; Mupunga et al., 2014; Njoroge et al., 2016; Njoroge et al., 2017; Agbetiameh et al., 2018; Norlia et al., 2018; Lien et al., 2019). Exposure to aflatoxins, particularly aflatoxin B1 (AfB1), is associated with increased risk of developing cirrhosis and liver cancer (Chu et al., 2017). Africa in particular, children are exposed to aflatoxin contamination in utero and throughout the weaning period and beyond (Turner et al., 2007; Khlangwiset et al., 2011; Watson et al., 2018; Seetha et al., 2018b). In a recent survey in Northern Nigeria (Ajeigbe et al., 2018), AfB1 concentrations in kuli kuli, a groundnut product widely consumed in different forms by a vast majority of Nigerians, range between 4.10 and 268.00 mg/kg. Indeed, 87–100% of kuli kuli consumed in Nigeria is contaminated by aflatoxin. The situation of several other groundnut-based products are not very different from that of kuli kuli. For example, between 91 and 96% of roasted groundnut sold at different locations across Nigeria are contaminated by AfB1 with concentrations ranging between 1 and 65 mg/kg.

Mitigating exposure to aflatoxins positively impacts growth of children (Seetha et al., 2018b). Given that groundnut is a common weaning food in many rural farming households, reducing exposure to aflatoxins will improve achievement of the Sustainable Development Goals (SDG) two (ending hunger and all forms of malnutrition by 2030), the WHO goal of reducing stunting of children under 5 years by 40%. Given the evidence showing an association between aflatoxin exposure and stunting, aflatoxin contamination of nutrient dense crops such as groundnut needs to be addressed to break the vicious links to aflatoxin contamination. Additionally, epidemiological studies have demonstrated a strong link between (AfB1) consumption and cancer occurrence as well as liver toxicity further compounding the evidence on the negative health effects of aflatoxin contamination.

A recent study conducted in Democratic Republic of Congo indicated that awareness of consumers on the dangers and mitigation measures of aflatoxin contamination in groundnut is still limited (Udomkun et al., 2018). Similarly, in a study conducted in 2018 by ICRISAT, 80.7 and 70% of households surveyed indicated that they had seen green moldy grain in Tanzania and Malawi, respectively. However, only 3.3% in Tanzania and 50% in Malawi had heard about aflatoxin contamination. The higher level of aflatoxin awareness in Malawi is mainly due to concerted efforts of the National Smallholder Farmers' Association of Malawi (NASFAM) backstopped by ICRISAT-Malawi and national groundnut program researchers (Ojiewo et al., 2018a). Generally, there seems to be a very strong disconnect from the efforts that upstream researchers are making on aflatoxin control from the intended users of research outputs. The same could be said of other aspects, not only of groundnut crop improvement research, but of many other crops. Nevertheless, as shown in this article, ICRISAT together with its partners are doing their best to bridge this disconnect and has already achieved success by integrating genomics in breeding to develop disease resistant (Varshney et al., 2014), high oleate (Janila et al., 2016; Bera et al., 2018), and low allergen lines (Pandey et al., 2019c) among others.

The use of host-plant resistance to A. flavus offers costeffective and environmentally sound management strategy for mitigation of the aflatoxin threat in groundnut. Aflatoxins have zero phytotoxicity but high mammalian toxicity hence the absence of any mitigation metabolism to mitigate its production in planta. There are three major mechanisms that have been identified that reduce infection of grain: in vitro seed colonization (IVSC), reduced pre-harvest aflatoxin contamination (PAC), and reduced aflatoxin production (AP). These resistances can be broadly classified as pod infection (pod wall), seed invasion and colonization (seed coat), and aflatoxin production (cotyledons). While the resistance to pod infection is attributed to physical barriers due to the pod-shell structure, seed invasion and colonization is correlated with density and thickness of palisade cell layers, presence of fungistatic phenolic compounds, wax layers, and absence of microscopic fissures and cavities. These resistance components are highly variable, independent, appearing to be governed by different genes with no significant relationships within, and have been breeding focuses to identify resistant genotypes (Nigam et al., 2009). Stable resistance can be achieved by accumulating favorable alleles for IVSC, PAC, and AP, in addition to deployment of pre- and post-harvest management practices (Pandey et al., 2019a). Studies to identify low groundnut genotypes that experience aflatoxin contamination material, has been conducted by ICRISAT and partners for several years using genebank and other germplasm, with slow and limited progress made. Notwithstanding, it has led to identification of material such as ICGV 88145, ICGV 89104, ICGV 91278, ICGV 91283, ICGV 91284 (Nigam et al., 2009) and 73‐33, ICGV 89063, ICGV 89112, J11 and 55‐437 (Mayeux et al., 2003), and ICG 23. J 11 and 55‐437, released in West Africa are known to accumulate low pre-harvest levels of aflatoxin (Mayeux et al., 2003). However, no aflatoxin resistant varieties have been released yet (Arias et al., 2018; Desmae et al., 2019).

ICRISAT's groundnut breeding programs in Malawi, Mali, and India have initiated breeding of low aflatoxin contaminated groundnut, developing populations and screening lines. At ICRISAT-Malawi, development of populations using eight popular varieties in the region and three sources of aflatoxin resistance started in 2012. The eight and three parental lines included: CG 7, Pendo, ICGV-SM 90704, JL 24, ICGV-SM 01721, ICGV-SM 01711, ICGV-SM 99557, ICGV-SM 99555; and J11, ICGV 95494, Ah 7223 respectively. A second set using a ICG 23, ICG 6402, and ICG 1122 as donor male parents for low aflatoxin contamination has been developed and are at F7 stage from which 32 lines were selected in 2018 to constitute a new population using a three way cross (Table 1). These new sources are additionally tolerant to drought and early maturing.

Similarly, in ICRISAT-Mali, screening of ICRISAT's groundnut mini core accessions was conducted between 2008 and 2013 resulting in identification of low aflatoxin

TABLE 1 | Populations developed involving eight female high aflatoxin contamination genotypes and three low aflatoxin male genotypes to introgress low aflatoxin accumulation in the elite eight material.


contamination sources especially due to pre-harvest aflatoxin contamination. The accessions are ICG 13603, ICG 1415, ICG 14630, ICG 3584, ICG 5195, ICG 6703, and ICG 6888 (Waliyar et al., 2016). Some of these materials have been used in population generation with, more than 130 populations developed between 2015 and 2018. In ICRISAT-India, multiparent advanced generation inter-cross (MAGIC) populations have been developed by crossing eight genotypes possessing low contamination conditioned by at least one of the three mechanisms (Pandey et al., 2019a). These parents include: ICGV 88145, ICGV 89104, U4-7-5, VRR 245, ICG 51, ICGV 12014, ICGV 91278, and 55-437. The MAGIC lines have been phenotyped for two seasons for PAC and AP in addition to genotyping with high-density 58K Axiom\_Arachis single nucleotide polymorphism (SNP) array (Pandey et al., 2017a). Further genetic studies using association mapping and further characterization of highly resistant lines is in progress.

Advances in genomics provide unprecedented opportunity for improving resistance to A. flavus infection and its associated aflatoxin contamination. Such genomic tools, provides opportunity to address the high genotype by environment interaction during trial evaluations that has slowed genetic gain (Nigam et al., 2009). Genomic advances include sequencing of groundnut diploid progenitors (Bertioli et al, 2016; Chen et al., 2016; Lu et al., 2018) and the cultivated tetraploid groundnut (Bertioli et al., 2019; Chen et al., 2019; Zhuang et al., 2019); that provides a scaffold for decoding genetics of host resistance to A. flavus pathogenesis and its associated aflatoxin metabolite production. Additionally, advances in biotechnology, especially recombinant DNA in groundnut by overexpressing antifungal plant defensins MsDef1 and MtDef4.2 against A. flavus pathogenesis, and through host‐induced gene silencing (HIGS) of aflM and aflP genes (Sharma et al., 2018) from the aflatoxin biosynthetic pathway (see next section for details) provide promise of deployment of plant and pathogen derived defense systems to reduce and or eliminate aflatoxin contamination in groundnut.

#### Aflatoxin Mitigation in Groundnut Using Host‐Induced Gene Silencing and Transgenic Approaches: Technology and Translation

Given the fact that aflatoxins are not phytotoxic, it is improbable that a host-pathogen co-evolution exists that can generate natural mechanisms for resistance. However, infection by A. flavus/Aspergillus parasiticus poses a threat to seed and its precious embryo, and therefore, a threat to a plant's transmission of the gene to the next generation. Thus a focus on exploiting pathogen/host interactions that minimizes preharvest infection has received renewed efforts. A second alternative is to focus on genetic engineering of pathogenicity genes that interfere with aflatoxin metabolism in the fungus. Deployment of such genes requires elucidation of aflatoxin metabolism in Aspergillus. The progress though, has been considerably slow, due to limited understanding of the resistance mechanism and associated markers (Luo et al., 2009). The use of "competitive atoxigenic" fungal technology (CAFT), deploying promiscuous atoxigenic Aspergillus strains has been fairly successful in reducing levels of aflatoxin contamination in maize (Atehnkeng et al., 2014), but the same may not be applicable in groundnut, a subterranean legume where mold growth on the grain reduces quality. These scenario necessitated the deployment of genetic engineering approach to develop transgenic resistance at ICRISAT, where a two-pronged strategy was used. High levels of immunity to A. flavus infection and colonization was achieved by overexpression (OE) of antifungal plant defensins from alfalfa, and through HIGS of the aflatoxin metabolism (Figure 1).

By expressing double stranded RNA molecules of Aspergillus in the groundnut–host system, the fungal toxin production pathway was interrupted, making the fungus incapable of aflatoxin production and accumulation (Bhatnagar-Mathur et al., 2015). The chimeric genes were designed for localized to extracellular spaces and endoplasmic reticulum. The OE-Def events showed higher expression of defensins at different pod development stages and maintained steady transcript abundance (up to 70-fold) of the respective defensin until 72 h post inoculation (hpi), compounding resistance to fungal growth. Further OE-Def events showed reduced conidiophore length and conidial head width and had very low fungal load compared to the wild-type control. Similarly, HIGS vectors carried cauliflower mosaic virus (CaMV) 35S promoterregulated hp-RNA (hairpin-RNA) cassettes comprising of synthetic DNA incorporating sections of aflP/omtA and aflM/ ver-1 genes cloned as inverted repeats around the PR10 intron and used for transformation (Bhatnagar-Mathur et al., 2015; Sharma et al., 2018).

Molecular analysis confirmed gene integration and expression in the events and aflatoxin B1 was estimated using HPLC (highperformance liquid chromatography) analysis. Significant reduction in transcription of early, middle, and late pathway genes were observed in both infected OE-Def and HIGS lines. OE-Def and HIGS lines maintained the reactive oxygen species (ROS) homoeostasis, a critical pathogenesis mechanism (Liu et al., 2010; Jun and Zhulong, 2015). This was possibly through positive regulation of the transcription of SOD and CAT genes. Several events accumulated <4 ppb AfB1 compared to >2,000 ppb detected in controls indicating very high levels of resistance to aflatoxin contamination. Significant reductions in transcription of early, middle, and late pathway genes were observed in infected OE-Defensin and HIGS lines.

Progeny from six promising transformation events assayed for A. flavus infection and subsequent aflatoxin content, revealed high levels of consistency, exhibiting trait stability across successive generations. In fact, the stable defensin and HIGS transformation events exhibited large aflatoxin contamination reduction, accumulating 0.5–4 ppb of AfB1 compared to >2,000 ppb in wild type (Bhatnagar-Mathur et al., 2015; Sharma et al., 2018). Experiments are underway to introgress these "traits" into elite backgrounds. Preliminary fungal bioassays with F2 seeds derived from eight cross combinations, also demonstrated very low levels of aflatoxin (< 10 ppb) compared to wild type

counterparts providing reasonable confidence to initiate deployment in groundnut breeding pipelines.

Furthermore, new set of HIGS lines carrying 4 hp-RNAs are currently under development to silence multiple genes in A. flavus by generating multi-target RNA interference (RNAi) signals for genes involved in transcriptional regulation of genes required for developmental processes of sclerotium morphogenesis and conidiation in A. flavus, in addition to the ones that regulate aflatoxin production. Preliminary results with groundnut seed carrying multi-target RNAi signal in T1 and T2 generation showed significant decrease in fungal colonization and aflatoxin production. This shows that down-regulation of genes vital for fungal growth and aflatoxin production through RNAi would be effective in enhancing aflatoxin resistance in groundnut plants.

# HIGH OLEIC GROUNDNUT VARIETIES FOR FOOD AND NUTRITION SECURITY

High oleic trait is an important quality parameter, which determines the flavor, stability, shelf-life, and nutritional quality of groundnut and groundnut products. High oleic groundnut is preferred by food processing and edible oil industry for its extended shelf life and high quality respectively. Groundnut oil and processed food products made using high oleic grain have 10-fold enhanced shelf life compared to regular groundnut (O'Keefe et al., 1993; Braddock et al., 1995). Oxidative rancidity is common in oils with high levels of polyunsaturated fatty acid due to the presence of double carbon bonds that degrade over time producing acids, aldehydes, ketones, and hydrocarbons (Moore and Knauft, 1989). Increased consumption of high-oleate groundnuts as compared to diets without groundnuts at all also has been shown to be linked to improved cardiac health (Guasch-Ferré et al., 2017). Replacement of other frying oils with a high content of polyunsaturated fatty acids will lead to better stability of frying oils, but without the negative cardiac health impacts of transfatty acids from hydrogenated oils or peroxide, polyaromatic hydrocarbon formation in polyunsaturated fatty acid (PUFA) oils during prolonged heating, thereby replacing less healthy alternative frying oils (Kratz et al., 2002).

Two fatty acids—namely oleic acid (monounsaturated fatty acid, MUFA) and linoleic acid (PUFA) accounts for up to 80% of the groundnut oil. The remaining fatty acids including palmitic, stearic, arachidic, gadoleic, behenic, and lignoceric acids constitute 20% with palmitic acid a saturated fatty acid alone contributing 10% (Kavera et al., 2014). High oleic groundnut varieties have a mutated form of the fatty acid dehydrogenase (FAD) gene. This gene encodes an enzyme delta-12-desaturase (oleoyl-PC desaturase) which catalyses addition of a second double bond onto oleic acid to produce linoleic acid. If the enzyme is inactivated, then oleic acid accumulates in the oil bodies resulting in oleic acid contents of more than 80%, while the linoleic acid content remains around 2–5%. Due to the allotetraploid nature of groundnut there are two homeologous gene sequences (FAD2A and FAD2B) believed to originate from the two progenitor species genomes—Arachis duranensis and Arachis ipaensis (Bertioli et al., 2019; Chen et al., 2019; Zhuang et al., 2019). Mutations in either one of these genes leads to small increase in oleic acid content of by over 60% (Nawade et al., 2016). However, the presence of both the mutant alleles of FAD2A and FAD2B genes is essential for achieving >75% oleic acid (Pandey et al., 2014) which have clearly been observed among introgression line developed using allele-specific markers (Janila et al., 2016).

Norden et al. (1987) identified the first natural high-oleate groundnut mutant line, F435 with about 80% oleic acid and 2% linoleic acid. The first high oleate groundnut variety, SunOleic 95R was bred in USA through conventional breeding (Gorbet and Knauft, 1997). Following the identification of linked allelespecific (Chen et al., 2010), and cleaved amplified polymorphic sequence markers for both the ahFAD2 genes (ahFAD2A and ahFAD2B; Chu et al., 2009), marker assisted backcross breeding (MABC) and marker assisted selection (MAS) were used to improve oleic acid content of a nematode resistant variety, 'Tifguard' in USA (Chu et al., 2011). Techniques such as HybProbe SNP assay (Bernard et al., 1998) and multiplex realtime PCR assay (Barkley et al., 2010) were also utilized in selecting the heterozygous and homozygous breeding lines for both mutant alleles. Recently, high-oleic lines have been developed using MAS and MABC in Spanish and Virginia Bunch varieties in India (Janila et al., 2016; Bera et al., 2018). The use of markers in breeding considerably reduced the time and population size in different backcross generations. High oleic groundnuts have also been developed and released for cultivation in Australia, Brazil, Argentina, and China. Interestingly, Australia is the only country cultivating 100% high oleic groundnut. Evaluation of high oleate lines is under testing in many countries in Africa and Asia. The high oleic acid in cooking oil decreases the risk of cardiovascular disease (CVD) by reducing the levels of serum low-density lipoprotein (LDL) cholesterol and maintaining the levels of high-density lipoproteins (HDL) (Rizzo et al., 1986); as compared to oils with high proportion of saturated fatty acids.

At ICRISAT, looking at the potential food industry needs and consumer health benefits, the high oleic breeding and testing pipelines are furthered by breeding high oleic in the background of elite/popular adapted varieties for different agro-ecologies as well as pyramiding multiple traits into a single cultivar. The ongoing groundnut breeding program incorporates the high oleic trait into biotic and abiotic stress resistant/tolerant cultivars, and different growth habits. In terms of biotic stresses such as late leaf spot and rust, although resistant lines were developed in the past using conventional breeding methodologies, majority of these resistant lines have long maturity. In this context, molecular markers were identified associated with resistance to these foliar diseases (Khedikar et al., 2010; Sujay et al., 2012; Pandey et al., 2017b). Subsequently, by using molecular markers and MABC approaches, the first set of foliar disease resistant lines were developed in three genetic backgrounds (Varshney et al., 2014). At present, late leaf spot (LLS) and rust resistance traits are being combined with the high oleic acid trait in the breeding programs both at ICRISAT and several national agricultural research systems (NARS) in Asia and Africa. The pyramided lines for foliar disease resistance and high oleic acid trait are currently under different stages of evaluation. The generation interval is reduced by using glass house facilities for generation advancement, and deploying genotyping, rapid and nondestructive phenotyping using near-infrared reflectance spectroscopy (NIRS), and early generation testing in target locations resulted in enhanced rate of genetic gain in high oleic breeding pipeline. SNPs for FAD2B mutant allele in F2 generation and phenotyping of harvested kernels from F2 single plants are used to make selection decisions in early generations.

Bold seeded high oleic varieties with low oil content is preferred by the food processing industries. High amount of linoleic acid in the oil is not good for cooking purposes as it is vulnerable to oxidative rancidity and becomes thermodynamically unstable when heated at high temperature (Kratz et al., 2002). Oleic acid has 10-fold higher auto-oxidative stability than linoleic acid (O'Keefe et al., 1993) and therefore, with high oleic to linoleic acid ratio (O/L ratio), groundnut and its products have longer shelf life than normal lines (Bolton and Sanders, 2002). In high oleic lines the linoleic acid is reduced and oleic acid is increased. Keeping this in perspective two low oil containing bold seeded parents—ICGV 06110 and ICGV 07368 [70–80 g hundred seed weight (HSW)] and two high oil containing medium seeded parents—ICGV 06142 and ICGV 06420 (37–40 g HSW) were initially used as recurrent parents in a crossing program with SunOleic 95R being the donor parent for the high oleic trait. The first set of 64 high oleic lines developed at ICRISAT had a 100 seed

FIGURE 2 | Performance of 16 high oleic lines under multi-location evaluation trials conducted during rainy season, 2016. These 16 lines were recommended for All India Co-ordinated Research Project on Groundnut (AICRP-G) testing based on their superior performance over the local check at respective location. Figures at the top of the bar indicates percentage increase in pod yield over the best local check. Oleic acid was measured by gas chromatography. TNAU, Tamil Nadu Agricultural University; ANGRAU, Acharya N G Ranga Agricultural University; JAU, Junagadh Agricultural University; PJTSAU, Prof Jayashankar Telangana State Agricultural University; DGR, ICAR-Directorate of Agricultural Research.

mass of 30–40 g. These 64 lines were evaluated in multi-location trials conducted at five locations (Tamil Nadu Agricultural University, Coimbatore, Tamil Nadu; Acharya NG Ranga Agricultural Univ., Tirupathi, Andhra Pradesh; Junagadh Agricultural Univ., Junagadh, Gujarat; Prof. Jayashankar Telangana Agricultural Univ., Palem, Telangana; Indian Council of Agricultural Research-Directorate of Groundnut Research, Junagadh, Gujarat) representing Western, Central, and Sothern India and a subset of 16 high oleic lines were proposed for national evaluation under All India Co-ordinated Research Project on Groundnut (AICRP-G) (Figure 2).

The results from the study on global homogenous groundnut zones show that the similarity between African and Asian locations is much higher and hence need to choose



collaboration partners across the globe as a way to achieve higher impact of investment (https://www.semanticscholar.org/paper/ Global-homogenous-groundnut-zones-%E2%80%93-a-tool-tothe-Mausch-Bantilan/1d698be7f3dc50728fbc6784b5 32069c73ae85d0). Selection for the large kernel size in subsequent cycles and recycling of elite lines as parents, it was possible to achieve a significant yield gain in the 100-seed mass from an average of 38 to 55 g from 2015 to 2017 (Figure 3). Size distribution of the kernels that give the proportion of different size of kernels is another key criterion used in selection advancement decisions in high-oleic breeding pipelines.

Some of the high oleic lines developed at ICRISAT-India were evaluated in Nigeria (27 lines) and Mali (9 lines). Preliminary results show adaptability of some lines with relatively higher pod yield of up to 2.4 t/ha (Table 2). Many of the lines were tolerant to early leaf spot (Cercospora arachidicola) and late leaf spot (Cercosporidium personatum), with severity scores of 4 and 5. The lines displayed stay green character associated with tolerance to drought, with lines such as ICGV 15059, ICGV 15074, and ICGV 16001 having a haulm yield of more than 3.4 t/ha in Mali. Similarly, in Nigeria, lines ICGV 15060 (4.5 t/ha), ICGV 15065 (4.1 t/ha), and ICGV 15052 (4 t/ha) were the top three at BUK while ICGV 15034 (3.6 t/ha), ICGV 15064 (3.7 t/ha), and ICGV 15070 (4.6 t/ha) were the top three at Minjibir in haulms yield. Similar evaluations of 23 lines are under way in the East and Southern Africa breeding program, to identify best candidates for further evaluation and/or use in line conversions. In east and southern Africa two approaches are being used to develop high oleate groundnut. Firstly from the groundnut minicore (Upadhyaya et al., 2010) high oleate donor germplasm such as ICG 1274, ICG 5221, ICG 5475, ICG 6766, ICG 6646, ICG 6201 are being used to improve the oil quality of adapted and popular varieties in the region. The variety CG7 that has over 48% oil content for example is targeted to improve oil quality while, food and confectionary popular varieties such as ICGV-SM 90704 and ICGV-SM 08503 respectively, among others is have been target.

Overall, 18 parental lines have been included in development of a new generation of high oleate groundnut. The elite and released materials targeted include ICGV-SM 90704 (Nsinjiro), CG 7, ICGV-SM 08503, ICGV-SM 06729, ICGV-SM 01731, ICGV-SM 01711, Chalimbana, ICGV-SM 03517, ICGV-SM 99557, ICGV-SM 99551, ICGV-SM 07539, ICGV-SM 07502, ICGV-SM 0552, ICGV-SM 99568, ICGV-SM 99556, and ICGV-SM 09511. A second approach involves the use of SunOleic 95R donor material for elite-by-elite crosses. These material are at F3–F4 depending on whether they are short duration (Spanish and Valencia) and or medium genotypes and long duration (Virginia). A marker assisted selection approach will be used to support line conversion through backcross approach. The populations developed are at different stages of advancements, with the earliest target for release being 2022.

ICRISAT-Mali groundnut program also developed 10 new high oleic populations during the 2018 main rainy season using two high oleic parents (ICGV 15112, ICGV 16012) by crossing with three released (Fleur 11, ICGV-IS 13825, ICG 7878) and two dominant farmers (28-206, 47-10) varieties. The F1s are planted during 2019 offseason using irrigation. The resulting F2s will be planted during the 2019 main rainy season where leaf samples will be sent to high throughput phenotyping and genotyping (HTPG) platform at Intertek-Hyderabad, India for genotyping for MAS of F2 plants carrying the high oleic allele.

# ADVANCES IN CROP IMPROVEMENT RESEARCH ON GROUNDNUT ALLERGENS

Groundnut allergy is one of the serious food allergies which affect 1– 2% of the world populations. Australia tops the list of the most highly affected countries (Sicherer and Sampson, 2014); other highly affected countries include USA (Sicherer et al., 2010; Nicolaou et al., 2010), Canada (Ben-Shoshan et al., 2009), Denmark and France (Morisset et al., 2002; Osterballe et al., 2005), and the United Kingdom (Grundy et al., 2002). Groundnut allergy is not only life threatening but also adversely affects life quality of groundnutallergic individuals and their families. Currently, there is no vaccine to prevent groundnut allergy in sensitive individuals, medicine to alleviate the allergic effects, or methods to reduce allergen proteins in the groundnut products.

Groundnut seed contains 32 different types of storage proteins and 18 of them have allergen property (Pele, 2010). The allergens Ara h 1, Ara h 2, Ara h 3, and Ara h 6 are the major allergens with ability to cause life-threatening reactions such as anaphylaxis (Krause et al., 2010). Groundnut sequence analysis has identified several homologs of the allergen encoding genes viz. three for Ara h 1, one for Ara h 2, eight for Ara h 3, and two for Ara h 6 (Ratnaparkhe et al., 2014). The study by mining allergen genes in the reference genome of the diploid A genome (A. duranensis, accession PI475845) (Chen et al., 2016) and indirect transcriptome studies covering few seed development stages (Clevenger et al., 2016) provided inconclusive information on presence of allergen genes in the entire genome. Since the tetraploid genomes became available in 2019 (Bertioli et al., 2019; Chen et al., 2019; Zhuang et al., 2019), comprehensive genome and functional genomics studies are required for mining the genome-wide allergen genes so that crop improvement approaches can be deployed for developing groundnut varieties with low allergen contents. Most recently, ICRISAT-India developed monoclonal antibody based ELISA protocol that successfully screened diverse set of groundnut accessions identifying five major allergens Ara h 1, Ara h 2, Ara h 3, Ara h 6, and Ara h 8 as well as groundnut genotypes with low allergen contents (Pandey et al., 2019b; Pandey et al., 2019c). The threshold of allergen proteins differ significantly in the allergic population, for example a threshold of 100 µg of Ara h 1 is observed in some populations (Warner, 1999).

The recent studies by the U.S. FDA studies showed improved tolerance by introducing groundnut consumption during 4–10 months of age (https://www.fda.gov/food/cfsan-constituentupdates/fda-acknowledges-qualified-health-claim-linking-earlygroundnut-introduction-and-reduced-risk). Another study has also demonstrated that beginning consumption of groundnutcontaining foods in infancy (between 4 and 10 months of age) reduced the risk of developing groundnut allergy by 5 years of age by more than 80% (du Toit et al., 2018). An approved health claim by US FDA (https://www.fda.gov/food/cfsan-constituentupdates/fda-acknowledges-qualified-health-claim-linking-earlygroundnut-introduction-and-reduced-risk) indicated that positive impact of early consumption of groundnuts may be one avenue to address potential groundnut allergies. For consideration of a qualified health claim regarding the relationship between the consumption of foods containing ground groundnuts and a reduced risk of developing groundnut allergy, the FDA found the scientific evidence appropriate and suggested to implementing agencies to provide clear information on the foods to avoid misleading consumers (https://www.fda.gov/media/107357/download). Further, FDA would monitor and evaluate for possible enforcement action situations where foods that bear the qualified health claim regarding reducing the risk of developing groundnut allergy that contain groundnuts in trivial amounts (https://www.fda. gov/media/107357/download). Nevertheless, if these efforts are more successful in increasing tolerance among kids in coming years, the groundnut lines with low allergen contents may provide an opportunity to be used in developing therapeutic product for vaccination or tolerance. Still a long way to go and much more efforts are needed in establishing the importance of low allergen protein containing groundnuts to be used as alternative and effective approach in fighting groundnut allergies.

# EXPLOITING THE DIPLOID AND TETRAPLOID GROUNDNUT GENOME SEQUENCES FOR CROP IMPROVEMENT

The groundnut research community has witnessed rapid developments in this decade in the area of genomic resources which are critical for harnessing the potential of genomics for groundnut improvement (see Pandey et al., 2012; Varshney et al., 2013; Pandey et al., 2016; Pandey et al., 2017a; Pandey et al., 2019b; Pandey et al., 2019c). Availability of reference genome and high density genotyping assay are the most important milestones for understanding genome architecture, trait mapping, gene discovery, and molecular breeding (Varshney et al., 2013). The major genomic resources that have been developed in recent years include 1) reference genome of cultivated tetraploid (Bertioli et al., 2019; Chen et al., 2019; Zhuang et al., 2019); 2) reference genome of allotetraploid wild groundnut, Arachis monticola (Yin et al., 2018); 3) reference genomes of diploid progenitors of cultivated groundnut i.e., A. duranensis (Bertioli et al., 2016; Chen et al., 2016) and A. ipaensis (Bertioli et al., 2016; Lu et al., 2018); 4) "Axiom\_Arachis" array, a high density genotyping assay with >58K highly informative SNPs (Pandey et al., 2017a); 5) gene expression atlas for cultivated tetraploid (Clevenger et al., 2016); 6) molecular/ genetic markers (Pandey et al., 2016; Pandey et al., 2017a; Vishwakarma et al., 2017; Zhao et al., 2017; Lu et al., 2019; Pandey et al., 2019b; Pandey et al., 2019c); and 7) diverse genetic populations such as MAGIC and nested association mapping (NAM) populations to conduct high resolution genetic mapping and breeding (Pandey et al., 2017a; Pandey et al., 2017b; Pandey et al., 2017c); and 8) trait linked diagnostic markers for use in genomics-assisted breeding (GAB) (Pandey et al., 2017c). As a result, the next-generation sequencing based trait discovery (Pandey et al., 2017b) and sequence-based breeding (Varshney et al., 2019) will enhance breeding speed and precision for greater genetic gains.

So far, the reference genomes of diploid progenitors have been used for comparative genomics, structural and functional genomics, trait mapping, gene and marker discovery. Now the reference genome for the cultivated tetraploid groundnut (cultivar Tifrunner) has been reported, by the International Groundnut Genome Initiative (IPGI, https://groundnutbase. org/groundnut\_genome), Fujian Agriculture and Forestry University<sup>3</sup> and ICRISAT-India, and Crop Research Institute of the Guangdong Academy of Agricultural Sciences (GAAS), China (Chen et al., 2019). Since one genome is not enough, we should sequence complete GeneBank accessions of groundnut. In this context, ICRISAT has completed sequencing of Groundnut Reference Set which is a global diversity panel and further comparative structural genomics and association mapping is in progress. Such efforts are likely to be increased in groundnut in the coming years.

<sup>3</sup> http://groundnutgr.fafu.edu.cn

## DELIVERING ADVANCED GENETICS TO SMALLHOLDER FARMERS TO UNLOCK GROUNDNUT PRODUCTION

Once superior groundnut varieties with improved nutritional quality (high oleic acid, low allergenic properties, low aflatoxin producing) are developed and released, sustained multi-sectoral participatory efforts of groundnut scientists, nutritionists, public health experts, socio-economists, nongovernmental organizations (NGOs), policy makers from the governments, and civil society champions is needed to develop functional delivery models that improve effectiveness of various production-to-consumption value chains (Ojiewo et al., 2019).

Market-oriented and/or export-led commercial production is a necessity for sustainable legume value chain (Rubyogo et al., 2019). This is still lacking in many developing countries, where a significant proportion of groundnut production is mostly done by small-scale farmers under rainfed conditions for subsistence (Ojiewo et al., 2018a). McGuire and Sperling (2016), in a large sample of 2,592 smallholder farmers in six countries found that only 7% of legume seed came from the formal or semi-formal (agro-dealers, government aid, NGOs, community seed groups) sectors. Therefore, 93% came from informal sources. Of this, 64% is purchased from local markets, mostly from grain aggregators/grocery stores. Similarly in North-eastern Nigeria, only 9% of farmers purchase seeds from seed companies, while 22% purchase from local market/grain aggregators (Ajeigbe et al., 2018). There is, therefore, evidence that farmers do buy legume seed. However, for various reasons they do not buy legume seed from the formal outlets.

Preliminary results from a study being conducted in Uganda on Gender Integration in Seed Systems indicate that only 2% of farmers save their own seed and plant the next season. Most smallholder legume farmers produce and consume or sell all their produce before the planting season to meet their basic needs and are therefore compelled to purchase seed during planting season. A few farmers, with alternative sources of income, manage to save grain from their harvest for better prices during shortages just before planting season and this is often the source bought by grain aggregators and later sold to other farmers. The greatest concern is the poor quality of the "seed" obtained from such market sources. Many times, the seed has to be sorted with significant sorting loss and suffers poor germination, vigor, and crop establishment as well as potential for seed borne diseases.

Farmers would potentially change their behavior to source seed from high quality sources if they are made aware of these losses and if access to high quality seed at reasonable prices is facilitated/arranged, and stable prime price market is guaranteed. From the same study, it was noted that farmers do not understand the language of "certified seed," but are actually interested in and willing to pay for high quality groundnut seed. Their sense of quality is in the color from inherent knowledge of an old variety called 'Red Beauty'. They associate any red variety with this old variety and assume that all red colored groundnut is improved variety and seeds of the same are of high quality. Farmers here suggest that simpler language such as "super seed" would convey the language of quality better than "certified seed".

Variety release and adoption figures summarized by the CGIAR DIIVA (Diffusion and Impact of Improved Varieties in Africa) project data on selected crops in sub-Saharan Africa (http://www.asti.cgiar.org/diiva), suggest that many of the improved groundnut varieties are not adopted and produced by farmers. This leaves a number of unanswered questions about the productivity and profitability/value of new crop varieties: Are the varieties superior/good enough? Do we have robust data on superiority (productivity/profitability) of new varieties to convince the private sector to commercialize them? Are value chain actors aware of them? Is the seed system ready to respond to demand? It is important to establish a product advancement criteria and process to prioritize varieties for commercialization, backed by extensive on-farm testing systems and robust demonstration trial data, helping to make confident conclusions and recommendations for variety turnover by public and private sector seed enterprises. This process works to make research and development more business oriented by focusing on the decision-making criteria of markets and advancing a defined selection of varieties.

Besides, innovative and transformative models for accessing, multiplying, and disseminating public-bred varieties should be developed and promoted by scaling up seed enterprises. Innovative models of early generation (breeder and foundation) and certified seed production should be tested through demand-led public and private partnerships. Quality seed of improved varieties of groundnut is difficult to access in many countries due to bottlenecks in the early generation seed (EGS) value chain. This is due to a number of factors related to perceived marginal economic value of quality seed. Some of the major factors that are important for a successful seed value chain include grain demand for varieties produced with quality seed, national and regional policy environment, quality assurance mechanisms, capacity and resources across the seed value chain, organization and implementation of quality assurance mechanisms, as well as quality of physical infrastructure. For example, with increase in demand from food industries for genetically pure high oleic groundnut for private seed companies are expected to play a crucial role in responding to this demand and in ensuring a more organized sustainable seed supply chain. Seed farmers or seed entrepreneurs will be the direct beneficiaries of the system as their help will be needed to meet the high demand. Thus, the development and release of high oleic acid groundnuts creates a demand-pull thus benefitting all the groundnut value chain actors, which include farmers, and food and oil industries, along with providing healthy alternatives for consumers.

Seed Revolving Fund (SRF) is a model developed and successfully implemented by ICRISAT and partners in Malawi to address limited production and supply of groundnut EGS (Siambi et al., 2015). The model involves public and private partners at each stage of the seed value chain where breeder seed is produced and supplied by the public breeding institution. Farmer seed producer groups are trained in quality seed production, and contracted by the SRF to produce foundation seeds, at agreed buy-back prices. Individual larger scale farmers are contracted to multiply seed for the SRF, especially if they have irrigation facilities that secure production even in drought years. It is important for entrepreneurs to purchase breeder seed instead of the SRF providing seed and deducting from the sales. Foundation seed is then sold to local seed ventures for multiplication into certified seeds. The companies produce and sell certified seed through agrodealers. Proceeds of the sales realized through the SRF are ploughed back to cover the operational costs such as staff, inspection and certification, warehouse, seed packaging and transport, and this enables the fund to engage more entrepreneurs every year. The success of the SRF model requires strict standard operating procedures to ensure good quality seed and also avoid conflict of interest by staff. Another important requirement is that the proceeds from the sales must "revolve" to enable the unit to make further investments and carry out all the necessary operations in a timely manner. This may involve consultation with governments to set up financial management structures that provide an easier accountability process. Seed quality is assured through a strategic partnership with the government's Seed Services Unit. It is also necessary to link up with grain and commodity markets, especially processors, to ensure sustained demand for grain, which then pulls the seed.

#### CONCLUSION AND FUTURE PROSPECTS

Investment in conventional breeding to develop groundnut varieties with low to zero aflatoxin contamination in groundnut is slowly making progress especially over the last decade. Key developments include exploiting resistance to the pathogen A. flavus and A. parasiticus, focusing on partial resistance during seed colonization, pre-harvest aflatoxin contamination in the field, and reduction of aflatoxin production. Further, transgenic approaches using plant defensins and host-induced gene silencing hold promise for elimination of aflatoxin production in nuts at both pre-harvest and post-harvest stages. While genotypes with very low allergen proteins have been identified, development of genomics assisted breeding will hasten deployment of the low allergenic trait groundnut varieties. The current effort to use genomic assisted breeding for development of high oleate and low linoleic and palmitic acids groundnut, have potential to unlock competitiveness of groundnut providing opportunity from farmto-fork. The seed industry could benefit from the increasing demand for propriety groundnut, such as high oleate genotypes,

#### REFERENCES

Agbetiameh, D., Ortega-Beltran, A., Awuah, R. T., Atehnkeng, J., Cotty, P. J., and Bandyopadhyay, R. (2018). Prevalence of aflatoxin contamination in maize and groundnut in ghana: population structure, distribution, and toxigenicity of the causal agents. Plant Dis. 102 (4), 764–772. doi: 10.1094/PDIS-05-17- 0749-RE

to develop a market oriented systems. Leveraging on such a systems, the regular groundnut and other legumes could benefit, further strengthening resilience of faring communities. Taken together, the increasing demand from food industries for high oleate groundnut, with low allergenic and aflatoxin properties, an organized seed sector leveraging on advances in science, the third industrial revolution underpinned by information and communication technology (ICT), improvements in finance inclusivity and policy groundnut provides a good model crop to meet production to consumption and income benefits to millions of households who depend on the crop.

#### AUTHOR CONTRIBUTIONS

CO wrote the first draft and incorporated all inputs from the coauthors and editors. HD, PO, JM, HA, WG-W, PB-M, PJ, MP, and RKV contributed to the section on Advances In Crop Improvement to Mitigate Groundnut Aflatoxin Contamination. PB-M further contributed to the sub-section on Aflatoxin Mitigation in Groundnut Using Host‐Induced Gene Silencing and Transgenic Approaches: Technology and Translation. PJ, MV, TR, KD, SB, AR, NM, RPV, and MK contributed the section on High Oleic Groundnut Varieties for Food and Nutrition Security. MP and RKV contributed the sections on Advances in Crop Improvement Research on Groundnut Allergens and Exploiting the Diploid and Tetraploid Groundnut Genome Sequences for Crop Improvement. CO, EN-M, GM, PO and EA contributed to the section on Delivering Advanced Genetics to Smallholder Farmers to Unlock Groundnut Production.

## FUNDING

The funding support for this study was received from the Bill and Melinda Gates Foundation (fund number OPP1114827), National Mission on Oilseeds and Oilpalm (NMOOP), Department of Agriculture and Co-operation (DoAC), Ministry of Agriculture, Government of India, Biotechnology (DBT) of Government of India; National Agricultural Science Fund (NASF) of Indian Council of Agricultural Research and MARS Wrigley, USA. The work reported in this article was undertaken as a part of the CGIAR Research Program on Grain Legumes and Dryland Cereals (GLDC). ICRISAT is a member of the CGIAR.

Ajeigbe, H. A., Inuwa, A. H., Vabi, M. B., Odoyo, P. O., and Kamara, A. Y. (2018). 2018 seed needs assessment in selected local government areas of Adamawa, Borno and Yobe States of Nigeria. Report submitted to FAO Nigeria.

Anim-Somuah, H., Henson, S., Humphrey, J., and Robinson, E. (2013). Strengthening agri-food value chains for nutrition: mapping value chains for

Amaike, S., and Keller, N. P. (2011). Aspergillus flavus. Annu. Rev. Phytopathol. 49, 107–133. doi: 10.1146/annurev-phyto-072910-095221

nutrient-dense foods in ghana. IDS EVIDENCE REPORT No 2, reducing hunger and undernutrition.


Food and Agriculture Organization of the United Nations), 72 pp. License: CC BY-NC-SA 3.0 IGO. doi: 10.18356/ce824af1-en


deletions markers provided greater insights on species, genomes, and sections relationships in the genus Arachis. Front. In Plant Sci. 8, 2064. doi: 10.3389/fpls.2017.02064


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Ojiewo, Janila, Bhatnagar-Mathur, Pandey, Desmae, Okori, Mwololo, Ajeigbe, Njuguna-Mungai, Muricho, Akpo, Gichohi-Wainaina, Variath, Radhakrishnan, Dobariya, Bera, Rathnakumar, Manivannan, Vasanthi, Kumar and Varshney. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Developing Climate-Resilient Chickpea Involving Physiological and Molecular Approaches With a Focus on Temperature and Drought Stresses

Anju Rani <sup>1</sup> , Poonam Devi <sup>1</sup> , Uday Chand Jha<sup>2</sup> , Kamal Dev Sharma<sup>3</sup> , Kadambot H. M. Siddique<sup>4</sup> and Harsh Nayyar 1\*

<sup>1</sup> Department of Botany, Panjab University, Chandigarh, India, <sup>2</sup> Department of Crop Improvement Division, Indian Institute of Pulses Research, Kanpur, India, <sup>3</sup> Department of Agricultural Biotechnology, Himachal Pradesh Agricultural University, Palampur, India, <sup>4</sup> The UWA Institute of Agriculture, The University of Western Australia, Perth, WA, Australia

#### Edited by:

Sergio J. Ochatt, INRA UMR1347 Agroécologie, France

#### Reviewed by:

Jens Berger, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia Wolfram Weckwerth, University of Vienna, Austria

> \*Correspondence: Harsh Nayyar

harshnayyar@hotmail.com

#### Specialty section:

This article was submitted to Plant Physiology, a section of the journal Frontiers in Plant Science

Received: 12 June 2019 Accepted: 16 December 2019 Published: 25 February 2020

#### Citation:

Rani A, Devi P, Jha UC, Sharma KD, Siddique KHM and Nayyar H (2020) Developing Climate-Resilient Chickpea Involving Physiological and Molecular Approaches With a Focus on Temperature and Drought Stresses. Front. Plant Sci. 10:1759. doi: 10.3389/fpls.2019.01759 Chickpea is one of the most economically important food legumes, and a significant source of proteins. It is cultivated in more than 50 countries across Asia, Africa, Europe, Australia, North America, and South America. Chickpea production is limited by various abiotic stresses (cold, heat, drought, salt, etc.). Being a winter-season crop in northern south Asia and some parts of the Australia, chickpea faces low-temperature stress (0–15°C) during the reproductive stage that causes substantial loss of flowers, and thus pods, to inhibit its yield potential by 30–40%. The winter-sown chickpea in the Mediterranean, however, faces cold stress at vegetative stage. In late-sown environments, chickpea faces high-temperature stress during reproductive and pod filling stages, causing considerable yield losses. Both the low and the high temperatures reduce pollen viability, pollen germination on the stigma, and pollen tube growth resulting in poor pod set. Chickpea also experiences drought stress at various growth stages; terminal drought, along with heat stress at flowering and seed filling can reduce yields by 40–45%. In southern Australia and northern regions of south Asia, lack of chilling tolerance in cultivars delays flowering and pod set, and the crop is usually exposed to terminal drought. The incidences of temperature extremes (cold and heat) as well as inconsistent rainfall patterns are expected to increase in near future owing to climate change thereby necessitating the development of stress-tolerant and climate-resilient chickpea cultivars having region specific traits, which perform well under drought, heat, and/or low-temperature stress. Different approaches, such as genetic variability, genomic selection, molecular markers involving quantitative trait loci (QTLs), whole genome sequencing, and transcriptomics analysis have been exploited to improve chickpea production in extreme environments. Biotechnological tools have broadened our understanding of genetic basis as well as plants' responses to abiotic stresses in chickpea, and have opened opportunities to develop stress tolerant chickpea.

Keywords: chickpea, water limitation, high temperature, tolerance, genomics

# INTRODUCTION

Chickpea (Cicer arietinum L.) is the 2nd most important legume crop after common bean (Phaseolus vulgaris L.) (Gaur et al., 2008; Varshney et al., 2013b) and an economically beneficial protein-rich food legume. India is the largest chickpea-producing country, with a 75% share of global production (FAO, 2016; Maurya and Kumar, 2018; Gaur et al., 2019). Chickpea is produced in 50 countries, of which Australia, Canada, Ethiopia, India, Iran, Mexico, Myanmar, Pakistan, Turkey, and the USA are the major producers (Gaur et al., 2012; Archak et al., 2016; Dixit et al., 2019). However, the productivity of chickpea is not sufficient to fulfill the protein requirement for the increasing human population (Henchion et al., 2017; Chaturvedi et al., 2018). Chickpea production faces many challenges due to various abiotic stresses such as drought, and low and high temperatures (Ryan, 1997; Millan et al., 2006; Gaur et al., 2008; Mantri et al., 2010; Jha et al., 2014; Garg et al., 2015). Most importantly, unpredictable climate change is the major constraint for chickpea production as it increases the frequency of drought and temperature extremes, i.e., high (> 30°C) and low (< 15°C) temperatures (Gaur et al., 2013; Kadiyala et al., 2016), which reduces grain yields considerably (Kadiyala et al., 2016). Thus, high- and stable-yielding varieties of chickpea during such stress conditions need to be developed (Chaturvedi and Nadarajan, 2010; Krishnamurthy et al., 2010; Devasirvatham et al., 2015; Devasirvatham and Tan, 2018).

Drought stress is a serious situation for agriculture in the context of climate change and the ever-increasing world population (Farooq et al., 2009; Tardieu et al., 2018). Extreme drought conditions reduce crop yields through negative impacts on plant growth, physiology, and reproduction (Yordanov et al., 2000; Barnabas et al., 2008). Across the globe, drought stress reduces chickpea yield by about 45–50% (Ahmad et al., 2005; Thudi et al., 2014). Numerous studies have been conducted on the drought effects on different chickpea traits, including early maturity, root traits, carbon isotope discrimination, shoot biomass (Kashiwagi et al., 2005; Krishnamurthy et al., 2010; Upadhyaya et al., 2012; Krishnamurthy et al., 2013b; Purushothaman et al., 2016), and morphological (Sabaghpour et al., 2006), physiological (Turner et al., 2007; Rahbarian et al., 2011), biochemical (Gunes et al., 2006; Mafakheri et al., 2010) and molecular traits (Mantri et al., 2007; Thudi et al., 2014; Garg et al., 2016). There have been various attempts to explain the advancements in "omics" technology for drought challenges. These advances should progress the development of stressresilient, high yielding, and nutritionally superior varieties of chickpea.

Winter/autumn-sown chickpea crops in northern south Asia and south Australia face low temperature (LT) stress at reproductive (flowering/podding) stages whereas those in Mediterranean region, especially the central Anatolia, are exposed to LT at the seedling and early vegetative stages (Berger et al., 2005; Berger et al., 2011; Berger et al., 2012). Winter-sown crops in the West Asia and North Africa (WANA) or northern regions of south Asia flower when cold is over and temperatures rise. Podding temperatures are slightly higher than those for flowering (Berger et al., 2005), and flowers drop if temperatures remain lower than that required for podding. At flowering/podding time, the crop is also at the risk of damage by Ascochyta blight disease. A temperature of 14–6°C, usually 15°C, is considered a threshold for reproduction in chickpea (Srinivasan et al., 1998; Berger et al., 2004; Clarke et al., 2004; Berger et al., 2005; Bakht et al., 2006b; Berger, 2007), a recent study by Berger et al. (2012), however, measured mean flowering temperature to be 21°C which is well above the earlier estimates implying that most of the world chickpea is susceptible to cold stress. Winter sown chickpea is also prone to terminal drought, as delayed flowering extends the chickpea growing season to warm but low or no rainy periods. In contrast to this, spring sown crops in the Mediterranean, USA, and Canada are of short duration and do not face terminal drought but productivity is low due to short duration (Singh et al., 1997a). In USA, the rains may extend the crop growth season so long that crop fails to mature especially in the Montana region (McVay et al., 2013). Being a crop of indeterminate growth habit, drought conditions will hasten maturity in chickpea by stopping growth, while late season rains will cause plants to green back up (McVay et al., 2013).

Despite being a cool-season crop, chickpea also faces hightemperature (HT) stress during reproductive development in warmer regions and in late-sown environments. HT aborts floral buds, flowers, and pods, ultimately leading to reduced seed size and yield (Wang et al., 2006) especially those above 32°C (Kaushal et al., 2013; Devasirvatham et al., 2015). HT like LT leads to loss of pollen viability and pollen fertility that affect pod set (Wang et al., 2006; Kumar et al., 2013; Kaushal et al., 2016). HT induced disruption in sucrose synthesis and its availability to the anthers, and oxidative stress appears to contribute to loss of pollen fertility and stigmatic function (Kaushal et al., 2013; Kumar et al., 2013; Devasirvatham et al., 2015), resulting in poor pod set. Heat stress can have a highly destructive effect on grain growth and development in chickpea (Wang et al., 2006). The grain yield of chickpea is related to its phenology, which is influenced by temperature range (Jumrani and Bhatia, 2014). High temperatures (> 35°C) during the reproductive stage is a major constraint for chickpea productivity (Siddique et al., 1999; Wang et al., 2006; Basu et al., 2009), with temperatures >30°C reducing grain weight and number (Kobraee et al., 2010). Substantial reductions in chickpea yield have been observed for even a 1°C rise in temperature beyond the threshold (Kalra et al., 2008). Yield losses have increased to 100% in many chickpea genotypes, with increasing temperature (Canci and Toker, 2009). High temperature severely affects podding in chickpea; the magnitude of which may be due to impaired source and sink relations from green leaves to anther tissue that leads to the mortality of pollen grains (Awasthi et al., 2014). Heat stress after flowering and grain filling reduced chickpea yield, due to increased senescence and reduced grain set and grain weight per plant (Wang et al., 2006). Post-anthesis, both grain numbers and weight decreased at high temperatures, leading to lower grain yields (Summerfield et al., 1984; Wang et al., 2006; Devasirvatham et al., 2013). Heat stress, in future, would considerably reduce the grain yields in several crops, including chickpea, in many parts of the world, and thus deserves serious attention to develop heat-tolerant cultivars. Developing new cultivars with improved adaptation to high temperature is vital for increasing worldwide chickpea production.

Winter sown crops in all parts of world are prone to terminal drought, however, drought is not confined to terminal stages but it may occur at any plant growth stage. Spring-sown chickpea in WANA region and semi-arid tropics (SAT) faces drought at the vegetative as well as reproductive stages (Silim and Saxena, 1993) leading to 30 to 100% yield losses, depending on the genotype, and severity as well as timing of drought (Singh, 1993; Leport et al., 1999; Canci and Toker, 2009). Chickpea can tolerate drought stress based on "escape," "tolerance," and "avoidance" three important mechanisms (Levitt, 1972). The principle of drought escape constitutes completion of plant's life-cycle before the onset of drought stress by hastening the phenological events (Levitt, 1972; Berger et al., 2016). Drought avoidance mechanism features minimum water loss and maximizing water use (Levitt, 1972). Usually, under central and south Indian conditions where chickpea is grown under stored soil moisture and having high water holding capacity soil, chickpea withstands drought stress through employing drought escape and drought avoidance mechanisms (Berger et al., 2006; Berger et al., 2016). However, this drought avoidance strategy remains ineffective under Mediterranean climates in Western Australia featuring low water holding capacity soil (Berger et al., 2016). The sources of resistance to these stresses are available either in the cultigens (heat and drought stress) or wild relatives (cold stress), and can be exploited to develop stress-resilient chickpea cultivars. The methodologies may be as simple as hybridization to use of marker assisted breeding [for genes as well as quantitative trait loci (QTLs)] or development of transgenics. QTLs for drought and temperature tolerance and in several cases genes within QTL regions have already been identified (Varshney et al., 2013a; Varshney et al., 2016; Devasirvatham and Tan, 2018; Kaloki et al., 2019). Genic, genetic, physiological, and biochemical basis of stress tolerance, once explored sufficiently, are expected to form the guiding principles for development of stress management strategies in chickpea. The objectives of sustainability of chickpea productivity or enhancing it further under changing climates can not be achieved until chickpea cultivars tolerant to combined stress, such as drought and heat, and drought and cold are developed. Various defense mechanisms regulating chickpea's adaptation during temperature and drought stress, especially the combined stresses, also need to be investigated (Upadhyaya et al., 2012; Awasthi et al., 2015; Khan et al., 2019a; Khan et al., 2019b). Here, we update the research status on drought and temperature stress in chickpea, and suggest appropriate management strategies to develop stress-tolerant genotypes.

#### Effects of Cold Stress

Chickpea (C. arietinum L.) has evolved in the Mediterranean region and developed sensitivity to low temperature, with adverse effects on growth and yield (Croser et al., 2003; Kaur et al., 2008a; Thakur et al., 2010; Kumar et al., 2013). About half of the productivity losses in chickpea are due to exposure to low temperature (Saxena, 1990). Chilling stress in chickpea mostly affects the northern parts of India and southern Australia, as temperatures drop below 15°C at flowering (Srinivasan et al., 1998; Clarke et al., 2004; Berger et al., 2006). The reproductive phase is critical for crop productivity (Thakur et al., 2010); chilling stress in chickpea causes flower abortion, pollen, and ovule infertility, disrupts fertilization, reduces pod set, retards seed filling, and reduces seed size and ultimately crop yield (Clarke and Siddique, 2004; Nayyar et al., 2005b; Nayyar et al., 2007; Thakur et al., 2010; Kumar et al., 2011). Low temperatures can limit chickpea growth and vigor at all phenological stages but are most damaging during the reproductive stage.

#### Germination and Vegetative Growth

Chickpea is a cool-season crop that is exposed to chilling (3–8°C) or even freezing temperatures during germination, which can affect seedling establishment and reduce seedling vigor (Chen et al., 1983; Srinivasan et al., 1998; Bakht et al., 2006b). Several interacting factors (genotype, temperature, duration and time of exposure, and seed moisture content prior to imbibition) mediate seed responses to low germination temperatures. Roberts et al. (1980) and Singh et al. (2009) demonstrated that low temperature (10°C) decreased the germination rate of chickpea seeds. The recommended threshold temperatures range for chickpea germination varies from 5 to 35°C and optimum germination temperature is 20°C (Singh and Dhaliwal, 1972; Ellis et al., 1986; Auld et al., 1988; Calcagno and Gallo, 1993). Chickpea, along with many other chillingsensitive species, is prone to "imbibitional chilling injury" (Tully et al., 1981). In the field, chilled seeds are often vulnerable to infestation by soil organisms, which reduces seedling survival. Chen et al. (1983) observed that the greatest sensitivity to cold occurs in the first 30 min of imbibition in chickpea and low temperature (3 to 8°C) during imbibition reduced chickpea germination by 15%. The combination of imbibition at low temperature and fast water uptake reduced germination by 65% (Tully et al., 1981; Chen et al., 1983). In Australia, chilling damage during imbibition has been implicated in the poor establishment of some chickpea genotypes in cold and wet soils combined (Knights and Mailer, 1989). The rapidity of imbibition is a factor, controlled principally by the thickness of the testa (Tully et al., 1981; St. John et al., 1984). Kabuli types generally have thinner testa than desi types, resulting in more rapid imbibition of water and consequently greater levels of imbibitional damage.

Another factor affecting germination success at cold temperatures is the seed phenolic content (Auld et al., 1983; Wery, 1990), which presumably confers fungal properties (Wery et al., 1994). Thus, the poor germination of kabuli types is partly due to their thin white testa being more susceptible to soil pathogens. Cold stress adversely affects the mobilization of food reserves from cotyledons that decreases embryonic growth, germination, and growth of chickpea seedlings (Croser et al., 2003). Ellis et al. (1986) found genotypic differences in the rate of germination with temperature. Given the existing genetic variability, it should be possible to select genotypes that are Rani et al. Drought and Temperature-Resilient Chickpea

resistant to temperature stress during germination. Some seed treatments, such as hydropriming for 12 h or osmopriming (PEG/0.5 MPa) for 24 h have increased germination of chickpea in low-temperature soil conditions (Elkoca et al., 2007), and may be linked to cross-tolerance. Chickpea plants growing under field conditions, especially in India and Australia, are exposed to gradually decreasing temperatures and photoperiods during the early vegetative stage (Croser et al., 2003). The minimum temperature that chickpea generally seems to survive is –8°C; however, some lines can tolerate as low as –12°C post-emergence (Wery, 1990; Croser et al., 2003). Thus, there is potential to select for cold tolerance at germination and during seedling growth from the existing chickpea germplasm.

#### Reproductive Growth and Yield

The flowering phase, the crucial phase in the plant life cycle that determines the yield of chickpea, is most sensitive to cold stress (Sharma and Nayyar, 2014). Temperatures below 15°C result in the abortion of chickpea flowers leading to decline in the number of pods per plant and seeds per pod (Srinivasan et al., 1999; Berger et al., 2004; Clarke and Siddique, 2004; Nayyar et al., 2005b; Berger et al., 2006; Kaur et al., 2011; Kumar et al., 2011). The causes of flower abortion in sensitive genotypes of chickpea are fairly well understood. It is well documented that male gametophyte of chickpea is highly sensitive to cold stress and in genotypes sensitive to cold, both microsporogenesis and subsequent pollen development are inhibited at temperatures below 10°C (Sharma and Nayyar, 2014; Kiran et al., 2019). Identification of flower and anther development stages in chickpea allowed studying the impact of cold at different flower development stages (Kiran et al., 2019). Flowers of different development stages react differently to cold stress (Kiran et al., 2019) e.g., low temperatures terminate microsporogenesis in flowers at premeiotic stage of anthers and microgametogenesis in those at tetrad stage. In anthers at young microspore stage, low temperatures inhibited anther dehiscence but did not inhibit development of microspores to mature pollen stage. The pollen, however, were sterile indicating that cold at this stage affected pollen viability, in addition to anther dehiscence (Oliver et al., 2007). Exposure at mature pollen stage delayed anther dehiscence and induced partial pollen sterility (Kiran et al., 2019). The quantum of low temperatures induced pollen sterility also depends upon the age of the flower with older flowers producing less amount of sterile pollen as compared to younger flowers, e.g., low temperature treatment at young microspore stage led to complete sterility of pollen whereas those at vacuolated microspore stage 23.59% pollen were viable, at vacuolated pollen stage 52.4% pollen were viable, at mature pollen stage 65.5% pollen were viable (Kiran et al., 2019). Apparently, male gametophytes of younger flowers are more prone to damage by cold stress as compared to the older ones. In contrast, cold-tolerant chickpea genotypes maintain functional anther and pollen development, leading to pod formation and seed set during chilling stress (Clarke and Siddique, 2004; Kumar et al., 2011). Cold stress also impairs pollen tube growth in the style and, consequently, fertilization failure (Clarke and Siddique, 2004; Nayyar et al., 2007).

Chilling stress also has an adverse effect on gynoecium to impair ovule function; Srinivasan et al. (1998) reported missing embryo sacs in some chickpea cultivars, which reduced the number of fertilized ovules in all cultivars during cold stress. Chilling stress reduces ovule viability, stigma receptivity, and pollen load on stigma (Kiran et al., 2019). While studying flower abortion due to cold stress in chickpea, it was observed that the older flowers, that have sufficient viable pollen were also aborted (Kiran et al., 2019). Very low ovule viability accompanied by very low stigma receptivity in older flowers pointed toward role of female gametophyte factors in lack of fertilization and flower abortion under low temperature stress in addition to male factors. The role of female gamete was also highlighted using pollen from cold treated flowers to pollinate plants growing at normal temperatures and vice-versa (Nayyar et al., 2005b). The low temperature (4°C) used by Kiran et al. (2019) was, however, considerably lower than the threshold of 15°C (Srinivasan et al., 1998; Clarke et al., 2004; Berger et al., 2004; Berger et al., 2005; Bakht et al., 2006b; Berger, 2007) or 21°C (Berger et al., 2012) reported for reproduction in chickpea. Further studies at temperature slightly below 15°C need to be conducted to understand behavior of flowers to threshold low temperature stress.

Ectopic persistence of tapetum in low temperature treated chickpea flowers indicates disruption of normal process of tapetum programmed cell death under low temperatures (Kiran et al., 2019). Such disruption might have imbalanced nutrition to developing microspores. It has been already documented that low temperatures during flowering cause nutritional deficiencies in the tapetum (Nayyar et al., 2005b; Sharma and Nayyar, 2014) and decrease in sugar levels in anthers and pollen grains, which may be a primary cause of flower abortion. Low temperatures disrupt the mobilization of carbohydrates from source to sink and lead to nutrient deficiencies in stylar tissues too (Nayyar et al., 2005b). Cold stress also induces the synthesis of abscisic acid (ABA) in chickpea flowers, indicating a correlation between flower abortion and high ABA concentration (Thakur et al., 2010). In chickpea exposed to low temperatures (12–15/4–6°C day/night), increased ABA concentrations caused flowers to abort (Nayyar et al., 2005a). ABA interferes with sucrose translocation to flowers (Kumar et al., 2010) probably by inhibiting sucrose transporter gene invertase as has been observed in crops like rice (Oliver et al., 2005; Sharma and Nayyar, 2016).

Chilling stress has a damaging effect on flower number, pod set, seed growth, and development in chickpea (Croser et al., 2003; Berger et al., 2004; Nayyar et al., 2005b; Thakur et al., 2010). Moreover, low temperature impairs seed filling processes, which reduces the size of chickpea seeds (Nayyar et al., 2005b; Nayyar et al., 2007; Kaur et al., 2008a). Grain yield is related to phenology of chickpea and a combination of low temperature induced factors i.e., poor plant growth, delay in flowering, flower abortion, delay in podding, pod abortion, and poor seed filling contribute to lower the yield of chickpea under cold (Berger et al., 2004). Poor pod set/filling as a result of cold stress is due to the disruption in photosynthesis and inhibition of translocation of initiating signals from leaves to the meristem or by changing plant architecture (Gogoi et al., 2018). The studies on estimation of yield losses in chickpea due to cold are scanty. Singh et al. (1993) grew cold tolerant and cold susceptible genotypes of chickpea both in spring (temperatures normal for crop) and autumn (temperatures stressful as low as −10°C) in Syria and compared yield among the genotypes and seasons. A highly cold susceptible chickpea line with cold rating of 7.8 (1 = no visible cold damage, 9 = all plants killed) yielded 161 kg/ha during winter (low temperature) season and 474 kg/ha during warmer spring season (Singh et al., 1993). In comparison to this, a line with cold rating of 5.2 yielded 632 kg/ha during winter season and 251 kg/ha during spring season (Singh et al., 1993) indicating that cold in susceptible genotypes caused huge yield losses. The spring season due to short duration, reduces productivity of chickpea as compared to longer winter seasons that allows more time for crop to grow and consequently higher yields. Nayyar et al. (2005c) reported 30% increase in seed yield per plant in glycine betaine (a compatible solute that accumulate in cold-tolerant plants in higher amounts under cold stress) treated plants over control in winter sown chickpea grown in low temperature prone northern regions of India (pot-based studies). Since, winter sown chickpea yields more as compared to spring sown one if genotype has adequate cold-tolerance, the emphasis worldwide is on development of cold tolerant cultivars of chickpea to increase productivity of the crop. Wild relatives of chickpea in primary gene pool (Cicer reticulatum, Cicer echinospermum) that are crossable with the cultigens are tolerant to cold can be ideal sources to introgression cold tolerance to chickpea for development of varieties for winter season (Berger et al., 2012).

#### Physiology

The physiological functions of plants are adversely influenced by low temperature (<20°C) (Thakur et al., 2010). Low temperatures (17.6/4.9°C; day/night for 26 days during reproductive phase) resulted in reduction in relative leaf water content, possibly due to a decline in root hydraulic conductivity, oxidative and membrane damage, and chlorophyll loss (Kumar et al., 2011). Chilling stress (13/10°C; day/night for 18 h) during germination considerably inhibited a-amylase activity, disrupted sugar metabolism, reduced leaf water status, and uptake of mineral elements (N, P, and K) that delayed seedling emergence and caused poor seedling growth in chickpea (Farooq et al., 2017). Temperature changes can impact root physiology, thus affecting ion absorption and may result in visible deficiency symptoms (Gregory, 1988). Low-temperature stress (5°C for 3 days) inhibited root growth and the capacity for water and mineral uptake to subsequently impact the nutritional influences on plant growth (Aroca et al., 2003; Heidarvand et al., 2011). Low temperatures (5/5°C for 4 days) also reduced the leaf water content because the stomata are unable to close (Lee et al., 1993; Farooq et al., 2009). Flower abortion and poor pod set in chickpea due to cold stress (12–15/4–6°C day/night during flowering stage) was attributed to decreasing levels of sucrose, glucose, and fructose in anthers and pollen in sensitive genotypes (Nayyar et al., 2005a). Endogenous proline and carbohydrates (glucose, rhamnose, and mannose) increased with cold stress (3°C for 7 days) in chickpea genotypes, and may play a role in osmoregulation and meeting the enhanced energy requirements (Saghfi and Eivazi, 2014); the cold-tolerant genotypes performed better in this regard.

#### Cellular and Physiological Mechanisms for Cold Survival

Low temperatures (0–10°C) result in rigidification of the plasma membrane that is sensed by plant cells (Yadav, 2010) to impair the integrity of phospholipids in the plasma membrane (Badea and Basu, 2009). In cold-tolerant chickpea genotypes, the content of unsaturated fatty acids increased during lowtemperature exposure (10°C for 5 days followed by 4°C for 2 days) (Shahandashti et al., 2013), which possibly contributed toward maintenance of membrane integrity during cold stress. Mitochondria are the most vital cell organelles and play an important role in stress tolerance mechanisms by interacting with energy-dissipating elements such as alternative oxidase (AOX) (Borecky and Vercesi, 2005; Rurek et al., 2015). In optimum conditions, plant cells carry on the cytochromemediated pathway with the help of the mitochondrial electron transfer chain, which results in ATP synthesis by using the proton motive force (Dinakar et al., 2016). In unfavorable conditions, a new pathway is involved in which cytochrome reductase and cytochrome oxidases are replaced by AOX to protect respiration and metabolic processes. This suggests that mitochondria have the flexibility to alter their activities and enhance AOX activity during environmental stress (Shi et al., 2013; Vanlerberghe, 2013). There are different genes for AOXs, depending on plant species; for example, AOX in chickpea is encoded by the aox3 gene in mitochondria (Karami-Moalem et al., 2018), and might be involved in cold tolerance.

Reactive oxygen species (ROS) are produced in response to cold stress in chickpea (Kumar et al., 2011) and damage vital molecules in cells, including membranes. Generally, lipid peroxidation and hydrogen peroxide concentrations are measured as markers of temperature-induced oxidative stress (Awasthi et al., 2015). A positive correlation was observed between lipid peroxidation and malondialdehyde (MDA) concentration in Cicer occidentalis (Shahandashti et al., 2013). Plant cells have different mechanisms to combat oxidative damage by activating ant oxidative systems that include both non-enzymatic (e.g., tocopherols, ascorbate, proline) and enzymatic [e.g., superoxide dismutase (SOD), catalase (CAT), and ascorbate peroxidase (APX)] (Turk et al., 2014; Zouari et al., 2016). A few studies in chickpea have identified an increase in the double bond index due to enhanced lipoxygenase (LOX) activity, suggesting that increased LOX activity plays an important role in providing cold tolerance in chickpea (Padham et al., 2007; Wasternack, 2007; Pushpalatha et al., 2011). The up-regulation of various types of antioxidants has been correlated with cold tolerance in chickpea (Nayyar and Chander, 2004).

Some plant regulating molecules look promising for imparting stress tolerance (Bhandari et al., 2017), and have been investigated in chickpea for enhancing cold tolerance. Polyamines (PAs), with a polycationic nature at a physiological pH, bind strongly to the negative charges in cellular components such as nucleic acids, proteins, and phospholipids (Bouchereau et al., 1999) and interact with membrane phospholipids to stabilize membranes under stress conditions (Roberts et al., 1986). The depletion of PAs as a result of cold stress (5 to 25° C for 4 days) has been linked to the loss of flowers and pods (Nayyar and Chander, 2004). Exogenous application of PAs reduced H2O2 levels and MDA content and increased antioxidant levels in chickpea plants subjected to cold stress (Nayyar and Chander, 2004). Hence, it may be possible to improve cold tolerance in chickpea by increasing the content of PAs using genetic manipulation or exogenous application. Besides PAs, abscisic acid (ABA) is also involved in providing stress tolerance (Trivedi et al., 2016); cold-stressed (10–12/2–4°C day/night at bud stage) chickpea plants treated exogenously with 10 µm ABA had improved pollen viability, pollen germination, flower retention, and pod set (Kumar et al., 2008). At the cellular level, ABA-treated plants increased activities of SOD, catalase (CAT), ascorbate peroxidase (APX), ascorbic acid, glutathione, and proline. Trehalose, a disaccharide of glucose plays an important role as a compatible solute, stabilizes biological structures under abiotic stress (Jain and Roy, 2009), including dehydrated enzymes, proteins, and lipid membranes, and protects biological structures from damage during desiccation (Fernandez et al., 2010). It also acts as a membrane and molecule chaperone during water or cold stress (Crowe, 2007; Fernandez et al., 2010). Seed priming with trehalose reduced the oxidative damage to biological membranes and other vital organelles during cold stress (13/10°C for 18 h) in chickpea, and improved carbon assimilation, resulting in better seedling growth (Farooq et al., 2017). Increased accumulation of total and reducing sugars (especially trehalose) may protect against chilling stress by stabilizing cell membranes, ceasing protein denaturation and acting as a scavenger of free radicals (Benaroudj et al., 2001; Farooq et al., 2009).

Glycine betaine (GB), an amino acid, is a cryoprotective solute that protects the activities of enzymes and proteins and stabilizes membranes and photosynthetic apparatus under chilling (12–14/3–4°C day/night) and freezing temperatures at bud and pod filling stage (Rhodes and Hanson, 1993; McNeil et al., 1999; Nayyar et al., 2005c). Cold stress (12–14/3–4°C day/ night at bud stage) decreased the endogenous GB concentration in chickpea leaves and flowers, resulting in the loss of pods (Nayyar et al., 2005c). Exogenously applied GB to chickpea plants at bud and pod filling stages during cold stress improved flower function, pollen germination, pollen tube growth, stigma receptivity, and ovule viability, leading to floral retention, pod set, and pod retention (Nayyar et al., 2005c). Moreover, treatment with GB at the pod filling stage improved seed yield/plant, number of seeds/100 pods. Cold tolerance induced by GB may be related to an increase in relative leaf water content (RLWC), chlorophyll and sucrose, and decrease in ABA and active oxygen species (malondialdehyde and hydrogen peroxide) (Nayyar et al., 2005b; Nayyar et al., 2005d; Nayyar et al., 2005e). Possible roles for GB in stress tolerance include stabilization of complex proteins and membranes in vivo, protection of transcriptional and translational machinery, and as a molecular chaperone for refolding enzymes (Rhodes and Hanson, 1993).

Cold stress is lethal to most plants; despite this, temperate plants survive the winter months through acclimation processes, which suggest that plant exposure to low but not freezing temperatures confers cold tolerance (Bohn et al., 2007). A comparative study on cold-acclimated (CA) and nonacclimated (NA) chickpea plants showed an increase in the ratio of unsaturated fatty acids and saturated fatty acids in CA plants (Kazemi-Shahandashti et al., 2014). Antioxidative enzymes, such as SOD, CAT, guaiacol peroxidase (GPX), and lipoxygenase (LOX), were highly active in CA plants and resulted in enhanced cold tolerance, compared to NA plants. The transcription levels of CaCAT and CaSOD genes were higher in CA plants than NA plants. Moreover, the transcription level of the Ca-Rubisco gene was higher in CA plants than NA plants. Thus, cold acclimation (23°C for 20 days, 10°C for 5 days, followed by −10°C for 15 min.) had a positive effect on chickpea plants during long-term cold stress (Kazemi-Shahandashti et al., 2014), and may be a critical means of increasing cold tolerance.

#### Genomics and Transcriptomics in Elucidating Molecular Responses of Chickpea Under Cold

The "omics" approaches such as genomics, transcriptomics, proteomics, and metabolomics have become integral part of scientific strategies to study regulation of plants' responses to abiotic and biotic stresses. Between the genomics and transcriptomics, genomics provide the knowledge of structure of the genome including genes, promoters, regulatory elements etc. whereas the transcriptome elucidate the functional component of genome at any stage of plant growth. Consequently, transcriptomics reveal changes, not only in the expression of genes in a plant under abiotic stresses but also the gene regulatory mechanisms that govern differential expression of genes. Transcriptomics also provide information on differences in gene regulation and expression between the tolerant and sensitive genotypes thereby depicting precisely the mechanisms that lead to tolerance or susceptibility. Such detailed information can also be used to understand coordination among different regulatory pathways and may be exploited in the agricultural crops to develop appropriate strategies to manage the abiotic stresses under field conditions. In chickpea, global transcriptome expression using complementary DNA-amplified fragment length polymorphism (cDNA-AFLP), differential display, or microarray techniques have been used to identify genes of potential importance for acclimatization/tolerance to cold and elucidate pathways regulating this process (Mantri et al., 2007; Dinari et al., 2013; Sharma and Nayyar, 2014). Using microarrays, 210 differentially expressed genes under cold were identified (Mantri et al., 2007). The cDNA-AFLP in association with 256 primer combinations revealed different transcriptderived fragments (TDFs) associated with cold in chickpea leaves (Dinari et al., 2013). Some of the TDFs showed a differential expression pattern and belonged to putative functions associated with transport, signal transduction pathways, metabolism, and transcription factors. Various genes are activated in chickpea during low-temperature stress, which encode for transcription factors and components involved in detoxification processes and cell signaling. For example, the gene encoding phosphatidylinositol-4-kinase, a key enzyme in an influx of Ca2+ into the cytoplasm, expressed in Jk649809 and Jk649838 chickpea genotypes, (Scebba et al., 1998). The mitogenactivated protein kinase was also up-regulated in Jk649803 during cold acclimatization and might be a signal molecule for cold tolerance. It was concluded that cold tolerance in chickpea is regulated by a relatively small number of genes (Dinari et al., 2013).

Transcriptome analysis of meiotic anthers of chickpea revealed that cold-tolerance-associated genes belonged to four main categories—carbohydrate/triacylglycerol metabolism, pollen development, signal transduction, and transport (Sharma and Nayyar, 2014). All of the genes of these four categories were upregulated in cold-tolerant anthers, with the exception of one pollen development gene that was downregulated. Genes involved in microspore/pollen growth (tetrad separation, pollen expansion, increased vascular transport, fatty acid transport, pollen maturation, pollen exine formation, pollen tube growth, fertility, and pollen development) were switched-on in cold-tolerant genotype under cold stress (Sharma and Nayyar, 2014). Upregulation of genes associated with carbohydrate and triacylglycerol metabolism suggests that cold-tolerant chickpea plants produce viable pollen during chilling stress by maintaining pollen development and carbohydrate/ triacylglycerol metabolic pathways (Sharma and Nayyar, 2014). Another study reported increased expression of 109 and 210 genes when chickpea was exposed to drought and cold stress, respectively (Mantri et al., 2007). Of these, 15 and 30 genes were differentially expressed between tolerant and sensitive genotypes, respectively, which coded for various regulatory and functional proteins. Significant differences were observed in stress responses within and between tolerant and susceptible genotypes indicating multi-gene control and a complex abiotic stress response mechanism in chickpea. This study demonstrated that the leaves of cold-tolerant chickpea over expressed serine/ threonine protein kinase while the flowers of cold-sensitive chickpea up-regulated SOD, a copper chaperone precursor involved in oxidative stress. Auxin repressed protein (DY475078) and auxin-responsive protein IAA9 (DY396315) transcripts, which are involved in cell rescue, were induced in the flowers and leaves of both the sensitive genotypes. Two phosphate-induced proteins (DY475076 and DY475172) were induced in flowers/pods of tolerant-1 (Sonali) chickpea genotype (Mantri et al., 2007). It is worth mentioning here that phosphorus is responsible for flower formation and seed production. Sucrose synthase (DY475105) was also induced in leaves of Sonali, which lead to the accumulation of sucrose that functions as an osmolyte and may provide cold tolerance.

To compare similarities and differences between cold-stressed anthers and gynoecium, a small subset of 25 genes that were upregulated in anthers under cold, was used to study gene expression in gynoecium (Sharma and Nayyar, 2014). While all the genes were expressed in both the organs, nine had contrasting expression patterns in both the organs, i.e., an increase in one organ and decrease in the other (Sharma and Nayyar, 2014). The genes expressed under cold were also compared with those expressed under drought and salinity (Mantri et al., 2007). Some of the genes were common between the stresses while others were unique (Mantri et al., 2007; Mantri et al., 2010), which suggests that some segments of abiotic stress responsive machinery are shared by different abiotic stresses.

Whole genome sequencing (WGS) has also provided insights into cold-tolerance mechanisms in chickpea. The technique has been exploited to generate genomic resources for better understanding of cold-tolerance and cold-susceptibility in chickpea, such as identification of a flowering repressor gene MtVRN2 in the confidence interval of a QTL (Mugabe et al., 2019), using the reference genome of CDC Frontier chickpea. GWS has also been used to identify mitogen-activated protein kinases (MAPKs) in chickpea and the impact of cold on their expression. Of the 19 MAPK genes detected in chickpea, 15 were induced by low temperature (4°C, chilling stress) compared to control plants (Singh et al., 2018). Similarly, 36 genes encoding the K<sup>+</sup> transport system in the chickpea genome were identified, along with their promoters with putative cold signals (Azeem et al., 2018). These studies provided new vital information about the genes, which might be associated with cold tolerance to chickpea and indicated that cold-tolerance mechanisms might have organ specific distinctions e.g., leaf, anther and gynoecium. To confirm association of these candidate genes in cold tolerance or cold susceptibility, further studies need to be conducted using appropriate models.

There is also a study indicating that changes in methylation patterns may be associated with cold tolerance in chickpea. Prolonged cold stress in a cold-tolerant genotype increased demethylation, relative to a cold-susceptible genotype, suggesting a higher potential for activation of cold-stressresponsive genes (Rakei et al., 2016). Thus, WGS and its further exploitation has generated genomic resources and enhanced our understanding of mechanisms governing cold tolerance/susceptibility in chickpea. These resources are ideal starting points for subsequent studies aimed at the regulation of cold tolerance in chickpea. The recent description of flower and anther development stages in chickpea (Kiran et al., 2019) is also expected to aid in the identification of molecular mechanisms for cold tolerance during different stages anther development.

Physiological studies (see previous sections for details) point to prominent role of carbohydrate metabolism, antioxidants, and free amino acids in cold-tolerance, however, gene regulatory networks for carbohydrates, antioxidants, and free amino acids under cold-tolerance have not been studied in detail. To understand intricacies and reveal complete picture of coldsusceptibility or tolerance in chickpea, merger of physiological and gene regulation knowledge under cold stress is essential. There is also a need to generate information on gene regulation/ expression for antioxidants, carbohydrates, and free amino acids where physiological studies have already been conducted. Since, mechanisms of cold-tolerance by leaves may be different from flowers, which are complex organs involving microsporogenesis, microgametogenesis, megasporogenesis, pollination, fertilization, and seed development (Kiran et al., 2019), studies also need to be launched to understand mechanisms of pollen viability/ovule viability under cold stress by the coldtolerant genotypes.

#### Genetic Variability and Breeding for Cold Tolerance

Winter-sown chickpeas face cold stress during reproductive growth resulting in flower drop, pod drop, and poor seed set (India and Australia) and restricted vegetative growth in young plants (Mediterranean region) (Singh et al., 1989; Saxena, 1990; Chaturvedi et al., 2009; Sharma and Nayyar, 2014; Sharma and Nayyar, 2016). The cold environment differs in these chickpea cultivation areas; temperatures remain subzero (freezing) for some time during early crop growth in the Mediterranean region but usually above zero in Indian and Australian regions. Consequently, the goals of cold-tolerance breeding will vary between regions, i.e., genotypes should be selected for freezing tolerance (below 0°C) during early growth in the Mediterranean region and chilling tolerance (up to 0°C) during reproductive growth in Indian subcontinent (Chaturvedi et al., 2009). Screening scales based on plant death at subzero temperatures are well described for cold-tolerant chickpea germplasm (Singh et al., 1989 [1–9 scale]; Saccardo and Calcagno, 1990 [0–5 scale]). However, no screening scales have been devised to identify chilling tolerance during reproductive growth, and appears to be due to the complexity of processes at reproductive phase (flowering, podding, seed set, seed development, etc.) and mechanisms by which cold impedes flower, anther, and pod development (Sharma and Nayyar, 2014; Kiran et al., 2019). Moreover, temperature sensitivity varies for flower, pod, and seed growth. For example, the critical temperature for seed growth is higher than that required for pod set (Srinivasan et al., 1998). Evidence is emerging that pod set is related to cumulative temperature rather than minimum temperature, as plants growing at 0°C night temperature and 20°C day temperature bore pods (Srinivasan et al., 1998). These observations need to be confirmed, as an earlier study reported that pod set only occurred at minimum night temperatures above 8°C (Saxena, 1990).

Several studies have been undertaken on freezing tolerance in the cultigens or Cicer species. Within C. arietinum, germplasm including M 450, ILC 8262, ICCV 88501, ICCV 88502, ICCV 88503, ICCV 88506, FLIP 84-70C, FLIP 84-71C, and FLIP84-79 C are tolerant to cold (Singh et al., 1990; Singh and Saxena, 1993) along with FLIP 81-293C, FLIP 82-127C, FLIP82-128C (Wery, 1990), ILC 8262 (a germplasm line), ILC 8617 (a mutant) and FLIP 87-82C (a breeding line) (Singh et al., 1995), ICCV 88501 and ICCV 88503 (Srinivasan et al., 1998), FLIP95-255C, FLIP93- 260C and Sel95TH1716 (Kanouni et al., 2009), and Sel96TH11404, Sel96TH11439, Sel96TH11488, Sel98TH11518, x03TH21, and FLIP93-261C (Saeed et al., 2010). Freezing tolerance in chickpea is dominant over susceptibility and controlled by at least five sets of genes (Malhotra and Singh, 1990). Further genetic analysis revealed the presence of genic interactions (additive × additive and dominance × dominance) with duplicate epistasis and additive gene effects (Malhotra and Singh, 1991). The two types of chickpeas, desi, and kabuli, do not differ in their reaction to cold (Berger et al., 2012).

There is growing evidence that wild relatives of chickpea possess a higher degree of cold tolerance than the cultigens (Singh et al., 1995; Berger et al., 2012). Wild Cicer species of the primary gene pool are readily crossable to the cultigens and can be the potential donors of cold tolerance. Wild species were evaluated extensively for cold tolerance both at freezing (young plants) and to a limited extent in chilling environments (at the reproductive stage). Among the wild relatives, Cicer bijugum, C. echinospermum, and Cicer judaicum were more cold-tolerant than C. arietinum during early growth (Singh et al., 1990; Malhotra, 1998) of the reproductive stage (Berger et al., 2012). Among 59 lines from seven annual wild Cicer species, 26 lines of C. reticulatum, 10 of C. bijugum, 4 of C. echinospermum, 2 of Cicer pinnatifidum, and 1 of C. judaicum tolerated freezing (subzero conditions) during early vegetative growth (Singh et al., 1995). Among the cold-tolerant wild species, five lines of C. bijugum and four of C. reticulatum (highly tolerant) were superior to the cultigens for cold tolerance. In another study, Toker (2005) evaluated 43 accessions of eight annual wild Cicer species (C. bijugum, Cicer chorassanicum, Cicer cuneatum, C. echinospermum, C. judaicum, C. pinnatifidum, C. reticulatum, and Cicer yamashitae) for cold tolerance in young plants at subzero temperatures (freezing tolerance). C. bijugum was the best source of cold tolerance, with all six accessions under study being cold-tolerant (AWC 6: free from any damage, AWC 2 and AWC 4: highly tolerant, AWC 1, AWC 3, and AWC 5: tolerant) (Toker, 2005). Eleven of 15 accessions of C. reticulatum, 4 of eight C. echinospermum, and 1 of five C. pinnatifidum (score 3) were cold-tolerant.

Chilling-tolerant chickpea germplasm—CTS 60543 (ICCV88516), CTS11308 (ICCV88510)—has been identified (Clarke and Siddique, 2004). Pollen selection [transfer of plants to cold stress (12/7°C) for 3 days immediately after pollination followed by F1 seed collection] was used to develop chillingtolerant chickpea varieties including Rupali (WACPE 2095) and Sonali (WACPE 2075) (Clarke et al., 2004). Similar to freezing stress, accessions of C. arietinum had less chilling tolerance than wild accessions (Berger et al., 2012). Even Rupali and WACPE 2078 developed by Clarke et al.(2004), when grown at ∼10°C postanthesis, had large flower–pod intervals (>65 days) indicating a low degree of cold tolerance (Berger et al., 2006). Among the wild species, an accession of C. echinospermum had robust chilling tolerance, whereas JM2106 of C. reticulatum was also chilling tolerant (Clarke and Siddique, 2004; Berger et al., 2012). The C. echinospermum accession not only expressed the early podding character at low temperature but also yielded five times more than the most productive chickpea cultivar. With duplications in gene bank accessions of wild species of Cicer (Croser et al., 2003), the actual number of cold-tolerant sources may be lower than that reported in the literature. Nonetheless, wild Cicer species are important sources for improving cold tolerance in chickpea.

One of the major consequences of low temperature has been hypothesized to be low sink utilization in northern regions of India, where low temperature causes flower abortion or failure of set pods (Saxena et al., 1988). To improve harvest index due to pod set failure in this region, chilling-tolerant lines were crossed with agronomic ally desirable lines (Saxena et al., 1988). Early flowering and podding in cross bred lines improved harvest index (50–54%) more than late flowering lines (39–42%). Coldtolerant wild species of Cicer, namely C. reticulatum and C. echinospermum, have also been exploited to develop highyielding chickpea (Singh and Ocampo, 1997). Cold-tolerant and Fusarium wilt resistant accession of C. reticulatum (ILWC 124) and C. echinospermum (ILWC 179) were crossed with cultigens (ILC 482); one of the progenies out-yielded ILC 482 by 39%. In another study, lines derived from a cross of cultivated chickpea and C. reticulatum out-yielded the check cultivars (Singh et al., 2005). Both studies showed that wild Cicer is not only a source of tolerance for abiotic stresses and diseases but can contribute to yield enhancement in chickpea. Both chilling tolerance during reproductive growth and yield enhancement in pedigree lines indicate that wild species of the primary gene pool have the potential to increase chickpea productivity in Australia and the Indian subcontinent (the region with the maximum area under chickpea) where cold stress coincides with the reproductive phase of the crop and productivity is low.

#### Genomics Advancements for Developing Cold Stress Tolerance in Chickpea

Generation of adequate genomic resources such as simple sequence repeat markers (SSRs) and single nucleotide polymorphism (SNPs) is essential for gene/QTL mapping and for identifying genes in QTL intervals. Currently available bioinformatics tools allow identification of molecular and biological functions of genes in QTL intervals based on existing scientific information, thereby allowing the selection of candidate genes governing the trait. The gene linked markers or QTLs can also be used to identify introgression of gene(s) into elite cultivars using a technique called foreground selection and recovery of recurrent parent genome using the background selection. Our understanding of cold tolerance in chickpea has increased considerably in the last decade, primarily due to advances in sequencing technologies that enabled large-scale decoding of genomic sequences at lower cost leading to gene identification, gene regulation, or large-scale development of DNA-based markers such as simple sequence repeats (SSRs) and single nucleotide polymorphism (SNPs). Development of reference genome sequences in chickpea (Jain et al., 2013; Varshney et al., 2013b; Parween et al., 2015) provided the much needed push in advancement of genomic resources in chickpea including development of SSR or SNP markers, identification of candidate genes within QTL intervals. Marker developments have allowed identification of QTLs governing tolerance to abiotic stresses. Association mapping of a panel of 44 genotypes was used to identify QTLs associated with freezing tolerance; however, no QTL associated with cold tolerance could be identified (Saeed and Darvishzadeh, 2017). The lack of adequate marker density appears to explain the non-detection of QTLs linked to cold tolerance as only 64 AFLP markers were used. Recently, a mapping population of 129 recombinant inbred lines (RILs), derived from an interspecific cross between ICC 4958 (cold-sensitive, desi type, C. arietinum) and PI 489777 (cold-tolerant wild relative, C. reticulatum Ladiz), followed by genotyping-by-sequencing was used to identify QTLs linked to cold tolerance (Mugabe et al., 2019). A total of 747 SNP markers, spanning 393.7 cM, were used in this study. The SNPs were more abundant than traditional markers and had considerably higher marker density, with an average of 1.8 SNPs cM−<sup>1</sup> . Freezing tolerance in PI48977 was governed by three QTLs situated on linkage groups (LGs) 1B, 3, and 8 (Mugabe et al., 2019); CT Ca-3.1 (on LG3) and CT Ca-8.1 (on LG8) were more important and accounted for 34 and 48% of the phenotypic variance for cold, respectively. One of the parents used in the study, C. reticulatum, requires vernalization, i.e., acceleration of flowering following brief spells of cold exposure (van Oss et al., 2015) and QTLs for vernalization response were also identified using a RIL population where one of the parents was PI 489777 (Samineni et al., 2016). It is worth mentioning here that cultigen, C. arietinum, does not respond to vernalization (Berger et al., 2005. Using 1,291 loci [SSRs, diversity array technology (DArT), cleaved amplified polymorphic sequences (CAPs), legacy markers, etc.] for QTL identification, a major vernalization response QTL was identified (Samineni et al., 2016). The QTL spanned 22 cM on LG3 and explained 47.9 to 54.9% of the phenotypic variation. Both studies, Samineni et al.(2016) and Mugabe et al.(2019) used the same coldtolerant and vernalization responsive parent (PI 489777), and identified the same QTL (CT Ca-3.1) linked to the cold tolerance and vernalization response. This finding necessitates further research to determine the relationship between cold tolerance and vernalization response machinery in Cicer species. Using CDC Frontier chickpea as a reference genome, a homolog of the Medicago truncatula vernalization gene named VERNALISATION2‐LIKEVEFS box gene (MtVRN2) was mapped in CTCa-3.1 confidence interval (Mugabe et al., 2019). MtVRN2 is a repressor of the flowering locus T gene homolog from M. truncatula and is a repressor of transition to flowering (Jaudal et al., 2016). This example demonstrates that genome sequences can be exploited effectively to narrow possible candidate genes in QTL regions and vernalization response in Cicer might be inversely related to flowering. None the less, QTLs governing cold tolerance in chickpea or candidate cold tolerance genes within these intervals are poorly explored so far as no information is available for QTLs in other cold-tolerant genotypes of C. reticulatum. Moreover, QTLs for coldtolerance within cold-tolerant genotypes of C. arietinum and another annual wild relative Cicer echnospermum that possesses tolerance to cold are yet to be identified. In addition, no efforts have so far been made to transfer cold-tolerance QTLs from C. reticulatum to C. arietinum.

#### Impacts of Heat Stress

Excessive heat stress affects all aspects of chickpea growth, phenology, and development (Devasirvatham et al., 2012; Devasirvatham et al., 2013; Kaushal et al., 2013), including biomass, flowering duration, pod number, days to maturity, seed weight, and grain yield (Upadhyaya et al., 2011; Kaushal et al., 2013) and a wide range of plant development and physiological processes. The impact of heat stress at different stages of plant growth and development in chickpea are described below.

#### Germination and Vegetative Growth

High temperatures affect seed germination in chickpea; genotypic variation was observed for high-temperature tolerance at seed germination, with no germination above 45°C (Singh and Dhaliwal, 1972; Ibrahim, 2011), reduced seedling growth (Kaushal et al., 2013), and even seedling death (Kaushal et al., 2011). Controlled environment studies showed significant biomass increases in both tolerant and sensitive genotypes at 35/ 25°C whereas exposure to 40/30°C decreased biomass at maturity in all genotypes, more so in the sensitive genotypes (Kumar et al., 2013).

#### Reproductive Growth

Heat stress limits chickpea growth and vigor at all phenological stages, but the reproductive phase is considered more sensitive to temperature extremes than the vegetative stage (Sita et al., 2017). Heat stress during reproduction generally 1) reduces flower number, 2) increases flower abortion, 3) alters anther locule number decrease, 4) causes pollen sterility with poor pollen germination, 5) reduces fertilization and stigma receptivity, 6) causes ovary abnormalities, 7) reduces the remobilization of photosynthates to seeds, and 8) reduces seed number, seed weight, and seed yield (Devasirvatham et al., 2012; Devasirvatham et al., 2013; Kaushal et al., 2013). Exposure of chickpea to heat stress (35/20°C) pre-anthesis reduced anther development, pollen production, and fertility by inducing physiological abnormalities (Devasirvatham et al., 2012). High temperature can induce anther and pollen structural aberrations, such as alterations in anther locule number, anther epidermis wall thickening, and pollen sterility, which are key factors reducing chickpea yield under high temperature (Devasirvatham et al., 2013). In chickpea, pollen is more sensitive to heat stress than the female gametophyte (Devasirvatham et al., 2012). The effect of high-temperature stress post-anthesis has been associated with poor pollen germination, pollen tube growth and fertilization, and the loss of stigma receptivity (Kaushal et al., 2013; Kumar et al., 2013), which reduces seed number, seed weight, and seed yield (Summerfield et al., 1984; Wang et al., 2006). Temperatures above 45°C are detrimental to pollen fertility and stigma function in chickpea (Devasirvatham et al., 2015).

Heat tress enhanced oxidative stress and lowered leaf photosynthesis, which reduced the soluble carbohydrate and ATP contents in the pistil (Kumar et al., 2013) and prevented nutrient transport from the style to pollen tube thus inhibiting pollen tube growth and ovary development (Kumar et al., 2013). Screening chickpea genotypes for heat sensitivity revealed substantial genetic variation in a high-temperature environment (Krishnamurthy et al., 2011; Devasirvatham et al., 2015). Heat-tolerant chickpea genotypes produced pods at temperatures above 35/20°C, while sensitive genotypes aborted most of their flowers (Kaushal et al., 2013). Devasirvatham et al. (2013) reported greater pod set in heat-tolerant genotypes (ICC 1205 and ICC 15614) than heat-sensitive genotypes (ICC 4567 and ICC 10685).

#### Influence of Heat Stress on Physiology

Some vital physiological traits, including chlorophyll concentration, photosynthetic rate, and membrane stability of leaf tissue, can be used as indicators of heat sensitivity (Hasanuzzaman et al., 2013). Chickpea is relatively more sensitive in terms of membrane stability and photosystem II function at high temperatures 50°C for 48 h than other legumes (Srinivasan et al., 1996). Heat stress (35/16°C for 10 days) induces leaf senescence in chickpea (Wang et al., 2006) by disrupting the chloroplasts and damaging chlorophyll. Heat stress (>32/20°C during reproductive stage) reduced the chlorophyll content in chickpea leaves, which caused chlorosis (Kaushal et al., 2013); this loss may have occurred due to photooxidative stress or inhibition of chlorophyll synthesis (Guo et al., 2006). Heat stress (>32/20°C during reproductive stage) caused more leaf damage in a heat-sensitive than heat-tolerant chickpea genotype, due to a greater reduction in leaf water status (as RLWC) and possible decline in stomatal conductance, and restriction in hydraulic conductivity of root (Kaushal et al., 2013). Transpiration efficiency in chickpea decreased with increasing temperature (Singh et al., 1982). The quantum yield or photosystem II (PSІІ) activity in chickpea was not affected at 35°C, but a noticeable reduction occurred at 46°C (during pod filling) that caused irreversible damage to photosynthetic systems (Basu et al., 2009). Similarly, Srinivasan et al. (1996) reported severe damage to PSІІ at 50°C for 48 h in chickpea. Temperatures above 35°C during reproductive stage suppressed photosynthesis and electron flow and disrupted metabolic pathways to reduce grain size (Kaushal et al., 2013; Awasthi et al., 2014; Redden et al., 2014).

Heat stress alters the fluidity of plasmalemma, mitochondria, and chloroplast membranes, which can disintegrate the lipid bilayer to change the protein conformation and cause protein unfolding (Pastor et al., 2007). Heat stress also results in the production of ROS that damage photosynthetic apparatus and other components, thus hampering metabolic activity (Allakhverdiev et al., 2008; Das and Roychoudhury, 2014). Respiration is more temperature-sensitive than photosynthesis (Hatfield et al., 2011). At 45/35°C (day/night), the cellular oxidizing ability of chickpea plants reduced appreciably at vegetative stage (Kumar et al., 2013), suggesting impaired respiration and energy generation, possibly due to the inactivation of enzymes (Salvucci and Crafts-Brandner, 2004).

At high temperature (> 32/20°C), sucrose synthesis decreased due to the inhibition (40–43%) of sucrose synthesizing enzymes (sucrose synthase and sucrose phosphate synthase) to impair sucrose metabolism in leaves of chickpea during reproductive phase (Kaushal et al., 2013). As a result, the sucrose flow to flowers in heat-sensitive genotypes was considerably decreased to affect the developmental and functional aspects of pollen grains resulting in poor fertilization and pod set (Kaushal et al., 2013). High temperatures (32/20°C day/night) from anthesis to maturity reduced starch deposition in chickpea grains because of reduced activity of ADP-glucose pyrophosphorylase and starch synthase (Vu et al., 2001; Awasthi et al., 2014) resulting in reduction in grain weight.

#### Cellular Mechanisms for Survival Under Heat

Under heat stress (>35/23°C day/night) at the time of flowering, chickpea experiences adverse effects on growth and various metabolic processes that lead to alterations in the redox state of the cell (Kaushal et al., 2011; Awasthi et al., 2015). At high temperature (37 and 42°C for 10 h), ROS generation causes oxidative damage to vital cellular components, such as membrane lipids, proteins, nucleic acids, pigments, and enzymes (Rivero et al., 2001; Suzuki and Mittler, 2006; Yin et al., 2008). The ROS-induced oxidative damage consists of both free radicals, including hydroxyl radicals (OH˙), superoxide (O2 − ), alkoxyl radicals, and non-radicals like hydrogen peroxide (H2O2) and singlet oxygen (<sup>1</sup> O2) (Suzuki and Mittler, 2006). At 40/30 and 45/35°C during growth and germination stage, increased lipid peroxidation and hydrogen peroxide levels in the leaves of heat-sensitive chickpea genotypes caused more leaf damage, than in tolerant genotypes (Kaushal et al., 2011; Kumar et al., 2012b; Kumar et al., 2013). Heat tolerance mechanisms in chickpea are potentially characterized by higher levels of antioxidants and osmolytes (Kaushal et al., 2011), which maintain membrane integrity, protect macromolecules, and sustain metabolism, leading to heat acclimatization. Under stressful conditions, plants tend to combat ROS production by inducing an antioxidant system consisting of enzymatic and non-enzymatic components (Gill et al., 2012); for example in chickpea, the activities of SOD, catalase (CAT), and ascorbate peroxidase (APX) increased at 40/ 35°C during growth and germination stage but decreased at 45/ 40°C (Kaushal et al., 2011). Similar, the activity was observed in non-enzymatic antioxidants ascorbate (ASC) and glutathione (GSH). Inhibition of these enzymes and non-enzymatic antioxidants was much more in the heat-sensitive genotypes: the antioxidants increased at 40/35°C but declined at 45/40°C observed (Kaushal et al., 2011) in heat-sensitive genotypes. Exogenous application of proline (Pro), an osmolyte, significantly increased SOD, CAT, ASH, and GSH activity at 45/40°C in chickpea, relative to the plants grown without proline (Kaushal et al., 2011).

Salicylic acid (SA) plays a key role in providing tolerance against temperature stress in chickpea. Heat-stress-induced membrane damage in chickpea plants declined significantly with the application of SA, relative to the untreated control and heat-acclimatized plants (Chakraborty and Tongden, 2005). The SA treatment also altered the contents of proteins and proline, significantly with induction of various stress enzymes such as peroxidase (POX), ascorbate peroxidase (APOX), and catalase (CAT) activities (Chakraborty and Tongden, 2005). Abscisic acid also appears to be involved in thermotolerance of chickpea; exogenous ABA application (2.5 mM) at 4 day seedling significantly alleviated the effects of heat stress (45/40°C for 10 days) in chickpea (Kumar et al., 2013) by improving plant growth and reducing oxidative damage. Another study showed that exogenous nitrogen application during pre-flowering and suitable irrigation helped to mitigate the effects of heat stress (>35°C) in chickpea (Upadhyaya et al., 2011). Heat stress (38°C for 10 days) induced the accumulation of raffinose family oligosaccharides (RFOs), such as galactinol and raffinose; galactinol synthase (GolS) is a key regulatory enzyme of RFO biosynthesis. In a recent study, galactinol and raffinose content increased significantly in response to heat stress in chickpea (Salvi et al., 2017).

During heat stress, heat shock genes encode different heat shock proteins (HSPs), which accumulate and protect cells by acting as molecular chaperones (Huang and Xu, 2008). The transcription of HSP genes is controlled by heat stress transcription factors (Hsfs), which play a prominent role in thermo tolerance (Kotak et al., 2007). The recent identification of 22 Hsfs genes in the chickpea genome (both desi and kabuli) has provided valuable information on thermo tolerance in chickpea (Chidambaranathan et al., 2018). Quantitative PCR (Q-PCR) expression analysis of Hsfs in heat-stressed (> 35°C for 3 h) chickpea at two stages of development (15-day-old seedlings and during podding) revealed that CarHsfA2, A6, and B2 were up-regulated at both the stages of growth and four other Hsfs (CarHsfA2, A6a, A6c, B2a) showed early transcriptional upregulation (Chidambaranathan et al., 2018). A previous study identified three distinct classes of Hsfs (A, B, and C) (Lin et al., 2014).

Various other heat-responsive proteins induced by heat stress (42/25°C for 8 days), exclusively in the heat-tolerant chickpea genotype, may play a vital role in heat tolerance (Parankusam et al., 2017). A recent study identified a set of 482 heat-responsive proteins and several metabolic proteins, including phenylalanine ammonia lyase 2-like, pectinesterase 3, cystathionine gammasynthase, monodehydroascorbate reductase, adenosyl methionine synthase, NADH dehydrogenase subunit, cytochrome b6, inositol-3-phosphate synthase, RNA polymerase, and ATP synthase subunit alpha protein that were strongly related to the heat response in chickpea (Parankusam et al., 2017). Understanding the differential role and expression of these proteins in chickpea genotypes will provide an important vision for mechanisms that confer thermotolerance in chickpea.

Transcription factors (TFs) play an important role in modulating cellular responses under different stress conditions by activating the transcription of target genes. WRKY TFs are a major family of transcriptional regulators in plants that influence the stress tolerance mechanism and form an integral part of cell signaling pathways (Agarwal et al., 2011; Chen et al., 2012). In chickpea, TFs for heat tolerance have been reported [CaMIPS1 and CaMIPS2 (Kaur et al., 2008b) and Ca\_02170, Ca\_16631, Ca\_23016, Ca\_09743, Ca\_25602] (Agarwal et al., 2016). Recently, a genome-wide analysis of a WRKY TF gene model revealed the presence of 78 WRKY TFs evenly distributed across eight chromosomes in chickpea (Kumar et al., 2016). Car-WRKY TF is reportedly multi-stress responsive, playing a central role in stress signal transduction pathways (Konda et al., 2018). In the chickpea genome, seven genes were identified based on homology, PIE1

Rani et al. Drought and Temperature-Resilient Chickpea

(photoperiod independent early flowering 1), ARP6 (actin-related protein), two SEF (serrated leaf and early flowering), and three H2AZs (histone 2A variant-Z, a thermosensor in plants) and analyzed for expression under heat stress (37°C) that are homologous to chromatin remodeling complexes (SWR1) in Arabidopsis (Chidambaranathan et al., 2016). Of the seven genes, PIE1 was up-regulated during podding but downregulated at the seedling stage. Higher tissue-specific expression of PIE1 and SEF genes was observed in root, flower, pod wall, and grain tissues than in shoots. During pod development, all three H2AZ genes might function as thermosensors, with greater downregulation within 15 min, 1 and 6 h of the heat stress treatment (Chidambaranathan et al., 2016).

#### Mechanisms For Improving Heat Tolerance

The damage from high-temperature stress mainly depends on the plant's defense response and the growth stage at the time of exposure (Farooq et al., 2017). Chickpea plants use adaptive strategies to avoid, escape, and tolerate heat stress (Wery et al., 1993; Toker et al., 2007). Leaves avoid the heat by changing orientation, reducing transpiration, and reflecting light (Wery et al., 1993). In heat-stressed chickpea plants, phenology was accelerated as days to flowering and podding decreased significantly at 35/20°C (Kaushal et al., 2013), which also reduced total plant biomass. Therefore, accelerated phenology may be detrimental to chickpea production and considered an escape mechanism. Early maturation is closely correlated with reduced yield losses (Jumrani et al., 2017). In chickpea, a simple and cost-effective field screening method for heat tolerance at the reproductive stage was developed by delayed sowing (Krishnamurthy et al., 2011), which enable the plants to expose to high temperatures (>35°C) during reproductive phase; accordingly, the number of filled pods per plant in late-sown crop as identified as a selection criterion for reproductive-stage heat tolerance. Recent research has suggested that heat stress tolerance indices mean productivity, geometric mean productivity, yield index, tolerance index (TOL), superiority measure, and stress susceptibility index can be used to identify chickpea genotypes based on grain yield under normal and heatstressed conditions. Based on these selection indices, RVG 203, RSG 888, GNG 469, IPC 06-11, and JAKI 9218 had moderate to high heat tolerance (Jha et al., 2018a). Using a heat tolerance index (HTI), ICC 3362, ICC 12155, and ICC 6874 were identified as heat-tolerant lines (Krishnamurthy et al., 2011). Upadhyaya et al. (2011) identified ICC 14346 as a heat-tolerant genotype among 35 early maturing germplasm under ideal crop management (irrigation, nitrogen application) conditions in a field screening at Patancheru (India), based on grain yield (kg ha–<sup>1</sup> ). The pollen selection method and pollen viability were used to confirm the heat tolerance in ICCV 92944 (Devasirvatham et al., 2012), ICC 1205, and ICC 1561 (Devasirvatham et al., 2013). Heat-tolerant chickpea genotypes are listed in Table 1.

Various physiological traits—such as stomatal responses, membrane thermostability, chlorophyll fluorescence (CFL), canopy temperature depression (CTD)—have been associated with heat tolerance (Priya et al., 2018). Stomatal responses to heat stress is one possible mechanism for heat adaptation in chickpea; in a recent study, stomatal conductance and leaf water content (RWC) were significantly lower in heat-sensitive genotypes, relative to the unstressed plants, and significantly higher in tolerant genotypes, when grown under HS environment (>32/20°C) (Kaushal et al., 2013). Therefore, it can be assumed that stomatal conductance plays an important role during heat stress. Membrane thermostability is another important trait for heat tolerance, which has been considered a possible selection criterion for heat tolerance in chickpea, faba bean, and lentil based on electrolyte leakage from the leaves (Ibrahim, 2011). When tissues are subjected to high temperatures, electrical conductivity increases due to damage to cell membranes, consequently resulting in solute leakage. Electrolyte leakage increased under high temperature (>32/20°C) in a heat-sensitive chickpea genotype, relative to a heat-tolerant genotype (Kaushal et al., 2013; Parankusam et al., 2017). Thermal techniques have been used to measure canopy temperature; genetic variability in CTD (canopy temperature depression) was reported in chickpea under high temperature (32–35°C) (Devasirvatham et al., 2012), which correlated with yield. The genotypes with lower CTD (1–3°C) had lower grain yields than those with higher CTD (> 4°C) (Devasirvatham et al., 2015).

#### Effects of Drought in Chickpea

Chickpea is predominantly grown in resource-poor, arid, and semi-arid regions under rainfed conditions. Consequently, drought stress can decrease chickpea yields by up to 50% (Sabaghpour et al., 2006). Drought stress impairs key physiological and biochemical processes ranging from photosynthesis, CO2 availability, cell growth, respiration, stomatal conductance, to other essential cellular metabolisms (Mansfield and Atkinson, 1990; Chaves, 1991; Chaves et al., 2003; Flexas et al., 2005; Chaves et al., 2009; Pinheiro and Chaves, 2011).

In subtropical (South Asia and north-eastern Australia) and Mediterranean climatic regions (such as southern Australia), chickpea faces "terminal drought" during the reproductive phase (Leport et al., 1999; Siddique et al., 1999), which can seriously impair reproductive processes, viz. anthesis, pollination, and also causes malfunction of reproductive organs especially pollen germination, pollen viability, fertility, and pollen tube growth and even dysfunction of stigma and style (Leport et al., 1998; Leport et al., 1999; Pang et al., 2017). However, drought stress at young plant stage or prior to reproduction is not uncommon. Drought at young plant stages reduces plant growth leading to stunting and reduced biomass accumulation (Siddique et al., 1999). Water deficit during podding in chickpea increased ABA that may impair pod set and cause pod abscission which can ultimately cause significant yield losses (Pang et al., 2017). Drought stress in chickpea can also lead to the collapse of symbiotic N2 fixation processes, resulting in serious yield losses (Wery et al., 1993).

#### Genetic Variability for Capturing Drought Stress Tolerance in Chickpea

The exploitation of natural genetic variation across various crop gene pools remains central to improving drought stress tolerance

#### TABLE 1 | List of chickpea genotypes tolerant to heat, cold, and drought stress.


#### TABLE 1 | Continued


in crops, including chickpea. Considerable genetic variability for drought stress tolerance in chickpea has been recorded for various morpho-physiological and grain yield-related parameters under contrasting water regimes in the field (Krishnamurthy et al., 2010; Jha et al., 2014; Pang et al., 2017). Simple field-based screening techniques and superior crop yield performance has identified several chickpea genotypes under non-stressed and water stress conditions (Singh et al., 1997b; Toker and Cagirgan, 1998; Canci and Toker, 2009). Likewise, stress tolerance indices viz. drought susceptibility index and drought tolerance index, identified significant genetic variability for various phenological and yield-related traits under water stress in a large mini-core collection of 211 accessions (Krishnamurthy et al., 2010) (Table 1).

Considering the role of wild species as an important reservoir for imparting drought tolerance, Cicer anatolicum, Cicer microphyllum, Cicer songaricum are worth mentioning (Toker et al., 2007). Likewise, Kashiwagi et al. (2005) identified chickpea landraces in the Mediterranean, west Asian, and central Asian regions with high genetic variability for root length density that could be exploited for developing high water-use-efficient chickpea genotypes under water stress. Water use efficiency (WUE) is an important strategy for drought tolerance in crop plants, including chickpea (Condon et al., 2004; Zaman-Allah et al., 2011a; Zaman-Allah et al., 2011b), where a significant amount of genetic variability has been recorded (Pang et al., 2017). The authors identified "Neelam" as drought tolerant genotype, based on high WUE, as this genotype used a "conservative water use strategy" to maintain higher seed yields under water stress during early growth.

Root architecture traits are important parameters for improving crop performance under drought stress (Wasaya et al., 2018; Ye et al., 2018). Considerable progress has been made in elucidating the role of various root traits for drought stress tolerance in chickpea (Kashiwagi et al., 2006a; Kashiwagi et al., 2015). How root biomass, root length, and other rootrelated parameters, such as root length density (RLD), total root dry weight (RDW), and deep root dry weight (deep RDW), contribute to drought stress tolerance has been investigated in chickpea (Krishnamurthy et al., 2003; Kashiwagi et al., 2005; Gaur et al., 2008; Kashiwagi et al., 2008; Kashiwagi et al., 2015; Purushothaman et al., 2016; Chen et al., 2017). A significant amount of genetic variability for RLD in the mini-core collection and wild species of chickpea has been reported (Kashiwagi et al., 2005). Given their larger RLD, deep rooting system, and higher root biomass production, ICC 4958 and ICC 8261 genotypes are used extensively as donors for transferring important drought adaptive root traits to elite chickpea cultivars to develop droughtresilient chickpea cultivars (Saxena et al., 1993; Gaur et al., 2008). In addition, ICC 4958 remains one of the most extensively studied chickpea genotypes both in classical and modern molecular breeding programs for dissection of various traits, including drought-stress-related root traits.

Thus, these genotypes (ICC 4958 and ICC 8261) have been steadily incorporated into drought tolerance breeding programs for transferring the above-mentioned traits into elite chickpea varieties and developing mapping populations for deciphering drought-tolerant QTLs (Gaur et al., 2012). Concurrently, efforts are underway to develop multi-parent advanced generation inter-cross populations (MAGIC) by incorporating ICC 4958, JG 130, ICCV 10, JAKI 9218, JG 130, JG 16, ICCV 97105, and ICCV 00108, genotypes possessing drought and heat tolerance genomic regions/QTLs (Devasirvatham and Tan, 2018). Thus, selection from the resultant crosses could increase genetic gain in chickpea. Moreover, Chen et al. (2017) provided scope for improving drought tolerance in chickpea by investigating 30 root-related traits and three shoot-related traits in a large set of 270 core collection. 13C discrimination, an important physiological selection parameter related to water stress could also be used to enhance WUE under drought stress (Condon et al., 2002). A significant amount of genetic variability for 13C discrimination has been recorded in the chickpea reference germplasm collection (n = 280) (Upadhyaya et al., 2008; Krishnamurthy et al., 2013b).

Advancements in breeding techniques such as MAGIC have enabled the transfer of drought- and heat-tolerant traits into elite high-yielding chickpea cultivars by combining favorable allele combinations for drought and heat tolerance (Gaur et al., 2014; Gaur et al., 2019). Furthermore, marker-assisted recurrent selection (MARS) and marker-assisted backcrossing (MABC) efforts have been successfully used to transfer a "QTL-hotspot" genomic region harboring important drought-tolerant-related traits from donor parent ICC 4958 to JG 11 elite cultivar (Varshney et al., 2016).

#### Role of Physiological Traits for Adaptation Under Drought and Heat and Increasing Future Genetic Gain in Chickpea

Direct phenotypic selection for yield and yield-related traits has led to ignoring various important physiological traits that have great potential for increasing genetic gain and significantly contributing to plant acclimation under various abiotic stresses (Reynolds and Langridge, 2016). The incorporation of "physiological traits" in crop breeding programs provides an opportunity to enhance the chances of "cumulative gene action for yield" (Cossani and Reynolds, 2012). However, the success of incorporating various physiological traits depends on how the traits are associated with grain yield, their heritability, their ease of selection response and measurement, and their nondestructive nature (Monneveux et al., 2012).

Plant withstand drought and heat stress by recruiting "escape," "tolerance," and "avoidance" mechanism (Levitt, 1972). In the context, the major physiological traits involved in drought stress adaptation are categorized into "constitutive traits" and "acquired tolerance traits" (Sreeman et al., 2018). The notable "constitutive traits" involved in drought stress adaptation in chickpea include phenology (Kumar and Abbo, 2001), stomatal conductance (Liu et al., 2003), specific leaf area (Purushothaman et al., 2016), leaf area index (Purushothaman et al., 2016), chlorophyll content (Mafakheri et al., 2010), WUE (Kashiwagi et al., 2006b), and root traits (Krishnamurthy et al., 2003; Gaur et al., 2008; Kashiwagi et al., 2006a; Kashiwagi et al., 2015; Zaman-Allah et al., 2011b; Purushothaman et al., 2015). Likewise, canopy temperature depression (CTD) (Zaman-Allah et al., 2011a; Purushothaman et al., 2016), proline accumulation (Macar and Ekmekci, 2009; Mafakheri et al., 2010), regulation of ABA (Pang et al., 2017), and production of various antioxidant scavenging enzymes (Macar and Ekmekci, 2009) are the major "acquired tolerance" traits involved in drought stress tolerance in chickpea.

Prioritizing early phenology traits, viz. selection for early flowering and maturity, helps in the selection of genotypes exhibiting drought and heat stress tolerance in the form of an escape mechanism (Canci and Toker, 2009; Gaur et al., 2012; Hamwieh and Imtiaz, 2015). Relying on this mechanism important drought tolerant varieties viz., ICCV 90629, ICCV 2, ICCC 37, ICCV 10 (Kumar and Abbo, 2001), KAK2 (Gaur et al., 2008), and heat tolerant variety ICCV92944 (Gaur et al., 2012) were developed, however they suffered yield penalty due to restricted photosynthetic period, rapid growth rate, high harvest index, and short lifecycle (Kashiwagi et al., 2015; Berger et al., 2016).

#### Shoot Related Traits Contributing in Drought Stress Tolerance

Stomatal conductance (gs) is an important shoot-related parameter affecting leaf gas and water vapor exchange under stress conditions. Drought stress negatively affects stomatal conductance and leaf turgor (Liu et al., 2003). Zaman-Allah et al. (2011a) and Pang et al. (2017) argued genotype having lower stomatal conductance and utilizing lower water during vegetative stage at well-watered condition displayed higher drought tolerance at reproductive stage by using the conserved soil water at "terminal drought" stress. However, this "water sparing" will be effective for the crops those grow under stored soil water condition (Vadez et al., 2012). Insight into the genetic inheritance of stomatal conductivity and selection for lower stomatal conductance with higher leaf transpiration efficiency under drought could be promising for the development of drought tolerant chickpea genotypes. Likewise, correlations between crop growth rate and transpiration and transpiration efficiency are receiving attention in the development of droughttolerant chickpea (Purushothaman et al., 2016).

Among the various non-destructive physiological traits, CTD infrared thermometer based parameter acting as a surrogate trait for transpiration explains the difference between air temperature [Ta] and canopy temperature [Tc] (Balota et al., 2007). It has received great attention as a potential selection tool and is regularly employed for screening high yielding drought and heat stress tolerant plants (Mason and Singh, 2014). This parameter depicts plant transpiration status that plays an important role in reducing leaf temperature under both drought and heat stress. Lower canopy temperature is indicative of higher transpiration, which enables plants to maintain their water status for growth under heat stress and water stress (Zaman-Allah et al., 2011a). In this context, a positive association of CTD with grain yield was noted under heat stress (Devasirvatham et al., 2015) and under drought stress (Purushothaman et al., 2015) in chickpea. Likewise, under drought stress, cooler canopy temperatures enhance root biomass, root depth, and ultimately grain yield (Lopes and Reynolds, 2010). Thus, further research of CTD at a genetic level could give better insight how to use this traits to develop drought and heat stress tolerance chickpea genotypes.

#### Role of Water Use Efficiency in Drought Stress Adaptation

WUE defines "biomass accumulated in plant at the cost of per unit water transpired" (Bacon, 2004). An array of traits ranging from stomatal regulation, transpiration rate to root traits could be employed for increasing WUE. Regulation of stomatal opening remains a great paramount importance, as restriction in stomatal opening increases reduction in transpiration leading to enhance WUE (Saradadevi et al., 2017). In this context, Zaman-Allah et al. (2011a) opined that lower stomatal conductance and lower transpiration could save water to be utilised during reproductive period under "terminal drought" stress in chickpea. However, reduction in stomatal opening causes lower intake of CO2 that may lead to decrease in photosynthetic carbon accumulation (Vadez et al., 2012). This mechanism of water stress tolerance works well when chickpea is grown in high water holding capacity soil in the south and central India featuring warmer and shorter growing period for chickpea (Berger et al., 2006; Berger et al., 2016). Contrastingly, high transpiration rate, high above and below ground biomass, high seed yield are the characteristics features of chickpea when it is grown under high rainfall receiving areas viz., northern Indian condition with low water holding capacity and with later phenology (Berger et al., 2006; Berger et al., 2016). Relying on the result explaining positive correlation of WUE with biomass yield under drought stress, Wright (1996) argued that increase in WUE could promisingly enhance plant yield provided harvest index is maintained.

Likewise, carbon isotope discrimination (D13C) is a noteworthy physiological attribute for measuring transpiration efficiency/WUE of plants under drought or heat stress. Kashiwagi et al. (2006b) suggested a negative correlation between D13C and WUE. However, its high cost of measurement remains a barrier to measuring WUE in larger numbers of genotypes. Thus, future genetic and molecular studies targeting traits improving WUE and optimizing transpiration rate could be beneficial in developing drought tolerant chickpea cultivars.

#### Role of Root Traits Contributing Drought Adaptation

Root system architecture is an important parameter that directly controls plant water content, which influences crop performance under water stress (Ye et al., 2018). Besides, root senses drought stress under dry soil and signals to produce ABA that causes closure of stomata resulting restriction of water loss through transpiration (Saradadevi et al., 2017). The crucial role of root traits, viz. RLD, root biomass, total RDW, root diameter, root volume, and root surface area, in controlling plant water status and how they help chickpea to adapt to water stress has been investigated (Krishnamurthy et al., 2003; Gaur et al., 2008; Kashiwagi et al., 2006a; Zaman-Allah et al., 2011b; Kashiwagi et al., 2015; Purushothaman et al., 2015). Mostly root traits play critical role in drought adaptation in chickpea by facilitating mining water through deep root and minimizing transpiration under water stress (Berger et al., 2016). In order to elucidate the role of root traits contributing in grain yield, Gaur et al. (2008) showed higher RLD and maximum root depth (RDp) in shallow soil could assist in increasing seed yield under drought stress. Likewise, Ramamoorthy et al. (2017) also evidenced positive association of RLD and grain yield under drought stress in chickpea. However, positive association of root traits with grain yield under drought stress remains inconsistent across various environment (Zaman-Allah et al., 2011b), leading plant breeders reluctant to use this trait in breeding program for drought tolerance. Thus, under central and south Indian condition where chickpea faces "terminal drought" stress, root traits based on "drought avoidance" strategy could be a promising approach for designing drought tolerant chickpea varieties (Kashiwagi et al., 2015). However, when chickpea grown under "in-season rainfall" in low water holding capacity soil under Mediterranean climates in Western Australia, this "drought avoidance" strategy remains ineffective (Berger et al., 2016).

#### Response of Biochemicals Alleviating Drought and Heat Stress

Plants including chickpea maintain turgor pressure and cell wall plasticity under water stress through recruiting osmotic adjustment mechanism that allows accumulating crucial biochemical compounds, including proline, glutathione, trehalose, molecular chaperones, and various antioxidant enzymes (Macar and Ekmekci, 2009; Mafakheri et al., 2010; Kaushal et al., 2011; Berger et al., 2016; Kaur et al., 2017; Farooq et al., 2018). Among the various stress-responsive chemical compounds, proline remains a critical amino acid produced in plants in response to stress. The differential expression pattern of proline synthesis enzyme (D1-pyrroline-carboxylate synthetase) and catabolism of proline by proline dehydrogenase in response to water stress at different vegetative and reproductive stages in drought-tolerant and drought-sensitive genotypes has been investigated in chickpea (Kaur et al., 2017). The desi Bakhar-2011 chickpea genotype accumulated more proline, trehalose, and non-reducing sugars to tolerate drought stress more than Bitall-2016 desi genotype by alleviating the adverse effects of oxidative stress and maintaining better carbon assimilation (Farooq et al., 2018). Likewise, to detoxify and to protect cellular damage from reactive oxygen species (ROS) viz., superoxide radicals, singlet oxygen accumulating under drought and heat stress, several ROS scavenging anti-oxidant enzymes such as superoxidase dismutase, catalase, glutathione peroxidase are worth mentioning biochemicals that enable chickpea adapting under drought and heat stress (Mafakheri et al., 2011; Kaur et al., 2017). Recently, Ullah et al. (2019) proposed that supply of zinc based nutrition could also assist in enhancing antioxidant activities and alleviate the detrimental effects of drought and heat stress in chickpea. These mechanisms are effective under moderate dehydrating conditions and impart partial drought tolerance (Farooq et al., 2018).

A holistic approach encompassing plant physiological approaches, genomics tools, and innovative breeding techniques for designing drought and extreme temperature tolerant chickpea cultivars has been depicted in Figure 1.

#### Advances in Genomics for Developing Drought and Heat Stress Tolerance in Chickpea

Investigating the genomic resources such as simple sequence repeat markers (SSRs) and single nucleotide polymorphism (SNPs) is vital for mapping of genes/QTL as well as for identifying genes related to drought and heat tolerance in QTL intervals. In the last decade, unprecedented advancements in molecular marker development and construction of high-density linkage maps have enabled precise mapping of various traits of breeding interest at specific locations across linkage groups in chickpea (Thudi et al., 2011; Jha et al., 2018b). Considering drought and heat stress tolerance, family-based bi-parental mating scheme derived mapping populations were limitedly devoted to elucidating QTLs controlling traits associated with various morpho-physiological and yield and yield-related traits under drought and heat stress in chickpea (Rehman et al., 2011; Hamwieh et al., 2013; Paul et al., 2018). However, the resultant QTL intervals remained large. Additionally, precise mapping of drought stress tolerance QTL remains challenging as it is controlled by various "minor effect QTLs" and remains unstable across the various locations due to high G×E interaction (Fleury et al., 2010). Increasing facilities of high density genotyping with large number of SSR markers and precise phenotyping of two mapping population segregating for various drought-related traits across multiple locations and multiple seasons allowed Varshney et al. (2014) to identify a "QTL-hotspot" harboring 13 main effect QTLs related to 12 drought-related traits, which explained up to 58% of the phenotypic variation on CaLG4. Subsequently, by adopting a

marker-assisted backcross breeding scheme, this QTL-hotspot genomic region was introgressed from ICC4958 into JG11, an elite chickpea cultivar (Varshney et al., 2016). The resultant introgressed lines had greater root depth, RLD, and RDW (Varshney et al., 2016). However, this marker assisted breeding scheme remains effective for transferring "major effect QTLs" (Hayes et al., 2009). Further, advancements in next-generation sequencing technology (NGS) and high resolution genotyping platforms enabled the generation of huge numbers of SSR and SNP markers that assisted in narrowing the previously identified QTL-hotspot (Varshney et al., 2014) region to ~14 cM by recruiting genotyping-by-sequencing (Jaganathan et al., 2015). Furthermore, the combination of high density bin mapping and precise phenotyping of 17 drought-related traits across multiple locations and seasons further narrowed the QTL-hotspot region to ~300 Kb, and subdivided this genomic region into "QTLhotspot\_a" and "QTL-hotspot\_b" regions on CaLG4 (Kale et al., 2015). Interestingly, QTLs contributing to plant vigor and canopy conductance under water stress were unfolded in this genomic region (Sivasakthi et al., 2018). Likewise, a total of four major QTLs developed from ICC 15614 × ICC 4567 RIL population controlling pod and grain yield trait were mapped on CaLG5 and CaLG6 under heat stress (Paul et al., 2018). Future cloning and functional characterization of these genomic regions could unravel the function of underlying gene(s), and thus facilitating designing of drought and heat stress tolerant chickpea genotypes.

Taking the advantage of higher resolution power of mapping complex QTLs owing to "natural evolutionary recombination events" genome-wide association study (GWAS) received great attention for unveiling "genotype-phenotype" associations elucidating the underlying novel candidate gene(s) controlling various complex traits including drought stress tolerance across large germplasm panel in various crop plants (Zhu et al., 2008; Huang and Han, 2014; Liu and Yan, 2019). In chickpea, GWAS has been used to better understand the genetic architecture of various complex traits of breeding importance [see Jha (2018)]. To elucidate marker-trait associations (MTA) for droughtrelated traits, Thudi et al. (2014) conducted GWAS in a large global collection of 300 chickpea genotypes. A total of 312 significant MTAs related to various drought and heat stressrelated traits were identified providing a great opportunity for targeting those genomic regions for drought and heat stress tolerance breeding (Thudi et al., 2014). Similarly, five significant MTAs for cell membrane stability and chlorophyll content related to heat stress tolerance were deciphered from 71 chickpea genotypes containing historically released varieties of Indian and improved breeding lines (Jha et al., 2018b). Likewise, recently given the 3.65 million SNPs emanating from resequencing 429 globally collected chickpea germplasm, GWAS was used to elucidate significant MTAs for drought and heat stress tolerance in chickpea (Varshney et al., 2019). A total of 262 significant MTAs for various heat stress relevant traits, along with several potential candidate genes, viz. TIC, REF6, aspartic protease, cc-NBS-LRR, RGA3 contributing in heat and drought tolerance were uncovered. Thus, the consistent and stable significant MTAs/genomic regions controlling pods/ plant, yield trait, and phenological traits could be potentially incorporated in the high yielding yet drought/heat stress sensitive popular chickpea cultivars for improving drought and heat stress in chickpea.

Unparalleled advances in cost-effective genotyping platforms have enabled the generation of large-scale SNP marker information using WGS and WGRS of globally released chickpea cultivars, breeding lines, and germplasm accessions (Varshney et al., 2013b; Thudi et al., 2016; Roorkiwal et al., 2018a; Varshney et al., 2019). This has provided opportunities for the chickpea breeding community to use genomic selection (GS) (Meuwissen et al., 2001; Jannink et al., 2010) for various complex traits including drought stress tolerance (Roorkiwal et al., 2016; Li et al., 2018; Roorkiwal et al., 2018b). To date, several conventional breeding approaches have been devoted to increasing genetic gain by selecting superior individuals in chickpea under various biotic and abiotic stresses, including drought stress. However, this process remains slow due to yield and yield-related traits being governed by "small effect QTLs," low heritability, and the influence of G × E interactions. GS could be one of the promising approaches to minimize this problem. GS constitutes "training population" with known genotypic and trait information, and is used to predict the genomic estimated breeding value of unobserved individuals of "candidate population" for complex traits with only genotypic information byusing various "trained statistical"/prediction models (Meuwissen et al., 2001; Jannink et al., 2010). Thus, the adoption of GS scheme could be a new avenue for capturing the "minor effect QTLs" across the whole genome and predicting increased genetic gain based on various prediction models under water stress in various crops, including chickpea (Hayes et al.2009; Crossa et al., 2017). The profuse numbers of SNP markers generated from 132 chickpea genotypes by WGRS allowed to conduct "SUPER GWAS" for unveiling the candidate genes associate to drought stress tolerance and also the sub set of SNPs were also used for performing GS for "prediction accuracy" of important yield related traits under drought stress (Li et al., 2018). Subsequently, Roorkiwal et al., 2018b investigated the implications of GS for precise prediction accuracy of genotypes incorporating G × E effects to enable selection of superior genotypes under various target environments for enhanced genetic gains in chickpea. However, the success of GS relies on high marker density, advanced genotyping platforms, heritability of trait, and optimization of the statistical model frameworks devised for GS (Roorkiwal et al., 2018a; Voss-Fels et al., 2019). Therefore, GS has great scope for selecting superior parents for crossing programs, maximizing selection accuracy, multi-trait selection in early generation, and speeding up the breeding cycle (Hayes et al., 2009; Jia and Jannink, 2012; Crossa et al., 2014; Crossa et al., 2017; Dias et al., 2018).

The arrival of NGS technologies in the last decade created a new dimension in genome sequencing chemistry, enabling the release of draft genome sequences of various plants of agricultural and economic importance (Michael and Jackson, 2013). The availability of draft genome sequences of kabuli (Varshney et al., 2013b), desi (Jain et al., 2013), and wild species (Parween et al., 2015) has sped up genomics research in chickpea. However, these genome sequences do not capture all the structural variations and presence–absence variation related to various traits. Falling cost of sequencing allowed us to sequence several genotypes/lines at a reasonable cost to capture the desired genomic regions. To obtain novel insight into drought-controlling genomic regions, WGRS of 100 chickpea genotypes has provided several important haplotypes that control drought stress tolerance (Thudi et al., 2016). Subsequently, Li et al. (2018) have unfolded significant associations of SNP markers released from WGRS of 132 chickpea lines with important drought tolerance candidate genes encoding auxin efflux carrier protein (PIN3), p-glycoprotein (PGP), and nodulin MtN21/EamA-like transporter. Recent efforts in WGRS of global chickpea germplasm coupled with GWAS have identified several drought-stress-controlling genomic regions (root traits, phenological traits, harvest index, 100 seed weight, delta carbon ratio etc.), including an important candidate gene REF6 responsible for early phenology trait (Varshney et al., 2019). Further cloning and functional validation of this REF6 gene and transfer of this gene through marker assisted breeding could help developing drought tolerant chickpea cultivar based on drought escape mechanism. Thus, translation of these genomics resources into applied breeding could expedite designing drought-tolerant chickpea varieties.

#### Functional Genomic Resources for Drought and Heat Stress Tolerance

Functional genomics remains a powerful approach for identifying the underlying candidate gene(s) and deciphering their functional role in response to various stresses including drought and heat stress in plant (Langridge et al., 2006). This approach can be employed in chickpea genotypes contrasting for stress sensitivity to obtain critical information about specific genes and their roles related to drought and heat tolerance. A significant progress in the development of genomic resources for dissection of drought and heat stress tolerance has been made (Varshney et al., 2014; Jaganathan et al., 2015; Kale et al., 2015; Varshney et al., 2016; Paul et al., 2018). However, the role of various candidate genes and their complex regulatory networks controlling drought and heat tolerance in chickpea at the functional level is limited (Hiremath et al., 2011; Agarwal et al., 2016; Garg et al., 2016); the information available about functional genomics largely pertains to drought tolerance.

Current advances in high throughput transcriptome sequencing technologies, especially RNA sequencing (RNA-seq), have provided novel insights into the molecular basis of drought tolerance by revealing the comprehensive landscape of divergent gene expression and their complex regulatory networks at various developmental stages at the transcriptional level (Garg et al., 2016; Kudapa et al., 2018). Before the advent of RNA-seq, microarray-based technologies and expressed sequenced tags (ESTs) were exclusively devoted to elucidating the preliminary function of various drought-stress-responsive genes/differentially expressed genes (DEGs) in chickpea (Mantri et al., 2007; Varshney et al., 2009; Deokar et al., 2011). Subsequently, given the RNA-seq driven global transcriptome analysis, a large number of water stress responsive DEGs (4954) were unearthed from root tissues of two contrasting drought tolerant (ICC 4958) and drought sensitive (ICC 1882) parents responding under water stress condition (Garg et al., 2016). Various DEGs identified under drought stress were found to be drought responsive TFs genes involved in controlling various hormone signaling ranging from abscisic acid, auxin, gibberellins, jasmonic acid, brassinosteroid to cytokinin (Garg et al., 2016; Badhan et al., 2018). Likewise, recently transcriptome sequencing of root and shoot tissue of two contrasting parents Bivanij and Hashem for drought resulted in 4,572 DEGs (Mahdavi Mashaki et al., 2018). From this investigation a total of seventeen common drought responsive genes from shoot and root were recovered. Importantly, to elucidate the role of candidate genes responding under drought stress, Bhattacharjee et al. (2015) reported higher upregulatory role of Ca\_19899 (homeobox gene) in shoot tissue and down-regulatory role of Ca\_00550 gene both in root and shoot under drought stress. To mitigate the toxic effect of ROS activity produced under drought stress, Mahdavi Mashaki et al. (2018) unveiled up-regulatory activity of three genes (in Hashem) and Ca\_04125 gene (in Bivanij) involved in safeguarding cells against ROS toxicity. Likewise, up-regulatory activity of Ca\_05702 gene (participating in flavonoid biosynthesis), CaNAC16 (Ca\_18090) (involved in water stress tolerance) and Ca\_00449 (carotenoid biosynthesis and producing ABA contributing in drought stress tolerance) in shoots of Bivanij under water stress were also substantiated (Mahdavi Mashaki et al., 2018). Additionally, participatory role of several TFs genes ranging from NAC, AP2/ ERF, bHLH, WRKY, to MYB/MYC in essential metabolic pathways were also deciphered in chickpea under drought stress (Badhan et al., 2018; Mahdavi Mashaki et al., 2018; Kumar et al., 2019).

Furthermore, to identify the precise role of various candidate genes identified in the "hotspot QTL" region pinpointed by Kale et al. (2015) at the gene expression level, RNA-seq based global gene expression analysis revealed differential expression of nine candidate genes under water stress (Kudapa et al., 2018). Four genes namely E3 ubiquitin‐protein ligase, LRX 2, kinase interacting (KIP1 ‐like) family, and homocysteine S‐methyltransferase, displayed induced expression under drought stress (Kudapa et al., 2018). Likewise, RNA-seq analysis of various vegetative and reproductive tissues subjected to heat stress identified several important candidate genes, viz. Ca\_25811, Ca\_23016, Ca\_09743, Ca\_17680, contributing in heat-stress tolerance (Agarwal et al., 2016).

Similarly, non-coding RNA, including microRNA and long non-coding RNA (lncRNA), have received attention for their regulatory role in the expression of various genes controlling complex traits at the post-transcriptional level, including for drought stress in chickpea (Khandal et al., 2017; Singh et al., 2017). A microRNA (miRNA) profiling study of root apical meristem identified 284 unique miRNA sequences; of which 259 were differentially expressed under drought and salinity stress (Khandal et al., 2017). Functional validation of miRNA397 through qRT-PCR revealed its up-regulatory role under drought stress and it targeted LACCASE4 gene that participate in lignin metabolism. To obtain deeper insight into the role of lncRNA for drought, a new tool "PLncPRO" was developed (Singh et al., 2017). A total of 3,714 lncRNAs involved in drought stress response in rice and chickpea have been discovered using this tool. However, the precise role of these lncRNAs in the drought stress response in chickpea and their functional annotation need further investigation. Further, availability of reference genome sequences, "C. arietinum gene expression atlas (CaGEA)" (Kudapa et al., 2018) and further refinement of transcriptome analysis could further increase our understanding of the complex drought and temperature stress responsive pathways, tracing the regulatory gene networks, and the underlying candidate gene(s), and their precise role in controlling drought and extreme temperature stress tolerances in chickpea. Moreover, transcriptome analysis could provide us great opportunity for revealing the genetic basis of higher adaptation of crop wild relatives (CWRs) and landraces to the counterpart of the cultivated species under various abiotic stresses (Srivastava et al., 2016). However, limited availability of abiotic stress tolerant cloned gene(s) has hampered the progress of functional genomics in chickpea (Deokar et al., 2015; Sen et al., 2017). Thus, in future mapbased cloning of abiotic stress tolerant gene(s)/QTLs could further illuminate our understanding of various mechanisms and key molecular players involved in drought, heat and cold tolerance in chickpea.

#### CONCLUSION AND FUTURE PERSPECTIVE

Current trends of unpredictable global climate change have resulted in periodic spells of drought stress and frequent episodes of extreme temperature, thus challenging plant growth and yield in several crops, including chickpea. Harnessing of crop germplasm, including various gene pools remains one of the most viable options in design of climateresilient chickpea plants. Cicer cultigens are not adequately equipped with cold-tolerance; wild relatives C. echinospermum or C. reticulatum, the species of primary gene pool which are crossable to the cultigen, are however, good sources of cold tolerance. These species can be exploited to introgress cold tolerance to the cultigen. Incorporation of cold-tolerance in winter sown crop will lead to early flowering and maturity, a strategy that would allow the crop to avoid terminal drought, expected terminal high temperature due to global warming especially in winter/autumn sown crop and would increase reproductive period leading to enhanced productivity. Chickpea has indeterminate growth, and observations at two sites in north India (Palampur and Chandigarh, India) showed that temperature increase acts as a cue to terminate flowering and podding (Sharma and Nayyar, personal observations). If temperature remains conducive, chickpea plants would continue to flower and set pods due to indeterminate growth habit and this period can be increased by introgression of cold tolerance in chickpea. On the other hand, chickpea in warmer climates especially the spring-sown regions is expected to face higher terminal temperatures and high temperature tolerant chickpea must be developed for these regions for sustained productivity under global warming. Incorporation of drought tolerance in the cold tolerant as well as heat tolerant cultivars would be desirable as such dual tolerance chickpea would have additional protection from damage by drought apart from cold or heat stress.

Unlike cold-tolerance, heat-tolerant chickpea genotypes are relatively common to find in C. arietinum. In both types of temperature stresses, reproductive stage is the most sensitive one, and fails for similar reasons. Some cellular defense mechanisms such as osmolytes, carbohydrates, and antioxidants have been worked out by us under both heat and cold stress environments, which showed commonalities in their expression in responses to both the stresses but the picture fully clear in this context. Physiological mechanisms under combination of drought and heat as well as drought and cold are not fully understood. Further, it needs to be investigated whether heat-tolerant genotypes set pods under cold stress by subjecting them to LT under controlled environment, and testing their reproductive function and pod set. In case of cross tolerance, cellular defense mechanisms involving some stress-related metabolites and related genes may be probed to understand the underlying mechanisms. Since chickpeas have maximum acreage under rainfed and leftover soil moisture conditions and the crop invariably faces droughts at reproductive stage, this coupled with expected erratic rainfall under climate change scenarios warrants development of drought tolerant varieties. Terminal drought usually coincides with terminal heat stress in several chickpea growing regions, and hence, development of heat and drought tolerant chickpea cultivars is desired. Incorporation of various landraces and a range of crop gene pool harboring "adaptive traits" could enhance the resilience of chickpea genotypes under extreme climates.

Considerable understanding of physiological responses of genotypes of chickpea tolerant/sensitive to cold, heat, and drought is available, this understanding have, however, not been underpinned completely by the genetics/genomics. Genomics and transcriptomics have increased our understanding of gene and gene regulatory networks governing cold, drought, and heat stress, the understanding is, however, incomplete as it does not converge into well defined pathways governing tolerance or susceptibility to these three major abiotic stresses of chickpea. Unlike chickpea, we have considerably more information of plants' responses to various abiotic stresses in Arabidopsis thaliana. To identify well defined regulatory pathways for abiotic stress tolerance/sensitivity in chickpea, focus should be on establishment of role of individual genes identified through transcriptomics/genomics in tolerance or sensitivity and advancing this knowledge gradually to elucidate some specific as well as common responses of chickpea plants to these abiotic stresses. Owing to advancements in genomics in chickpea, QTLs/genes governing tolerance to the three abiotic stress traits and preliminary information on genes/gene interactions governing susceptibility/tolerance to these traits is available. The DNA-based markers, despite accelerated development during the last decade, are still inadequate and further enrichment of genomic resources for marker assisted selection is required so that adequately dense genetic maps be developed to map all the possible traits and narrow down the QTL boundaries in case of quantitative traits such as cold, drought, and heat stress tolerance. Considering drought stress, a "QTL-hotspot" harboring root and various drought related trait has been introgressed into elite chickpea genotype (Varshney et al., 2016). However, the other minor QTLs need to be pyramided individually or in combination for developing drought and heat tolerant elite chickpea varieties. Chickpea breeders still rely primarily on phenotypic selection for progeny plants while marker assisted selection (MAS) remained an underutilized technology even for monogenic traits like Fusarium wilt. Similarly, gene/QTL pyramiding has not been exploited in chickpea. Clearly, marker technology in chickpea is still in the laboratory stage waiting to be exploited commercially. Nonetheless, genomic resources such as markers linked to phenotypic traits and genes governing several traits are already known and this knowledge is expanding rapidly e.g., sequencing and resequencing approaches have increased repertoire of SNP markers during the last decade. This information indicates toward possible exploitation of genomic selection for phenotypic traits for chickpea in future.

Future research must aim at developing designer chickpea cultivars that can tolerate combination of stress environments, such as heat and drought, and cold and drought, to expand its stress tolerance ability along with superior agronomic performance. Exploitation of genomics/transcriptomics/ resequencing coupled with reference genome sequences in chickpea, are expected to enhance our understanding of cold, heat and drought stress tolerance that in near future will boost development of single- or multiple stress tolerant high-yielding chickpea cultivars suited to specific climatic niches. This knowledge may consequently result in development of better and economical stress management options based on chemical/ agronomic means, apart from host resistance, to enable us to deal with unexpected climatic contingencies.

#### AUTHOR CONTRIBUTIONS

AR and KDS compiled information about cold stress, and PD and UJ about heat and drought stress. KHM and HN thoroughly edited the manuscript and gave their inputs in organizing the text.

#### ACKNOWLEDGMENTS

HN thanks Department of Science and Technology (DST), Department of Biotechnology (DBT), University grants commission (UGC), CGIAR, University of Western Australia of supporting work on cold and heat stress in chickpea. Thanks are also to DST-PURSE grants for research facilities. HN is also thankful to Punjab Agricultural University (PAU), Ludhiana,

### REFERENCES


India, and ICRISAT for providing chickpea germplasm. KS is thankful to Department of Biotechnology (DBT), India for supporting the work on cold stress in chickpea.


chlorophyll fluorescence in soybean. Photosynth. Res. 131, 333–350. doi: 10.1007/s11120-016-0326-y


arietinum) are associated with impaired sucrose metabolism in leaves and anthers. Funct. Plant Biol. 40, 1334–1349. doi: 10.1071/FP13082


Drought Stress. J. Agron. Crop Sci. 195, 335–346. doi: 10.1111/j.1439- 037X.2009.00374.x


arietinum L.) and soybean (Glycine max L.) to water deficit stress. Bot. Bull. Acad. Sin. 46, 333–338.


drought tolerance of chickpea. J. Exp. Bot. 62, 4239–4252. doi: 10.1093/jxb/ err139


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Rani, Devi, Jha, Sharma, Siddique and Nayyar. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.