# ECOLOGY AND EVOLUTION OF PLANTS UNDER DOMESTICATION IN THE NEOTROPICS

EDITED BY : Alejandro Casas, Ana H. Ladio and Charles R. Clement PUBLISHED IN : Frontiers in Ecology and Evolution and Frontiers in Plant Science

#### Frontiers Copyright Statement

© Copyright 2007-2019 Frontiers Media SA. All rights reserved. All content included on this site, such as text, graphics, logos, button icons, images, video/audio clips, downloads, data compilations and software, is the property of or is licensed to Frontiers Media SA ("Frontiers") or its licensees and/or subcontractors. The copyright in the text of individual articles is the property of their respective authors, subject to a license granted to Frontiers.

The compilation of articles constituting this e-book, wherever published, as well as the compilation of all other content on this site, is the exclusive property of Frontiers. For the conditions for downloading and copying of e-books from Frontiers' website, please see the Terms for Website Use. If purchasing Frontiers e-books from other websites or sources, the conditions of the website concerned apply.

Images and graphics not forming part of user-contributed materials may not be downloaded or copied without permission.

Individual articles may be downloaded and reproduced in accordance with the principles of the CC-BY licence subject to any copyright or other notices. They may not be re-sold as an e-book.

As author or other contributor you grant a CC-BY licence to others to reproduce your articles, including any graphics and third-party materials supplied by you, in accordance with the Conditions for Website Use and subject to any copyright notices which you include in connection with your articles and materials.

All copyright, and all rights therein, are protected by national and international copyright laws.

The above represents a summary only. For the full conditions see the Conditions for Authors and the Conditions for Website Use. ISSN 1664-8714 ISBN 978-2-88963-047-9 DOI 10.3389/978-2-88963-047-9

### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

## Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# ECOLOGY AND EVOLUTION OF PLANTS UNDER DOMESTICATION IN THE NEOTROPICS

Topic Editors:

Alejandro Casas, Universidad Nacional Autónoma de México, México Ana H. Ladio, Universidad Nacional del Comahue, Argentina Charles R. Clement, Instituto Nacional de Pesquisas da Amazônia, Brazil

Image: Alejandro Casas "Araq papa festival at Chumbibilcas, Peru" by Alejandro Casas

The Neotropical area is a main setting of the earliest experiences of domestication of plants, and evolutionary processes guided by humans, which continue being active in the area. Studies comprised in this Research Topic show a general panorama about similarities and particularities of processes of domestication for different plant groups and regions, some of them illustrate how the domestication processes originated and diffused, how landscape domestication has operated and continues being practiced and others discuss some of the main challenges for designing policies for biosafety and conservation of plant genetic resources. It is an attempt to identify main topics for research on evolution under domestication, and opportunities that researchers can find in the Neotropics to understand how and why these processes occurred in the past and present.

Citation: Casas, A., Ladio, A. H., Clement, C. R., eds. (2019). Ecology and Evolution of Plants under Domestication in the Neotropics. Lausanne: Frontiers Media. doi: 10.3389/978-2-88963-047-9

# Table of Contents

*05 Editorial: Ecology and Evolution of Plants Under Domestication in the Neotropics*

Alejandro Casas, Ana H. Ladio and Charles R. Clement

# DOMESTICATION AS AN EVOLUTIONARY PROCESS


Yolanda H. Chen, Lori R. Shapiro, Betty Benrey and Angélica Cibrián-Jaramillo


# ORIGINS AND DIFFUSION OF DOMESTICATION


# LANDSCAPE DOMESTICATION

*143 Firewood Resource Management in Different Landscapes in NW Patagonia* Daniela V. Morales, Soledad Molares and Ana H. Ladio


# GENETIC RESOURCES MANAGEMENT AND POLICIES


Guillermo Sánchez-de la Vega, Gabriela Castellanos-Morales, Niza Gámez, Helena S. Hernández-Rosales, Alejandra Vázquez-Lobo, Erika Aguirre-Planter, Juan P. Jaramillo-Correa, Salvador Montes-Hernández, Rafael Lira-Saade and Luis E. Eguiarte

*262 An Initiative for the Study and Use of Genetic Diversity of Domesticated Plants and Their Wild Relatives*

Alicia Mastretta-Yanes, Francisca Acevedo Gasman, Caroline Burgeff, Margarita Cano Ramírez, Daniel Piñero and José Sarukhán

*269 The Mating System of the Wild-to-Domesticated Complex of* Gossypium hirsutum *L. is Mixed*

Rebeca Velázquez-López, Ana Wegier, Valeria Alavez, Javier Pérez-López, Valeria Vázquez-Barrios, Denise Arroyo-Lambaer, Alejandro Ponce-Mendoza and William E. Kunin

# Editorial: Ecology and Evolution of Plants Under Domestication in the Neotropics

Alejandro Casas <sup>1</sup> \*, Ana H. Ladio<sup>2</sup> and Charles R. Clement <sup>3</sup>

1 Instituto de Investigaciones en Ecosistemas y Sustentabilidad, Universidad Nacional Autónoma de México, Morelia, Mexico, 2 Instituto de Investigaciones en Biodiversidad y Medio Ambiente, Universidad Nacional del Comahue, Bariloche, Argentina, <sup>3</sup> Departamento Tecnologia e Inovação, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil

Keywords: genetic resources, human guided evolutionary processes, human selection, landscape domestication, origins of agriculture, Tropical Americas

**Editorial on the Research Topic**

**Ecology and Evolution of Plants Under Domestication in the Neotropics**

# INTRODUCTION

The earliest studies on domestication and the origins of agriculture identified the Americas, particularly the Neotropics, as one of the settings with the earliest experiences of these processes in the world (Darwin, 1859, 1868; de Candolle, 1882; Vavilov, 1926). For a long time, Mesoamerica, and the Andean region were considered the main areas of origin of agriculture, but more recently Amazonia was shown to be similarly important (Clement et al., 2015; Levis et al., 2017). Analyzing how similar the processes were among these main regions and how much they influenced each other remain mostly unexplored. This is relevant to understand how the domestication processes originated and diffused throughout the whole continent, as well as how ancient and current human cultures of the region practiced—and practice—plant domestication (Casas et al., 2017). In addition, it is relevant to comprehend how peoples have practiced landscape domestication, which involves human manipulation of landscapes resulting in ecological changes at that scale, as well as more productive landscapes, congenial for humans (Clement, 1999).

This research topic of Ecology and Evolution of Plants under Domestication in the Neotropics showcases 20 articles analyzing in depth 17 species in several Neotropical regions, as well as dozens of species from various perspectives. The studies include evolutionary, phylogeographic, ethnobotanical, population genetics, pollination biology, and ecological approaches. All are valuable contributions that reflect the state-of-the-art and trends in research on domestication in the Neotropics, and put new questions and hypotheses into perspective for further studies.

## Domestication as an Evolutionary Process

Several articles of the Research Topic review and analyse current issues on domestication as an evolutionary process and the contribution of such perspective to constructing general evolutionary theory. Particularly, some authors analyse parallel and convergent evolution associated to domestication, the shaping of phenotypes by human and non-human forces, as well as domestication as on-going process that allows studying how evolution operates and can be analyzed. These aspects are reviewed by Pickersgil and Chen et al., and illustrated with case studies in Capsicum annuum and Phaseolus lunatus by Luna-Ruiz et al. and Cuny et al., respectively. All these studies exemplify the different nature of human and natural forces influencing evolution, and exhibit domestication as study model for analyzing evolutionary processes.

#### Edited by:

B. Mohan Kumar, Kerala Agricultural University, India

#### Reviewed by:

Ruben Milla, Universidad Rey Juan Carlos, Spain

> \*Correspondence: Alejandro Casas acasas@cieco.unam.mx

#### Specialty section:

This article was submitted to Agroecology, a section of the journal Frontiers in Ecology and Evolution

> Received: 18 March 2019 Accepted: 04 June 2019 Published: 21 June 2019

#### Citation:

Casas A, Ladio AH and Clement CR (2019) Editorial: Ecology and Evolution of Plants Under Domestication in the Neotropics. Front. Ecol. Evol. 7:231. doi: 10.3389/fevo.2019.00231

Pickersgill analyzed the nature of changes in phenotypic traits involved in domestication, which are commonly produced by complex genetic networks and quantitative trait loci. Domestication has been historically analyzed as a diversification process, resulting in variants adapted to different uses, environments, and agronomic conditions. Similar diversification traits may be shared by closely related species controlled by homologous genes (parallel evolution) or by distantly related species controlled by non-homologous genes (convergent evolution). Understanding the nature of traits resulting from domestication and diversification is not always easy and may have different meanings according to the scale at which these processes are visualized. Crops domesticated in the Neotropics have responded to human of selection for diverse purposes, changing in many different traits and providing examples of both parallel and convergent evolution, but data are still insufficient to arrive at conclusions about the relative roles of these two processes in domestication and diversification.

Considering domestication as a model for analyzing the genetics of evolution, Chen et al. studied how biotic, abiotic, and human selective forces shaped the phenotypes of domestic plants. From a quantitative genetics approach the authors emphasized the effects of abiotic and biotic interactions of microbial and insect assemblages on phenotypes, availability of nutrients, humidity, among other factors, and called attention to the need to understand local adaptation. This approach has theoretical relevance for evolutionary studies, as well as applicable consequences for evolutionary criteria for sustainable management of agricultural practices.

Luna-Ruiz et al. analyzed shifts in plant chemical defenses of C. annuum associated with domestication in Mexico. They assumed that comparing wild and domesticated forms can elucidate how crop domestication influences biotic and abiotic interactions, including the chemical defenses involving capsaicinoids. Capsaicin is a secondary metabolite in the chile fruit that mediates interactions with seed dispersers (birds), and with seed predators (fungi, insects, and rodents). The authors explored the evolutionary ecology of Capsicum and human-Capsicum interactions to test how domestication shifts plant chemical defenses. Their review examined the ways in which incipient domestication through "balancing selection" in wild C. annuum populations may have led to the release of selective biotic and abiotic pressures. Then they characterized cultivated material with chemotypes, morphotypes, and ecotypes found in low frequencies in the wild, probably related to different cultural uses, cropping systems, and ecogeographic regions.

Cuny et al. studied seed size of P. lunatus in relation to competition of beetle larvae. The authors explored the hypothesis that larger seeds of beans decrease competition among beetle larvae, which indicates that gigantism of seeds not only favored human desired characters but also plants fitness. The authors found a negative correlation between the initial egg number on a seed and the weight of female beetles that emerged from the much smaller wild seeds. Similarly, beetle survival was found to be negatively correlated with competition intensity only on wild seeds. The larger seed size of cultivated beans mitigates the potential negative effects of larval intraspecific competition to plants, which controls the size of populations. The results suggest that human selection for larger seeds has reduced the intensity of intraspecific larval competition of the beetle Zabrotes subfasciatus.

Pedrosa et al. evaluated the on-going domestication of Pourouma cecropiifolia populations cultivated in Western Amazonia. The authors compared fruit characteristics between wild and domesticated populations to quantify the direct effects of domestication. Also, they examined differences in vegetative characteristics and changes in seed:fruit allometric relations to explore characteristics that are not under direct human selection. Although they could not clearly discriminate the effects caused by human selection and consequences associated with environmental changes, they suggest that the allometric differences between fruits and seeds of wild and domesticated plants can be used in future studies as an additional parameter of the domestication syndrome.

# Origins and Diffusion of Domestication

Several contributions used phylogeographic and phylogenomic approaches to analyse the origin and dispersal of crops. These contributions illustrate cases of important crops from the Neotropics and provide methods and theory for further studies on this important issue. Chacón-Sánchez and Martínez-Castillo studied (P. lunatus), whose wild populations are widely distributed from Mexico to northern Argentina. By using genome-wide SNP markers, they explored the number of domestication events, and whether two Mesoamerican gene pools can be identified. They confirmed the two Mesoamerican gene pools and suggested that the differentiation of wild Mesoamerican pools was previous to domestication, and concluded that there was one main Western Mesoamerican domestication event, as well as confirming the Andean gene pool from Ecuador and Peru.

With a similar approach, Guerra-García et al. analyzed the genomics of the P. coccineus, naturally distributed from northern Mexico to Panama, exploring the history of domestication events and loci associated with natural and human selection. By genotyping SNPs, the authors identified a monophyletic clade of cultivated populations suggesting a single domestication event in the Trans-Mexican Volcanic Belt, instead of the two events suggested by previous studies with SSR markers. The study identified 24 SNPs associated with domestication, mainly of flower and pod characters, another 13 loci related to crop diversification and others associated with natural selection.

Moreira et al. analyzed the case of Crescentia cujete and C. amazonica. Previous studies by Aguirre et al. (2012) identified strong differences in gene lineages of wild and cultivated populations of this species in Mexico, even in the Mayan homegardens of the Yucatán Peninsula where both types coexist, which lead them to suggest the origin of domestication of C. cujete in Central or South America. The authors explored the relationship between Mesoamerican and Amazonian C. cujete, their morphological and genetic variation, and possible routes of dispersal. They concluded that domesticated C. cujete were introduced into the Amazon Basin and Mexico, sharing a common ancestry with a currently unknown origin. The occurrence of wild populations in Mexico and the higher diversity of cultivated C. cujete from Mesoamerica, compared to Amazonia, suggest that its origin may be in Central America, but a more extensive sampling is still needed to test this hypothesis.

Clement et al. examined the origin and dispersal of domesticated Bactris gasipaes. They identified the wild relative of the domesticated landraces, and examined three hypotheses about the origin of domestication (either southwestern Amazonia, northern South America, or multiple independent events). The authors modeled the potential distribution of wild and domesticated palms, identified the origin with cpDNA sequences, and post-domestication dispersal routes with nuclear microsatellites. The phylogeographic studies confirmed southwestern Amazonia as the origin of domestication, while nuclear markers confirmed two dispersals, one along the Ucayali River with starchy fruits for fermentation, into western Amazonia, north-western South America, and Central America; the other along the Madeira River into central and then eastern Amazonia.

Chávez-Pesqueira and Núñez-Farfan reviewed the genetics of the domestication of Carica papaya, which has been proposed as a Mesoamerican domesticate for a long time, a hypothesis supported in this study because wild populations are distributed in southern Mexico and Central America. In addition, phylogenetic studies indicate that the closest relatives of Carica are the genera Horovitzia and Jarilla, both endemic to Mesoamerica. The authors emphasize the study of wild populations from Mesoamerica and conclude with the importance of more extensive sampling and genomic approaches for precisely identify the events of domestication and routes of dispersal.

# Landscape Domestication

A group of articles focuses on processes of incipient domestication and the association of this type of plant management with landscapes that are transformed for human needs, which has long been considered a process of domestication, since landscapes are socio-ecological constructions (Casas et al., 1997). This is a relatively recent approach (Clement, 1999) and several articles of this Research Topic contribute with case studies that allow constructing theories and methodological approaches. Morales et al. used an ethnobotanical perspective to analyse the use of plant species as fuelwood in Patagonian rural communities; they explored the fuelwood gathering patterns in relation to different landscapes according to their degree of domestication. They found that less domesticated landscapes were less intensely used as sources of fuelwood, provided fewer species that were also less intensely used and had lower use versatility than in domesticated landscapes, which, although less rich in species composition, with mostly exotic species, were more intensely used. They found that the landscape units have spatial continuity, including semidomesticated landscapes, in response to ecological gradients and types of management. Landscapes with medium and high domestication levels provide exotic (and native) plant species, decreasing pressure on native species.

Furlan et al. examined landscape domestication by studying homegardens of the peri-urban areas of the city of Iguazú, Argentina. Using ethnobotanical and ecological approaches, they documented the different types of management of 66 perennial species and varieties, their distribution and abundance, and their influence in shaping local landscapes. In addition, the authors emphasized the importance of local knowledge for managing and maintaining native species in homegardens of the Atlantic Forest.

Cruz-García analyzed the motivations of people to manage food plants, which is a crucial topic to understand how management and domestication starts today and how it could have started in the past. She studied plant management practiced by people in a rural village in Peruvian Amazonia, documenting practices in forests and agroforestry systems. The interest of people to have access to valuable resources, their abundance, and the ease of managing them are major factors influencing people's decisions. The study is a theoretical contribution to understand the origins of plant management and its ecological and evolutionary consequences, as well as for food security programs.

Betancurt et al. documented incipient domestication processes in plants of urban parks of San Carlos de Bariloche, Argentina, in the Andino Norpatagonica Biosphere Reserve. Incipient processes of domestication are under studied and most are from rural contexts; therefore this contribution is doubly important, documenting what urban people do with plants, how they manage them, and when, where and how they start processes of domestication. The authors analyzed composition and different management practices of woody species in parks with different environmental and socioeconomic characteristics, hypothesizing that species richness of exotic plants would be higher than native, mainly ornamental species, and management type would vary according to the environment and socioeconomic aspects of the park. Most species were exotic; however, the authors reported native species, some with signs of incipient domestication, which form part of the local biocultural heritage, and that help to preserve natural ecosystems and promote appreciation of local biocultural values.

Levis et al. examined how people domesticated Amazonian forests, which have been managed by people for millennia. They documented plant management and its consequences on the structure of vegetation in 30 sites, where they recorded the following practices: (1) removal of non-useful plants, (2) protection of useful plants, (3) attraction of non-human animal dispersers, (4) transportation of useful plants, (5) selection of phenotypes, (6) fire management, (7) planting of useful plants, and (8) soil improvement. These practices allowed the authors to explain how patches of vegetation dominated by useful plants could have been formed in the past.

Following similar principles, Reis et al. analyzed management and domestication of araucaria forests in southern Brazil. These forests are currently managed to produce Ilex paraguariensis under Araucaria angustifolia. The authors focused their analysis on I. paraguariensis, A. angustifolia, and Bromelia antiacantha, and documented management practices, demographic structure and genetic diversity in farming zones and in a protected area. The three species are intentionally promoted with practices of protection, transplanting, and/or selection.

# Genetic Resources Management and Policies

Finally, the Research Topic includes some policy-related articles since the themes analyzed are connected with genetic resources diversification, use and conservation. Themes like genetic improving through plant breeding and genetic engineering, biosafety, and conservation policies from general scope to specific cases are issued. Although this is a controversial subject that deserves more extensive analyses, studies and views showed in the Research Topic are examples of the spectrum of issues and approaches needed. Hernández Terán et al. provided a meta analysis of plant breeding and genetic engineering of rice, maize, canola, sunflower, and pumpkin, whose targeted and non-targeted traits are compared with their wild relatives and organisms improved without genetic engineering. The authors concluded that genetic modification by humans can be traced phenotypically when compared with their wild relatives, and that the magnitude of the phenotypic differences between crops with or without genetic engineering suggest consequences of genetic modification beyond the target traits. Cases in which the transgene, due to genetic interactions, causes unexpected phenotypes were identified in canola, sunflower, rice, and maize, but they could not rule out phenotypic plasticity nor the origin and specific context of domestication in different phenotypic scenarios. The authors conclude that further studies on phenotypic changes in human modified crops must include as many traits as possible, including non-target traits.

Sánchez de la Vega et al. studied the genetic variation of Cucurbita argyrosperma and its wild relatives across Mexico to identify the main areas of genetic diversity and differentiation, as well as to evaluate gene flow among wild and domesticated populations. Such an approach contributes to identify possible areas of domestication and design policies for conservation.

Mastretta-Yanes et al. present the initiatives of the National Commission for Knowledge and Use of Biodiversity (CONABIO) of Mexico to study and sustainably use and conserve the genetic diversity of crops and their wild relatives. These national initiatives focus on the sources of variation available for domestication, the context in which domestication occurs, and an important challenge, environmental change. The initiative concentrates on: (1) pilot research projects on genetic diversity of Mexican cultivars and wild relatives, (2) an information system developed by CONABIO that allows data on agrobiodiversity and genetic diversity to be analyzed, archived, and made public, (3) enhancing collaboration among research groups, civil organizations, and education institutions to accelerate

# REFERENCES


participatory research, and (4) implementation of public policy recommendations and participatory research congruent with the reality of smallholders' needs in Mexico.

Velázquez-López et al. conducted pollination biology studies to analyse the mating system of wild and domesticated Gossypium hirsutum. Cultivated cotton is generally considered self-pollinated, but the mating system varies throughout the distribution of the metapopulations studied, and it should therefore be considered to have a mixed mating system rather than being primarily autogamous. This has consequences for strategies of conservation and biosecurity.

# CONCLUSIONS

The contributions suggest the importance of studying the processes of parallel and convergent evolution in domestication, as well as the role of human and natural selection in this process. New gene sequencing techniques provide abundant information about origins of domestication and subsequent dispersal, and preliminary data show the Neotropics to be a diffuse mosaic of areas of origin of its main crops, rather than the centers of origin originally proposed by Vavilov and others. These proposals are being analyzed with new archaeological and genomic approaches and most probably will change our current view about origins of domestication, dispersal of crops, and production systems. Landscape domestication is a theoretically valuable concept, as the Amazonian, Mesoamerican, Andean, and Patagonian cases illustrate. Domestication has generated important genetic resources for meeting global and local needs that are crucial in the context of global change. Policies for enhancing understanding and conservation of genetic resources and processes shaping them are primary issues that require research and interaction of the academic sector with decision makers from household to global scales.

# AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

# ACKNOWLEDGMENTS

We wish to thank all the contributing authors for their efforts to make this issue a success. We also wish to thank the Frontiers in Ecology & Evolution editorial and support teams for their patience and advice.


population decline. Econ. Bot. 53, 188–202. doi: 10.1007/bf02 866498


Vavilov, N. I. (1926). Origin and Geography of Cultivated Plants. Cambridge: Cambridge University Press.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Casas, Ladio and Clement. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Parallel vs. Convergent Evolution in Domestication and Diversification of Crops in the Americas

### Barbara Pickersgill\*

Retired from School of Biological Sciences, The University of Reading, Reading, United Kingdom

Domestication involves changes in various traits of the phenotype in response to human selection. Diversification may accompany or follow domestication, and results in variants within the crop adapted to different uses by humans or different agronomic conditions. Similar domestication and diversification traits may be shared by closely related species (parallel evolution) or by distantly related species (convergent evolution). Many of these traits are produced by complex genetic networks or long biosynthetic pathways that are extensively conserved even in distantly related species. Similar phenotypic changes in different species may be controlled by homologous genes (parallel evolution at the genetic level) or non-homologous genes (convergent evolution at the genetic level). It has been suggested that parallel evolution may be more frequent among closely related species, or among diversification rather than domestication traits, or among traits produced by simple metabolic pathways. Crops domesticated in the Americas span a spectrum of genetic relatedness, have been domesticated for diverse purposes, and have responded to human selection by changes in many different traits, so provide examples of both parallel and convergent evolution at various levels. However, despite the current explosion in relevant information, data are still insufficient to provide quantitative or conclusive assessments of the relative roles of these two processes in domestication and diversification

Keywords: domestication, diversification, parallel evolution, convergent evolution, American crops

# INTRODUCTION

Domestication has been defined in various ways (e.g., Harris, 1989; Harlan, 1992; Clement, 1999; Benz, 2006; Fuller et al., 2010; Abbo et al., 2014; Larson et al., 2014), depending in part on the perspectives of the definer. However, there is a general consensus that domestication occurs in response to selection, predominantly human but also natural. This response involves increase in frequency, often to fixation, of certain traits adaptive to human needs or to the environment created by human activities. These traits constitute the so-called domestication syndrome. They may include increased size (particularly of the harvested organ), loss of dispersal mechanisms, change in plant habit (such as reduction in branching) and, in domesticates from regions with seasonal climates, loss of seed dormancy. Qualitative traits distinguishing domesticates from their wild progenitors are frequently controlled by one or a few genes, most of which have so far proved to be regulatory genes, mainly encoding transcription factors (Doebley et al., 2006; Sang, 2009; Martínez-Ainsworth and Tenaillon, 2016). Quantitative traits are usually controlled polygenically, by quantitative trait loci (QTL), but one or a few of these often have disproportionately large effects,

#### Edited by:

Charles Roland Clement, National Institute of Amazonian Research, Brazil

### Reviewed by:

Vigouroux Yves, Institut de recherche pour le développement (IRD), France Alessandro Alves-Pereira, Universidade Estadual de Campinas, Brazil

#### \*Correspondence: Barbara Pickersgill

b.pickersgill888@btinternet.com

### Specialty section:

This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Ecology and Evolution

> Received: 05 September 2017 Accepted: 17 April 2018 Published: 07 May 2018

### Citation:

Pickersgill B (2018) Parallel vs. Convergent Evolution in Domestication and Diversification of Crops in the Americas. Front. Ecol. Evol. 6:56. doi: 10.3389/fevo.2018.00056 thereby enabling rapid response to selection (Poncet et al., 2004). Genes governing traits of the domestication syndrome have been called domestication genes.

Crop evolution does not cease with domestication, but continues by improvement and diversification. Improvement has progressed by increasingly efficient methods of humanmediated selection, and by gene transfer and gene modification. Improvement often targets the same traits as those involved in domestication. Recurrent cycles of improvement may thus result in favourable alleles accumulating at large-effect QTL that influence the target trait (Poncet et al., 2004). This provides one explanation for apparently contradictory reports that, in a given crop, most domestication traits are under simple genetic control, yet many genes have been affected by domestication and subsequent improvement. An alternative, or additional, explanation is that the "top down" approaches that are often used to identify candidate domestication genes may favour detection of major genes and QTL with large effects (Morrell et al., 2012; Olsen and Wendel, 2013).

Diversification involves development of variants within a crop that are adapted to different uses by humans, or to different agricultural environments. Distinguishing between domestication and diversification is not always easy (Meyer and Purugganan, 2013) and again may depend on definition. Furthermore, the same trait may be part of the domestication syndrome in one crop, but associated with diversification in another. In Capsicum, loss of dispersal results from loss of an abscission zone at the base of the fruit and is a feature of most domesticated chile peppers. Most domesticated tomatoes still have an abscission zone in the fruit stalk, but breeders have recently selected for its loss in tomato cultivars that are harvested mechanically for canning (Mao et al., 2000). Loss of ability to disperse fruits is usually considered part of the domestication syndrome in chile pepper, but is a diversification trait in tomato.

The terms "parallel evolution" and "convergent evolution" are not always used consistently, so may be sources of confusion. They were originally employed for phenotypic characters, but their application has now been extended to the genes controlling these characters and to the nucleotide sequences of these genes. The meanings of parallel and convergent evolution at these different levels, as used in this review, are summarised in **Table 1**. In the original usage (e.g., Davis and Heywood, 1963), parallel evolution is considered to be independent development of similar phenotypic traits in taxa with a relatively recent common ancestry, while convergent evolution is development of similar traits in distinct phylogenetic lineages, i.e., in taxa that are not closely related. In this paper, genera in the same family are considered to be closely related, whereas genera in different families are considered to be distantly related. Traits associated with both domestication and diversification may be shared by distantly as well as closely related species, hence may have arisen by convergent or by parallel evolution.

At the level of the gene, parallel evolution may be viewed as production of similar phenotypes by orthologous genes, i.e., homologous genes that have diverged from a common ancestor, while convergent evolution occurs when similar phenotypes are produced by genes that are not homologous. At the level of

TABLE 1 | Definitions of parallel and convergent evolution at different evolutionary levels.


nucleotide sequences, parallel evolution is the occurrence, in different populations, of genotypes with identical changes in DNA sequence in a given gene. This seems to be rare. Different changes in DNA sequences affecting the same gene are more frequent and may produce similar phenotypes. This represents convergence at the level of DNA sequence but parallelism at the level of the gene.

The question of whether, or to what extent, parallel evolution has occurred during domestication and diversification has been much discussed. Glémin and Bataillon (2009) and Martínez-Ainsworth and Tenaillon (2016) concluded that there is little evidence for parallelism at the genetic level and that similar traits of the domestication syndrome in different species are usually controlled by loci that are not homologous. On the other hand, Sang (2009) argued that, since most domestication genes appear to be regulatory genes affecting more than one trait (pleiotropy), the same genes would probably be targeted repeatedly during domestication, because selection would favour consistently those genes for which negative pleiotropic effects are minimal. Poncet et al. (2004) considered that orthologous loci are more likely to be involved when domesticates belong to the same family, while Lenser and Theißen (2013) suggested that traits resulting from alterations in simple metabolic pathways, such as those involved in pigment synthesis, are more likely to be controlled by orthologous genes than complex traits such as seed dormancy. Gross and Olsen (2010) argued that diversification traits are controlled by orthologous genes more often than are domestication traits. Finally, Gaut (2015) concluded that the question of the extent of parallel evolution during domestication of different crops remains open.

The study of domestication and diversification, together with the respective roles of parallel vs. convergent evolution, is relevant to further improvement of existing crops, to possible development of new crops, and to understanding origins and spread of crops in prehistory. When different pathways to increased yield, involving different genes (convergent genetic evolution), have been utilised in related crops, or in independently domesticated lineages within the same crop, plant breeders may be able to combine in one genotype alleles associated with increased yield in different lineages, thereby enabling yield to reach a new plateau. Food production may also be increased by extending agriculture into environments currently considered marginal because of low temperatures, drought or salinity. Cultivated species adapted to such marginal conditions are often undomesticated, hence unsuited to modern mechanised agriculture. It may be possible to identify in the genome of such species homologues of genes known to govern traits of the domestication syndrome in their domesticated relatives, and then to edit these genes to produce a new domesticate by parallel genetic evolution. Finally, techniques for recovering ancient DNA from archaeological specimens are steadily improving, together with the ability to amplify and sequence particular genes in these DNA fragments. This is opening the possibility of being able to study the order in which traits of the domestication syndrome became established in a given crop and hence how human selection was exercised in early stages of domestication of that crop.

The principal regions of crop domestication in the Americas are shown in **Table 2**, together with the probable regions of domestication of those species mentioned in this text. There are examples of independent domestications of the same species in two continents (e.g., common bean); independent domestications of different species in the same genus for the same purpose (e.g., the New World cottons Gossypium hirsutum and Gossypium barbadense); independent domestications of species in different genera of the same family for the same purpose (e.g., tomato, chile pepper and Physalis spp., all members of the Solanaceae domesticated for their fleshy fruits); and independent domestications of species in different families for the same purpose (e.g., maize in the Poaceae and Amaranthus spp. in the Amaranthaceae, domesticated for their starchy seeds). These crops thus provide an opportunity to evaluate the roles of parallel vs. convergent evolution in domestication and diversification of closely vs. distantly related species.

# PARALLEL VS. CONVERGENT EVOLUTION IN TRAITS OF THE DOMESTICATION SYNDROME

Increased size, reduction in natural dispersal mechanisms, loss of dormancy and reduced branching have all been investigated genetically in sufficient detail to be discussed here. Loss of toxic or bitter compounds, considered as part of the domestication syndrome by Abbo et al. (2014), will be treated later, as a diversification trait, since many New World crops are still polymorphic for presence or absence of such compounds.

# Increased Size

Increased size results from presence of more cells in a given organ and/or from larger cells. Number of cells in an organ depends on number of cells in the primordium that gave rise to that organ and/or the amount of cell division in the organ after initiation.

## Increased Number of Cells in the Primordium

The genetic control of events in the shoot meristem is complex and not yet fully understood (Somssich et al., 2016). Basically, control of cell proliferation vs. differentiation depends on interactions between WUSCHEL (WUS), a gene that encodes a transcription factor promoting cell proliferation, and CLAVATA3 (CLV3), a gene that encodes a product promoting cell differentiation. The CLV3-WUS signalling pathway is in turn regulated by many receptor-like kinases and receptor-like proteins. Somssich et al. (2016) suggested that increased size in various crops is likely to result from changes in genes involved in the CLV3-WUS pathway, since this seems to be significantly conserved between species, but as yet there are very few crops in which candidate genes that affect size have been associated with this pathway. One of these crops is tomato (Solanum lycopersicum), in which an orthologue of WUS (SlWUS) is a candidate gene for LOCULE NUMBER (van der Knaap et al., 2014), a gene that increases the size of the fruit by increasing the number of locules. In Capsicum, selection under domestication has similarly resulted in larger fruits with more locules. Barchi et al. (2009) found a QTL for locule number in Capsicum annuum that they thought might be orthologous to SlWUS, but it has yet to be shown that the gene underlying this QTL functions in the CLV3-WUS pathway, or that orthologous QTL are responsible for parallel evolution of multilocular fruit in different domesticated species of Capsicum. Although the widespread and conserved nature of the CLV3-WUS pathway and associated loci seems to offer opportunities for parallel genetic evolution in crops that have undergone parallel selection for increased size, there is so far little evidence of this.

## Increased Cell Division in the Organ Under Selection

Another QTL that influences fruit size in tomato is FW2.2. This encodes a transcription factor that controls the number of cells in the fruit by acting as a negative regulator of cell division (Cong et al., 2002; Tanksley, 2004). FW2.2 is a member of a family of genes known as CELL NUMBER REGULATOR (CNR) genes (van der Knaap et al., 2014), an ancient gene family that occurs also in animals and fungi, but appears to have expanded and radiated more in plants. In maize, the closest orthologue to FW2.2 is ZmCNR1. It seems to act in the same way as FW2.2, in that down-regulation increases organ size, while over-expression reduces size, through changes in cell number, not cell size (Guo et al., 2010). However, ZmCNR1 is not a major domestication gene in maize, whereas human selection for fw2.2 was considered by Blanca et al. (2015) to have been important in the origin of semi-domesticated cherry tomato from wild currant tomato. In avocado, a CNR gene homologous to FW2.2 affects fruit size through similar negative regulation of cell division (Dahan et al., 2010). Thus, convergent changes in fruit size in such distantly related plants as avocado, tomato and maize seem to result from parallel changes involving CNR genes.

However, apparently parallel evolution of similar phenotypes may not extend to structural details, even for genera in the same family. In the Solanaceae, human selection has produced large fruits in both tomato and Capsicum. In C. annuum, at least seven QTL affecting fruit weight are orthologues of corresponding loci in tomato, but their relative contributions differ in the two genera. In tomato, the QTL with greatest effect on fruit size is FW2.2, but its orthologue in Capsicum has only a minor effect



(Paran and van der Knaap, 2007). This may be because FW2.2 acts principally on cell division in the placenta and central axis of the fruit, which together make up most of the flesh in largefruited tomatoes, whereas in the hollow fruit of Capsicum, there is relatively much less placental tissue, so increased size results mainly from increased cell division in the fruit wall (van der Knaap et al., 2014; Wang et al., 2015).

In large-fruited tomatoes, the number of cells in the fruit wall has likewise increased, and is affected by another major QTL, FW3.2. The candidate gene for FW3.2 is an orthologue of KLUH in Arabidopsis, so was designated SlKLUH (Chakrabarti et al., 2013; van der Knaap et al., 2014). A putative orthologue in C. annuum (CaKLUH) is also associated with large fruit (Chakrabarti et al., 2013; Wang et al., 2015). Parallel evolution of large fruits in tomato and chile pepper thus involves both similarities and differences at the levels of fruit structure and its genetic control.

### Increased Cell Expansion

An initial phase of cell division in a developing organ is usually followed by a phase of cell expansion. In the Mexican green tomato (Physalis philadelphica), Wang et al. (2014) identified a locus that they named PHYSALIS ORGAN SIZE 1 (POS1). Overexpression of this locus resulted in larger organs, including fruits, whereas virus-induced silencing reduced sizes of these organs. Cell numbers were the same in large or small organs, so changes in POS1 affected cell expansion rather than cell number. Large fruits of domesticated P. philadelphica are therefore phenotypic parallels of large fruits of tomato, but the parallel does not extend to the genetic level.

A more dramatic example of increased cell expansion resulting from human selection during domestication involves lint hairs of cotton. These are single-celled structures borne on the seed coat and provide the commercial fibre. In both of the species domesticated in the Americas, G. hirsutum and G. barbadense, human selection has produced lint hairs that are much longer than those of wild cotton, because they start to elongate earlier, elongate faster and continue elongating for a longer period. Furthermore, formation of secondary cell walls is delayed. Many genes are involved in these differences, but genes governing two sets of processes have been studied in detail. The extended period of cell elongation and delayed formation of secondary cell walls correlates with production of enzymes that break down reactive oxygen species such as hydrogen peroxide. In G. hirsutum, this is achieved by up-regulation of genes encoding catalase, glutathione S-transferase and thioredoxin, while in G. barbadense, genes encoding several peroxidases are up-regulated (Chaudhary et al., 2009). Rapid elongation of lint hairs is attributed to the action of profilins. Five members of the gene family encoding profilins are expressed in lint hairs and all five are up-regulated in both G. hirsutum and G. barbadense (Bao et al., 2011). In the case of the profilins, human selection thus appears to have resulted in parallel genetic as well as phenotypic changes in two independently domesticated, but related, species, whereas in the control of antioxidant activity, independent selection of different sets of genes with similar effects has produced similar phenotypes by convergent genetic routes (Olsen and Wendel, 2013).

# Reduction of Dispersal

In crops domesticated for their fruit or seed, or propagated by seed, human selection has frequently resulted in either reduced efficiency or total loss of mechanisms for dispersal. For singleseeded fruits, or multi-seeded fruits that are eaten by animals, this often involves loss of the abscission zone that enables fruit to be removed from the parent plant. For multi-seeded fruits that are not eaten, fruits must open at maturity to release their seeds. This often involves differential lignification of cell layers in the fruit wall so that, as the fruit dries, these layers shrink to different degrees, setting up tensions that are released as the fruit opens along predetermined lines of dehiscence. In peanut, and also in potato, where underground organs are harvested, the structures by which the harvested organs are attached to the parent plant have become shorter, so that harvesting is easier but dispersal occurs over a shorter distance. Parallel reductions in efficiency of dispersal may therefore be achieved by different morphological means, with corresponding differences in genetic control.

To date, genetic control of mechanisms of dispersal has been investigated in detail in a few model species only. Recent reviews (Estornell et al., 2013; Dong and Wang, 2015) have indicated that regions of dehiscence and their associated abscission zones are controlled by regulatory networks involving many transcription factors. Additional networks involve interactions with plant hormones, for example ethylene up-regulates certain genes encoding enzymes involved in breaking down cell walls. Change in any of the genes involved in any of these networks may lead to loss of dispersal. What appear to be similar phenotypes may therefore be produced by changes in quite different genes or sets of genes.

In Phaseolus, five species were domesticated independently, while two of these, common bean (Phaseolus vulgaris) and Lima bean (P. lunatus), were both domesticated at least twice, in Mesoamerica and in the Andean region. Indehiscent pods were selected by humans during each domestication event. In P. vulgaris, a recessive allele of a major gene, stringless (st), causes loss of fibres in the sutures of the pods (Koinange et al., 1996). This results in loss of dispersal, and also in the "stringless" phenotype, favoured in types such as snap beans, in which immature pods are eaten. Additionally St, or a gene tightly linked to St, controls lignification of inner layers of the pod wall, which affects ability of the pods to open explosively. Although St has been mapped, the nature and function of the underlying gene(s) are not yet known. Until a candidate gene is securely identified, it is not possible to determine whether phenotypic parallels in evolution of indehiscent pods in different domesticated species of Phaseolus, and/or in independently domesticated lineages within P. vulgaris and P. lunatus, are paralleled at the genetic or nucleotide levels.

Five different species of Capsicum have been domesticated in different parts of the Americas. Most domesticated peppers in all five species have lost the abscission zone at the base of the fruit that enables the fruit to be removed easily from the parent plant by animal dispersers. Loss of this abscission zone is controlled by a recessive allele, designated s, of a single major gene S, though other QTL must also be involved because the force required to detach the fruit varies. Candidate gene(s) underlying this locus have not been securely identified, though Rao and Paran (2003) suggested that, in C. annuum and C. frutescens, a gene that encodes a polygalacturonase is probably implicated, because this gene maps to the same chromosomal region as S and is active specifically in the fruit. There are no data on whether the same gene is involved in the other domesticated species of Capsicum, hence whether parallel evolution of reduced efficiency of fruit dispersal is caused by parallel evolution at the genetic level.

In tomato, an abscission zone develops in the fruit stalk of wild and most domesticated tomatoes. A mutation of a gene designated JOINTLESS suppresses development of this abscission zone (Mao et al., 2000). JOINTLESS encodes a transcription factor that interacts with transcription factors encoded by two other genes. All three transcription factors seem to regulate the same set of target genes (Estornell et al., 2013; Dong and Wang, 2015). Within the Solanaceae, there has thus been a parallel loss of abscission zones, and hence reduction of fruit dispersal, in most domesticated chile peppers and some domesticated tomatoes, but the sites of loss, the genes involved, and the mechanisms of gene action appear different in the two genera.

# Loss of Seed Dormancy

For annual crops grown on a field scale, it is desirable that seeds germinate rapidly and evenly. This helps to suppress competing weeds and produces stands of plants of similar age that will mature at about the same time, so may be harvested in a single operation. Many domesticated crops have therefore lost much of the seed dormancy characteristic of their wild progenitors. However, some degree of dormancy needs to be retained to prevent seeds from germinating before the crop is harvested. Dormancy is therefore a quantitative character (Baskin and Baskin, 2004; Graeber et al., 2012). Dormancy varies over time, with environmental conditions, within species, and even within individuals (Smýkal et al., 2014). It is controlled by nuclear genes, but there may also be maternal effects and epistatic interactions (Baskin and Baskin, 2004). Determining how dormancy is controlled genetically in any given species is therefore challenging.

There are two principal types of seed dormancy among crop species (Baskin and Baskin, 2004). Physical dormancy, known also as hard-seededness, results from impermeability of the seed coat or fruit wall, due to an unbroken cuticle and one or more layers of palisade cells with lignified walls. This impermeability hampers the uptake of water necessary for germination and also, in crops such as grain legumes, the water required if seeds are to become soft when cooked. Physiological dormancy is more widespread than physical dormancy. It may be associated with the seed coat, where compounds such as tannins or pigments may inhibit germination (Smýkal et al., 2014). However, physiological dormancy is more often associated with the seed contents (endosperm and/or embryo). Some species combine both physical and physiological dormancy; for example, sunflower has physiological dormancy, but the fruit wall also provides a physical barrier to germination (Weiss et al., 2013).

Amongst crops domesticated in the Americas, wild species of both Phaseolus and cotton have hard seeds. QTL affecting dormancy have been mapped in common bean (Koinange et al., 1996), but the underlying candidate genes and their mechanisms of action have not been identified. Even less is known about the genetics of hard-seededness in cotton. In sunflower, with its combination of physical and physiological dormancy, Weiss et al. (2013) found that the micropylar end of the two valves of the fruit wall opened earlier in domesticated than in wild sunflowers, facilitating earlier entry of air and water, hence faster germination. However, it is not known what promotes loosening of cells in this region of the fruit wall.

Abscisic acid (ABA) appears to be significant in inducing and maintaining physiological dormancy, while gibberellic acid (GA) releases dormancy and promotes germination (Finch-Savage and Leubner-Metzger, 2006; Nonogaki, 2014). The balance between ABA and GA is therefore important in control of physiological dormancy, while tissues surrounding the embryo frequently appear to play a pivotal role (Graeber et al., 2012). Mechanisms of physiological dormancy have been extensively investigated in Arabidopsis, where numerous QTL are involved, including DELAY OF GERMINATION 1 (DOG1). DOG1 is highly conserved, with homologues in both dicotyledons and monocotyledons, and is required, together with ABA, to induce physiological dormancy (Nonogaki, 2014; Née et al., 2017). Regulation of transcription of DOG1 is complex and is influenced by environmental factors. The light-activated form of the photoreceptor phytochrome B, produced by the gene PHYB, represses transcription of both DOG1 and two further genes that, like DOG1, encode transcriptional regulators (Jiang et al., 2016). The means by which DOG1 acts are not clearly established.

Finch-Savage and Leubner-Metzger (2006) suggested that some at least of the mechanisms involved in control of seed dormancy and germination are widespread and have been conserved during evolution. However, it has not yet been shown that QTL involved in maintenance or loss of dormancy in Arabidopsis have orthologues with similar functions in any of the crop species domesticated in the Americas. The closest approach is the work of Mandel et al. (2014). They searched the literature for genes known to affect domestication-related traits, including germination, then identified homologues of these genes in sunflower, sequenced these from panels of wild, primitive cultivated, and improved sunflowers, and looked for evidence of past selection as shown by reduced nucleotide diversity. They found that one gene that encodes a protein that represses germination in Arabidopsis had apparently been selected during improvement of sunflower. PHYB had also been targeted during sunflower improvement, though this might relate to effects of phytochrome on adaptation to different daylengths, rather than its effects on seed dormancy.

In various plant families, ABA inhibits weakening of the endosperm around the embryo (Graeber et al., 2012). In several members of the Solanaceae, endosperm weakening, facilitating germination, may relate to inhibition by ABA of certain enzymes that degrade callose associated with plasmodesmata and thus break adhesion between adjacent cells in the endosperm (Finch-Savage and Leubner-Metzger, 2006).

It is not possible to reach any conclusion about the respective roles of parallel vs. convergent evolution in genetic control of physiological dormancy until candidate genes underlying QTL affecting this trait are clearly identified, together with their modes of action. Given the complex control of physiological dormancy, mutations in many different genes could produce apparently similar phenotypes in which dormancy has been lost.

# Reduced Branching

In species with terminal inflorescences, domestication has often resulted in reduced branching, so that plant resources are channelled into fewer but larger inflorescences, often with larger fruits and seeds. Domesticated maize has fewer basal branches than its closest wild relative, and also has fewer ear-bearing lateral branches on the main stem. In sunflower, modern cultivars have an unbranched stem with a single massive head, unlike the numerous smaller heads, each terminating a lateral branch, of wild sunflowers. In Chenopodium, reduced lateral branching and a much larger terminal inflorescence developed independently in the Andean domesticate quinoa (Chenopodium quinoa) and the Mesoamerican domesticate huauhzontli (Chenopodium berlandieri).

Lateral branches result from outgrowth of buds in axils of leaves on the main stem. This depends on various factors, including genotype, hormonal signals, nutrients, and environmental variables, particularly light. Actions of, and interactions between, these variables are complex (Rameau et al., 2015). Auxin appears to have a key role, with other plant hormones acting both downstream and upstream of auxin. Strigolactones suppress outgrowth of axillary buds, while cytokinins promote this. In Arabidopsis, a protein that interacts with a strigolactone receptor is encoded by MORE AXILLARY GROWTH 2 (MAX2) (Rameau et al., 2015). The recessive mutant max2 has a bushy phenotype. In tomato, a recessive mutant of LATERAL SUPPRESSOR (LS) does not form side shoots, and shoot apices have greatly increased levels of auxin and gibberellin, but decreased levels of cytokinin (Schumacher et al., 1999). A key step in cytokinin biosynthesis is controlled by isopentenyl transferase enzymes, encoded by IPT genes. Transcript levels of IPT genes are in turn modified by auxin levels. Mechanisms by which genetic control of branching is achieved are not yet fully understood, but disruptions at different stages or in different pathways seem likely to produce the same phenotype of reduced branching by different genetic means.

Mandel et al. (2014) identified homologues in sunflower of genes known to affect branching in other species, then looked for evidence of selection acting on these genes during domestication or improvement of sunflower. The homologue of LS showed evidence of selection during domestication, while the homologue of MAX2 was subjected to selection during improvement, and selection at an unknown stage of either domestication or improvement affected IPT5, hence levels of cytokinin. Mandel et al. (2014) have therefore shown that genes affecting auxins, gibberellins, cytokinins, and strigolactones may all be involved in genetic control of branching in sunflower.

In maize, reduced branching seems to involve changes in a pathway that may be unique to maize. The principal gene involved is TEOSINTE BRANCHED 1 (TB1), while a second gene, GRASSY TILLERS 1 (GT1) apparently acts in the same pathway, but downstream of TB1 (Whipple et al., 2011). Both genes encode transcription factors. In rice (Oryza sativa), pea (Pisum sativum) and Arabidopsis, TB1 is a key target of strigolactone signalling, but in maize, both TB1 and GT1 appear to be no longer regulated by strigolactone signalling, suggesting that, even within the cereals, selection under domestication for reduced branching may have been achieved by different routes (Guan et al., 2012).

The limited evidence so far available therefore suggests that parallel evolution of phenotypes with reduced branching in different crops results from convergent evolution at the genetic level.

# PARALLEL VS. CONVERGENT EVOLUTION IN DIVERSIFICATION

Diversification in response to human selection affects many different features. These depend on the crop concerned, but include plant habit, colour and/or shape of the harvested organ, flavour and/or palatability, cooking properties, and adaptation to different day-lengths.

# Plant Habit

In pumpkins and squashes (Cucurbita spp.), and also in common bean, there are differences in habit within the crop. In Cucurbita, some landraces and cultivars have a trailing vine-like habit, like their wild relatives. This is advantageous in subsistence agriculture, when pumpkins and squashes are often interplanted among maize and beans. Their viny habit then provides complete ground cover, suppressing weeds and minimising soil erosion. Bush types have semi-erect stems with short internodes, can be planted more closely, and are favoured when immature fruits are harvested frequently in order to maintain production. Bush types occur in some cultivars of Cucurbita maxima and in both domesticated subspecies of Cucurbita pepo. Deficiencies in gibberellin synthesis or signalling have been associated with bush phenotypes (Zhang et al., 2015). The bush habit is dominant in the early stages of growth but incompletely dominant later (Denna and Munger, 1963). Zhang et al. (2015) identified three QTL associated with bush habit in C. maxima and suggested that a candidate gene for the QTL with greatest effect is an orthologue of a gene in Arabidopsis that encodes gibberellin 20-oxidase, which catalyses the final stages in gibberellin biosynthesis. A large deletion and two single nucleotide polymorphisms (SNPs) in the promoter region of the bush allele in C. maxima apparently reduce its expression. Zhang et al. (2015) suggested that other genes encoding gibberellin 20-oxidases might become active later in development, partially compensating for the defective gene. It has still to be conclusively demonstrated that the same gene controls bush habit in C. maxima and in the two subspecies of C. pepo, and that the candidate gene in all three cases encodes gibberellin 20-oxidase. It has also yet to be determined whether identical changes in nucleotide sequence independently inactivate this gene.

The bush habit has also become established in some landraces and cultivars of common bean, but here the terminal meristem switches to a reproductive state relatively early in development, so that growth of the main stem is determinate, whereas in bush types of Cucurbita, the main stem remains indeterminate. Bush beans can be grown without support, unlike wild and many domesticated beans, which are climbers. Bush beans flower earlier than climbing beans, so are better adapted to cooler climates. In most bush beans, habit is controlled by a single gene, fin (Koinange et al., 1996). The candidate gene underlying fin is an orthologue of TERMINAL FLOWER 1 (TFL1) in Arabidopsis, and is designated PvTFL1y (Repinski et al., 2012). Common bean was domesticated independently in Mesoamerica and the Andean region and bush types occur in both regions. Kwak et al. (2012) identified eight mutations in the nucleotide sequence of PvTFL1y that were expected to change function of the gene. Different mutations tend to be restricted to one or other region: deletion of the entire PvTFL1y sequence is confined to Andean bush beans, while a large retrotransposon-mediated insertion is likewise virtually restricted to Andean bush beans. A phylogenetic tree of nucleotide sequences from PvTFL1y had two major branches, corresponding to Mesoamerican vs. Andean accessions. This suggested to Kwak et al. (2012) that the bush habit was selected independently in each region.

The bush habit in common bean is therefore an example of parallel evolution at phenotypic and genetic levels, but independent mutations at the nucleotide level, hence multiple origins of the same trait. Convergent phenotypic similarities in bush habit in Phaseolus and Cucurbita are superficial only; morphology and genetic control are fundamentally different.

# Flavour, Palatability, and Cooking Qualities

Meyer et al. (2012) found that, in domesticated species in general, the most common changes associated with human selection involve flavour, toxicity, or colour. Loss of bitterness often results in loss of toxicity as well as increased palatability. Abbo et al. (2014) considered that, in those crops in which bitterness has been lost, this is a crucial element of the domestication syndrome. Several crops in the Americas were domesticated from wild progenitors containing toxic or unpalatable compounds, but human selection has not eliminated these compounds entirely. Instead, elaborate methods of processing have been devised. Bitter potatoes, adapted to cold temperatures and high altitudes in the Andes, contain glycoalkaloids in amounts that may be toxic to humans (Johns and Galindo Alonso, 1990). Harvested tubers are repeatedly frozen and thawed, trampled or leached to express the bitter juice, and finally dried. Tubers of sweet landraces of manioc (Manihot esculenta) contain relatively little toxic cyanogenic glycoside, mainly in the outer layers, so the tubers can be cooked and eaten safely after peeling. Bitter tubers, containing more glycoside, distributed throughout the tubers, must be shredded, squeezed, and the pulp heated to drive off any remaining hydrogen cyanide. In both potato and manioc, variation in content of bitter compounds is associated with diversification, rather than domestication.

Many different compounds confer bitterness or unpalatability: cyanogenic glycosides in Lima bean as well as manioc; toxic alkaloids in seeds of tarwi (Lupinus mutabilis); steroidal glycoalkaloids in potato; saponins in grains of quinoa (C. quinoa); cucurbitacins in species of Cucurbitaceae. These different compounds are synthesised by different pathways, so their loss or reduction in different domesticates involves changes in different genes. However, in independently domesticated species of the same genus, containing similar unpalatable compounds, human selection for parallel changes in palatability might lead to parallel decreases in these compounds, controlled by parallel genetic changes.

In Cucurbita, cucurbitacins have been lost independently from fruits of all five domesticated species. In C. pepo and C. maxima, a single gene controls the difference between bitter and nonbitter fruits. Bitterness is dominant, but it is not clear whether the same gene is involved in both species (Paris and Brown, 2005). In C. argyrosperma, non-bitterness is due to recessive alleles at two independent loci, both different from the locus responsible for non-bitterness in C. pepo (Borchers and Taylor, 1988). Presumably the three loci govern different steps in the biosynthesis of cucurbitacins, though the precise biochemical functions of the loci have still to be determined. Parallel evolution of non-bitter fruits in different species of Cucurbita thus does not necessarily involve parallel evolution at the genetic level.

During diversification in chile peppers (Capsicum spp.), people have selected for different levels of pungency. In three independently domesticated species, C. annuum, C. frutescens, and Capsicum chinense, the same gene, Pun1, controls presence vs. absence of pungency. Independent mutations in each species have resulted in loss of function of this gene. In non-pungent C. annuum, the promoter and most of the first exon have been deleted; in non-pungent C. frutescens, part of the second exon has been deleted, causing a frameshift mutation and loss of transcription; while in non-pungent C. chinense, a different small deletion has again caused a frameshift mutation, affecting transcription (Stellari et al., 2010). In these three closely-related species, parallel evolution of non-pungent fruits results from parallel changes in the same gene, but the parallels do not extend to the level of DNA sequences.

In cereals and pseudocereals, starch is the principal reserve stored in the seeds. Starch consists of a mixture of amylopectin and amylose. Starch that lacks amylose is glutinous. In various unrelated species, absence of amylose is associated with mutation in the same gene, waxy. This encodes an enzyme involved in starch synthesis. In grain amaranths, landraces with waxy starch occur in all three domesticated species, but result from different mutations: a SNP in exon 6 in Amaranthus hypochondriacus, a SNP in exon 10 in A. cruentus, and a single nucleotide insertion in exon 8 in Amaranthus caudatus. All three mutations produce a premature stop codon, hence no active enzyme (Park et al., 2010). Waxy starch also occurs in maize. This mutation has not been favoured in Mexico, where maize originated (Whitt et al., 2002), but in China human selection has produced many waxy landraces (Fan et al., 2008). Sequencing studies have shown that these are due to at least two independent mutations, both deletions but affecting different exons. Both differ from the three mutations reported for Amaranthus. Waxy starch therefore represents an example of convergent evolution in dicotyledons and monocotyledons, involving parallel changes in the same gene, though the parallels do not extend to the nucleotide level.

In sweet corn, mutants of the gene sugary 1 (su1) accumulate sucrose rather than starch in the grain. All sweet corns from the United States carry the same amino acid substitution at the same conserved position in the encoded enzyme, presumably making it unable to convert sucrose to starch. This change is not found in sweet corn from Mexico, which carries instead an insertion of a transposable element that disrupts translation of su1 (Whitt et al., 2002). There have therefore been at least two parallel origins of the sweet corn phenotype controlled by su1, but again these parallels do not extend to the level of DNA sequences.

# Colour and Shape of the Harvested Organ

Charles Darwin (1868) observed that variation in cultivated plants is greatest in that part of the plant used by man. The most striking variations are in colour and shape. Some may have been selected simply for aesthetic reasons, but different colours and shapes may also act as visible markers for invisible differences in flavour or cooking properties (Boster, 1985). For example, the Aztecs used chile peppers of different colours to flavour different dishes: yellow chile with white fish and also with axolotl, red chile with greyish-brown fish, green chile with frogs (Coe, 1994).

The principal classes of plant pigments are anthocyanins and carotenoids. Biosynthetic pathways for both have been worked out and are common to species in many different families. This affords opportunities for both parallel and convergent evolution. In tomato, the characteristic red fruit colour is due to the carotenoid lycopene. Tomatoes with yellow fruit are homozygous for mutations in the gene PHYTOENE SYNTHASE 1 (PSY1). This encodes an enzyme that catalyses formation of the first carotenoid in the pathway to lycopene. Two different psy1 mutants are known in tomato. One involves insertion of a transposable element, the other involves a short deletion (Jiang et al., 2012). Both result in loss of function of the enzyme, hence absence of red pigment. Parallel yellow phenotypes in different cultivars of tomato therefore result from parallel changes in the same gene, but changes that must have occurred independently. By contrast, red chile pepper fruits owe their colour to two pigments, capsanthin and capsorubin, produced later in the carotenoid pathway than lycopene. Yellow fruit in Capsicum results from mutation in the gene CAPSANTHIN-CAPSORUBIN SYNTHASE (CCS), which encodes the enzyme catalysing the final step in synthesis of these pigments (Lefebvre et al., 1998; Popovsky and Paran, 2000). Parallel evolution of yellow fruit in tomato and Capsicum therefore results from mutations in different genes.

Within Capsicum, at least four independent mutations in CCS are known to produce yellow fruit. In C. annuum, a deletion at the 5′ end of the gene occurs in yellow bell pepper (Lefebvre et al., 1998; Popovsky and Paran, 2000), while a Chinese pungent yellow pepper carries a SNP that produces a premature stop codon (Li et al., 2013). Both changes result in a non-functional enzyme. In C. chinense, one mutation to yellow involves a small deletion that produces a frameshift, hence premature termination of translation, while a second involves a SNP that produces a premature stop codon (Ha et al., 2007), but the SNP in C. chinense is in a different position from that reported by Li et al. (2013) in C. annuum. As in tomato, different changes in the same gene have resulted in repeated parallel evolution of yellow fruit.

PSY1 affects fruit colour in Capsicum as well as in tomato, but psy1 mutants in Capsicum have orange fruits containing reduced amounts of red pigments, rather than yellow fruits with no red pigment. The psy1 mutation in Capsicum is thought to affect a splice site, resulting in premature termination of translation and hence a truncated enzyme. Kim et al. (2010) suggested that the truncated enzyme is unable to anchor properly to the chromoplast membrane, hence has less access to its substrate, but is nevertheless able to produce reduced quantities of red pigments. Orange fruits, visually indistinguishable from those resulting from the psy1 mutation, may also result when red pigments are totally absent, so that colour is provided by carotenoids produced earlier in the carotenoid pathway. In an orange-fruited cultivar of this type, absence of red pigments is caused by deletion of a single nucleotide in the coding sequence of CCS, producing a premature stop codon (Guzman et al., 2010). The two visually indistinguishable classes of orange fruits represent parallel evolution of the same phenotype, but convergent evolution at the genetic level.

In tomatoes grown for canning, elongated fruits are preferred to round. Fruit shape is influenced by the genes fs8.1, SUN, and OVATE (Tanksley, 2004). A candidate gene for fs8.1 is SlSUN22, one of 34 members of the tomato SUN family (Huang et al., 2013). The original sun mutation was caused by a transposonmediated duplication of a segment of chromosome 10, carrying another member of the SUN family, which was then inserted into chromosome 7 (Xiao et al., 2008). In its original location, this SUN locus is expressed only at a very low level, but in its new location, under control of a different promoter, it is expressed at a much higher level (van der Knaap et al., 2014). A QTL on chromosome 10, previously known to affect fruit shape, may correspond to the SUN locus that was duplicated and transposed. This QTL of tomato is orthologous with fs10.1, associated with round vs. elongate fruits in C. annuum (Ben Chaim et al., 2001), and maps to the same region of the Solanaceae genome as the potato gene Ro, a major QTL affecting tuber shape (van Eck et al., 1994; Borovsky and Paran, 2011). Orthologous QTL may therefore control parallel variations in shape in organs as different as tubers and fruits.

OVATE in tomato is a member of a family of genes called OVATE FAMILY PROTEINS (OFPs). These are widespread in the plant kingdom and act as novel plant growth regulators (Wang et al., 2016). The only mutation to ovate known in tomato results from a SNP that produces a premature stop codon, hence loss of function of the gene product (Tsaballa et al., 2011). In C. annuum, a mutant of the presumed orthologue, CaOvate, does not contain a premature stop codon. Instead, changed expression seems to be responsible for changed fruit shape. In round fruits, CaOvate is expressed most strongly after anthesis, whereas in elongate fruits, it is expressed most strongly before anthesis (Tsaballa et al., 2011).

Much remains to be learned about control of fruit shape in tomato and Capsicum, but evidently there are instances where phenotypic parallels in shape are controlled by parallel changes in orthologous genes, and these parallels may even extend to control of tuber shape in potato.

# Adaptation to Long Days

Most crops that originated in the Americas were domesticated in the tropics, many from wild progenitors that require days shorter than a certain critical length to induce flowering. As these crops spread to temperate latitudes, they had to adapt to growing seasons of increasingly longer days. This adaptation usually involved loss of sensitivity to daylength. Closer adaptation to specific environments, in which length of growing season is influenced by factors such as drought or cold, was achieved by selection for earlier or later flowering under any given photoperiod.

Control of time to flowering is complex. More than 90 relevant genes have been identified in Arabidopsis and their homologues in crop species are being isolated and studied. Gibberellin, vernalisation and autonomous pathways act and interact, together with, most importantly, the photoperiodic pathway. Basic features of the photoperiodic pathway appear to be conserved across a wide range of species, though there are differences in detail (Andrés and Coupland, 2012; Blümel et al., 2015). The names used here for genes and gene products are those employed in Arabidopsis thaliana, a model species for these investigations. The flower-inducing signal (FT), also known as florigen, is encoded by the gene FLOWERING LOCUS T (FT), which is activated by the transcription factor CO, produced by the gene CONSTANS. Transcription of CO is repressed by certain factors that are degraded by a lightinduced interaction between the product of the gene GIGANTEA (GI) and an enzyme produced by a further gene. Light and dark affect the stability of CO after translation. Interactions with light are mediated by the photoreceptor phytochrome B, encoded by PHYTOCHROME B (PHYB). FT and CO are both expressed in leaves. FT then moves through the phloem to the shoot apical meristem, where it combines with FD, the protein product of FLOWERING LOCUS D. The FT-FD complex activates SUPPRESSION OF OVEREXPRESSION OF CONSTANS 1 (SOC1), which encodes a transcription factor that activates a series of other transcription factors. These in turn activate floral identity genes that irreversibly convert a vegetative to a floral meristem.

Given the complexity of this pathway, it is not surprising that photoperiodic sensitivity has been lost through different genetic changes in different crops. In Sea Island cotton (G. barbadense), a major locus, Gb-Ppd1, is associated with the day-neutral phenotype. The underlying gene has not yet been identified, despite examination of 110 possible candidates (Zhu and Kuraparthy, 2014). In contrast, in upland cotton (G. hirsutum), many genes appear to be involved, including three members of the CONSTANS-LIKE (COL) gene family, all three of which have apparently undergone selection during domestication. Both G. hirsutum and G. barbadense are AADD allotetraploids. Song et al. (2017) observed that the COL2 locus in the A subgenome is hypermethylated, hence repressed, in both species, but in photoperiodically insensitive accessions of both species, the D subgenome homoeologue of COL2 is less methylated, hence more strongly expressed. As Song et al. (2017) noted, there are also other QTL associated with loss of sensitivity to photoperiod in cotton. Nevertheless, the data do suggest that change in a CO-like gene, or in its regulation, is one cause of adaptation to long-day growing seasons, possibly in both species of cotton.

Sunflower (Helianthus annuus) is unusual in that genotypes adapted to short or long days, as well as genotypes insensitive to photoperiod, all occur among wild populations and among domesticated accessions of this single species (Blackman, 2013). In wild sunflowers, insensitivity to photoperiod and adaptation to long days are restricted to different parts of their extensive range, outside the region in which sunflower was probably domesticated (Blackman, 2013). Duplication of the original FT locus has produced four paralogues, HaFT1 to HaFT4, which seem to have diverged in function (Blackman, 2013). HaFT1 is expressed in the shoot apex and is a candidate gene for a major QTL that affects time to flowering. HaFT2 and HaFT4 seem to encode the florigen. They are expressed in leaves under inductive photoperiods, like FT genes in other species. In one wild population and one cultivar, which have independently developed insensitivity to photoperiod, HaFT4 is expressed under both long- and shortdays. Blackman (2013) considered that changes in either the promoter of HaFT4 or in other genes that regulate HaFT4 are probably involved in both cases, but could not determine whether the same changes are implicated in both. In contrast, adaptation to long days apparently evolved by different means in wild and domesticated sunflowers. Long days are usually non-inductive, but in one long-day cultivar, both HaFT2 and HaFT4 are expressed in long days, indicating that there has been some change in the regulation of these genes. However, in a long-day wild sunflower, neither HaFT2 nor HaFT4 is expressed, though expression of the homologue of SOC1 has changed (Blackman, 2013). Loss of sensitivity to photoperiod in wild and domesticated sunflowers may therefore represent parallel evolution at the genetic as well as phenotypic level, but parallel evolution of phenotypes adapted to long days in wild and domesticated sunflowers represents convergent evolution at the genetic level.

In common bean, two QTL are consistently implicated in control of flowering time (Kornegay et al., 1993; Koinange et al., 1996; Gu et al., 1998). Ppd has the larger effect and determines sensitivity to photoperiod, while Hr determines degree of response to photoperiod. Kornegay et al. (1993) considered that insensitivity to long days involved the same two QTL in both Andean and Mesoamerican common bean, although these were independently domesticated, but Gu et al. (1998) found different markers associated with earliness in Mesoamerican and Andean beans. Bellucci et al. (2014) presented evidence that a DNA sequence homologous to GI was selected during domestication of Mesoamerican beans, but there is no other information on candidate genes for QTL associated with photoperiodicity or time to flowering in common bean, or their modes of action (Kwak et al., 2008).

Tropical maize has retained the short-day adaptation of its wild relative, teosinte, whereas temperate maize is insensitive to day-length, so flowers sufficiently early in the long-day growing seasons of temperate latitudes to mature its grain before the short days and cold temperatures of autumn. Although a large number of QTL affect flowering time in maize, relatively few affect sensitivity to photoperiod. One of these few is ZmCCT (Hung et al., 2012). This gene represses expression of the probable orthologue of FT, ZCN8 (Lazakis et al., 2011; Yang et al., 2013; Mascheretti et al., 2015). An insertion of a transposable element in the upstream regulatory region of ZmCCT suppresses transcription of ZmCCT under long days, thereby up-regulating expression of ZCN8, resulting in earlier flowering. The insertion is apparently absent in teosinte, but was subjected to a strong selective sweep in photoperiod-insensitive accessions of temperate maize. Yang et al. (2013) therefore concluded that the insertion occurred after domestication and was favoured by selection as maize spread into long-day growing seasons of temperate latitudes. Lazakis et al. (2011) suggested that in temperate maize, the photoperiodic pathway, in which ZCN8 is activated in short days, has been largely superseded by the autonomous pathway, in which ZCN8 is activated by the protein product of the gene indeterminate 1(id1). Although all higher plants contain a large family of id1-related genes, there is no clear orthologue of id1 in dicotyledons.

Convergent evolution of photoperiod-insensitivity has therefore been achieved by different means in different crops, even though the pathways that regulate the transition to flowering, particularly the CO-FT pathway, are extensively conserved. CO-like genes have been major players in cotton, FT genes in sunflower, and suppression of ZmCCT, together with probable increased importance of the autonomous pathway, in maize.

Flowering is not the only process in plants that is controlled by photoperiod. In both potato (Solanum tuberosum) and Jerusalem artichoke (Helianthus tuberosus), tubers are formed in response to short days, and orthologues of FT appear to be involved (Navarro et al., 2011; Blackman, 2013). The process has been most studied in potato, since its success in temperate latitudes depends on ability to form tubers in long days. The orthologue of FT in potato is StSP6A, while the othologue of CO is StCOL1 (Navarro et al., 2011; González-Schain et al., 2012). The product of StCOL1 activates StSP5G and the StSP5G protein in turn represses transcription of StSP6A. StCOL1 is degraded in the dark, so only a low level is present in short days. This releases repression by StSP5G of StSP6A, so tubers are formed. StCOL1 is stabilised in the light and this is dependent on phytochrome B. When PHYB is suppressed, tuber formation is no longer sensitive to photoperiod (Abelenda et al., 2016). Adaptation to long days has apparently evolved at least twice; once in Chile as potatoes spread south from the Andean region, and again in Europe after post-Conquest introduction of Andean short-day potatoes. However, it has yet to be shown what genes are involved in loss of sensitivity to daylength in each case, and whether the same genes are involved in different photoperiod-insensitive potatoes.

# DISCUSSION AND CONCLUSIONS

Parallel vs. convergent evolution are terms that were initially used for discussion of phenotypic similarities between closely vs. distantly related species. Their definition is subjective, in that it does not specify how long ago lineages must have diverged to be considered no longer closely related. For example, waxy starch occurs in some dicotyledonous and some monocotyledonous crops. Given that all flowering plants descend ultimately from a common ancestor, this may be considered as either parallel or convergent evolution, depending on whether dicotyledons and monocotyledons are regarded as closely related, because they are all flowering plants, or distantly related, because they belong to lineages that diverged early in the evolution of flowering plants.

At the genetic level, parallel evolution implies that similar phenotypes are controlled by the same (i.e., homologous) genes, whereas convergent evolution implies that similar phenotypes are controlled by genes that are not homologous. This again raises the problem of subjectivity, this time in the definition of homology. Flowering plants have undergone many cycles of gene duplication, so many genes now occur in multi-gene families. If the original gene controlled multiple functions, these may become partitioned among different members of the family. Alternatively, or additionally, some family members may acquire new functions and/or changes in expression. Different members of the same gene family may then be favoured by selection during domestication in different crops. All this leads to differences in degrees of homology and blurs the distinction between parallel vs. convergent evolution at the genetic level. An example occurs among cereals (Meng et al., 2011). In rice, the gene Hd3a encodes the florigen. In maize, 15 members of the ZCN family are phylogenetically related to Hd3a. ZCN15 maps to the same region of the genome as Hd3a, so in that respect is its closest homologue, but ZCN15 is expressed in maize kernels, not leaves, and is not involved in control of flowering. ZCN8 is expressed in leaves and does encode a florigen, so is functionally homologous to Hd3a although located elsewhere in the genome. The control of flowering in rice and maize thus may, or may not, be considered to involve parallels at the genetic level.

Many phenotypic traits listed as components of the domestication and diversification syndromes are complex characters. When these are broken down into their components, some apparent parallels disappear. For example, increased size of fruit is a frequent response to human selection in species domesticated for their fruit. In tomato and Mexican green tomato (P. philadelphica), development of large fruits in domesticated accessions of both species initially seems to be an example of parallel phenotypic evolution in species belonging to the same family. But on closer examination, large fruits in tomato are found to result from increased numbers of cells in various parts of the fruit, together with more carpels in the fruit, while in Mexican green tomato, large fruits are due primarily to increased cell size (Wang et al., 2014). Carpel number, cell size and cell number are controlled by different genes so, not surprisingly, large fruits in tomato and Mexican green tomato are controlled principally by non-homologous genes. Dissection of this complex character, fruit size, into simpler components thus shows that the phenotypic parallel is more limited than it initially appeared.

In C. annuum, fruit size has likewise increased under human selection. This increase is controlled by numerous QTL, many of which are orthologous to those of tomato. This therefore seems a good example of parallel evolution at levels of both phenotype and gene. However, the QTL with greatest effect on fruit weight in tomato, FW2.2, has a lesser effect in C. annuum, probably because FW2.2 appears to act principally in the placenta and central axis of the fruit, which make up much of the mass of tomato fruits but disproportionately less of the hollow fruits of Capsicum (van der Knaap et al., 2014). Large fruits of both Capsicum and tomato have more cells in the fruit wall, but the QTL primarily responsible has a smaller effect in tomato than its orthologue does in C. annuum. The phenotypic and genetic parallels between large fruits in tomato and C. annuum therefore become less perfect when examined in detail. This suggests that parallelism is a matter of degree, like relationship or homology. Furthermore, in assessing possible cases of parallel evolution, it is important to compare only the comparable.

Difficulties in defining parallel and convergent evolution led Arendt and Reznick (2008) to conclude that applying these terms at the level of molecular genetics leads only to confusion. They advocated use of a single term, and suggested "convergent evolution," for evolution at both phenotypic and genetic levels. However, this term carries the baggage of past usage. I have therefore continued to distinguish parallel and convergent evolution and have applied these terms at phenotypic, genetic, and DNA sequence levels. This may sometimes seem confusing, and may also seem to be governed by the principle employed by Lewis Carroll's Humpty-Dumpty, who told Alice: "When I use a word, it means exactly what I choose it to mean – neither more nor less." Nevertheless, clarity in definition and consistency in use of existing terms seems preferable to coining new, hence unfamiliar, ones.

No examples of parallel changes in DNA sequence exist among the cases of parallel genetic evolution reviewed here. I know of only one possible example among crops domesticated in the Americas: occurrence, in semi-domesticated cherry tomato and fully domesticated modern tomatoes, of bushy plants associated with a recessive mutation in the SELF-PRUNING (SP) gene. Pnueli et al. (1998) found that mutants in both types of tomato carry the same amino acid substitution at the same position in the encoded protein. They argued that the two identical mutations arose independently. However, this example seems to be the exception, not the rule. In most cases of parallel genetic evolution, parallel phenotypes have been produced by different (convergent) changes in DNA sequence of the same gene. These changes usually result in loss of function of the gene or its product, less often in changed expression of the gene.

When phenotypic and genetic levels are considered together, any combination of parallel and convergent evolution may occur, as **Table 3** shows. There are examples of parallel phenotypes controlled by homologous (parallel) genes in closely related species; convergent phenotypes controlled by homologous genes in distantly related species; parallel phenotypes controlled by non-homologous (convergent) genes in closely related species; and convergent phenotypes controlled by non-homologous genes in distantly related species. Moreover, all these combinations are found in both domestication and diversification traits.

Glémin and Bataillon (2009) and Martínez-Ainsworth and Tenaillon (2016) considered that evolution of similar phenotypes during domestication generally does not result from parallel genetic changes. Some traits, particularly those that involve complex developmental pathways and/or complex networks of genes, do indeed seem to involve evolution of similar phenotypes by different genetic routes. For example, in loss of sensitivity to daylength, CO-like genes appear to be major players in cotton (Song et al., 2017), whereas genes of the FT family appear more important in sunflower (Blackman, 2013). On the other hand, development of large fruit in two distantly related dicotyledons, tomato and avocado, seems to involve genes of the CELL TABLE 3 | Examples of parallel and convergent evolution at levels of phenotype and gene in some domestication and diversification traits.

#### (A) Domestication Traits


#### (B) Diversification Traits


NUMBER REGULATOR (CNR) family in both genera (Dahan et al., 2010; van der Knaap et al., 2014).

Poncet et al. (2004) considered that parallel evolution at the genetic level, involving orthologous loci, is more likely in crops that belong to the same family. Some of the cases discussed here support this view, but others do not. For example, in the Solanaceae, several orthologous loci are implicated in evolution of large fruit in tomato and C. annuum, but parallel evolution of yellow fruit is controlled by different, non-orthologous, genes in these two crops. Available data are as yet insufficient to assess reliably the number of examples for or against the opinion of Poncet et al. (2004).

Lenser and Theißen (2013) suggested that traits resulting from simple metabolic pathways, rather than from complex developmental networks, are more likely to provide examples of parallel genetic variation. Once more, examples can be cited both for and against this view. Waxy starch results from a change in a relatively simple biosynthetic pathway, controlled by an orthologous gene in both grain amaranths and maize (Whitt et al., 2002; Park et al., 2010). On the other hand, two visually indistinguishable types of orange fruit in C. annuum result from changes in different genes in the pathway of pigment biosynthesis (Guzman et al., 2010; Kim et al., 2010). Again, there are insufficient data to quantify relative frequencies of parallel vs. convergent evolution in traits resulting from simple vs. complex pathways.

Gross and Olsen (2010) argued that diversification traits are more likely to provide examples of parallel genetic evolution than domestication traits. This seemed plausible, because diversification often involves changes in simpler characters than domestication, but the data are again inconclusive and examples cited above include some for and some against this proposition.

Technical advances, such as next-generation sequencing and various methods for studying levels and locations of gene expression, are producing an information explosion regarding details of developmental processes and their control in different plant species. Many traits of the domestication syndrome, and some associated with diversification, are now known to be controlled by complex networks of genes. These networks, or portions of them, are often conserved to considerable extents, though regulation of the genes involved may vary. This widespread conservation suggests that similar phenotypic changes, even in distantly related crop species, could result from parallel genetic changes. On the other hand, the complexity of the pathways suggests that disruption at different points (convergent evolution) could produce parallel phenotypes. Most current investigations aim to elucidate the pathways and genes involved in these networks for a few model species only. Less attention has so far been directed toward establishing how, or at what points, these pathways and networks have been changed or disrupted through natural and/or human selection, and whether, or to what extent, these pathways or networks differ in different species in the same family.

This review is therefore both premature and, given the rate at which new information is accumulating, likely to be out-ofdate before it is published. Nevertheless, it may serve to highlight

Capsicum annuum

the many gaps in the information available and the many areas where additional data are needed. Both domestication and diversification evidently involve both parallel and convergent evolution, acting at any or all levels of phenotype, gene, or nucleotide sequence, with relative frequencies still to be determined. As Doebley and Stec (1991) concluded many years ago, evolution seems essentially opportunistic, making use of whatever variation is available.

# REFERENCES


# AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

# FUNDING

Frontiers have granted me a full fee waiver.


the red fruit colour in pepper. Plant Mol. Biol. 36, 785–789. doi: 10.1023/A:1005966313415


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Pickersgill. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Back to the Origin: In Situ Studies Are Needed to Understand Selection during Crop Diversification

Yolanda H. Chen<sup>1</sup> , Lori R. Shapiro2, 3, Betty Benrey <sup>4</sup> and Angélica Cibrián-Jaramillo<sup>5</sup> \*

<sup>1</sup> Department of Plant and Soil Science, University of Vermont, Burlington, VT, United States, <sup>2</sup> Department of Microbiology and Immunology, Harvard Medical School, Boston, MA, United States, <sup>3</sup> Department of Applied Ecology, North Carolina State University, Raleigh, NC, United States, <sup>4</sup> Laboratory of Evolutionary Entomology, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland, <sup>5</sup> Laboratorio Nacional de Genómica para la Biodiversidad, Centro de Investigación y de Estudios Avanzados, Irapuato, Mexico

#### Edited by:

Alejandro Casas, Instituto de Investigaciones en Ecosistemas y Sustentabilidad, Universidad Nacional Autónoma de México, Mexico

#### Reviewed by:

Shabir Hussain Wani, Michigan State University, United States Mukhtar Ahmed, Pir Mehr Ali Shah Arid Agriculture University, Pakistan

> \*Correspondence: Angélica Cibrián-Jaramillo angelica.cibrian@gmail.com

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> Received: 24 June 2017 Accepted: 25 September 2017 Published: 18 October 2017

#### Citation:

Chen YH, Shapiro LR, Benrey B and Cibrián-Jaramillo A (2017) Back to the Origin: In Situ Studies Are Needed to Understand Selection during Crop Diversification. Front. Ecol. Evol. 5:125. doi: 10.3389/fevo.2017.00125 Crop domestication has been embraced as a model system to study the genetics of plant evolution. Yet, the role of the environment, including biotic forces such as microbial and insect communities, in contributing to crop phenotypes under domestication and diversification has been poorly explored. In particular, there has been limited progress in understanding how human selection, agricultural cultivation (soil disturbance, fertilization, and irrigation), and biotic forces act as selective pressures on crop phenotypes. For example, geographically-structured pathogenic, pestiferous, and mutualistic interactions with crop plants have likely given rise to landraces that interact differently with local microbial and insect communities. In order to understand the adaptive role of crop traits, we argue that more studies should be conducted in the geographic centers of origin to test hypotheses on how abiotic, biotic, and human selective forces have shaped the phenotypes of domesticated plants during crop domestication and subsequent diversification into landraces. In these centers of origin, locally endemic species associated with wild ancestors have likely contributed to the selection on plant phenotypes. We address a range of questions that can only be studied in the geographic center of crop origin, placing emphasis on Mesoamerican polyculture systems, and highlight the significance of in situ studies for increasing the sustainability of modern agricultural systems.

Keywords: crop domestication, agroecology, evolutionary ecology, biogeography, epigenetics, human culture, insects, microbes

# INTRODUCTION

The domestication of crop plants has fundamentally altered the relationship between humans and their environment (Larson et al., 2014). While the genetics of crops domestication has been widely studied for some common plant species (Darwin, 1868; Evans, 1993; Smartt and Simmonds, 1995; Ladizinsky, 1998; Hancock, 2012), the role of ecological interaction within centers of origin in contributing crop phenotypic diversity has been overlooked (Chen et al., 2015; Perez-Jaramillo et al., 2016). Prior to domestication, wild ancestors of crop plants evolved in association with a broad assemblage of microbes and insects, with which they engaged in a range of pathogenic,

predatory, commensal, and mutualistic interactions (Chen et al., 2013; Huang et al., 2016; Perez-Jaramillo et al., 2016). These ecological interactions were almost certainly altered by domestication and when early domesticates were introduced to new locations with unique climates, distinct local biodiversity, and different cultural methods of farming.

The majority of domestication events occurred in specific geographic regions, often within the native range of their wild ancestors (Vavilov, 1926, 1951; Meyer et al., 2012). Despite this, most studies rarely consider whether all interacting species are endemic in the center of origin. We found that only 1.6% of 1,532 studies comparing insect responses on wild and crop plants specifically accounted for biogeographical history (Chen et al., 2015). Geographically-explicit hypotheses are needed to understand in situ crop diversification for two reasons. First, human-mediated migration of crops to new regions within centers of origin influenced the genetic structuring of crop populations. Second, domesticated cultivars experienced novel selective pressures imposed by new environments and the cultural preferences of different indigenous peoples (**Figure 1**; Brush, 1995; Hugo et al., 2003).

In situ field studies documenting variation of ecological interactions are important to determine the extent to which landrace phenotypes respond to local adaptation and artificial selection. After initial crop domestication, early landraces were brought to new environments that were often different from the ecological conditions experienced by their wild progenitors (Hufford et al., 2012). In these new environments, landraces interacted with new species, different cropping system, and new abiotic conditions (**Figure 1**). Also, different groups of farmers may cultivate the same crop in different polyculture systems (Casas et al., 1996, 2007; Hugo et al., 2003), which is the simultaneous cultivation of multiple crops. As a result, landraces emerge over thousands of years due to natural selection exerted by local abiotic conditions, local insect, and microbial communities, as well as human selection on traits related to ease of cultivation, aesthetics, taste, and cultural preferences (Perales et al., 2005; Brush and Perales, 2007; Casas et al., 2007; Aguirre-Dugua et al., 2013).

Here, we discuss the factors that contribute to phenotypic variation in crops and landraces. We examine the role of human selection and niche construction activities on crop phenotypic plasticity, a major factor in local adaptation (Piperno, 2017). We conceptually address two questions that could deepen our understanding of crop evolution and local adaptation: (1) To what extent are crop plants locally adapted? and (2) What are the relative roles of human selection, human-mediated migration, the local abiotic environment, endemic biotic communities, and cultivation practices in the diversification of crops into landraces? We describe polyculture systems in Mesoamerica as a suitable model in which these questions could be pursued. Ultimately, whether plants can maintain historic beneficial interactions with associated species—or form new ones in their introduced regions—have important implications for the future sustainability of agriculture.

# GENETIC AND ENVIRONMENTAL CONTRIBUTIONS TO CROP PHENOTYPIC VARIATION

We focus on landraces as a natural experimental system to understand how natural and human selective forces have shaped the diversity of phenotypic traits of domesticated crop plants during local adaptation. In order to detect local adaptation, it is important to characterize the extent of variation and fitness of phenotypic traits in response to local selective pressures (Kawecki and Ebert, 2004). Variation in landrace phenotypes is affected by their genotype, local environmental variation, and plant phenotypic plasticity. Equation 1 describes how the components of phenotypic variation can be partitioned (Pigliucci, 2001):

$$V\_P = \,^\prime V\_G + \,^\prime V\_E + \,^\prime V\_{G \times E} \tag{1}$$

Where V<sup>p</sup> denotes the total phenotypic variation found for a population or subspecies, V<sup>G</sup> is the total genetic variation, V<sup>E</sup> is the total environmental variation, and VGxE is the genotype × environment interaction. We limit our treatment of V<sup>p</sup> and V<sup>G</sup> because they have been discussed elsewhere (Olsen and Wendel, 2013; Piperno, 2017). In contrast, there has been a minimal effort to understand the contributions of V<sup>E</sup> and VGxE to crop domestication and diversification. **Table 1** provides a set of examples illustrating the range of plant adaptations to abiotic, biotic, and human selection.

Total genetic variation (VG) is influenced by evolutionary forces including selection, genetic drift, inbreeding, and gene flow (Hartl and Clark, 2006), which likely differ between wild progenitor and crop populations. Domesticated crops have reduced effective population sizes and are under intense selection for a suite of traits favored by humans (Doebley et al., 2006; Moreno et al., 2006). Cultural factors can also structure landrace populations and dispersal patterns, as people in different ethnolinguistic groups are less likely to trade seeds (Orozco-Ramírez et al., 2016).

The contribution of the environment to phenotypic variation (VE) strongly differs between wild progenitors and domesticated crops grown in human constructed agricultural niches. In natural systems, V<sup>E</sup> is affected by local abiotic conditions such as climate, precipitation, soil type, and nutrient availability. Domesticated crops are found in a range of cultivation systems, where the purpose of cultivation is to lessen the unpredictability of the local environment in order to favor uniform plant growth and yield. Cultivation has been shown to strongly affect plant-associated microbial and insect communities (Berg, 2009; Berendsen et al., 2012; Chen et al., 2013). Surprisingly, the majority of the insect and microbial species associated with a given crop in its center of origin remain undescribed (Chen et al., 2015; Coleman-Derr et al., 2016). Given that plant-biotic interactions can influence plant phenotypes (Henning et al., 2016), human niche activities can indirectly influence crop phenotypes by influencing the diversity and community structure of biotic assemblages.

Genotype by environment interactions (VGxE) describe phenotypic plasticity, or how genotypes responds to a range of environmental conditions. Crops respond to agricultural

conditions differently from their wild ancestors in growth (**Table 1**). Also, plasticity itself may have been constrained by domestication. Wild progenitors may have retained a greater plasticity to unpredictable environments, whereas domesticated may have lost phenotypic plasticity. Growing plants in different environments can provide important insight on the origin of important traits associated with domestication. For instance, teosinte grown under historic Pleistocene conditions (atmospheric CO<sup>2</sup> and temperature) displayed maize-like phenotypes such as reduced tillering, uniform seed maturation, and bract-less seeds (Piperno et al., 2015). Therefore, hypotheses that account for environmental variation and the selective environment can provide important insight on how crop species adapted to divergent environmental conditions.

Crop responses to the environment can be heritable, independent of changes in the DNA sequence, which contributes another dimension to VGxE. An emerging frontier is to understand the role of transgenerational epigenetics in crop adaptation to local environmental conditions (Piperno, 2017). Epigenetics studies can reveal how the environment can shape heritable gene expression (and thus the phenotype) without changing the underlying DNA sequence (Jablonka and Raz, 2009; Laland et al., 2014). Epigenetic changes in DNA methylation can regulate gene expression patterns during stress, which can influence the ability of crop varieties to respond to local environments (Ferreira et al., 2015). Therefore, in situ studies are critically needed to test the relative roles of genetics, epigenetics, and the environment in contributing to crop diversification.

# MESOAMERICA AS A FIELD LABORATORY FOR IN SITU STUDIES

Centers of domestication offer field locations to understand how local environments and human cultural influences contributed to crop domestication and diversification into landraces. The legendary Russian botanist Nikolai Vavilov delineated eight geographic regions as "centers of domestication" where multiple crop species were domesticated (Vavilov, 1926, 1951). One such center is Mesoamerica, which is the region of origin for maize, beans, squash, peppers, avocado, vanilla, and thousands of non-commercialized plant species (Casas et al., 2007). Crop domestication in Mesoamerica has received far less attention than in other centers of origin (Casas et al., 2007). Mesoamerica is particularly well-suited for field studies on crop domestication and diversification for several reasons: (1) It hosts a dazzling array of landrace varieties that are often cultivated sympatrically with their wild ancestors, (2) It has an archeological record confirming TABLE 1 | Phenotypic variation (Vp) in wild progenitors and landraces can be attributed to genetic variation (VG), environmental variation (V<sup>E</sup> ), and variation attributed to genotype by environment interactions (VGxE ).


Wild progenitors and domesticated crops display adaptive responses to abiotic, biotic, and human selection. Environmental variation (V<sup>E</sup> ) can be naturally occurring or arise from human niche construction activities. Wild progenitors and domesticated crops show different responses to the habitat conditions within which they evolved (VGxE ).

a long history of interactions between humans and many species of crop plants, and (3) Indigenous peoples continue oral and cultural traditions associated with the cultivation of these plants (Gepts, 2004; Staller et al., 2006; Piperno et al., 2007).

We focus on Mexico, the largest country within Mesoamerica, where much of the phenotypic and genotypic variation underlying local adaptation to environments has not been characterized. Many suspected centers of domestication and regions with wild relatives remain poorly explored for most crop plants. These underexplored regions that are historically relevant for the study of crop diversification include: the Balsas River Valley (Piperno et al., 2007), the Gulf Coast (Kraft et al., 2014), and the Tehuacan Valley for species other than maize (Vallejo et al., 2016). Needless to say, there are considerable opportunities to examine the selective forces that produced the extant array of landraces for native Mesoamerican crop plant species.

In Mesoamerica, traditional agroecosystems have been maintained cohesively for hundreds to thousands of years by indigenous peoples. Different cropping systems dominate in different climatic regions (**Figure 1**). In the Yucatan Peninsula, home gardens are highly diverse polyculture systems, and include crops such as avocado, annona, and papaya (Moreno-Calles et al., 2016). The inland and coastal regions are dominated by agroforestry systems paired with ornamental and woody species (Moreno-Calles et al., 2016). The oldest Mesoamerican polyculture systems continue to be maintained in Tlaxcala (Gonzalez-Jacome, 2016). In cold and dry highland environments, prehispanic terraces for water management still exist in Oaxaca and the Tehuacan Valley (Donkin, 1979). In arid environments, drought-tolerant plants including cactus are cultivated in polyculture with chili pepper and other crops (Moreno-Calles et al., 2012, 2016). Slash and burn agriculture is the most widespread form of cultivation in the tropical deciduous and temperate forests, where crops are rotated after the plant cover is burned every few years. Slash and burn systems can be quite diverse, with at least 57 tree species from 33 plant families and many more herbaceous species (Moreno-Calles et al., 2016).

One of the most dominant cropping system in slash and burn agriculture is the milpa (**Figure 1D**), which is, at its most basic, the joint cultivation of maize, beans, squash (Zizumbo-Villarreal and Colunga-GarcíaMarín, 2010). Maize serves as a trellis for beans that fix nitrogen, while squash suppresses weeds. Although, the wild progenitors of maize, beans, and squash are native to separate regions in Mexico (Smith, 1997; Gepts, 2004; Piperno et al., 2007), these core milpa crops have been cultivated together for thousands of years. It is highly possible that associated microbes and insects have adapted to the milpa. Perhaps it should not be a surprise that the most devastating contemporary native insect pests of maize can all be collected from squash flowers (Metcalf and Lampman, 1989), and that several closely related species of leaf beetles in the genus Diabrotica damage maize, beans, and squash (Clark et al., 2001; Vidal et al., 2005; Eben and Espinosa de Los Monteros, 2013). There are many other unexplored questions on crops and biotic interactions in milpa systems such as: (1) To what extent have microbiomes, pathogens, mycorrhizae, herbivores, natural enemies, and pollinators adapted to polyculture and landrace traits? (2) Does intercropping select landraces to develop complementary use for light, nutrients, and water? and (3) How has the milpa shaped landrace phenotypes?

The present Mesoamerican landscape is also a gradient of ecological contexts where one could study whether genes underlie phenotypes that are adaptive to local abiotic and biotic conditions. Reciprocal common garden studies with maize have found that highland landraces show higher fitness and seed quality in highland conditions, while lowland landraces have higher fitness in mid-altitude locations (Mercer et al., 2008). Traits such as pigmentation, stem hair, plant height, and flowering time have been shown to be adaptive to altitude, but completely different genes underlie local adaptation to highland conditions in Mesoamerica and South America (Mercer et al., 2008). Field sampling of Mexican teosinte populations helped to clarify that maize adaptation to the Mexican highlands resulted due to introgression from wild teosinte (Hufford et al., 2013). Understanding the genomic basis of local adaptation in crops relies on multiple in situ localities, where the ecological history can be reconstructed by testing for genomic regions under divergence (Pyhäjärvi et al., 2013), and the responses of candidate genes can be observed in local environments (Doust et al., 2014; Piperno, 2017). Traditional Mesoamerican agroecosystems are living biological and ethnographic systems that are suitable for studying how human-created niches in agroecosystems interact with local biotic and abiotic environments to shape landrace phenotypes. These in situ systems provide an important reference for examining how crop plants adapted as they have diversified within centers of origin.

# IMPLICATIONS OF IN SITU STUDIES FOR SUSTAINABLE AGRICULTURE

Sustainable agriculture aims to reduce reliance on pesticides and fertilizers by utilizing biodiversity to provide ecological services that provision nutrients, protect crops, and enhance yields (Altieri, 1999). For the world's most important crops, the majority of production occurs outside their centers of origin (Khoury et al., 2016). Oftentimes, crops are grown in marginal environments, where they experience low nutrient availability, excess or limited water availability, temperature extremes, or pest outbreaks (**Table 1**). Under climate change, these pressures are predicted to intensify (Hatfield et al., 2010). Perhaps because centers of origin tend to be the geographic source for the major diseases (Leppik, 1970) and insect pests of crops (Chen, 2016), they are also the source for genes for resistance (Harlan, 1976; Hijmans et al., 2003; Zhang et al., 2017), insect natural enemies (van den Bosch, 1971; van Driesche et al., 2008), and microbes (Philippot et al., 2013; Perez-Jaramillo et al., 2016) that help plants to resist pests and tolerate abiotic stress.

In situ studies can also provide insight on whether human selection for crop yield is fundamentally at odds with traits that mediate beneficial plant-biotic interactions. First, crop domestication appears to have promoted pests more frequently than beneficial species, especially for economically-important traits such as fruit size and seed size (Chen et al., 2015). However, we do not know if direct trade-offs between yield and pest resistance exist, and whether this relationship may vary with different landraces and environments within centers of origin. The diversity and community structure of microbes and insects associated with wild ancestors and landraces have been inadequately described (Chen et al., 2013; Perez-Jaramillo et al., 2016), and geographic variation in patterns of biodiversity within centers of origin remain unexplored.

Second, plant genotypes vary in their ability to form positive relationships with beneficial species (**Table 1**; Smith and Goodman, 1999; Chen and Welter, 2005; Tamiru et al., 2011). Determining the relative roles of plant genetic diversity, microbial associates, and plant gene × environment interactions in conferring resistance to biotic and abiotic stresses (Philippot et al., 2013) would help elucidate whether breeding, microbial inoculation strategies, or natural enemy introductions would better support crop production in the diverse environments where crops are grown. Finally, in situ studies provide insight on the ecological function of crop genes and metabolites within their natural environment, which are oftentimes only explored in environments far from centers of origin. For examples, teosinte and some maize varieties emit the sesquiterpene (E) β-caryophyllene, which attracts entomopathogenic nematodes (Rasmann et al., 2005) and parasitoids (Kollner et al., 2008) in Europe. However, the role of this compound in landraces is not known, especially in Mesoamerica, where a diverse assemblage of species may be adapted to respond to plant signals. In situ studies can help resolve whether landrace varieties produce the compound, natural enemies are attracted to it, and whether breeding for (E)-β-caryophyllene would increase natural enemy attraction and enhance yield in the diverse worldwide locations where maize is now grown.

# CONCLUSIONS

In situ ecological studies are an essential, but almost completely unexplored line of inquiry for evolutionary ecologists to understand the selective forces that contribute to local adaptation of landrace varieties. As one of the major centers of crop origin, Mesoamerica is an ideal location for in situ studies, because wild progenitors can be found growing sympatrically with domesticated landrace varieties cultivated in traditional polyculture systems. For many crops and cultivation systems, the unique combination of local abiotic, biotic, and cultural selective forces that shaped variation in crop phenotypes during domestication and diversification continue to coexist. We advocate that geographically-explicit studies will yield new insight into how selection from humans and the local environment contribute to landrace diversification and local adaptation. Such knowledge is immediately applicable toward understanding the capacity of crop plants to respond to the biotic

# REFERENCES


and abiotic conditions over the vast geographic ranges where they are now grown, and to identify sources of germplasm that might have adaptive traits for crops in their introduced ranges.

# AUTHOR CONTRIBUTIONS

YC, LS, and AC conceived the Perspective. YC, AC, LS, and BB drafted the work and revised the content.

# FUNDING

We thank our funders for their support: USDA National Institute of Food and Agriculture grant no. #VT-H02301MS to YC, NSF postdoctoral fellowship DBI-1202736 to LS, Swiss National Science Foundation, project No. 31003A\_162860 to BB, and CONACyT Problemas Nacionales #247730 to AC.

# ACKNOWLEDGMENTS

We thank Nicolas Marguler for the design of **Figure 1**.


domesticated chili pepper, Capsicum annuum, in Mexico. Proc. Natl. Acad. Sci. U.S.A. 111, 6165–6170. doi: 10.1073/pnas.1308933111


chilies. Proc. Natl. Acad. Sci. U.S.A. 105, 11808–11811. doi: 10.1073/pnas.08026 91105


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Chen, Shapiro, Benrey and Cibrián-Jaramillo. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Shifts in Plant Chemical Defenses of Chile Pepper (Capsicum annuum L.) Due to Domestication in Mesoamerica

Jose de Jesus Luna-Ruiz <sup>1</sup> \*, Gary P. Nabhan<sup>2</sup> and Araceli Aguilar-Meléndez <sup>3</sup>

<sup>1</sup> Centro de Ciencias Agropecuarias, Universidad Autonoma de Aguascalientes, Aguascalientes, Mexico, <sup>2</sup> The Southwest Center, University of Arizona, Tucson, AZ, United States, <sup>3</sup> Centro de Investigaciones Tropicales, Universidad Veracruzana, Xalapa, Mexico

#### Edited by:

Alejandro Casas, Instituto de Investigaciones en Ecosistemas y Sustentabilidad, Universidad Nacional Autónoma de México, Mexico

#### Reviewed by:

Daniel Pinero, Universidad Nacional Autónoma de México, Mexico Rosa Lia Barbieri, Embrapa Clima Temperado, Brazil

\*Correspondence:

Jose de Jesus Luna-Ruiz joselunaruiz11@yahoo.com.mx

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> Received: 31 August 2017 Accepted: 05 April 2018 Published: 24 April 2018

#### Citation:

Luna-Ruiz JdJ, Nabhan GP and Aguilar-Meléndez A (2018) Shifts in Plant Chemical Defenses of Chile Pepper (Capsicum annuum L.) Due to Domestication in Mesoamerica. Front. Ecol. Evol. 6:48. doi: 10.3389/fevo.2018.00048 We propose that comparisons of wild and domesticated Capsicum species can serve as a model system for elucidating how crop domestication influences biotic and abiotic interactions mediated by plant chemical defenses. Perhaps no set of secondary metabolites (SMs) used for plant defenses and human health have been better studied in the wild and in milpa agro-habitats than those found in Capsicum species. However, very few scientific studies on SM variation have been conducted in both the domesticated landraces of chile peppers and in their wild relatives in the Neotropics. In particular, capsaicinoids in Capsicum fruits and on their seeds differ in the specificity of their ecological effects from broad-spectrum toxins in other members of the Solanaceae. They do so in a manner that mediates specific ecological interactions with a variety of sympatric Neotropical vertebrates, invertebrates, nurse plants and microbes. Specifically, capsaicin is a secondary metabolite (SM) in the placental tissues of the chile fruit that mediates interactions with seed dispersers such as birds, and with seed predators, ranging from fungi to insects and rodents. As with other Solanaceae, a wide range of SMs in Capsicum spp. function to ecologically mediate the effects of a variety of biotic and abiotic stresses on wild chile peppers in certain tropical and subtropical habitats. However, species in the genus Capsicum are the only ones found within any solanaceous genus that utilize capsaicinoids as their primary means of chemical defense. We demonstrate how exploring in tandem the evolutionary ecology and the ethnobotany of human-chile interactions can generate and test novel hypotheses with regard to how the domestication process shifts plant chemical defense strategies in a variety of tropical crops. To do so, we draw upon recent advances regarding the chemical ecology of a number of wild Capsicum species found in the Neotropics. We articulate three hypotheses regarding the ways in which incipient domestication through "balancing selection" in wild Capsicum annuum populations may have led to the release of selective biotic and abiotic pressures. We then analyze which shifts under cultivation generated the emergence of Capsicum chemotypes, morphotypes and ecotypes not found in high frequencies in the wild. We hypothesize that this "competitive release" can lead to a diversification of the domesticate's investment in a greater diversity of SM potency across different cultural uses, cropping systems and ecogeographic regions. While most studies of plant domestication processes focus on morphological changes that confer greater utility or productivity in human-managed environments, we conclude that changes in the chemical ecology of a useful plant can be of paramount importance to their cultivators. The genus Capsicum can therefore provide an unprecedented opportunity to compare the roles of SMs in wild plants grown in natural Neotropical ecosystems with their domesticated relatives in the milpa agro-ecosystems of Mesoamerica. Even with the current depth of knowledge available for crop species in the genus Capsicum and Solanum, our understanding of how particular SMs affect the reproduction and survival of wild vs. domesticated solanaceous plants remains in its infancy.

Keywords: Capsicum annuum, plant domestication, secondary metabolites, plant chemical defenses, Neotropics, Mesoamerica

# INTRODUCTION

What changes occur in a Neotropical plant's chemical defenses when it is domesticated for crop production as a food, medicine, vermifuge or condiment, or for all four of these uses? There is remarkably little tested ecological theory regarding how domestication affects plant chemical defenses (Rindos, 1984; Johns, 1990; Casas et al., 2015). This may be because most phytogeographic, agroecological, and archaeobotanical studies of plant domestication have largely used morphological indicators to track the domestication process rather than identifying phytochemical indicators of changes in ecological interactions. As recently argued by Zeder (2017), ecologists need to identify tractable model systems that allows for an assessment of the core assumptions of the Extended Evolutionary Synthesis (EES).

The domestication of crop plants by human cultures provides one such case study opportunity. That is why we propose that the genus Capsicum can serve as important model system for discerning how changes in secondary metabolites (SMs) reveal shifts in plant chemical defenses that have occurred with domestication. In the case of domesticated chile peppers, these shifts influenced both (1) antagonistic interactions with predators and abiotic pressures, and (2) facilitated (including mutualistic) interactions among chiles, their avian dispersers, nurse plants and human cultures. The integration of ethnobotanical, paleoecological, archeological, linguistic, genetic and evolutionary perspectives on chile domestication that has been in process for the last two decades (Tewksbury and Nabhan, 2001; Pickersgill, 2007; Tewksbury et al., 2008b; Aguilar-Meléndez et al., 2009; Haak et al., 2012; Kraft et al., 2014; Carrizo-García et al., 2016) has already contributed substantively to the possibilities of such an EES.

For these reasons, we have chosen to integrate ecological studies of wild Capsicum species in natural habitats of the Neotropics with ethnobotanical, agroecological and nutritional studies of domesticated Capsicum landraces in culturallymanaged milpa habitats and indigenous diets of Mesoamerica. By doing so, we wish to further test hypotheses underpinning the theory that a major trend in crop evolution in the Neotropics has been a dramatic shift in the ecological functionality of plant chemical defenses (Johns, 1990). We propose that testing the following three hypotheses can help identify the most parsimonious fit with data and trends involving the evolutionary transition from wild Capsicum annuum var. grabriusculum to domesticated Capsicum annuum var. annuum:

(H1) a reduction and simplification of the potency of plant chemical defenses against seed predators, foliage herbivores and disease microbes with greater reliance on human intervention to protect the plants;

(H2) a diversification of the levels of potency and mixes of defense chemicals, given the wider range of habitats, cultural management and uses, and broader geography to which the crop chile plants are exposed;

(H3) an intensification of the potency of certain plant chemical defenses, given the need to protect the plants in agro-habitats where they occur at higher density and without as much beta diversity of neighboring plant species to slow the spread of predators, herbivores, competing weeds or diseases.

Crop plants in the Solanaceae (including Capsicum chile peppers) may be extremely useful models for looking at changes in potency, diversity or effectiveness of plant chemical defenses which occur with domestication. This is because their SMs and the ecological roles which these plant defenses play have been intensively investigated in the field and in the laboratory for well over two centuries (Johns, 1990; Eich, 2008). Neverthless, it remains clear that we lack the detailed knowledge needed to determine how particular plant chemical defenses (e.g., specific capsaicinoids) function in repelling (or attracting) various sets of vertebrates, invertebrates and fungi which serve as seed predators, seed dispersers, fruit and foliage consumers or root parasites on various solanaceous crops. Even with the current depth of knowledge available for crop species in the genus Capsicum and Solanum, our understanding of how particular SMs affect the reproduction and survival of wild vs. domesticated solanaceous plants remains in its infancy.

Of the 97–102 genera represented by 2300–2460 distinct species documented in the Solanaceae (Hunziker, 2001; Eich, 2008), SMs (such as the ornithine-derived alkaloids which function as the primary chemical defenses of most of these species) have so far been documented in more than 61 genera (Eich, 2008). Many of the SMs commonly found in the Solanaceae—such as tropane, nicotinoid, pyridine and terpenoid alkaloids—can be toxic or at least repellant to a broad variety of insects as well as to vertebrate herbivores; some also reduce fungal or bacterial infestations of seeds, fruit or foliage. We will focus the rest of this inquiry on the ecological and ethnobotanical consequences of these chemical defenses as found in seeds and fruits of solanaceous crops, with particular focus on chile peppers (Capsicum annuum).

These broad-spectrum alkaloids function as primary chemical defenses in a number of solanaceous crop plants, and in their wild relatives as well. We have therefore placed the domestication of Capsicum species in the context of other domestication studies for the following genera: Jaltomata (xaltomatl, sogorome); Lycium (goji berry); Nicotiana (tobacco); Solanum (potato, tomato, eggplant, garden huckleberry/chichiquelite); Physalis (tomatillo/ground cherry, cape gooseberry/uchuva) (e.g., Johns, 1990; Pickersgill, 2007 among many others). While some of the same alkaloids characteristic of many species in the Solanaceace are present in extremely low concentrations in the foliage of Capsicum species, nearly all the species in this genus have taken up an altogether different strategy—Capsaicinoids, for defending their seeds and fruits from biotic stresses.

Departing from the norm in the Solanaceae—where species principally use broad-spectrum and highly toxic glyco-alkaloids for defense—most Capsicum species instead employ another, unique set of SMs that are not appreciably toxic to animals. In contrast to all other genera and species in the nightshade family, both wild and domesticated chile peppers produce several of the 22 known capsaicinoids, with capsaicin, dihydrocapsaicin and nordihydrocapsaicin being the most prevalent, widely-studied and economically important ones. However, it is likely that each distinct capsaicinoid functions in varying degrees to direct avian seed dispersal or to repel and reduce damage by insects, mammals, bacteria and fungi (esp. Fusarium) (Tewksbury and Nabhan, 2001; Tewksbury et al., 2008b; Haak et al., 2012). Unfortunately, to arrive at a comprehensive EES (Zeder, 2017), we will require more detailed knowledge on the specific ecological effects of 19 of those distinctive capsaicinoids on various faunal and fungal species found in Neotropical habitats.

The ability to produce capsaicinoids is a monophyletic synapomorphic carácter shared by most of the 35 + wild Capsicum species. The exceptions are few, and are found in the wild Andean clade (C. ciliatum = C. rhomboideum, C. scolnikianum, C. geminifolium, C. lanceolatum, and C. dimorphum), and the Longidentatum clade (C. longidentatum) (Eich, 2008; Haak et al., 2012; Carrizo-García et al., 2016).

Pungency in all other wild chile peppers has a simple genetic basis that is expressed only in glands within the placental fruit, where it serves to protect viable seeds from predation by granivorous mammals, or from microbial infestation. It also facilitates the directed dispersal of seeds by frugivorous birds such as thrashers, cardinals, and finches to safe sites for germination and recruitment under nurse plants, providing an unusually direct ecological link to changes in reproductive fitness that is often missing from studies of chemical ecology (Nabhan, 2004; Tewksbury et al., 2008a). Pungency is polymorphic in several wild chile species (Carrizo-García et al., 2016), and such polymorphic populations have been identified along natural environmental gradients (Haak et al., 2012; Carrizo-García et al., 2016). These polymorphisms provide unique opportunities to advance an extended evolutionary synthesis from field comparisons of wild and domesticated subspecies in the same crop species and economic genus (Hernández-Verdugo et al., 2001a; Haak et al., 2012; Chen et al., 2015).

These attributes make chile peppers excellent systems through which to investigate the evolution of adaptive constraints found under various levels of domestication.

Ironically, consumption of the very same capsaicinoids that function as chemical defenses for chile plants have long been used by Mesoamerican cultures as defenses against microbial and invertebrates challenging human health (Nabhan, 2004). Their many indigenous uses as food or medicine has likely benefited overall human health and reproductive fitness in Neotropical environments for well over six millennia (Perry and Flannery, 2007; Kraft et al., 2014); these biomedically-significant ethnobotanical uses mediated by SMs (Mostafa-Kamal et al., 2015) possibly triggered the domestication and diversification of chile peppers.

Capsaicinoids are now the most widely used SMs in the world, even though their commercial production is dominated by landraces of just five species in the genus Capsicum. Now culturally-dispersed far beyond the Neotropics, each continent and its biomes favors different ecotypes of place-based landraces such as the tabasco pepper, ghost pepper, piri-piri, aji, habanero, jalapeño, and long green New Mexican chile. Today, more than a third of the world's human population daily consume food products derived from 2500+ landraces, standard varieties and modern hybrids of chile peppers (Tewksbury et al., 2008b). In fact, we predict that if one includes the number of human daily ingesting and topically-applying chile peppers as pharmaceuticals and folk medicinals then over half the world's population are currently consuming some form of chile peppers for nourishment, health and ultimately, reproductive fitness.

We will focus most of our analysis on discerning historic shifts in plant chemical defenses in the most widely-used Capsicum species – C. annuum L., domesticated in the dry subtropical habitats of Mesoamerica over 6,500 years ago (Kraft et al., 2014). We posit that these shifts in SM enhanced, or at least diversified, the mutualistic relationships among chile peppers and indigenous Mesoamerican cultures, as a result of relatively rapid selection and linguistically-traceable diffussion, that intensified around 6,500 years B.P. (Brown, 2010; Kraft et al., 2014).

It appears that Homo sapiens is one of the few mammalian species which routinely overcome a deep-seated aversion to the consumption of pungent chile peppers (Rozin and Schiller, 1980; Nabhan, 2004), perhaps because the evolutionary benefits of consuming chile fruits outweighed the costs when exposed to environmental challenges, commonly exhibited in certain Neotropical habitats.

# CROP DOMESTICATION

Domestication is the outcome of both conscious and unconcious selection processes that lead to increased co-evolutionary adaptation of plants to cultivation and utilization by humans in managed environments (Gepts, 2010). Paleolithic cultures developed tools, food preparation and plant selection techniques for detoxifying certain plant foods rich in SMs (Johns and Kubo, 1988; Johns, 1990). As such, the coevolutionary response of Mesoamerican cultures to chile peppers certainly included memes, but may also have included the selection of "non-taster" genes in humans for organoleptic tolerance of pungency and bitterness (Nabhan, 2004).

On the other hand, the suite of traits that marks the divergence from its wild ancestor(s) has been defined as the "domestication syndrome" (Harlan, 1992). A domestication syndrome may include selection for combinations of several different morphological and phytochemical traits, including seed retention (non-shattering), increased fruit and/or seed size, changes in branching and stature, changes in reproductive strategy, and, importantly, changes in SMs (Pickersgill, 2007; Gepts, 2010; Meyer et al., 2012).

Often, domestication selects against traits that formerly increased the plant's defensive or reproductive successes in natural environments (Meyer et al., 2012). However, this generalization may not completely fit for SMs such as capsaicinoids in C. annuum in the Neotropics, where a high diversity of landraces and wild populations express some degree of pungency as a natural defense against predators.

Cultural selection can therefore work in opposition to natural selection, and certain domesticated crops may exhibit reduced fitness, or, in some cases, an inability to survive outside of cultivation (Pickersgill, 2007; Gepts, 2010). The very act of moving plants from natural habitats into culturallymanaged habitats such as milpas alters the mix of selection pressures, leading to increased adaptation to cultivation, and to actual physical protection from pests and predators by cultural managers, potentially at the expense of traits conferring fitness in the natural environment (Meyer et al., 2012). In the very least, selection pressures for plant chemical defenses against predators might be relaxed if human intervention with the same predators (eg., rodents) is consistently offered to the crop variety over multiple generations.

# SECONDARY METABOLITES IN PLANTS

Plant chemicals can be divided into two major categories: primary metabolites (PMs) and secondary metabolites (SMs). PMs are substances produced by all plant cells that are directly involved in growth, development, or reproduction (sugars, proteins, amino acids, and nucleic acids). PMs function in basic anabolic and catabolic processes required for respiration, nutrient assimilation, and growth/development (Kliebenstein, 2004; Freeman and Beattie, 2008).

SMs may not be directly involved in growth or reproduction, but they are often involved with plant defense (Freeman and Beattie, 2008), particularly in the case of Capsicum species (Tewksbury et al., 2008b). SMs are considered the major mediators of ecological interactions of plants as a result of their large and diverse biological functions in nature. SMs are produced in response to certain biotic and/or abiotic stress signals or stimuli. They function in the defense against herbivores, microbes, viruses or competing plants, and also as signal compounds to attract pollinating or seed dispersing animals (Wink, 2003). Thus, SMs are very important for plant's survival and reproductive fitness. This complex multirole of SM has led plants to synthesize many different chemical compounds in nature during evolution (Kliebenstein, 2004).

According to their role in plant's defense, SMs have been classified on the basis of their host protection and fostering of beneficial biotic interactions. According to Freeman and Beattie (2008), SMs usually belong to one of three large chemical classes: terpenoids, phenolics, and alkaloids.

Terpenoids include a series of toxic and non-toxic phytochemicals produced in different plant organs that inhibit, repel, or attract other living organisms, such as predators (plant pathogens, herbivores invertebrates, vertebrates) and non-predators (dispersers, pollinators, pest-enemies).

Phenolics include a series of toxic and non-toxic compounds such as flavonoids, isoflavonoids, and phenolic monomers produced in different organs (roots, stems, leaves, flowers, fruits, and seeds). Phenolics and their derivatives have different functions in nature (UV-protectan, antifungal, antibiotic, insecticidal, and others).

Alkaloids are N-compounds produced and aggregated in different organs such as roots, leaves, fruits and seeds. Alkaloidbased SMs may function as bactericides, fungicides, insecticides and allelopathics. Alkaloids may have degrading and digestive effects on different tissues of predators and pathogens. Examples of this type of SM include cafeine, cocaine, morphine, nicotine, atrophine, plus capsaicine and other capsaicinoids. Other Ncompounds important for plant chemical defense include cyanogenic glucosides, defensins, lectins, and hydrolitic enzymes.

Therefore, SMs in chile peppers and other solanaceous plants in Neotropical habitats have evolved as defense mechanisms against microorganisms (viruses, bacteria, fungi), herbivores (molluscs, hemipteran insects, vertebrates), and competing plants. They may also function to attraction of pollinators and seed dispersers by virtue of their fragrances and colors they express in the plants. Regardless of the efficacy of such benefits, SMs require a great deal of plant resources and energy to be produced. Consequently, they may be synthesized and translocated after a pathogen or pest has attacked the plant and triggered their activation. Once activated, these chemical defensive compounds are usually very effective inhibitors of fungi, bacteria, nematodes, and hemipteran insect herbivores.

# CHEMICAL ECOLOGY OF WILD CAPSICUM IN NEOTROPICAL HABITATS

To address the changes in plant chemical defenses that have occurred with the domestication of Capsicum annuum, we must briefly establish the context through which wild chile peppers and other solanaceous plants deal with biotic and abiotic stresses prevalent in the Neotropics. In particular, we will focus on the biotic interactions as well as the biotic and abiotic stresses that wild chile plants may particularly respond to in dry subtropical thornscrub and tropical deciduous forest vegetation types, characteristic of the Sierra Madre Oriental and the Trans-Volcanic Belt in Mesoamerica. At least one EESstyle integration has determined that these vegetation types are among the most likely Neotropical habitats where C. annuum domestication and diffusion may have occurred (Kraft et al., 2014). However, because there has been considerable change in the areas covered by these habitat types over the last 6500 years (Kraft et al., 2014), other proposed geographic areas such as the Yucatan peninsula remain viable enough as putative centers of chile pepper domestication that we do not wish to rule them out (Aguilar-Meléndez et al., 2009).

In contrast, the pungency of wild chile pepper fruit repels small mammals that function as seed predators, but directs their dispersal to safe sites under nurse trees where germination, recruitment and establishment have higher probabilities (Tewksbury and Nabhan, 2001; Carlo and Tewksbury, 2014). The seeds from these pungent wild chiles are also protected from "predation" by Fusarium fungi that might otherwise leave the infected seeds inviable (as evidence shows for C. chacoense). Thus, the directed dispersal adaptations of wild chile peppers afforded to them by the pungency of their specialized SMs–their capsaicinoids—have conferred to them a level of reproductive fitness that has incidentally allowed them to be present in abundance and accessible to human foragers in the Neotropics for millennia.

# CHANGES IN SECONDARY METABOLITE INTENSITY WITH CHILE DOMESTICATION

What are the traits that have been modified as a result of selection under cultivation that have made modern and fully domesticated varieties of chile peppers so poorly adapted to the natural Neotropical habitats? We propose that the morphological and/or phenotypic changes which occurred during cultural selection and domestication of C. annuum have been accompanied by (if not surpassed in importance by) corresponding changes in SMs that regulate ecological interactions of chile peppers with their surrounding abiotic and biotic environments. The complexity and specificity of SMs as chemical mediators of biotic interactions of both wild and domesticated C. annuum in the Neotropics are summarized in **Figure 1**.

Wild populations of chile pepper have coexisted and coevolved with many different organisms of tropical origin. **Figure 1A** focuses on two types of biotic interactions with wild Capsicum species: mutualistic and antagonistic. Every particular plant interaction is regulated by some SM produced and expressed in a particular organ, at a certain phenological stage, in response to specific biotic or abiotic signals. Chile pepper interactions have been strongly influenced by humans and cultural diversity in Mesoamerica over the last 10,000 years. The cultural diversity present in modern Mexico, and a sample of the wide morphological variation and levels of domestication that are currently found in Mexican chile peppers are shown in **Figures 1B,C**. The variation in Mexican chile peppers also applies to the chemical compounds, which may help explain the wide differences in fruit taste and flavor for different purposes and uses across Mexico.

Chen et al. (2015) indicated that among their various functions, SMs play particularly important roles in insect-plant interactions. Studies that have compared chemical defense traits in wild crop relatives and their cultivated counterparts are increasing in number, and their outcomes consistently show that domesticated plants provide a better food resource for herbivores than their more toxic wild progenitors. Several studies provide evidence of such changes in the chemical ecology and biotic interactions along a domestication gradient (Holt and Birch, 1984; Benrey et al., 1998; Rodriguez-Saona et al., 2011; Dávila-Flores et al., 2013). These widely-observed trends seem to contextualize, if not explain, shifts in the chemical defenses of C. annuum during its domestication in certain but not all, Neotropical habitats of Mesoamerica.

To date, most studies of SMs in C. annuum in Mesoamerica have been focused on fruits of fully domesticated commercial varieties for consumption as fresh fruits (jalapeño, serrano, ancho and sweet pepper morphotypes). In addition, there are few ecological field studies of how capsaicinoids in wild Capsicum species of arid North America and tropical South America mediate relationships with native fauna, but they do not specify which capsaicinoid(s) drive those interactions (Tewksbury and Nabhan, 2001; Tewksbury et al., 2008a; Carlo and Tewksbury, 2014; Haak et al., 2014). Most analyses have concentrated on capsaicinoids and few have included other SMs, such as phenolics and carotenoids. The literature available on SMs in chile peppers is focused on their presence in both, vegetative organs and in fruits and seeds (Do Rêgo et al., 2012; Kim et al., 2014). The presence of SMs in different organs and genotypic backgrounds may help explain the existence of natural sources of genetic resistance in Capsicum to particular herbivorous pests and seed predators.

The identities of most SMs remains incomplete among wild C. annuum var. glabriusculum from the Neotropics. However, genetic resistance to Huasteco pepper virus has been documented for wild C. annuum from Nortwest Mexico (Hernández-Verdugo et al., 2001b; Retes-Manjarrez, 2016). Of the known cases of genetic resistance among domesticated chile peppers are their tolerance to Phytophthora capsici and root knot nematodes, first documented in the Criollo de Morelos landrace—CM-334 (Pegard et al., 2005) also, leaf phenolic extracts from domesticated chile landraces have been used to control Alternaria altata in tomatoes.

Crop domestication can lead to a decrease in SMs associated with pest resistance, a trend corroborated by Meyer et al. (2012); they found a decline in levels of some SMs across 203 separate crop varieties, relative to levels in their wild progenitors, including C. annuum. However, other SMs, such as capsaicinoids, have dramatically increased within some natural and domesticated chile pepper landraces (e.g., Bhut Jolokia;

mediators of ecological interactions with wild C. annuum in natural and semi-managed habitats. (Illustration designed by Frida Isabel Luna-Vallejo). (B) Map of Mexico showing indigenous territories, contrasted by colors. The symbols identify particular ecological zones where certain indigenous groups have persisted in modern times. All indigenous groups represented here have documented uses of chile peppers. (Map elaborated by Andres Lira Noriega and Araceli Aguilar-Melendez based on data from the authors, SINAREFI-SNICS-SAGARPA and SNIB/CONABIO 2016; the layer of indigenous territories was provided by Eckard Boege). (C) A representative sample of the wider array current morpho-typic diversity and levels of domestication of chile peppers across Mexico. (Photos by Ivan Montes de Oca Cacheux and Miguel Angel Sicilia Manzo/Image repository CONABIO).

Bosland and Baral, 2007), so that these changes are not unidirectional.

Given that "original" contexts for how wild Capsicum species function and survive in the Neotropics, Table S1 proposes a set of differences that may have been triggered by "balancing selection" during the domestication process. Balancing selection operated in ways that transformed some wild polymorphic populations into fully-domesticated but still heterogeneous C. annuum landraces. We place particular emphasis on levels of SMs and other adaptations that appear to confer reproductive fitness to Capsicum populations in Neotropical habitats.

# OTHER CHANGES OCCURRING WITH DOMESTICATION OF CHILE PEPPERS

We do not wish to presume that shifts in SMs were the only changes which have occurred with the domestication of Capsicum species in Neotropical habitats. We wish to briefly mention several other traits of adaptive significance in Neotropical habitats.

# Loss of Dispersal Mechanisms

Wild chile peppers are naturally dispersed by frugivorous birds to the understory of selected nurse plants (Tewksbury and Nabhan, 2001; Carlo and Tewksbury, 2014), while domesticated chiles depend on human intervention for dispersal. Seed dispersal often involves lost of an abscission zone from some part of the plant. Fruits of wild chile peppers separate easily from the receptacle at maturity. Fruits of domesticated peppers remain firmly attached to the plant. Mature wild chile pepper fruits are consumed and effectively dispersed by a variety of frugivorous Neotropical birds. Domesticated peppers are either too large, or are not attractive to nor dispersed by most Neotropical birds. Different SMs may mediate seed dispersal in wild chiles, but carotenoids in the fruit pulp probably are likely the most important due to bird attraction by their red color. The pyrazine fragrances of chile peppers may also serve to attract certain birds.

# Loss of Seed Dormancy

Most wild chile pepper seeds have staggered seed dormancy, which allows germination and recruitment when optimal conditions occur in a more variable and uncertain environment. Domesticated chiles do not exhibit any seed dormancy (Pickersgill, 2007). Therefore, domesticated chiles would likely have poor recruitment, survival and fitness if placed in most naturally wild environments. Seed dormancy in most wild Capsicum species is mediated by SMs such as ABA, a plant regulator that inhibits seed germination (Marrush et al., 1998; Sariyildiz et al., 2005; Nambara et al., 2010), and lignin, a structurally protective and hydrophobic compound of the seed coat (Randle and Honma, 1981; Tewksbury et al., 2008b; Nambara et al., 2010).

Wild chile pepper seeds with thick lignified testas become increasingly impermeable to water on drying. This feature is disadvantageous for—if not absent from—most domesticated crop seeds, not only because these seeds germinate slowly, but also because they may require prolonged soaking to remove inhibitors from the seed coat (Randle and Honma, 1981; Pickersgill, 2007; Carlo and Tewksbury, 2014). Therefore, domesticated chile peppers generally have thinner testae than their wild progenitors.

# Changes in Organ Size and Quantity

As part of the domestication syndrome, changes in secondary metabolite content may be correlated with other physical and chemical traits, such as nutrient content, size, or biomass (Chen et al., 2015). Compared to most domesticated landraces, wild Capsicum species exhibit smaller leaves, flowers, fruits and seeds, but a larger number of these organs per plant (Pickersgill, 2007). These characteristics—such small but numerous leaves and seeds—confer adaptability, stress reduction, survivability, and bet-hedging strategies to wild chile peppers for the production and dispersal of their seeds in Neotropical habitats (Tewksbury et al., 2008b).

# Increased Morphological Variation

According to Chen et al. (2015), morphological changes arising from domestication can disrupt plant-herbivore-natural enemy interactions, however domesticated chile landraces now exhibit enormous inter-varietal and some intra-varietal heterogeneity in morphological traits.

This factor also is especially marked in the parts of the chile pepper plant used by Mesoamerican cultures. While domesticated chile peppers vary greatly in fruit size and shape, and to a lesser extent in color, wild C. annuum var. glabriusculum populations show little morphological variation in fruit size, shape, and color. In certain coastal Neotropical habitats, chile pepper fruits are selected for particular colors and shapes, said to be the best for seasoning turtle meat, while others, of different color and shape, are known as perfume peppers because they have a fragrant aroma as well as pungency. Pickersgill (2007) and Boster (1985) suggest that such traits result from cultural "selection for perceptual distinctiveness."

In short, the different landraces of chile peppers grown and consumed across Mesoamerica display an astounding range of morphological variation in plant architecture and fruit shape, as well as in fruit color, pungency, and particular cultural uses (Bosland and Votava, 2000). All SMs in Capsicum species, including carotenoids, flavonoids, capsaicinoids, and ascorbic acid, are to some extent, linked with these morphological traits. Boster (1985) has deftly summarized the many references documenting the pronounced differences in morphology between wild and domesticated peppers.

# Changes in Plant Habit Related to Resource Partitioning

Selection for increased harvest index (ratio of harvested to total biomass produced per plant) may result in reduced or suppressed lateral branching (Pickersgill, 2007). Reduced number of inflorescences per plant and producing more synchronous fruit ripening on an individual plant and within a stand, facilitating harvesting of the stand as a whole. Fewer nodes and shorter internodes, greater synchronization of maturation of vegetative branches and fruit ripening is also favored by a determinate habit.

The transition from the perennial indeterminate habit of wild chile peppers to the annualized compact habit of domesticated peppers has been triggered by selection for earliness, larger fruits, compact growth/reduced branching with reduced number of fruits per plant, and more synchronous fruit ripening. Loss of perennial plant habit may be the final/accumulated result of human selection for non-dormant seed, which probably modified fruit and seed morphologies, and SM potencies.

# Changes in Reproduction

In Capsicum species, floral phenology and pollination, as well as fruit and seed development are influenced by different SMs. For example, carotenoid and flavonoid derivatives are secondary metabolites in the flower that attract pollinators. Similarly, fruit and seed dispersal are mediated by SMs which serve to attract seed dispersers. Simultaneously, fruit and seed protection is mediated by particular SMs (capsaicinoids and phenolics) that repel predators of fruits and seeds.

Wild C. annuum is an autogamous plant with protaginous flowers (exerted stigmas) and high rates of outcrossing by insect pollinators, and indeterminate growth in neotropical Mesoamerica. Flower initiation is late, but once initiated is persistent and very prolific, with overlapping stages of flower and fruit development over the season. Fully domesticated C. annuum land races can also be autogamous, but exhibit much lower rates of outcrossing, probably due to more synchrony in anther and stigma maturation. Most of the fully domesticated chile pepper land races exhibit determinate growth under cultivation, with more rapid onset of flower initiation, fruit development and ripening. For such reasons, fruit and seed production of fully domesticated chile landraces would be almost impossible under natural wild environments in the Neotropics.

# Loss of Chemical or Physical Protection Against Biotic and Abiotic Stresses

Many other domesticated crops have partially or completely lost the SMs that protect their wild relatives against predators (herbivores, plant pests and pathogens), and abiotic stresses (drought, salinity, heat, frost, daming radiation, etc.). However, this trend does not necessarily hold true for most domesticated C. annuum land races. Capsaicinoids and other SMs are synthesized in the placental tissue of domesticated chile fruits after flowering as part of fruit development. In other words, in domesticated chiles, SMs may play a small role in chemical defense of plant tissues before fruit and seed development (Meyer et al., 2012; Fernández-Marín et al., 2014).

Protection of wild chile pepper fruits in populations against predators is mostly conferred by capsaicinoids, although flavonoids and phenolics may also play protective roles against predators. However, protection against hervibory in wild chile plants (prior to their flowering) is also facilitated by the "prey refugia" offered by the dense thorny canopies of certain nurse plants. Where they lack nurse plant protection in Mesomerican milpas, domesticated chile peppers must rely on farmers themselves to evict (or to reduce the damage potentially wreaked by) mammalian predators and browsers (Pickersgill, 2007; Gepts, 2010; Padilha and Barbieri, 2016).

With regard to protection against abiotic stresses, wild chile pepper plants employ SMs such as flavonoids, phenolics and vitamin C for protection against drought, heat and daming radiation. In particular, carotenoid derivatives confer protection against plant cell oxidative reactions caused by lethal radiation, such as direct sunlight and UV light (Wahyuni et al., 2013).

Fully domesticated C. annuum landraces express widely varying concentrations of capsaicinoids compared to pungency levels in wild populations. Today, the mildest to most pungent domesticated chiles vary in the capsaicin and pungency content (∼5,000–300,000 SHU); with most (but not all) wild populations being in the medium-to-high range (∼100,000 SHU) of pungency (Eich, 2008). The hottest chile peppers belong to C. chinense and currently there are some cultivars of this species such as "Bhut Jolokia" and "Trinidad Scorpion" which have around 1.0 million SHU (Bosland and Baral, 2007), and "Carolina Reaper," the hottest pepper in the world exceeding 1.5 million SHU (Padilha and Barbieri, 2016). Domesticated landraces of C. annuum may also have larger but more variable amounts of other SMs, including more antioxidant capacity (Wahyuni et al., 2011).

# AGROECOLOGICAL CONTEXT OF MILPA CULTIVATION AS A SELECTIVE PRESSURE

Lack of both seed dormancy and a facultatively perennial plant habit probably enabled the shift from avian dispersal of fruits under nurse plant canopies in the wild to open cultivation of annual plants with non-dormant seeds in milpa agro-ecosystems. The loss of ecological interactions with birds and nurse plants due to intentional seed-saving and dispersal by humans must have generated incidental changes in SMs. Shifting the patterns of SMs through such selection could explain, in part, the emergence of new chemotypes, genotypes and morphotype landraces under cultivation in milpas within the Neotropics. The Mesoamerican milpa agroecosystem may have gradually replaced the nurse plants in agroforestry systems during the early domestication of C. annuum, but as it did, it likely accelerated unconscious selection away from wild chemotypes and morphotypes.

# SYNTHESIS OF COEVOLUTIONARY SHIFTS OCCURRING WITH DOMESTICATION

We suggest that incipient cultivation and "re-balancing" selection of seed germinability in polymorphic founder populations of C. annuum var. glabrisculum in Mesoamerica around 6500 BP rapidly led to changes in gene frequencies associated with other adaptive traits. Curiously, this is roughly the time period when a new meme –a chile-processing technology and associated culinary techniques–first became evident in the prehistoric cultures of south-central Mexico. This technology was called mollicaxtli in Nahuatl (now molcajete today in Spanish, and consists of a round three-legged, grinding bowl and pestle for crushing dried spices, made out of fired clay or volcanic stone (Vela, 2009).

The molcajete's sudden emergence and wide diffusion suggests that domesticated chile pepper were not merely being eaten fresh, but surplus harvests were being dried and stored between growing seasons for use as a dried spice, condiment, medicine or vermifuge. Undoubtedly, these multiple uses of small, dried chile "pods" emerged long before the selection for larger fleshier fruits, which could be used as a vegetable that was stuffed with meats, fruits or other spices. Thus, a new technology (molcajetes) and its associated culinary uses, as well as seed saving and trade beyond their ancestral habitats may have accelerated selection for a wider range of Neotropical habitats and overall diversification of domesticated chile pepper landraces.

Most remarkably, chile pepper fruits of some cultivated landraces are many times hotter or milder than those of wild populations, suggesting that domestication has not only diversified, but shifted total pungency in both directions—to higher "heat levels" in some varieties (e.g., ghost peppers), and to lesser levels in nearly non-pungent varieties (e.g., bell peppers). There is limited evidence that the mixes of capsaicinoids found in cultivated chile varieties are also more variable than those in wild populations, but comparable sampling has been poor. Neverthless, we see evidence for both (H2)—a diversification of the levels of potency—and (H3)—an intensification of potency of selected SMs with chile pepper domestication.

In the case of milder (less pungent) chile peppers, we assume that farmers' protection of the plants compensates to some extent for lower levels of chemical defenses. Haak et al. (2012) have confirmed tradeoffs between expression of capsaicinoid pungency, and yield under water-stressed conditions. While capsaicinoids remain the most important plant chemical defenses in most domesticated chiles as they are in wild peppers, the roles of other secondary metabolites found in lower concentrations should not be dismissed.

# MESOAMERICAN HUMAN/CHILE PEPPER COEVOLUTION IN RELATION TO BENEFITS OF CHEMICAL DEFENSES

According to paleobiolinguistic reconstructions of the presumed origins and diffusion of domesticated chile peppers in Mesoamerica, the oldest reconstructed term for cultivated chiles is found in proto-Otomanguean from south-central Mexico, estimated to be in transcultural circulation by 6592 B.P. (Brown et al., 2013; Kraft et al., 2014). This evidence is supported by archeological analyses that confirm the presence of domesticated chile fruit and spice-grinding molcajetes at sites along the Sierra Madre Oriental/Trans-Volcanic by 6000 years ago, especially in seasonally dry subtropical thornscrub (Kraft et al., 2014).

Nevertheless, several lines of research agree that the origin of the domesticated C. annuum landraces may have also occurred elsewhere within the broader Mesoamerican region (Eshbaugh, 1970; Hernández-Verdugo et al., 2001a; Perry and Flannery, 2007; Pickersgill, 2007; Aguilar-Meléndez et al., 2009). In other words, the precise location or locations of domestication of C. annuum in Mesoamerica still remains unknown.

Based on linguistic analyses, Brown (2010) suggests that the earliest plant management in Mesoamerica was of grain, succulent and oil crops; they became cultivated as staples no later than 7000 years ago. The earliest cultivation of spices (including chiles) for seasoning these staples came centuries later.

In short, staples such as maize, maguey, nopal and avocado were probably cultivated to provide seasonal surpluses for storage and consumption at least a thousand years before the earliest detectable onset of chile pepper cultivation as a spice, anthelmentic medicine, vermifuge or condiment (but most likely not as a fresh green vegetable).

The pervasiveness of the use of chile peppers in treating illnesses in Mesoamerica and Aridoamerica (N Mexico and SW USA) is without peer, among any of the other crops domesticated in these regions. This fact alone suggests that the culinary uses of Capsicum were not the only catalysts to domestication. Table S2 shows several ancient medicinal uses derived from extensive studies of indigenous farming cultures in Mesoamerica. Collectively, this information suggests that a "Mesoamerican intellectual tradition" of indigenous medicinalculinary knowledge (López Austin, 2001; Good, 2005) may have guided the selection of SMs and other traits in chile pepper landraces. The very cultural persistence of chile plants (as well as maize, etc.) within milpas and dooryard gardens in this modern globalized world, is clear evidence that ancestral cultural traditions spanning 6000–7000 years, still have adaptive value today.

In addition, the milpa management traditions have been culturally maintained to keep alive what is culturally perceived as a sacred agroecosystem that maintains and regenerates everyday life, community values and collective identities among many Mesoamerican societies (Bonfil-Batalla, 2012; Good, 2015). The medicinal, ceremonial and culinary uses of chile peppers by over 60 native cultures in Mesoamerica are embedded a small but inseparable and integral part of a broader cosmovision, one that persists up through this present moment (Alcorn, 1984; Long-Solís, 1986; López Austin, 2001; de Avila, 2008). Any true EES that attempts to use chile pepper domestication as a model system must inevitably take these cultural memes into account.

There is no reason to assume that chiles were first gathered, then cultivated, for a single use, given that tobacco, cacao and other early crops also had multiple uses. However, as staple crops grew in yields and diets became more redundant, chile peppers may have played critically-important roles in protecting grains and legumes aggregated in storage facilities from postharvest consumption by insect pests and fungi common in the Neotropics. Some of these same chemical defenses in chile peppers may have protected humans who were aggregated into increasingly dense habitations from intestinal parasites, and from body lice or fleas. Finally, the SMs in chile peppers may also have become increasingly necessary elements of the traditional diets and pharmacopeia as "nutraceuticals" that counteracted the greater redundancy in agricultural diets.

The pharmacological utility of SMs in chile peppers is not restricted to the control of fleas, lice and intestinal microbes. They have recently been demonstrated to be effective in reducing intestinal infections by aquatic helminthes of the same group as the intestinal worms that cause ill health and sluggishness among one third of the world's population, especially children in tropical climes (Mostafa-Kamal et al., 2015). This is a clear example of how plant chemical defenses have proven efficacy for "defending" human health against various biotic stresses among those who consume the same plant as a food, a medicine or both (Mostafa-Kamal et al., 2015).

In Table S2, we wish to underscore the myriad medicinal uses retrieved from historical documents that persist to this day in Mesoamerican intellectual traditions. Out of 47 ailments to which chile peppers were applied, 24 of these were recorded among Maya communities. In 2000, fieldwork in Yucatecan Mayan communities documented the persistence of medicinal uses of at least seven different types of chiles (Aguilar-Meléndez and Lira-Noriega, 2018), suggesting that the diversification of chile peppers may continue to generate direct benefits to human health.

# CONCLUSIONS

In this paper three hypothesis were evaluated and discussed:

(H1) A reduction and simplification of the potency of plant chemical defenses against seed predators, foliage herbivores and disease microbes with greater reliance on human intervention to protect the plants. This assumes that fully domesticated modern and commercial varieties of peppers under intense monoculture are more susceptible to predators (insect pests and diseases), than their wild progenitors, because they produce less number and concentration of SM in fruits, seeds, and leaves.

(H2) A diversification of the levels of potency and mixes of defense chemicals, given the wider range of habitats and broader geography to which the crop plants are exposed. This assumes that different C. annuum landraces in different agroecosystems produce variable amounts and types of SM.

(H3) An intensification of the potency of certain plant chemical defenses, given the need to protect the plants in agrohabitats where they occur at higher density and without as much beta diversity of neighboring plant species to slow the spread of predators, herbivores, competing weeds or diseases. This assumes that some domesticated landraces and modern varieties produce larger concentrations of valued SMs (capsaicinoids and carotenoids) under intense monoculture, compared to their wild progenitors.

Of these three hypotheses, we see more evidence supporting both H2 and H3, with respect to the diversification and heightening of pungency through chile pepper domestication. H2– the diversification of levels in SMs under domestication– seems to fit with the mechanism of "balancing selection," in the sense of maintaining polymorphisms in Mesoamerican chile pepper landraces. The H3 trend has mostly been in more recently advanced cultivars of chile peppers outside their area of Neotropical origins. The H1 trend toward a reduction in pungency and other SMs such as phenolics and carotenoids in fruits and other organs is most evident in the recently advanced "bell pepper" group of chile landraces and cultivars, which are also most popular outside of the Neotropics. There is no question that sweet bell pepper cultivars of C. annuum must rely on human protection to survive against different predators that may prey on roots, leaves, fruits, and seeds. While birds may damage bell peppers grown in temperate climates outside of the Neotropics, they are virtually ineffective in dispersing the fruit (or most seeds within the fruit) to safe sites for germination and recruitment.

We conclude that contrary to trends in other crops, domestication has not necessarily reduced potency or homogenized the levels of chemical defenses—or at least of capsaicinoids—in chile pepper fruits. It has diversified capsaicinoid potency levels among and across domesticated varieties, compared to those found in most wild chile peppers. However, scientists still lack sufficient evidence to conclude that such diversification has occurred in any other SMs involved in chile pepper plant defense.

The likely diversification of SM production and/or concentration in domesticated C. annuum is the result of differential human selection of different allelic combinations including selection of many recessive genes, under different environments and managed ecosystems—that are only rarely expressed in truly wild populations (Haak et al., 2014).

Higher concentrations of pungent compounds such as capsaicin may confer better adaptation and fitness to chile pepper crops under novel environments. These highly pungent varieties are now finding new uses in pharmacological and culinary uses, but the majority of the world's human inhabitants continues to directly use wild or domesticated landraces of chile peppers medicinally and gastronomically as they have for centuries.

There is plausible evidence from diverse cultures in Mexico that the SMs expressed in C. annuum fruits have been efficacious in reducing human diseases as well as infestations of internal and external parasites. This may in part explain why so many of the distinctive medicinal uses of chiles persist in nearly every Mesoamerican and Aridoamerican culture today. The nutritional and medicinal benefits of chiles may initially appear diffuse or minor to evolutionary ecologists, but their collective benefits as perceived by their "co-evolved" Mesoamerican cultivators, curanderas, cooks and consumers are impressive.

The extraordinary potency and the current intensity of gastronomic and pharmacological uses of chile peppers (Bosland and Votava, 2000) suggest that chile peppers should no longer be relegated the status of a "minor crop" as standard economic botany references and global agricultural statistics have done in the past. By 2010, global production of domesticated Capsicum fruits had reached 1.8 million ha, with more than 29 million metric tons annually harvested (Wahyuni et al., 2013). Their production continues to expand, while their culinary as well as medicinal and pest-repellent uses continue to diversify.

We should acknowledge that the current efficacy and economic significance of chile peppers' secondary metabolities in our diets and pharmocopieas is not merely due to the historic inventiveness of and mutualistic interactions with our own kind. It has benefited from the selective pressures by fungi, hemipteran insects, nematodes and rodents, as well as the directed dispersal of chile seeds by numerous bird species in the Neotropics. As such, there remains much to be learned by further advancing analyses of chile domestication to serve as a model for extended evolutionary synthesis.

# AUTHOR CONTRIBUTIONS

JL-R: designed research and wrote the paper; GN: designed research and wrote the paper; AA-M: wrote the paper.

# FUNDING

Funding for this publication comes in part from Programa de Fortalecimiento a la Calidad Educativa (PFCE) of the Universidad Autónoma de Aguascalientes, México.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00048/full#supplementary-material

# REFERENCES


their wild counterparts. BMC Plant Biol. 14:1599. doi: 10.1186/s12870- 014-0385-1


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer, DP, and handling Editor declared their shared affiliation.

Copyright © 2018 Luna-Ruiz, Nabhan and Aguilar-Meléndez. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Large Seed Size of Domesticated Lima Beans Mitigates Intraspecific Competition among Seed Beetle Larvae

#### Maximilien A. C. Cuny <sup>1</sup> , Gwen J. Shlichta<sup>2</sup> and Betty Benrey <sup>1</sup> \*

<sup>1</sup> Laboratory of Evolutionary Entomology, Institute of Biology, University of Neuchâtel, Neuchâtel, Switzerland, <sup>2</sup> Department of Biology, Edmonds Community College, Lynnwood, WA, United States

The domestication of beans has selected for larger seeds in cultivated plants compared to their wild relatives. This has not only resulted in an enhanced resource for humans, but also for the insects that feed on these seeds. Seed beetles that attack wild and cultivated seeds often lay several eggs on a single seed. We hypothesized that the larger seed size of domesticated beans will mitigate the competition among the larvae that hatch from these eggs, with important implications for their growth and survival. To test this we examined how seed size of wild and cultivated Phaseolus lunatus (lima bean) affect the performance of the Mexican bean weevil Zabrotes subfasciatus, an important pest of beans in Mexico. A negative correlation was found between the initial number of eggs on a seed and the weight of female beetles that emerged, but only for the much smaller wild seeds. Similarly, beetle survival was found to be negatively correlated with competition intensity only on wild seeds. Our results imply that by selecting for larger seeds, domestication of P. lunatus has reduced the intensity of intraspecific larval competition of Z. subfasciatus.

Keywords: plant-insect interactions, bean weevil, seed pest, intraspecific competition, Phaseolus lunatus, seed size, domestication syndrome

# INTRODUCTION

Increasing evidence shows that plant domestication has altered the strength and nature of their interactions with other organisms (Chen et al., 2015a; Rowen and Kaplan, 2016; Whitehead et al., 2017). Cultivated plants differ from their wild ancestors in a suite of phenotypic traits, collectively known as the domestication syndrome. These include traits related to the ease of cultivation and harvest, as well as morphological and chemical traits that ensure higher yields and enhanced nutritional value. Selection for these traits has commonly resulted in larger tissue mass or organ size, higher nutrient content and decreases in physical defenses and toxic chemical compounds (Meyer et al., 2012). These changes in cultivated plants have been shown to affect the food choices and performance of insects that attack them (Chen et al., 2015a,b, and references therein). This is particularly evident when crops occur in the native range of their wild relatives (Chen et al., this issue), as insects adapted to wild plants are suddenly faced with a more abundant and often more nutritious and less toxic resource.

Phaseolus lunatus (Lima bean), one of the five domesticated species of the genus Phaseolus is of Andean and Mesoamerican origin. Lima beans were domesticated at least twice, one domestication

#### Edited by:

Alejandro Casas, Instituto de Investigaciones en Ecosistemas y Sustentabilidad, Universidad Nacional Autónoma de México, Mexico

#### Reviewed by:

Maria Pappas, Democritus University of Thrace, Greece Paul Gepts, University of California, Davis, United States

\*Correspondence:

Betty Benrey betty.benrey@unine.ch

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> Received: 18 July 2017 Accepted: 09 November 2017 Published: 23 November 2017

#### Citation:

Cuny MAC, Shlichta GJ and Benrey B (2017) The Large Seed Size of Domesticated Lima Beans Mitigates Intraspecific Competition among Seed Beetle Larvae. Front. Ecol. Evol. 5:145. doi: 10.3389/fevo.2017.00145 event occurred in the Andean mountains of Ecuador and Northern Peru and a second event in central-western Mexico (Motta-Aldana et al., 2010). Beans went through further domestication events and adapted to a wide variety of climatic regimes and ecological conditions (Martínez-Castillo et al., 2008; Motta-Aldana et al., 2010; Serrano-Serrano et al., 2012; Chacón-Sánchez and Martínez-Castillo, 2017).

Changes resulting from domestication of the genus Phaseolus mainly involve an increase in pod and seed size, decreased shattering, reduction in levels of toxins, such as lectins, lectin-like proteins, and cyanogenic compounds (only in P. lunatus), and an overall increase in proteins and minerals (Delgado-Salinas, 1988; Smartt, 1988; Sotelo et al., 1995). Throughout their distribution range in Mesoamerica, cultivated and wild bean plants coexist in sympatry (Gepts, 1988; Piñero and Eguiarte, 1988; Martínez-Castillo et al., 2014; Silva et al., 2017), allowing for a frequent exchange of insects and pathogens between wild and cultivated forms (Leroi et al., 1990; Lindig-Cisneros et al., 1997; Alvarez et al., 2007; Zaugg et al., 2013). It is well documented that herbivorous insects that achieve pest status usually continue to exist in natural habitats alongside managed ones (Mitchell et al., 2016). Once cultivated beans are harvested and seeds are transported to storage places, they continue to be in close proximity to wild plants and are exposed to the insects that attack them (Alvarez et al., 2005, 2007). Furthermore, humanmediated migration as a result of farmers exchanging or selling seeds in local or regional markets may increase the spread of insects that originate from wild populations (Alvarez et al., 2007). This constant exchange of insects between wild and cultivated populations has important implications for pest pressures in agriculture. This is particularly true for bruchinae beetles that infest cultivated fields in Mexico, for which it has been shown that geographic distance between cultivated and wild populations greatly explains the patterns of infestation rates (Alvarez et al., 2005, 2007). Moreover, if cultivated plants offer a more reliable and nutritious resource than their wild counterparts this can explain why seed beetles thrive in cultivated seeds.

Numerous studies have shown that seed size greatly influences the oviposition decisions of adult seed beetles, and that size can often be used as a good indicator of seed quality for the developing larvae (Janzen, 1977; Fox and Czesak, 2000; Guedes et al., 2010; Chen et al., 2015b; Oliveira et al., 2015). Indeed, for seeds in the genus Phaseolus, seed size has been found to be the best predictor of oviposition choices (Moreira et al., 2015; Hernandez-Cumplido et al., 2016). Thus, we would predict that, faced with a choice, adult females would preferentially oviposit on cultivated seeds rather than on much smaller wild seeds. We further predict that inside the cultivated seeds the larvae will be exposed to lower levels of conspecific competition, which may be an important reason for the oviposition preference.

We tested this hypothesis with the Mexican bean weevil Zabrotes subfasciatus, and wild and cultivated seeds of Lima bean, P. lunatus. Our specific goal was to test the effects of increased seed size in cultivated varieties on the interaction with the seed beetle. In controlled laboratory experiments using seeds from three cultivated varieties and three wild populations of Lima bean, we investigated the oviposition patterns of adult females and the subsequent performance of their progeny resulting from of seed-size mediated competition among beetle larvae.

# MATERIALS AND METHODS

# Seeds

For the experiments we used seeds from three cultivated varieties and three wild populations of P. lunatus (**Figure 1**). Wild seeds were collected in locations along the Pacific coast of Mexico were Lima bean grows naturally. They are located at: Hidalgo near San Jose Manialtepec ("HGO"; 15.575564, −97.151350), Experimental Campus of the Universidad del Mar ("UMAR"; 15.923366, −97.151892), and near Largartero ("INK"; 15.725127, −96.656343) (as described in Shlichta et al., 2014). We collected seeds from 10 plants per site (only six for HGO).

The following domesticated seed varieties were obtained from W. Atlee Burpee & Co (Warminster, PA, USA): Jackson Wonder, Fordhook 242 Bush Bean and Burpee's Best Pole Bean (we named them "JACK", "FORD," and "BURP," respectively). The choice of these varieties was made based on previous studies with several commercially available cultivated varieties, in which we found that beetles develop well and do not appear to discriminate with respect to their different genetic pool (Shlichta et al. unpublished data). Thus, because we wanted to have extreme variation in seed size in order to test our hypothesis, the choice was made based on this variation and not on their domestication history. These seeds represent a mixture of two and perhaps three genetic pools; "JACK" is of Mesoamerican origin and "FORD" of Andean origin (Nienhuis et al., 1995; Ernest and Kee, 2008), we do not have information regarding the genetic pool "BURP." Although there is variation in seed size and color among these three cultivated varieties, variation in size is greater between wild and cultivated seeds (Supplementary Figure 1).

# Insects

The Mexican bean weevil Z. subfasciatus, native to Mesoamerica, attacks seeds of several wild and cultivated species in the genus Phaseolus throughout Mexico, Central and South America (Credland and Dendy, 1992; Benrey et al., 1998; Romero and Johnson, 2000), It is considered one of the most important pests in bean cultivation and storage (Birch et al., 1985; Leroi et al., 1990), not only in the Americas but also in tropical regions of Asia and Africa (Davies, 1972). Females glue their eggs on the seed coat and upon emergence, first instar larvae bore into the seed, where they feed, develop, pupate and then emerge as adults (Benrey et al., 1998).

This beetle is particularly suited to test our hypothesis because females do not avoid seeds with previously laid eggs and may lay many eggs on a single seed, even when seed availability is not limited. Indeed, a single seed has been observed to present up to 63 eggs lay by multiple females (Teixeira and Zucoloto, 2012), even though larval survival under these conditions is highly unlikely (Cuny, personal observation). Once larvae enter the seed, they are confined to it for their entire development until adulthood. If several larvae are inside the seed, they can experience high levels of competition for both space and food resource.

Zabrotes subfasciatus has been reared in our lab for several years on cultivated seeds of Phaseolus vulgaris (Vivien Paille red Kidney, obtained from MultiFood, 3238 Gals, Switzerland; see Campan and Benrey, 2006 for details on the rearing). To control for inbreeding effects, every year new fieldcollected individuals from Mexico are added to the colony and allow to mix for several generations before being used in experiments. All the insects described in this experiment were <4 days old.

# Experimental Protocol

Five seeds of one of the varieties or populations were placed in a plastic Petri dish (28 × 23 × 5 mm, Semadeni AG, A4686). Ten Petri dishes were set up for each variety or population (60 in total). One male and one female beetle were introduced into each dish for 5 days, after which the number of eggs laid on each seed was counted and the seeds were individually stored in falcon tubes at 28◦C. Beetles complete their development on average in 25 days. Dishes were checked daily and we recorded: larval survival (number of adults that emerged divided by the initial number of eggs laid on the seed), adult sex (determined from elytra patterns and size; Oliveira et al., 2015) and weight (to the nearest 0.01 mg with an analytical balance Mettler AE163, Switzerland). In parallel, in order to confirm the size difference between wild and cultivated lima bean seeds, 20 uninfested seeds (20 seeds per cultivated variety and per wild population) were weighed and measured using a binocular magnifier with an ocular scale.

Finally, we conducted an experiment to evaluate the effect of seed size on female oviposition independent of other factors linked to bean domestication. Seeds from each cultivated variety and wild population were selected and divided in two groups; small and large (chosen from the available natural variation within each seed type). Two seeds of different size from the same variety or population were placed in a Petri dish (as described in the previous experiment), and one male and one female beetle were introduced. Three days later, we counted the number of eggs laid on each seed. Based on previous studies, we know that a 3 day period is sufficient for beetles to make an oviposition choice and at the same time assures that not to many eggs are laid on a single seed (Campan and Benrey, 2006).

# Statistical Analysis

Data were analyzed using SAS (SAS Institute, 2002)<sup>1</sup> . SAS Institute Inc., statistical package. Assumptions of normality and homoscedasticity were tested before each test. Linear mixed models (PROC MIXED) or generalized linear mixed models (PROC GLIMMIX), followed by a post-hoc analysis (Tukey) were used to compare data on seed size, weight, the number of eggs laid on the seeds, adult sex ratio and survival. Correlations have been tested using Pearson or Spearman correlations tests (PROC CORR). Seeds and Petri dishes were included as random factors in the models and seed domestication status, as well as seed varieties and population nested in domestication status were included as fixed factors (to account for natural variation among the three cultivated varieties and the three wild populations). Seeds with only one egg were not included in the analysis of beetle survival. Females being generally heavier than males, their weight was analyzed separately. For the experiment performed to test the relationship between seed size and number of eggs within each cultivated variety or wild population, seeds with no eggs were excluded from the analysis.

# RESULTS

Measurements of seed size and weight confirmed that cultivated seeds are significantly (∼60%) larger and heavier than wild seeds [**Figure 2**, N = 40, F(1.74) = 172.8, p < 0.001, and N = 40, F(1.74) = 281, p < 0.001 for size and weight, respectively]. The fixed factor of population and variety nested within seed domestication status was significant for seed size and weight [F(1.74) = 6.63, p = 0.002, and F(1.74) = 59.9, p < 0.001, respectively].

Female beetles laid significantly more eggs (2-fold) on seeds from cultivated varieties than on wild seeds [**Figure 3**, Nwild = 41, Ncultivated = 51, F(1.233) = 13.32; p < 0.001]. We also found a significant effect of population and variety nested within seed domestication status on the number of eggs laid per seed [F(4.236) = 28.86, p < 0.001]. Finally, within each variety and population, the relationship between seed size and number of eggs laid was not significant (Supplementary Figure 2). This suggests that the variation in seed size within wild or cultivated seeds is not large enough to influence ovipositing females.

Larval survival (expressed as the percentage of adults that emerged per seed) was negatively correlated with the number of eggs laid on wild seeds (**Figure 4**, N = 51, r = −0.32, p = 0.023), but no significant correlation was found for survival on cultivated seeds (N = 43, r = −0.21, p = 0.17). Similarly, female weight was negatively correlated with the competition intensity (expressed as the number of eggs per the seed) when they developed in wild seeds (**Figure 5B**, N = 39, r = −0.48, p = 0.002), but not in cultivated seeds (**Figure 5A**, N = 27, r = −0.19, p = 0.34). However, male weight was only marginally significant correlated

<sup>1</sup> Statistical Analysis System (2002).

FIGURE 2 | (A) Mean size of domesticated and wild bean seeds. [N = 40, <sup>F</sup>(1.74) <sup>=</sup> 172.8, <sup>p</sup> <sup>&</sup>lt; 0.001]. (B) Mean weight of domesticated and wild bean seeds. [<sup>N</sup> <sup>=</sup> 40, <sup>F</sup>(1.74) <sup>=</sup> 281, <sup>p</sup> <sup>&</sup>lt; 0.001]. Different letters indicate a significant difference. Bars are means ± SE.

with number of eggs on in wild seeds (**Figure 5B**, N = 37, r = −0.32, p = 0.056) and this correlation was also not significant in cultivated seeds (**Figure 5A**, N = 34, r = −0.018, p = 0.9). Finally, we did not find a difference in the sex ratio of beetles that emerged from domesticated or wild seeds [F(1.79) = 0.03, p = 0.868; Supplementary Figure 3], nor a significant effect among cultivated varieties or wild populations [F(4.79) = 0.74, p = 0.57].

## DISCUSSION

For pulse crops, larger seed size is one of the major agronomic traits that were selected for during domestication (Evans, 1993; Fuller, 2007). Larger seeds not only result in larger yields (Kluyver et al., 2017), but have also been associated with increases in germination success and seedling competitive ability and survival (Westoby et al., 2002). However, increases in seed size also have been repeatedly shown to be correlated with an increase in the likelihood of herbivore attack (reviewed in Chen et al., 2015b). Here, we found again support for this hypothesis; female beetles laid more eggs on the larger cultivated seeds of Lima bean than on the smaller wild seeds. Further, our results support the hypothesis that larger seeds offer a better resource for the Mexican bean weevil and as a consequence mitigate the intensity and negative effects of larval competition. In addition to and despite the higher number of eggs laid on cultivated seeds, more and larger adults emerged from these seeds.

Earlier studies with Phaseolus beans and various species of Bruchinae beetles, support our findings that seed size largely explains the observed patterns of oviposition and larval performance (Paukku and Kotiaho, 2008; Moreira et al., 2015; Oliveira et al., 2015; Hernandez-Cumplido et al., 2016). In a study aimed at examining the role of cyanogenic glycosides of Lima bean seeds on beetle performance, Shlichta et al. (unpublished data) conducted an experiment similar to the one described here but allowing only one larva of Z. subfasciatus to develop in each seed. They found that in the absence of larval competition within the seed, whether seeds were wild or cultivated did not affect the survival and average weight of the emerging adults. In another study with wild Lima bean seeds, Hernandez-Cumplido et al. (2016) found that under field and laboratory conditions, beetles laid more eggs on larger seeds. Also, using seeds from different wild bean populations, Moreira et al. (2015) found that two Bruchinae species, Acanthoscelides obtectus and Z. subfasciatus, laid more eggs and had higher survival on the larger seeds of P. coccineus than on the smaller seeds of P. vulgaris.

For seed beetles, seed size can be a reliable indicator of seed quality (Fox and Czesak, 2000; Cope and Fox, 2003). For example, Cope and Fox (2003) found that when females of the seed beetle, Callosobruchus maculatus were presented with seeds of varying sizes, they distributed their eggs in a manner that maximized resource availability for all offspring. C. maculatus rejects seeds that already carry eggs (Messina and Renwick, 1985). For these insects, the presence of previously laid eggs can therefore also serve as a good indicator of the quality of the seed, as it reflects the level of competition that their offspring will face inside the seed. For Z. subfasciatus this appears not to be always the case (Campan and Benrey, 2006). Although females prefer to oviposit on uninfested seeds, if they do not have a choice, they will oviposit on seeds that already have eggs (Teixeira and Zucoloto, 2012, M. Cuny, personal observation). Even if the probability of larvae surviving under high egg densities is very low. For females of this species, it seems advantageous to rely on cues such as seed size that will help minimize larval competition and maximize lifetime fitness. Limited amounts of resource inside the seed for the developing larvae will not only affect the intensity of

indicate a significant difference. Bars are means ± SE.

competition and subsequent survival, but also the size of the emerging adults, with important consequences for their fitness. Female fitness is dependent on their fecundity, which is directly dependent on body size (Dendy and Credland, 1991; Colegrave, 1993; Callejas, 1996), while male size although not so directly linked to reproductive success, can affect mating success (Savalli and Fox, 1998). Earlier studies with C. maculatus found that seed size and the initial number of eggs on the seed influenced the weight of emerging adults (Credland et al., 1986; Giga and Smith, 1991; Colegrave, 1995). For Z. subfasciatus, we found that seed size mostly affects female but not male size and only on the smaller wild seeds. This result can be explained by the overall smaller size of males (on average 30% smaller and lighter than females), implying that they may be less limited by the availability of resources for development and thus not as affected by larval competition inside the seeds.

It is important to note that the cultivated and wild seeds used in this study do not only differ in their size, but also in other traits that are part of the domestication syndrome of Phaseolus beans resulting from adaptations to cultivation, harvesting practices and human preferences. These other changes in bean traits can all have an influence on beetle oviposition decisions and larval performance. Wild seeds are harder, have a thicker testa and an inconspicuous dark brown color, whereas cultivated seeds have been selected for faster germination, hence are softer and have a thinner seed coat permeable to water and there is a vast color variation among varieties. Physical features of the seeds are known to affect beetle oviposition behavior and the ability of larvae to burrow into the seed (Chavan et al., 1997; Plaza, 2001; Boeke et al., 2004). Similarly, nutritional and defense chemical compounds present in the testa and inside the seed are known to interfere with the development and affect the survival of seed beetles (Goossens et al., 2000; Moraes et al., 2000; Silva et al., 2004), and their concentrations can differ between wild and cultivated accessions (Sotelo et al., 1995; Zaugg et al., 2013). Particularly, for Z. subfasciatus, earlier studies have documented differences in its performance when reared on cultivated or wild beans (Schoonhoven et al., 1983; Benrey et al., 1998; Campan and Benrey, 2006), as well as differential performance of beetles on wild seed populations that vary in their protein or phenolic content (Moreira et al., 2015; Hernandez-Cumplido et al., 2016). These differences in physical and chemical traits between wild and cultivated seeds will undoubtedly influence the oviposition decisions and performance of the Mexican bean weevil. Yet, our results unequivocally demonstrate that the difference in seed size between cultivated and wild seeds plays a major role in the oviposition and performance differences. Although we cannot completely disentangle seed size from other factors associated with the domestication status of the seeds, one key finding of this study is that the larger seed size of cultivated beans, independent of their genetic pool of origin, mitigates the potential negative effects of larval intraspecific competition, a process that in nature controls the size of populations (Begon et al., 2009). This additional consequence of bean domestication implies that the presence of bean fields in areas where wild beans occur naturally provides new ecological opportunities for associated insects. The expansion to a new and more profitable resource favors individuals that exploit these novel resources that provide conditions of relaxed competition (Van Valen, 1965). Yet caution should be taken to extrapolate our results to natural situations. The transferability of these results to the field would require additional measurements on variation in seed size and insect oviposition in natural conditions.

Nonetheless, these findings have important evolutionary and applied implications. Divergent selective factors that act on the plants and insects associated with wild and cultivated bean

# REFERENCES


populations can lead to specialization and in extreme cases genetic differentiation and host race formation (Alvarez et al., 2007; Laurin-Lemay et al., 2013; Kenyon et al., 2015). There is further evidence for our bruchid-bean system that shows that bean domestication has selected for different behaviors in host use, not only in seed beetles, but also in the natural enemies of these beetles (Benrey et al., 1998; Campan and Benrey, 2004; Aebi et al., 2008). Yet, strong human-mediated dispersion of cultivated beans and these associated organisms will most likely result in continuous genetic mixing and will prevent selection for divergent behaviors that could lead to genetic differentiation of insects specializing on wild or cultivated seeds (Alvarez et al., 2007; Laurin-Lemay et al., 2013).

Finally, it is important to emphasize that studies in regions where cultivated plants coexist with their wild relatives allow us to understand the interplay between natural and humanmediated selection and how they interact to shape the presentday associations between plants and insects in agricultural and natural systems (Chen et al. this issue). For our study system this is also important from an applied perspective, as beans are a major staple food in many countries of Mesoamerica as well as in other regions of the world (FAO, 2013). The development of strategies that will allow us control pests in this important crop might be facilitated by unraveling the changes in interactions among insects and plants that resulted from plant domestication.

# AUTHOR CONTRIBUTIONS

The three authors conceived and designed the experiment and participated to the writing of the paper; MC performed the experiments and analyzed the data.

# ACKNOWLEDGMENTS

We thank Johanna Gendry for her assistance with data collection. Ted Turlings and two reviewers made useful suggestions that helped to improve the manuscript. This research was financially supported by the Swiss National Science Foundation (Project No. 31003A\_127364) awarded to BB.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2017.00145/full#supplementary-material

insects associated with domesticated plants. Evolution 61, 2986–2996. doi: 10.1111/j.1558-5646.2007.00235.x


landraces in the Americas: evidence from chloroplast and nuclear DNA polymorphisms. Crop Sci. 50, 1773–1787. doi: 10.2135/cropsci2009.12.0706


with the development of the cowpea weevil [Callosobruchus maculatus (F.) (Coleoptera: Bruchidae)]. Anais da Academia Brasileira de Ciências 76, 57–65. doi: 10.1590/S0001-37652004000100006


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Cuny, Shlichta and Benrey. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Domestication of the Amazon Tree Grape (Pourouma cecropiifolia) Under an Ecological Lens

Hermísia C. Pedrosa<sup>1</sup> \*, Charles R. Clement<sup>2</sup> and Juliana Schietti<sup>1</sup>

<sup>1</sup> Programa de Pós-Graduação em Ecologia, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil, <sup>2</sup> Coordenação de Tecnologia e Inovação, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil

Domestication studies traditionally focus on the differences in morphological characteristics between wild and domesticated populations that are under direct selection, the components of the domestication syndrome. Here, we consider that other aspects can be modified, because of the interdependence between plant characteristics and the forces of natural selection. We investigated the ongoing domestication of Pourouma cecropiifolia populations cultivated by the Ticuna people in Western Amazonia, using traditional and ecological approaches. We compared fruit characteristics between wild and domesticated populations to quantify the direct effects of domestication. To examine the characteristics that are not under direct selection and the correlated effects of human selection and natural selection, we investigated the differences in vegetative characteristics, changes in seed:fruit allometric relations and the relations of these characteristics with variation in environmental conditions summarized in a principal component analysis. Domestication generated great changes in fruit characteristics, as expected in fruit crops. The fruits of domesticated plants had 20× greater mass and twice as much edible pulp as wild fruits. The plant height:DBH ratio and wood density were, respectively, 42% and 22% smaller in domesticated populations, probably in response to greater luminosity and higher sand content of the cultivated landscapes. Seed:fruit allometry was modified by domestication: although domesticated plants have heavier seeds, the domesticated fruits have proportionally (46%) smaller seed mass compared to wild fruits. The high light availability and poor soils of cultivated landscapes may have contributed to seed mass reduction, while human selection promoted seed mass increase in correlation with fruit mass increase. These contrasting effects generated a proportionately smaller increase in seed mass in domesticated plants. In this study, it was not possible to clearly dissociate the environmental effects from the domestication effects in changes in morphological characteristics, because the environmental conditions were intensively modified by human management, showing that plant domestication is intrinsically related to landscape domestication. Our results suggest that evaluation of environmental conditions together with human selection on domesticated phenotypes provide a better understanding of the changes generated by domestication in plants.

Keywords: allometry, Amazonia, domestication syndrome, ecological perspective, environmental effects, perennial fruit crop

#### Edited by:

Alejandro Casas, Universidad Nacional Autónoma de México, Mexico

#### Reviewed by:

Luis Sampedro, Consejo Superior de Investigaciones Científicas (CSIC), Spain Petr Smýkal, Palacký University, Czechia

> \*Correspondence: Hermísia C. Pedrosa hermisia.pedrosa@gmail.com

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Plant Science

> Received: 30 September 2017 Accepted: 02 February 2018 Published: 14 March 2018

#### Citation:

Pedrosa HC, Clement CR and Schietti J (2018) The Domestication of the Amazon Tree Grape (Pourouma cecropiifolia) Under an Ecological Lens. Front. Plant Sci. 9:203. doi: 10.3389/fpls.2018.00203

# INTRODUCTION

fpls-09-00203 March 13, 2018 Time: 16:40 # 2

Plant domestication resulted in populations more useful to humans and better adapted to cultivated landscapes (Harlan, 1992; Clement, 1999). Although humans have domesticated populations of many species, currently only 12 annual crops account for 75% of world food consumption (FAO, 2004). In the tropics, however, many fruit trees were domesticated in different degrees (Miller and Gross, 2011; Meyer et al., 2012), and in the Amazonia, most species with domesticated populations are fruit trees (Clement, 1999). One of these species is the Amazon tree grape (Pourouma cecropiifolia, Urticaceae), cultivated principally in Western Amazonia.

Gonzalo Jiménez de Quesada, in his 1596 expedition in the Eastern Llanos of Colombia, reported the existence of plantations of P. cecropiifolia in "gardens of vegetables and fruit plants" (Patiño, 2002). He describes the trees with "large racemes and fruits as large as nuts" (Patiño, 2002), indicating that the species is cultivated and probably domesticated since the pre-Columbian period. Currently, P. cecropiifolia is especially popular among the Ticuna people and large plantations are found in swiddens and fallows around their villages, in the vicinity of Tabatinga (Brazil), Letícia (Colombia), and Iquitos (Peru), where Clement (1989) observed several morphological differences in the fruits between wild and domesticated plants. In this region, the Ticuna people are still selecting individuals of P. cecropiifolia that have the largest and sweetest fruits, and removing the individuals that have smaller and tasteless fruits. Annually, in the fruiting period, the Ticuna farmers select and propagate the seeds of the plants with interesting characteristics in cultivated environments.

Because the species is dioecious (Clement, 1989), only fruiting trees can be selected and open-pollination from unselected pollen-donors (and individuals that will be removed later) slows response-to-selection. Like any crop, P. cecropiifolia is cultivated in agroecosystems that are quite different from it wild niche in early and mid-successional forests, so there is unconscious human selection for adaptation to new ecological factors. Therefore, P. cecropiifolia domestication is a special case compared to annual crops that are often under high intensity selection, because it is a moderately long-lived species, whose domestication syndrome has been influenced by both human selection and changes in ecological conditions that began a long time ago. However, this is a common case in Amazonia, where perennial plants and landscape domestication occur at the same time (Clement, 1999). Considering these points, it is an open question how wild and domesticated populations of perennial plants, such as P. cecropiifolia, differ and what are the effects of human selection, environmental conditions, and their interactions on fruits, seeds, and vegetative characteristics.

Domestication studies traditionally approach the differences and variation in morphological and genetic aspects between wild and domesticated plant populations (Harlan, 1992; Clement, 1999; Miller and Gross, 2011). However, a focus only on domesticated characteristics can limit the understanding of the interaction between human selection and natural selection. Because natural selective forces act on the phenotypes along with human selection, the set of characteristics that marks the divergence between domesticated and wild plants, the domestication syndrome, is probably more diverse than understood in classic domestication studies (Milla et al., 2015; Preece et al., 2017). A look at the ecological mechanisms that continue to act during the domestication process considers characteristics that are and are not under human selection and the correlations between characteristics. An ecological approach also allows the identification of relations between morphological characteristics and environment conditions, and evaluation of the integrated effect of domestication and the environment on phenotypic plasticity of the characteristics directly or indirectly modified by human selection. Therefore, considering ecological aspects can generate a more complete and integrated understanding of the domestication process (Milla et al., 2015).

In trees, the domestication process begins with population management in their natural environment (Rindos, 1984; Smith, 2011). Subsequently, individuals with the most desirable morphological characteristics are selected and cultivated in domesticated landscapes (Clement, 1999; Smith, 2011; Levis et al., 2017). Growing conditions under cultivation and directional selection lead the domesticated plant populations to diverge morphologically and genetically from their wild progenitors (Pickersgill, 2007; Miller and Gross, 2011). The genetic variability of populations under selection is reduced due to founder events (Nei et al., 1975) caused by the selection of a few individuals and a restricted gene pool in the next generation (Miller and Gross, 2011). In contrast, the phenotypic variability of the characteristics under selection in domesticated populations may increase in comparison with wild populations (Clement, 1999; Meyer and Purugganan, 2013). In perennial fruit crops, like P. cecropiifolia, the average time expected from the beginning of selection to domestication, when the domesticated characteristics are fixed in the cultivated populations, is about 3,000 years (Meyer et al., 2012).

Changes in the morphology of aerial vegetative parts, fruits and seeds are among the most common characteristics of the plant domestication syndrome (Meyer et al., 2012). In herbaceous plants, increases in leaf and whole-plant size are observed (Milla and Morente-López, 2014). In fruit trees, fruit and seed "gigantism" is very frequent (Meyer et al., 2012). These changes in the sizes of useful parts and in the whole-plant occur due to changes in biomass allocation patterns in and among the parts under selection (Milla et al., 2015). The increase in certain plant organs or parts caused by domestication can lead indirectly to changes in size of other plant parts due to allometric or biophysical relationships. In this case, any increase in allocation to an organ should be complemented by a proportional increase to the other organ or it has a cost to the other organ (Kleyer and Minden, 2015). Allometric relationships were analyzed in five herbaceous species and it was observed that plants selected to have larger leaf areas invested less in leaf blade biomass, but invested in larger petioles and other supporting structures, leading to larger plant sizes (Milla and Matesanz, 2017). For fruits and seeds, some allometric relationships are well known;

for example, plants that have fruits with larger seeds have a smaller number of seeds per fruit (Niklas, 1994). However, to our knowledge, there are no studies about the allometric relationships in fruit trees to explain the fruit and seed size changes generated by domestication. The changes in fruit and seed allometric relationships might be a signature of fruit tree domestication, and, if observed for several species, can be a parameter to identify fruit species domestication in future research.

In addition to human selection, the environmental conditions where domesticated plants develop also generate selective pressures that affect their phenotypes (Rindos, 1984; Harlan, 1992). When plants are cultivated, humans modify the landscape to create environmental conditions that reinforce the characteristics of interest and that favor the harvesting process. Farmers typically change soil fertility, light availability and reduce competition through the thinning of neighboring plants (Harlan, 1992; Clement, 1999). For example, the increase in available light and in soil fertility can generate an increase in the size of fruits, contributing to human interest. This can also affect other characteristics that are not under human selection, such as characteristics of leaves and roots, wood density and plant height, which respond to soil and light conditions and can be modified indirectly by domestication due to changes in the environmental conditions caused by human management in the landscape. Knowledge of local environmental conditions will help us to evaluate whether they are favorable, unfavorable or do not interfere with human selection. This will help us to understand what is a direct result of domestication and what is a reflection of the interaction between human selection and environmental variations, which results in a phenotypic plasticity response (Schlichting and Levin, 1986; Scheiner, 1993; Gratani, 2014). In the case of P. cecropiifolia, the plants are cultivated principally in terra-firme swiddens and fallows, where the Ticuna people practice "slash and burn" agriculture that dramatically modifies ecological conditions, especially soil fertility and light intensity (Clement, 1999; Jacovak et al., 2015).

In this study, we investigated the domestication process of P. cecropiifolia populations in Western Amazonia, where cultivated populations with large fruits occur in the vicinity of wild populations in the Amazonian forest. We use the traditional approach (focused on differences and variation in morphological characteristics under selection) allied with an ecological approach (human selection effects on allometry and the relations of morphological characteristics with the environment) to answer the following questions: (i) do domesticated populations have distinct morphological characteristics in contrast to wild populations located in adjacent forests and, if so, what is the magnitude of these differences? (ii) has human selection increased the phenotypic variability of characteristics under selection in domesticated populations? (iii) has human selection altered fruit and seed allometric relationships in domesticated plants? and (iv) what is the importance of environmental conditions in explaining the variation in morphological characteristics in wild and domesticated populations?

# MATERIALS AND METHODS

# Study Area

This study was conducted in eight Ticuna indigenous communities, along approximately 400 km of the upper Solimões River in Western Brazilian Amazonia (**Figure 1** and **Table 1**). The Ticuna people are the largest indigenous group in Brazil and are distributed in three countries: Brazil, Colombia, and Peru. In Brazil, their communities are located in the state of Amazonas and are distributed along both margins of the Solimões River and its tributaries, where our sampling was performed. The upper Solimões River was chosen as the study area because it has a high concentration of cultivated P. cecropiifolia populations and is considered to be the center of domestication (Clement, 1989; Clement et al., 2010). In this region, cultivated populations occur in terra-firme areas and wild populations occur in adjacent floodplain forests.

# Pourouma cecropiifolia

Pourouma cecropiifolia is a fruit tree species that occurs in the Amazon rainforest, from Western to Central Amazonia. The species is found in wild conditions in Bolivia, Colombia, Ecuador, Venezuela, Peru, and Brazil. In Brazil, it occurs in the state of Acre and in the state of Amazonas. In areas of primary forest, it occurs mainly in terra-firme forests, but abundance in this phyto-physiognomy varies within the distribution area of the species. In the region of upper Solimões River (near the city of Tabatinga, Amazonas, Brazil), for example, it occurs mainly in floodplains, being scarce in terra-firme forests. These floodplains are relatively high floodplains, which are flooded for a period of 2–3 months, where the maximum level of flooding is around 1.5 m (personal observation in the year 2015). The occurrence of P. cecropiifolia was not recorded in low floodplains or in chavascais (almost permanently flooded areas), where there is a great abundance of pioneers, such as species of the genus Cecropia (also of the family Urticaceae) and grasses.

The fruiting of the species occurs annually between September and December (Lopes et al., 1999). The main pollinator agents of P. cecropiifolia are insects of the family Apidae, Oxytrigona obscura, Trigona dellatarreana, and Trigona sp. (Falcão and Lleras, 1980). The seed dispersing agents are mainly small-sized primates, bats, and humans. It is considered a fruit of easy propagation, fast growth, precocity and good productivity. Villachica (1996) reports that, in plantations, the trees begin to bear fruit at 2 years, reaching an optimum of production between the fifth and sixth year, with subsequent progressive decrease.

For the Ticuna people, P. cecropiifolia is an important traditional fruit and a symbolic component of their culture, and is widely consumed and cultivated in Ticuna fields and agroforests. Moreover, it is reported in Ticuna myths as a plant associated with fauna and mythical entities of the forest.

# Sampling Design and Characteristic Measurements

In each of the eight communities, we sampled 10 adult plants in a terra-firme area under cultivation (domesticated population)

FIGURE 1 | Pourouma cecropiifolia distribution and abundance in Greater Amazonia, and study area. Large map: abundance in humid areas (orange) and upland terra-firme areas (green) from the Amazon Tree Diversity Network (data provided especially for this paper by the Dr. Hans ter Steege, responsible of the Amazon Tree Diversity Network – ATDN, contact: http://atdn.myspecies.info/); gray dots are ATDN sites with no P. cecropiifolia. Occurrence in unspecified ecosystems (black) and cultivated systems (yellow) without abundance data from Missouri Botanic Garden (MOBOT) database (data available in http://tropicos.org/Name/21300486?tab= specimens). Small map: the region of the upper Solimões River with paired cultivated (yellow) and humid area (orange) populations.


TABLE 1 | Data of Pourouma cecropiifolia collection sites: population, group (wild, domesticated), village name, city, latitude and longitude.

and 10 adult plants in a nearby forested area (wild population); 3 km was the average distance between paired populations. The paired sampling model follows Dawson et al. (2008), who studied domestication of Inga edulis in Western Amazonia. For each individual plant we measured the following morphological characteristics: (i) number of fruits per bunch (mean of five bunches), (ii) fruit length (cm), (iii) fruit diameter (cm), (iv) fruit mass (g), (v) seed mass (g), (vi) peel mass (g), (vii) pulp mass (g) – by difference (iv – v – vi), (viii) pulp:fruit mass ratio – ratio vii:iv, (ix) seed:fruit mass ratio – ratio v:iv, (x) diameter at breast height (DBH) (cm), (xi) plant height (m), to determinate the (xii) plant height:DBH ratio (m/cm) and (xiii) branch wood density (g/cm<sup>3</sup> ) (correlation among characteristics are showed in **Supplementary Figure S2**). For fruit and seed metrics, we used two fruits from each of five bunches. DBH was measured at 1.30 m above ground level. Plant height was estimated using a hypsometer (Vertex Laser VL400 Ultrasonic-Laser Hypsometer III, Haglöf of Sweden). Wood density was determined by the ratio between the dry weight and wet volume of a lateral terminal branch section, with approximately two centimeters in diameter. The wood samples had the wet volume measured by water displacement and were dried for 72 h at 105◦C.

# Environmental Conditions

We collected 300-g soil samples in the 0–30 cm layer close to each tree sampled. The 10 individual samples from each population were dried, homogenized and mixed in the laboratory to make a composite sample that represented the soil of each population. The composite sample was analyzed to evaluate phosphorus (P), potassium (K), calcium (Ca), and magnesium (Mg) concentrations (EMBRAPA, 1997). The clay, sand and silt content were determined by granulometric analysis to characterize soil texture (EMBRAPA, 1997).

Light availability was estimated for each tree and averaged over the population. We used the Crown Illumination Index, which describes the environment luminosity inside the forest, on a scale ranging from 1, where there is diffuse incident light, up to 4, where there is direct light on the canopy (Keeling and Phillips, 2007).

# Statistical Analyses

To evaluate the morphological differences between domesticated and wild individuals, we compared the means and amplitudes of variation of 10 characteristics using Kernel density graphs and performed an ANOVA between the two groups (wild and domesticated) for each characteristic, using R software (R Core Team, 2016). A principal component analysis (PCA) of the 10 morphological characteristics was also used to evaluate the differentiation between wild and domesticated individuals, and to evaluate which characteristics are most strongly correlated with domestication, also using R. To test the multivariate differences between wild and domesticated individuals, we performed an ANOVA on the two principal axis of the PCA. To identify and classify groups of wild and domesticated populations, we performed a cluster analysis based on Normal Mixture Modeling, performed with the mclust package (Fraley and Raftery, 2002; Fraley et al., 2012) in R. We also identified which group has the greatest phenotypic variability by comparing the variances between the wild and domesticated populations for each characteristic.

To evaluate whether domestication altered allometric patterns of fruit components, we used data from the literature of 74 non-domesticated species with drupe fruits like those of P. cecropiifolia (Ibarra-Manriquez and Oyama, 1992; Chen et al., 2010; Shiels and Drake, 2011; Bentos et al., 2012). We adjusted Niklas (1994) potential regression model (SM = a FMˆ<sup>b</sup> ) of the relationship between fruit mass (FM) and seed mass (SM) for these non-domesticated species (including non-domesticated P. cecropiifolia populations), and used this model for domesticated P. cecropiifolia individuals (n = 80), performed in the qpcR package (Spiess, 2014) in R. We then compared their shape factors (a, exponent of the potential relation between variables) and scaling factors (b, intercept of the potential relation between variables). The differences between the two equations were evaluated by the overlap of the confidence intervals of the shape and scaling values. We performed a covariance analysis to test the differences in the relations between seed mass and fruit mass considering the three groups – wild individuals of P. cecropiifolia, domesticated individuals of P. cecropiifolia and the other species.

A PCA of the environmental conditions [described above, plus the sum of bases (K + Ca + Mg), which represents a fertility index] was used to evaluate the differences between the environmental conditions of the forest and cultivated sites. To compare the multivariate differences between forest and cultivated sites, we performed an ANOVA on the first two principal axis of the PCA. We evaluate the effect of environmental conditions on the 10 morphological characteristics through simple linear regressions using all populations together to encompass all the environmental and morphological variation observed in the study. We used the PCA axis that best represented the environmental conditions to evaluate their relationships with morphological characteristics. To evaluate the individual effect of environmental variables, we performed simple linear regressions between each environmental variable and each morphological characteristic. All analyses were run in R.

Finally, we constructed a conceptual model to present an overview of the combined direct and indirect effects of domestication and environmental conditions on the plant phenotype.

# RESULTS

# Morphometry and Domestication Syndrome

Domestication increased the length, diameter and mass of fruits, seed mass, pulp mass and pulp:fruit mass ratio. In contrast, domestication reduced the number of fruits per bunch, seed:fruit mass ratio (seed:fruit allometry), plant height:DBH ratio and wood density (**Figure 2**, **Table 2**, and **Supplementary Table S1**). Domesticated fruits had 20× greater mass than wild fruits. About 64% of the domesticated fruit is composed of edible pulp, compared to only 34% in

wild fruits (**Table 2** and **Supplementary Table S1**). On the other hand, the average values of fruits per bunch, seed:fruit mass ratio, plant height:DBH ratio and wood density were 24.8%, 45.5%, 42.1%, and 21.7% higher in wild populations, respectively (**Table 2** and **Supplementary Table S1**). We found significant differences (p < 0.01) between wild and domesticated populations for all 10 characteristics evaluated (**Supplementary Table S2**).

The first axis of the PCA with the 10 morphological characteristics explained 73.7% of the data variation highlighting the multivariate differences between wild and domesticated plants (F = 1794, p < 0.001). The second axis explained 7.9% of the data variation and did not

TABLE 2 | Values of means and variances per plant group (domesticated and wild) of the 10 morphological characteristics evaluated in the study for Pourouma cecropiifolia.


The bolder values indicate the higher variance between the wild and domesticated group for each characteristic.

differentiate wild from domesticated plants (F = 0.394, p = 0.531) (**Figure 3**). The characteristics most associated with the domestication syndrome, mass, proportion and size of fruits and their components (seed and pulp), were highly and positively correlated (±90%) with axis 1. Hence, PC1 is the axis that best reflects the domestication syndrome.

Reinforcing the pattern found in the PCA, the clustering and classification analysis (Normal Mixture Modeling) distinguished among groups of domesticated and wild populations for seven morphological characteristics (**Figure 4**). Fruit length, fruit diameter, fruit mass, seed mass, pulp:fruit mass ratio and seed:fruit mass ratio discriminated two groups, the domesticated populations and the wild populations. Pulp mass, however, allowed discrimination of three groups, dividing the domesticated populations into two groups, including four populations in and close to Tabatinga, with higher values of pulp mass than the other four populations further east in the study area. Using the clustering and classification analyses, the number of fruits per bunch, plant height:DBH ratio and wood density did not differentiate wild populations from domesticated populations.

## Phenotypic Variability of Characteristics

Among the fruit characteristics, fruit length, fruit diameter, fruit mass, seed mass, pulp mass, and pulp:fruit mass ratio presented higher variances in domesticated populations, indicating greater phenotypic variability in these characteristics in plants under human selection (**Table 2**). Fruit mass and pulp mass presented much greater variances within the domesticated group. The greater amplitude of variation in these characteristics is also apparent in the density curves (**Figure 2**). Number of fruits per bunch, seed:fruit mass ratio and plant height:DBH ratio presented higher variances in wild populations, while wood density WD, wood density.

fpls-09-00203 March 13, 2018 Time: 16:40 # 7

fruit length; FD, fruit diameter; FM, fruit mass; SM, seed mass; PM, pulp mass; PFR, pulp:fruit mass ratio; SFR, seed:fruit mass ratio; HD, plant height:DBH ratio;

presented similar variances in wild and domesticated populations.

# Allometric Changes in Domesticated Plants

The general allometric model SM = 0.63 FM0.<sup>89</sup> (where SM is seed mass and FM is fruit mass) based on 74 species that have fleshy fruits with only one seed (including P. cecropiifolia wild populations) presented a higher value of the shape factor than the model adjusted to the characteristics of domesticated individuals (SM = 0.44 FM0.60). There was no difference in the confidence intervals of the scaling factor between the equations. In the equation for domesticated individuals the confidence interval of the scaling factor ranged from 0.35 to 0.55 and in the general equation for 74 wild species it ranged from 0.52 to 0.74. The confidence intervals of the shape factor values of the equations did not overlap. In the equation of domesticated individuals, the confidence interval of the shape factor ranged from 0.50 to 0.69 and in the general equation for 74 wild species it ranged from 0.83 to 0.96. This shows that the two equations are different and that the observed values of seed mass in the P. cecropiifolia domesticated plants are lower than the values predicted by the general allometric equation. The seed:fruit mass ratio changed from approximately 0.9:1 in wild plants to 0.6:1 in domesticated plants (**Figure 5A**). In comparison, the correlation between the observed seed mass and the predicted seed mass by the model is higher in wild P. cecropiifolia plants (r <sup>2</sup> = 0.87) than in domesticated plants (r <sup>2</sup> = 0.66, **Figure 5B**). In the ANCOVA, we found significant differences in the intercept and in the slopes between the groups of other species, the wild individuals of P. cecropiifolia and the domesticated individuals of P. cecropiifolia [F(2,215) = 267.14, p < 0.001]. The interaction between fruit mass and groups was also significant [F(2,215) = 228.33, p < 0.001], showing that the allometric relation between seed mass and fruit mass change as a function of the groups.

# Effects of Environmental Conditions on Characteristics

The environmental conditions where the wild and the domesticated groups occur in the study area were also very different from each other (**Supplementary Figure S1** and **Supplementary Table S3**). The first principle component explained 91.9% of the data variation, and differentiated floodplain forests from cultivated sites [F(1,14) = 18.77, p < 0.001]. The second axis explained 7.3% and did not differentiate floodplain forests from cultivated sites (F = 3.128, p = 0.098). Domestication is strongly correlated with variations in light availability (CII), sum of bases, and calcium, magnesium, phosphorus, silt and sand (**Supplementary Figure S3**). Cultivated sites (terra-firme) have 28% higher light availability, poorer soils (16x lower sum of bases), and 63% sandier soils than floodplain forests (right side of **Supplementary Figure S1**); the floodplain forest sites have lower light availability and more fertile silty soils (**Supplementary Table S3**). Only the clay and potassium contents were slightly altered in cultivated areas and are less correlated with domestication (**Supplementary Figure S3**).

The environmental conditions had significant effects on all the morphological characteristics (**Table 3**). The mass and dimensions of fruits (**Figures 6B–D**), seeds (**Figure 6E**), pulp (**Figure 6F**), and pulp:fruit mass ratio (**Figure 6G**) increase in environments with higher available light and poorer sandier soils (**Supplementary Figure S4**). The number of fruits per bunch (**Figure 6A**), seed:fruit mass ratio (**Figure 6H**), plant

height:DBH ratio (**Figure 6I**) and wood density (**Figure 6J**) increase in environments with less available light and more fertile silty soils (**Supplementary Figure S4**). Analyzing only those characteristics less associated with domestication, we found a significant increase in wood density due to the increase in potassium content (**Supplementary Figure S4**).

# DISCUSSION

Using the traditional morphometric approach of domestication studies, we found higher mean values and greater variability in the dimensions and mass characteristics of the fruit in domesticated populations. The vegetative characteristics also varied, but to a lesser extent than the fruit characteristics. The domesticated populations showed lower values of plant height:DBH ratio and wood density than wild populations. Using an ecological approach, we found marked changes in the seed:fruit allometric relation. The domesticated fruits have a lower proportion of seed mass than the wild fruits of the same species and the fruits of 74 species of non-domesticated plants with the same fruit type (drupe). In addition, we observed that the morphological characteristics evaluated in P. cecropiifolia are influenced by variations in soil and light conditions. However, it is not easy to dissociate the environmental effect from the domestication effect, because farmers also created the cultivated landscapes.

# Morphological Characteristics in Domesticated Versus Wild Populations

The increase in dimensions and mass of fruits and seeds are the characteristics most modified during the domestication process of fruit species (Meyer et al., 2012). Domestication in fruit trees selects extreme phenotypes for size and mass of fruits, and eliminates phenotypes that differ from the preferred phenotype, reducing their frequencies in domesticated populations (Zohary, 2004). In P. cecropiifolia, humans selected for fruits with larger sizes and larger masses than those found in wild populations, in which large fruits are not favored by natural selection (McCouch, 2004). Dispersers of wild P. cecropiifolia in the forest are usually small-sized primates, such as Saguinus mystax, S. fuscicollis (Knogge and Heymann, 2003), Callimico goeldii, Saguinus labiatus (Porter, 2001) and Cebus apella (Gómez-Posada, 2012). Smaller fruits may provide an advantage over large fruits for dispersal by small-sized primates, because they can be more easily removed, transported and dispersed (Tanksley, 2004). On the other hand, in domesticated plants, the larger fruit size does not have an adverse effect, because humans guarantee the seeds' dispersal and the seedlings' establishment.

The number of fruits per bunch was smaller in domesticated populations than in wild populations. This negative correlation between number and size is common in fruit trees (Browning, 1985), due to the reallocation of photoassimilates to fruit size, which demands a decrease in the number of fruits per bunch for biomechanical reasons.

Within the domesticated populations we found geographic variation for fruit size, where a group of populations on the western side of the sample area had larger fruit than a group of populations to the east. This finding supports Clement's (1989) proposal of a larger-fruited landrace close to the triple frontier (Brazil, Colombia, and Peru). Whether this extends as far as Iquitos, Peru, where Ducke (1946) commented on the abundance and popularity of P. cecropiifolia, remains to be investigated.

The lower mean values found in plant height:DBH ratio and in wood density can be explained by changes in edaphic and

light conditions, which will be better detailed later in the specific section where the environmental effects on the morphological characteristics will be discussed. However, it is also possible that domestication generated a reduction in the height, diameter and wood density of the trees, due to the reallocation of photoassimilates to the harvestable product, changing the ratio between the biomass of the harvestable product (the fruits and seeds in the case of P. cecropiifolia) and the total plant biomass (Li et al., 2012). This ratio is called the 'harvest index,' and a negative correlation with plant height is common in many annual crops, such as rice (Li et al., 2012) and sorghum (Can and

TABLE 3 | Results of the simple regression analyses between morphological characteristics of P. cecropiifolia populations (wild and domesticated, n = 16) and environmental conditions (PC1).


The values of the coefficients in bold indicate that the relationship between the morphological characteristic and the environmental conditions is significant (p < 0.05).

Yoshida, 1999) due to greater translocation of photoassimilates from the vegetative tissues to grains (Zou et al., 2003). For tree crops, this negative correlation is also expected (Cannell, 1985).

# Is the Variability in Characteristics Under Human Selection Higher?

In addition to the morphological differences between wild and domesticated populations that characterize the domestication syndrome, phenotypic variability is also expected to be greater in useful parts (Pickersgill, 2007). This expectation was observed in domesticated populations of banana (Li et al., 2013), peach palm (Clement, 1988) and tomato (Tanksley, 2004). Although genetic variability generally decreases in domesticated populations, phenotypic variability of selected parts may increase with domestication (McCouch, 2004; Purugganan and Fuller, 2009) due to dispersal and diversification after initial domestication (Meyer and Purugganan, 2013). During the dispersal process, the genetic material under selection is shared and disseminated among different human groups with cultural peculiarities, which may have, for example, different food preferences. This may also promote diversity in domestication syndromes (Milla et al., 2015). In the case of domesticated P. cecropiifolia populations, the Ticuna report fruits with more fibrous pulp and others whose pulp has higher water content, thus generating large variation in pulp and fruit mass. However, the Ticuna also report that they seldom select for the juicier pulp, as the fruits "explode" when they fall on the ground, a common occurrence during harvesting, and are unfit for transport or sale.

environmental conditions in wild populations (circles) and domesticated populations (triangles) of Pourouma cecropiifolia. The x-axis represents the environmental conditions (axis 1 of the PCA; Supplementary Figure S1), where higher values indicate higher sand content and light availability typical of cultivated areas on the terra-firme. Lower values indicate higher soil fertility, and silt and clay contents typical of floodplain forests.

# Alterations in the Seed Mass:Fruit Mass Allometry

Humans selected P. cecropiifolia plants to have larger fruits. In response to selection to increase fruit mass, a faster increase in pulp mass than in seed mass is expected, resulting in fruit with a higher relative proportion of pulp (Martinez et al., 2007; Chen et al., 2010). The increase in pulp mass in P. cecropiifolia is mainly due to the increase in carbohydrates (fibers, cell walls, and starch) (Lopes et al., 1999). Due to the correlation among fruit components, the increase in fruit mass also leads to an increase in seed mass. However, the seeds of domesticated P. cecropiifolia do not follow the same tendency of increase observed in wild plants (MS ∼ MF0.89; Niklas, 1994). Seeds contain proteins and oils, which are energetically more "expensive" than carbohydrates (Lopes et al., 1999). Therefore, it is likely that seed mass does not increase in the same proportion as fruit mass in domesticated plants, because the highest selective pressure is on the fruit, which responds with more carbohydrates, and not on the seed, which needs to retain a balance of proteins, carbohydrates, and oils to guarantee good germination.

# The Effects of Environmental Conditions and Their Relations With Domestication

Contrary to expectations for the variation in dimensions and mass of fruits and their components in response to edaphic conditions (Janick and Paull, 2008), we found fruits with larger masses and dimensions, and larger masses of seed and pulp in the domesticated populations, which occur in soils 16x poorer in nutrients and 63% sandier than floodplain soils, where wild populations occur. Having larger fruits with larger seeds and pulp mass in poorer and sandier soils is an indication that the domesticated phenotype is mainly a result of alterations in the genotype resulting from domestication, considering that for P. cecropiifolia the fruit characteristics are the most important for humans and are under direct human selection. The increase of fruit dimensions may have been due to the preferential selection of trees with large fruits and the elimination of trees with small fruits (Zohary, 2004). Considering that the size of the plant affects biomass allocation (Milla and Matesanz, 2017) and considering that the high light availability of the cultivated environment where the domesticated populations of P. cecropiifolia grow generates a decrease in the total size of the plant, it is possible that there is a trade-off that may have led to a lower allocation to vegetative parts and a higher allocation to reproductive organs, as is typical of changes in harvest index (Li et al., 2012).

The subtle variations in vegetative characteristics suggests that they are not under direct selection. These changes are possibly results of changes in the environmental conditions caused by human management, an indirect effect of domestication (Harlan, 1992; Zohary, 2004) (**Figure 7**). Due to high light availability in cultivated landscapes, the plants of domesticated populations invested less in growth in height, because the competition with other plants for light is smaller when compared to individuals in forest landscapes that have smaller canopy openings (Cannell, 1985). With the greater light availability in the cultivated environment, domesticated plants possibly grow faster than wild plants, and consequently have lower wood density (Poorter, 1999; Poorter et al., 2008). The plant height:DBH in domesticated plants is also affected by the sandier soils of the cultivated areas. Sandy soils maintain less water and nutrients due to their lower surface areas (Tarboton, 2003), which affects water supply in dry seasons.

Under natural conditions, wild P. cecropiifolia plants produce proportionately larger seeds due to the difficult conditions of establishment in the shaded understory, as observed in other forest species (Salisbury, 1974; Westoby et al., 1996). In

cultivated landscapes where the P. cecropiifolia domesticated plants grow, the low fertility of cultivated soils (Westoby et al., 1990; Hammond and Brown, 1995) and the high light availability promote reduction of seed mass (Eriksson et al., 2000; Moles et al., 2005). In contrast, human selection for larger fruits and the positive correlation between fruit mass and seed mass promote an increase in seed mass in the domesticated plants. Therefore, the combined but contrasting environmental effects and human selection lead domesticated populations to have a proportionately smaller increase in seed mass than wild populations.

By evaluating individually the variables not correlated with landscape domestication, it was possible to observe which morphological changes are effectively responding to environmental variations. The positive correlation between wood density and potassium content in the soil, and the existence of one domesticated population with wood density similar to the wood density of the wild populations, in a site with a high potassium content, shows the environmental effect on the phenotypic plasticity of the P. cecropiifolia individuals. This suggests that, although the effect of environmental conditions in cultivated landscapes can be superimposed on the domestication effect, we cannot ignore the plastic capacity of individuals in explaining the morphological variations of plants under human selection.

Due to the difficulty in dissociating the effect of environmental conditions from the effect of human selection, we suggest that reciprocal transplant experiments of domesticated plants to uncultivated landscapes and wild plants to cultivated landscapes will be needed to effectively differentiate domestication effects from environmental effects. In addition, we consider that in future studies it will be necessary to evaluate experimentally the effect of luminosity on fruit mass, seed mass and in seed:fruit allometry considering the high light availability in cultivated landscapes and because it is expected that there is a positive correlation between light intensity and fruit mass (Moles et al., 2005; Janick and Paull, 2008), but a negative correlation with seed mass (Eriksson et al., 2000; Moles et al., 2005).

In this study, the domesticated phenotype is a result of a combination of human selection and environmental conditions in the sites where the plants are cultivated. We observed strong environmental modification created by humans in cultivated landscapes that is exacerbated by the fact that wild populations occur in flooded areas, while domesticated populations occur in upland areas. These changes in environmental conditions between natural and cultivated sites, in addition to genetic selection by humans, promoted the phenotypic changes in domesticated populations. However, the forces of genetic selection through human management and of natural selection through environmental conditions are intrinsically mixed and discriminating the magnitude of each component, and the environment by genotype interaction, requires a welldesigned common garden experiment (Falconer and Mackay, 1996).

# CONCLUSION

fpls-09-00203 March 13, 2018 Time: 16:40 # 12

Addressing ecological aspects in plant domestication studies provides us with a more integrated understanding about the evolution of cultivated plants, because the domesticated phenotypes are the result of the combined effects of human selection and natural selection on plant populations. We quantified modifications in numerous components of the domestication syndrome of P. cecropiifolia populations in Western Amazonia. The domesticated plants presented substantial changes in the morphology of their fruits and seeds, and more subtle changes in vegetative characteristics. The combined effect of natural selection and human selection modified the expected pattern in the allometric relations between seed mass and fruit mass, due to the contrasting effects of environmental filters, which promote seed size reduction, and human selection, which promotes seed size increase. The strong correlation between domestication and environmental conditions due to changes in the landscape generated by human management made it difficult to separate environmental effects from human selection effects. The evaluation of the effects of environmental conditions and of human selection and management in cultivated landscapes are important for a better understanding of the domestication syndrome. We suggest that the allometric differences between fruits and seeds of wild and domesticated plants can be used in future studies, as an additional parameter of the domestication syndrome.

# DATA ACCESSIBILITY

Data for this paper have been archived in figshare: https:// figshare.com/articles/data\_Pcecropiifolia\_xlsx/5306380.

# AUTHOR CONTRIBUTIONS

HP: contributed to conception and design of the study, was responsible for field collections and laboratory analyses, conducted the statistical analyses, contributed to the interpretation of the results, wrote the first version of the manuscript, and approved the final version to be published. CC: contributed to conception and design of the study and statistical analyses, participated substantially in the interpretation of the results and drafting of the manuscript, and approved the final version to be published. JS: contributed to conception and design of the study, participated substantially in the statistical analyses, interpretation of the results, and drafting of the manuscript, and approved the final version to be published.

## REFERENCES


# FUNDING

The research was funded by the Fundação de Amparo a Pesquisas do Estado do Amazonas (FAPEAM) Universal 19776.UNI472.1978.20022014 and FAPEAM/Newton Fund 062.00831/2015.

# ACKNOWLEDGMENTS

We thank the Ticuna people for their permission to work with them and for their assistance in various Indigenous Territories of the upper Solimões River and the Fundação Nacional do Índio (FUNAI) for their authorization (process 08620.046540/2015-39). We thank Dr. Hans ter Steege and the Amazon Tree Diversity Network (ATDN) for the availability of Pourouma cecropiifolia occurrence and abundance data in their permanent plots. Their contact is: http://atdn.myspecies.info/. We also thank the field assistants Cláudia Reis Mendonça and Bernardo Cruz Gabriel.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018.00203/ full#supplementary-material

FIGURE S1 | Principal components analysis of the environmental variables at each cultivated and natural site where Pourouma cecropiifolia was collected for this study.

FIGURE S2 | Correlation matrix among the morphological characteristics of Pourouma cecropiifolia.

FIGURE S3 | Correlation matrix between domestication status of Pourouma cecropiifolia and the environmental variables and among the environmental variables.

FIGURE S4 | Correlation matrix among the morphological characteristics of Pourouma cecropiifolia, among the environmental variables and between the morphological characteristics and the environmental variables.

TABLE S1 | Descriptivie statistics at each site for all 10 morphological characteristics analyzed of Pourouma cecropiifolia.

TABLE S2 | Results of ANOVA between the wild and domesticated groups of Pourouma cecropiifolia for each characteristic analyzed.

TABLE S3 | Morphological caracteristics of Pourouma cecropiifolia and environmental variables by population, including the values of pH, soil nutrients, soil texture and Crown Illumination Index.

Plants, eds M. G. R. Cannell and J. E. Jackson (Huntingdon: Institute of Terrestrial Ecology), 409–425.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling editor is currently co-organizing a Research Topic with one of the authors CC, and confirms the absence of any other collaboration.

Copyright © 2018 Pedrosa, Clement and Schietti. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Testing Domestication Scenarios of Lima Bean (Phaseolus lunatus L.) in Mesoamerica: Insights from Genome-Wide Genetic Markers

María I. Chacón-Sánchez <sup>1</sup> \* and Jaime Martínez-Castillo<sup>2</sup>

<sup>1</sup> Departamento de Agronomía, Facultad de Ciencias Agrarias, Universidad Nacional de Colombia, Bogotá, Colombia, <sup>2</sup> Centro de Investigación Científica de Yucatán, Yucatán, Mexico

Plant domestication can be seen as a long-term process that involves a complex interplay among demographic processes and evolutionary forces. Previous studies have suggested two domestication scenarios for Lima bean in Mesoamerica: two separate domestication events, one from gene pool MI in central-western Mexico and another one from gene pool MII in the area Guatemala-Costa Rica, or a single domestication from gene pool MI in central-western Mexico followed by post-domestication gene flow with wild populations. In this study we evaluated the genetic structure of the wild gene pool and tested these two competing domestication scenarios of Lima bean in Mesoamerica by applying an ABC approach to a set of genome-wide SNP markers. The results confirm the existence of three gene pools in wild Lima bean, two Mesoamerican gene pools (MI and MII) and the Andean gene pool (AI), and suggest the existence of another gene pool in central Colombia. The results indicate that although both domestication scenarios may be supported by genetic data, higher statistical support was given to the single domestication scenario in central-western Mexico followed by admixture with wild populations. Domestication would have involved strong founder effects reflected in loss of genetic diversity and increased LD levels in landraces. Genomic regions affected by selection were detected and these may harbor candidate genes related to domestication.

Keywords: approximate bayesian computation, SNPs, genotyping-by-sequencing, linkage disequilibrium, founder effects, domestication bottlenecks

# INTRODUCTION

Domestication can be seen as a complex interplay among demographic processes and evolutionary forces that increase the adaptation of wild populations to human-driven environments (Purugganan and Fuller, 2009; Larson and Burger, 2013; Meyer and Purugganan, 2013; Wang et al., 2017). Several questions about domestication have been of interest to evolutionary biologists. One of these questions is the number of times a crop species was domesticated. The traditional approach to address this question has been the identification of monophyletic clusters of extant crop representatives as evidence of single domestication. However, this approach may be misleading because the extent of genetic drift and gene flow (for example among independent cultivation sites) may be effective in erasing early genetic signals of multiple domestications (Allaby et al., 2008; Olsen and Gross, 2008). A second question is the extent of the domestication bottleneck (Ladizinsky, 1985), which translates into a loss of crop genetic diversity. A third question is the geographic area

### Edited by:

Alejandro Casas, National Autonomous University of Mexico, Mexico

#### Reviewed by:

Hovirag Lancioni, University of Perugia, Italy Baocheng Guo, Chinese Academy of Sciences, China

> \*Correspondence: María I. Chacón-Sánchez michacons@unal.edu.co

#### Specialty section:

This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Plant Science

> Received: 05 July 2017 Accepted: 24 August 2017 Published: 12 September 2017

#### Citation:

Chacón-Sánchez MI and Martínez-Castillo J (2017) Testing Domestication Scenarios of Lima Bean (Phaseolus lunatus L.) in Mesoamerica: Insights from Genome-Wide Genetic Markers. Front. Plant Sci. 8:1551. doi: 10.3389/fpls.2017.01551

**67**

where domestication took place. This is basically done by identifying the wild stocks that are most closely related to domestic populations (Salamini et al., 2002), however profuse gene flow among domesticates and wild populations should be taken into account (Cornille et al., 2012). A fourth question is time of domestication, namely when domestication occurred and how long did it take. This question may be answered by examination of archeological remains of wild and domestic forms (Purugganan and Fuller, 2011). A final question that is key for further genetic improvement of domestic forms is how domestication traits arose, namely those traits that differentiate wild from crop populations (the domestication syndrome). To answer this question it is necessary to identify genomic regions that were affected by selection and also the genes that underlie the genetic control of those traits that arose during the process of adaptation to domestication (Doebley et al., 2006).

The questions mentioned above have been traditionally addressed with genetic markers (in the terms of dozens of them) and their analysis by indirect approaches, which mainly involve calculation of genetic distances among wild and domesticated populations and visualization of patterns by means of clustering approaches, assignment tests, etc. From the interpretation of these patterns usually some hypotheses are proposed but in general these hypotheses are not tested (Gerbault et al., 2014). For domestication, as well as for other evolutionary processes, hypothesis testing may not be an easy task because scenarios may be too complex and datasets too large to be analyzed by many of the available methods based on the calculation of the likelihood function (Gerbault et al., 2014). This difficulty stimulated the development of other approximations such as the so-called Approximate Bayesian Computation (ABC) approach (Beaumont et al., 2002). Due to the genetic stochasticity of evolutionary processes and because evolutionary forces may affect different regions of the genome in a different way, sampling a relatively large number of loci distributed along chromosomes is important. Fortunately, the development of new sequencing and genotyping technologies allow the analysis of genome-wide genetic markers for evolutionary studies, even in non-model plant species (Elshire et al., 2011).

Lima bean is the second most important crop of the genus Phaseolus (after common bean) cultivated worldwide. The con-specific wild ancestor of Lima bean is widely distributed from Mexico to Argentina according to current germplasm and herbarium records (Debouck, 2008). Wild Lima bean is structured into three gene pools (Serrano-Serrano et al., 2010), the Mesoamerican I gene pool (MI) occurs in central-western Mexico, to the north and west of the Isthmus of Tehuantepec; the Mesoamerican II gene pool (MII) is found in Mexico to the south and east of the Isthmus of Tehuantepec, along the coastal plains of the Gulf of Mexico, in Central America, northern South America, southern Peru, Bolivia, and northern Argentina; the Andean gene pool (AI) is distributed in a narrow geographic range in the Andes of Ecuador and northern Peru.

Lima bean landraces are classified into two major groups, the Mesoamerican and the Andean, according to their geographic origin and seed characteristics. Mesoamerican landraces have small seeds (size ranging from 30 to 78 g/100 seeds, with an average of about 45 g/100 seeds), include the types known as "Sieva" (flat or kidney-shaped small seeds) and "Potato" (globular small seeds), and were domesticated in the Mesoamerica region (Gutiérrez-Salgado et al., 1995; Motta-Aldana et al., 2010). Andean landraces have flat larger seeds known as "Big Lima" (size ranging from 58 to 122 g/100 seeds, with an average of about 87 g/100 seeds) and were domesticated in the Andes of Ecuador and northern Peru (Gutiérrez-Salgado et al., 1995; Motta-Aldana et al., 2010). Previous studies, based on genetic data from few loci, have proposed two competing scenarios for the origin of the Mesoamerican landraces: (1) two separate domestication events, one from gene pool MI in central-western Mexico and another one from gene pool MII in the area Guatemala-Costa Rica, or (2) a single domestication from gene pool MI in centralwestern Mexico and post-domestication gene flow with wild populations from gene pool MII (Motta-Aldana et al., 2010). These previous studies have also shown that domestication was accompanied by strong founder effects that decreased genetic diversity of landraces in Mesoamerica and the Andes. Founder effects have so far been quantified with a handful of marker loci [the internal transcribed spacer of the ribosomal DNA (ITS), two non-coding regions of the chloroplast DNA (cpDNA) and a handful of nuclear SSR markers], which raises questions about how these estimations represent genome-wide patterns of diversity in Lima bean. A key aspect that has not been explored in previous studies is how domestication has affected genome-wide patterns of linkage disequilibrium in landrace populations, an aspect that undoubtedly will increase our understanding of the evolution of domesticated populations.

In spite that much have been advanced in the understanding of the evolution of this species in the wild and during domestication, previous genetic data have been based on a set of very few marker loci, which have made it impossible to test the two domestication scenarios outlined above, especially because the uniparental inheritance of the cpDNA and the very poor representation of the nuclear genome are not adequate to discern between hypotheses involving gene flow. On the light of the new sequencing technologies and the recent development of methods of analysis, not only these two hypotheses may be tested but also many aspects of the evolution of this crop species may now be addressed from a genome-wide perspective. The main goal of the present study was to test which one of the two domestication hypotheses better fit to data gathered from a set of genome-wide SNP markers genotyped at 270 accessions of wild (110) and domesticated (160) Lima bean, mainly from the Mesoamerican gene pool. Because genomewide SNP markers provide a more complete picture of the genetic variation of Lima bean, we used these markers to assess from a genomic perspective the genetic structure of Lima bean, the effect of domestication on genome-wide diversity and patterns of linkage disequilibrium in landraces, and to detect genomic regions that may harbor candidate domestication genes.

# MATERIALS AND METHODS

# Plant Material

On the basis of previous studies (Motta-Aldana et al., 2010; Serrano-Serrano et al., 2010, 2012; Andueza-Noh et al., 2013, 2015; Martínez-Castillo et al., 2014), 160 wild and 110 domesticated accessions were selected from the germplasm bank of the International Center for Tropical Agriculture—CIAT and the Centro de Investigación Científica de Yucatán, CICY (see Table S1). Accessions were chosen from those analyzed in previous studies and that had a gene pool assigned according to ITS data. These accessions were complemented with other accessions in order to reflect the known range of distribution of Lima bean in the Americas. Most of the selected accessions have geographic coordinate data (256 out of 270 accessions). Because in this study we are testing domestication hypotheses for the Mesoamerican landraces, the accessions analyzed in this study are mainly (not exclusively) distributed within the potential domestication area in Mesoamerica, namely from Mexico to Costa Rica. Therefore, many of the wild accessions come from countries such as Mexico (78 accessions) and Guatemala (29 accessions) and almost half of the landraces come from Mexico (46 accessions). Also some accessions from South America, classified previously within gene pool MII and the Andean gene pool, were selected. The geographic distribution of the accessions is shown in Figure S1.

# Genotyping by Sequencing (GBS)

DNA was extracted from young leaflets with the method reported by Vega-Vela and Chacón Sánchez (2011). These DNA samples were analyzed by GBS (Elshire et al., 2011) by the Institute for Genome Diversity of Cornell University, USA. Sequence reads were processed and analyzed with the software NGSEP (Duitama et al., 2014; Perea et al., 2016). Sequence reads were de-multiplexed on the basis of their unique barcodes and aligned to the common bean (Phaseolus vulgaris L.) genome used as reference. The reference genome was obtained from Phytozome v. 9.1 (http://www.phytozome. net/) and indexed with the program Bowtie 2 (Langmead and Salzberg, 2012). After variant detection, all samples were genotyped for every variable position. The SNPs that were polymorphic, contained less than 10% missing data, with quality values higher than 40, supported by a minimum read depth of 10 and with a minor allele frequency (MAF) higher than 5% were retained for further analysis. Also, only samples that had less than 10% missing data were retained. One filtered VCF file was built for the wild accessions (134 wild accessions and 4,593 SNPs) and a second VCF file for wild and domesticated accessions (270 accessions and 4,779 SNPs). These VCF files were annotated using the GFF of common bean and converted to other formats, as needed, by using the tools implemented in the programs NGSEP (Duitama et al., 2014), Tassel v. 5.0 (Bradbury et al., 2007), and PGDSpider v. 2.0.8.2 (Lischer and Excoffier, 2012). All.bam files generated in this study for the 270 accessions were deposited in the SRA database of GenBank under accession number SRP115055.

# Genetic Structure of Wild Lima Bean

Several tools were applied to evaluate the genetic structure of wild Lima bean. First, a Nei's standard genetic distance matrix was built among individuals with the software GenAlex v. 6.5 (Peakall and Smouse, 2012). This genetic distance matrix was then used to build an unrooted neighbor-joining (NJ) tree with the software Phylip (Felsenstein, 1993) and to carry out a principal coordinate analysis (PCoA) in GenAlex v. 6.5 in order to explore the major grouping patterns in the dataset. Second, the Bayesian clustering approach implemented in the software Structure v. 2.3.4. (Pritchard et al., 2000) was applied. The model used was admixture with correlated allele frequencies. We evaluated values of K from 2 to 6 and run 10 independent simulations for each value of K. Each simulation consisted of a burnin period of 100,000 and 100,000 MCMC (Markov Chain Monte Carlo) steps after burnin. The software CLUMPP (Jakobsson and Rosenberg, 2007) was used to obtain a single Q matrix for each K and the software Structure Harvester (Earl and vonHoldt, 2012) was used to obtain the optimal K according to Evanno et al. (2005). For the optimal K, each individual was assigned to the K population from which it derived more than 70% of its ancestry; otherwise, the individuals were classified as admixed. For each wild cluster observed, several measures of genetic variability were calculated with the software GenAlex, namely, average number of alleles per locus (NA), average effective number of alleles per locus (NE), Shannon's diversity index (I), observed heterozygosity (HO), expected heterozygosity (HE), and fixation index (F).

# Geographic Areas and Number of Domestication Events in Lima Bean

To recover genetic clusters of wild and domesticated accessions, the same methods outlined above for wild Lima beans were used. The clustering patterns among wild and domesticated accessions may indicate how many times and where Lima beans were domesticated. Two competing domestication scenarios (scenarios 1 and 2) (see **Figure 1**) were evaluated by an approximate Bayesian computation (ABC) approach. In both scenarios, the two Mesoamerican wild populations (MI and MII) diverged at time t3. The time of divergence of wild populations is much older than the divergence of domesticated populations from their wild sources (t1 and t2). The origin of each domesticated population from its wild source population involved a domestication bottleneck (db) for a number of t generations and an effective size Ni smaller than the effective size of the source population. After the domestication bottleneck, the domesticated populations reached a larger and stable population size. All wild source populations are stable in size.

For the ABC estimation of the posterior probabilities of the two domestication scenarios, the software DIYABC 2.1.0 was used (Cornuet et al., 2014). For this approach, a new SNP matrix (2,527 SNPs) was built to contain 30 accessions from wild gene pool MI, 30 accessions from wild gene pool MII, 30 accessions from domesticated gene pool MI and 17 accessions (the only accessions available) from domesticated gene pool MII, for a total of 107 accessions. These accessions are marked with an asterisk in their ID in Table S1. In selecting these accessions, care was taken

FIGURE 1 | Domestication scenarios of Lima bean evaluated by an approximate Bayesian computation approach. In both scenarios, the two Mesoamerican wild populations (MIW and MIIW) diverged at time t3. The time of divergence of wild populations is much older than the divergence of domesticated populations (MID and MIID) from their wild sources (t1 and t2). The origin of each domesticated population from its wild source population involved a domestication bottleneck (db) for a number of t generations and an effective size Ni smaller than the effective size of the source population. After the domestication bottleneck, the domesticated populations reached a larger and stable population size. All wild source populations are stable in size. (A) Scenario 1: Mesoamerican landraces (MID and MIID) come from two independent domestication events from MIW and MIIW, respectively. (B) Scenario 2: Mesoamerican landraces (MID) come from one domestication event from MIW. MIID landraces are the product of admixture between MID and MIIW, at an admixture rate of ra. Full explanation of parameters is found in Table 1.

to not include accessions classified as admixed in the Structure analysis because this analysis assesses the recent contribution of the different populations to the accessions (Cornille et al., 2012), and for the ABC analysis we want to assess historical contributions, not recent ones. In this ABC approach, a total of 200.000 genetic datasets were simulated under the coalescent model and each scenario was considered to be equally probable. In these simulations, the values of the parameters, which can be seen in **Table 1**, were drawn from their prior distribution and were kept as generic as possible. For Lima bean, one generation corresponds to 1 year. Conditions were set as t3 > t2 and t3 > t1, according to previous studies (Serrano-Serrano et al., 2010). The duration of bottleneck (db1 and db2) was set between one and three thousand generations because previous studies suggest that domestication may involve slow rates of evolution (Purugganan and Fuller, 2011). Time of domestication (t1 and t2) was set between 2,000 and 10,000 generations (or years) according to archeological data for Lima bean (Kaplan and Lynch, 1999).

For each scenario, a pre-evaluation of model-prior combination was done in two ways. First, a principal component analysis was done at 10,000 simulated datasets to see how the observed data are located in relation to the space of simulated data. Second, simulated summary statistics are evaluated to count how often they are over-estimated or sub-estimated in relation to the observed data.

For each simulation, the summary statistics calculated were mean genetic diversity and mean genetic divergence measured by FST (and for scenario 2 an additional summary statistics was admixture rate, ra). The calculation of the posterior probability of each scenario is based on the distance or difference that exists between the summary statistics calculated for each simulated dataset and the observed dataset. The estimation of the posterior probability of scenarios was done by applying a direct approach and a weighted polychotomous logistic regression on 500 and 10,000 simulated datasets, respectively. In this regression the predictor variables were the differences between observed and simulated summary statistics.

The best domestication scenario was selected as the one with the highest posterior probability. We evaluated the confidence in choosing the best scenario by simulating 1,000 pseudoobserved data sets (pods). Then, posterior probabilities of the scenarios were calculated and finally type I and type II


errors were estimated. To assess the fit of the best scenario in relation to the observed dataset, the model checking option implemented in DIYABC was used. This option evaluates the goodness of fit of model-posterior combinations. Finally, the posterior distributions of parameters under the best scenario were calculated by using the 1% simulated datasets that are closest to the observed dataset. The bias and precision in parameter estimation was calculated with 1,000 pods using the several dispersion measures implemented in DIYABC.

# Domestication Founder Effects

To investigate founder effects due to domestication on the Lima bean genome two analyses were conducted in a sample of 160 wild and 110 domesticated accessions. First, the percent reduction (%r) in expected heterozygosity among wild (HEW) and domesticated accessions (HED) was calculated as follows: %r = (HEW-HED)/(HEW). This reduction was measured in the whole sample and within the Mesoamerican and Andean gene pools. This reduction was also calculated locus by locus for the whole sample of accessions. Second, to compare the level of linkage disequilibrium (LD) by chromosome among wild and domesticated accessions, the full matrix option of the program Tassel was used. The method to detect regions with significant differences in LD among wild and domesticated accessions is described below.

# Outlier Loci and Genomic Regions in High LD Related to Domestication

In order to identify outlier loci related to domestication, two analyses were carried out. First, loci that show significant divergence among wild and domesticated beans were detected by means of their FST values using the program Bayescan (Foll, 2012). In order to increase the number of SNP markers analyzed by Bayescan, a new SNP matrix was created with 95 wild and domesticated accessions with the highest coverage from the sequencing process. In this new matrix, with less missing data, we could retain 7,759 high quality SNPs. We used Bayescan to make two comparisons: (1) among all wild and all domesticated accessions and (2) among wild and domesticated accessions within the Mesoamerican and within the Andean gene pool. For Bayescan we used a burnin period of 50,000 followed by 100,000 iterations. To declare outlier loci we used a false-discovery rate of 0.05. Second, the island model proposed by Excoffier et al. (2009) and implemented in the software Arlequin (Excoffier and Lischer, 2010) was applied to identify outlier loci based on FST statistics in a set of 100 wild and domesticated accessions from gene pool MI (2,827 SNPs). For this, we simulated 50,000 datasets with two main groups (wild and domesticated), 100 demes within each group, and a maximum expected heterozygosity of 0.5. To declare outlier loci we took into account those loci that fell outside 99% confidence intervals of the null distribution.

Genomic regions with significant differences in LD between wild and domesticated beans were detected by applying the methods implemented in the software varLD (Ong and Teo, 2010). varLD aims to identify genomic regions with significant differences in LD patterns among two populations relative to the LD differences found in the rest of the genome (Teo et al., 2009). The significance of the varLD scores was obtained by resampling methods.

After identifying the outlier SNP loci based on FST, these SNPs were structurally and functionally annotated by comparing the physical position of each SNP with the GFF annotation of the P. vulgaris genome reported by Schmutz et al. (2014). This annotation allowed to see whether the SNPs had nonsynonymous effects on coding sequences. Because the genomic regions with different LD patterns among wild and domesticated accessions may contain genes related to domestication, we counted within the LD regions the number of genes that displayed domestication signatures in P. vulgaris, according to the results reported by Schmutz et al. (2014), and then we evaluated whether the function of these genes was or not related to domestication.

# RESULTS

# SNP Detection by GBS

The GBS technique produced a total of 565,386,739 sequence reads for the 270 accessions analyzed. Of these, 242,489,038 reads were aligned at unique positions in the reference genome, 112,212,070 reads were aligned at multiple positions and 210,685,631 reads were not aligned. The percentage of reads aligned was 63%. In average, every sample produced 2,094,025 reads of which 898,108 aligned to unique positions. The reads aligned at unique positions were further used for variant detection among the samples analyzed. This process produced an unfiltered VCF file containing 895,110 biallelic SNPs and 11,265 biallelic indels. A total of 322,052 biallelic SNPs were located in coding regions, 162,892 of them were synonymous substitutions, 157,279 were missense substitutions, and 1,881 were non-sense substitutions. After filtering, a total of 4,779 biallelic SNPs were retained, 3,439 of them located in coding regions, 2,245 were synonymous substitutions, 1,192 missense substitutions, and two non-sense substitutions. Therefore, synonymous substitutions were twice more frequent than non-synonymous substitutions in the filtered dataset. Transitions were also more frequent than transversions (transition/transversion ratio = 1.32). Raw read number, mapped reads, and number of SNPs genotyped per accession can be seen in Table S2. The distribution of the SNPs by chromosome can be seen in Figure S2. Given a genome size of common bean of about 587 MB (Schmutz et al., 2014), the map density would be of one SNP every 123 Kb.

# Genetic Structure of Wild Lima Bean

Genetic structure of the wild Lima bean was explored in a set of 134 wild accessions and a total of 4,593 SNPs by means of a NJ analysis, a PCoA and assignment tests implemented in the software Structure. The results of these analyses can be seen in **Figures 2**, **3**, **4**. The results of Structure indicated that the optimum K was 5. Accessions could be assigned to four of these populations (K1 to K4) and the rest of accessions (20 in total, 15%) were classified as admixed. In general, there was a high concordance in the clustering patterns obtained with the three kinds of analyses (except for the set of 20 admixed accessions), therefore 114 wild accessions could be assigned to four different clusters or gene pools (MI, MII, AI, and AII).

In Table S1, the PcoA and Structure clusters assigned to each accession can be seen. Below there is a short description for each one of the clusters. Discrepancies between our current classification of accessions in gene pools based on GBS data and previous studies based on ITS data can be seen in Table S3.

## Cluster 1

This cluster contains 48 accessions mainly distributed in Mexico (humid coastal plains of the Gulf of Mexico in the states of Veracruz and Tamaulipas, Chiapas, Oaxaca and the Peninsula of Yucatán), in Central America (Guatemala, Honduras and Costa Rica), northern Colombia, Ecuador (Azuay), Peru (Cajamarca and Junín), and Argentina (Salta) (**Figure 5**). This is the most widely distributed cluster and corresponds to the Mesoamerican gene pool MII of previous studies (Serrano-Serrano et al., 2010, 2012), therefore we will call this cluster MII hereafter. A total of 29 accessions in this cluster have been evaluated for seed size at CIAT. Seed size varies from 4 to 15.3 g/100 seeds, with average of 8.8 g/100 seeds, a seed size that is within the range of Mesoamerican wild Lima beans.

### Cluster 2

This cluster contains 45 accessions mainly distributed in Mexico (36 accessions) along the western and southern Pacific coastal plains and hills in the states of Sinaloa, Nayarit, Jalisco, Colima, Michoacán, Guerrero, and Oaxaca, the Gulf of Mexico dry coastal plains in the state of Tamaulipas, one accession in Yucatán, two accessions in Puebla and two in Morelos, some few accessions in Central America (Guatemala and Belize), one accession in Cuba,

detected by GBS. Names on the right side indicate the clusters detected. Numbers on nodes indicate bootstrap support.

one in northern Colombia (Atlántico department) and three in Ecuador (Chimborazo, Imbabura and Pichincha) (**Figure 5**). This is the second most widely distributed gene pool and its geographic distribution is within the range of the Mesoamerican gene pool MI reported in previous studies, except for the three accessions from Ecuador (G26469, G26606, G26751A). We will call this cluster MI hereafter. A total of 21 accessions have seed size data reported by CIAT. Seed size ranges from 5 to 25.4 g/100 seeds, with an average of 10.9 g/100 seeds, a size that is within the range of Mesoamerican wild Lima beans.

# Cluster 3

This cluster contains 15 accessions from South America (six accessions from Ecuador, one from Colombia, and one from Peru), Guatemala (five accessions: G25844, G26653, G26655, G26684, G26732), one accession from Honduras (G26630), and one from Mexico (Chiapas, G26753) (**Figure 5**). All the accessions have seed size data taken by CIAT and show an average of 13.6 g/100 seeds. The geographic distribution of these accessions in Ecuador and Peru corresponds to the range of the Andean gene pool observed in previous studies, and the close genetic relationship of these Andean accessions with accessions from Central America has not been reported before. We will call this cluster Andean I (AI) hereafter.

# Cluster 4

Only six accessions, all of them from the central departments of Boyacá and Cundinamarca in Colombia, belong to this cluster. Seed size in this cluster ranges from 12.2 to 17.4 g/100 seeds, with an average of 15 g/100 seeds, a range characteristic of Andean wild Lima beans. We will call this cluster AII hereafter.

**Table 2** summarizes genetic diversity values for the four wild gene pools or clusters. As can be seen, there was practically no observed heterozygote genotypes in the SNPs analyzed, or very few, as expected for an autogamous species. Among the four wild gene pools, the MII gene pool (H<sup>E</sup> = 0.138) was slightly more diverse than the MI gene pool (H<sup>E</sup> = 0.115), and gene pools AI and AII were four or five times less diverse, although this may be due to the fact that the sampling scheme in this study was more focused toward the Mesoamerican wild gene pool.

# Domestication Patterns in Lima Bean Indirect Methods

The genetic relationships among 160 wild and 110 domesticated accessions were studied by means of genetic distances, unrooted NJ topologies (**Figure 6**), PCoA (**Figure 7**), and Bayesian approaches with the software Structure (**Figure 8**). These analyses may tell us how many times and where Lima beans were domesticated. The first axis of the PcoA explained 28.40% of variation, the second one explained 15.54%, and the third one explained 3.15%, for a cumulative of 47.09%. It can be seen in **Figures 6**–**8** that domesticated accessions grouped along with wild accessions within clusters MI, MII, and AI. The Structure analysis indicated that the optimum K was three: one for wild and domesticated accessions within gene pool MI (population K2), another one for wild and domesticated accessions within gene pool MII (population K1) and a third one for wild and domesticated accessions within gene pool AI (population K3). The Structure analysis reported a total of 14 admixed accessions: 8 domesticated and 6 wild. Figure S3 shows the global ancestry derived from gene pools AI, MI, and MII for these 14 admixed accessions. The geographic distribution of wild and domesticated accessions within MI, MII and AI clusters and the admixed or outlier accessions can be seen in **Figure 9**.

In the cluster MI, most of the domesticated accessions (74 accessions, 67%), with an average seed size of 48 g/100 seeds, grouped together with 53 wild accessions. In this cluster we found three domesticated accessions from Bolivia, Ecuador, and Peru (G25981, G26480, G25909, respectively) that show larger seed sizes (from 64 to 121 g/100 seeds) typical of Andean beans.

In the cluster MII, only 15% of the domesticated accessions (17 accessions), with an average seed size of 53 g/100 seeds, grouped together with 82 wild accessions. In this cluster we found three accessions, one from Bolivia, and two from Ecuador (G27337, G26659, G26672, respectively) that show larger seed sizes (from 75 to 115 g/100 seeds) typical of Andean beans.

In the cluster AI, 10% of the domesticated accessions (11 accessions) with an average seed size of 66 g/100 seeds grouped together with 14 wild accessions. In this cluster we found four accessions (G26290, G26438, G25277, G25771) from Argentina, Costa Rica, El Salvador, and Mexico, respectively, that show smaller seed sizes (from 40 to 53 g/100 seeds), typical of Mesoamerican landraces. These accessions were classified as Mesoamerican in a previous study based on ITS polymorphisms (Serrano-Serrano et al., 2012). The sampling in this study was more focused on wild and domesticated accessions from Mesoamerica, for this reason we will describe below in more detail the clustering pattern of MI and MII.

It can be seen in **Figure 6** that within MI cluster, domesticated accessions and wild accessions are grouped within separate subclusters. Within the domesticated subcluster however, 13 wild accessions were found. These wild accessions come from Mexico (four accessions in Morelos, Oaxaca and Yucatán), Guatemala (four accessions), Cuba (one weedy accession), Colombia (one

accession), and Ecuador (three accessions). The inverse is also true, within the wild subcluster two domesticated accessions were found, one from Panama and another one from Peru. The significance of this is not clear but may indicate cases of introgression, at least for the wild sample in Yucatán where cases of introgression among wild populations and landraces have been documented (Martínez-Castillo et al., 2007; Dzul-Tejero et al., 2014). Also, in the PCoA plot of axis 1 vs. 3, we can see that most of the MI domesticated accessions tend to cluster together and apart from the MI wild accessions on axis 3 (see Figure S4).

## ABC Approach

We compared two domestication scenarios with an ABC approach (**Figure 1**) to evaluate the hypotheses of single or multiple domestications of landraces in Mesoamerica. Before estimating posterior probabilities of both scenarios, we evaluated model-prior combinations (see Table S4 and Figure S5). The PCA plots show how observed data are within the cloud of simulated data for both scenarios, indicating a good modelprior combination. Table S4 shows that for most of the summary statistics analyzed, simulated data are not significantly different from observed data.

The best-supported domestication scenario with the direct approach was scenario 1 (although scenario 2 also got some support) and with the logistic approach was scenario 2 (and here scenario 1 did not get any support) (see **Table 3** and Table S5), indicating than both scenarios may be supported by the genetic data.

The goodness of fit of model-posterior combination was done by means of a PCA using the model checking option of DIYABC. Figure S6 shows that observed data is within the cloud of simulated data based on posterior distributions, indicating a good fit of model-posterior combination for scenarios 1 and 2.

Parameter estimation was done on the basis of scenarios 1 and 2 separately and not by averaging because parameter values show large differences between the two scenarios (see **Table 4** and Figures S7, S8). The performance of the method for parameter estimation was assessed by means of several bias and error measures (see Table S6).

# Founder Events

**Table 2** shows the global percent reduction (%r) in genetic diversity in domesticated accessions compared to their wild ancestors (founder effect). Domesticated accessions compared to wild accessions (for the whole sample and within gene pool MI) are less diverse in terms of P, NE, I and HE. P was 25% higher in MI wild gene pool compared to MI domesticated gene pool, and 10% higher in MII wild gene pool compared to MII domesticated gene pool. P among wild and domesticated accessions within gene pool AI was very similar.

The founder effect when all domesticated accessions were compared to all wild accessions was around 18%. When this effect was measured within gene pools, contrasting values were obtained. While the founder effect within the MI gene pool was TABLE 2 | Diversity indexes for wild and domesticated accessions of Lima bean and for the four gene pools observed in this study, calculated on the basis of 4,779 SNP markers.


N, Average sample size; P, Percent of polymorphic loci; NA, Average number of alleles per locus; N<sup>E</sup> , Average effective number of alleles per locus; I, Shannon's diversity index; HO, Observed heterozygosity; H<sup>E</sup> , Expected heterozygosity; F, Fixation index; %r, Reduction in genetic diversity due to domestication (or founder effect) calculated as (HEW -HED)/(HEW ) where HEW is genetic diversity in wild accessions and HED is genetic diversity in domesticated accessions.

around 31%, there was no evidence of founder effect within gene pools MII and AI.

%r was also calculated locus-by-locus in the whole sample. Two-third parts of the loci, namely 3,167 loci, reported positive values for %r ranging from 1% to 100% and mean value of 44%. Positive values of %r indicate a founder effect, namely domesticated accessions contain less diversity than wild accessions. Also, about one-third of the loci, namely 1,562 loci, reported negative values for %r indicating that domesticated accessions in these loci contain more genetic diversity than wild accessions (in average 31% more diverse). Lastly, 48 loci reported values of cero, with no evidence of reduction in genetic diversity.

The fact that some loci are genetically more diverse in domesticated populations than in wild populations when the whole sample is analyzed may be a result of population structure within the domesticated gene pool. For this reason, %r was measured locus-by-locus within each gene pool (MI, MII, and AI) (see **Table 2**). Within gene pool MI, most loci (around 84.6%) showed positive values (in average a reduction of 78% in genetic diversity in these loci in landraces), 15% of loci showed negative values (in average 34% more diversity in these loci in landraces), and only 0.4% of loci showed no reduction in genetic diversity. Within gene pool MII, the results were quite different because most loci (around 54.5%) showed negative values (in average 33% more diversity in landraces in these loci), 42.3% of loci showed positive values (in average a reduction of 53% in genetic diversity in landraces) and 3.2% of loci showed no reduction in genetic diversity. In the Andean gene pool (AI), 79% of loci showed reduction in genetic diversity in the landraces (an average reduction of 47%), and 21% of loci showed increase in genetic diversity in the landraces (in average these loci contain 16% more genetic diversity).

To compare the LD among wild and domesticated accessions, chromosome-wise LD values were calculated as the correlation coefficient r 2 . It can be seen in **Table 5** that in all the 11 chromosomes, average values of r 2 are larger in domesticated accessions than in wild accessions (an increase of about 20–30%), possibly as a consequence of the reduction in genetic diversity during the domestication process and selection processes. In the section below, we identified particular chromosome regions with large differences in LD among wild and domesticated accessions that may indicate regions under selection.

FIGURE 6 | NJ topology showing the genetic relationships among the 270 wild and domesticated Lima bean accessions included in this study on the basis of 4,779 SNPs detected by GBS. Names on the right side indicate the clusters detected. Names on nodes indicate bootstrap support. Within cluster MI, wild accessions are shown as light blue lines and domesticated accessions as dark blue lines. Within cluster MII, wild accessions are shown as light red lines and domesticated accessions as dark red lines. Within cluster AI, wild accessions are shown as bright green lines and domesticated accessions as dark green lines. Wild admixed accessions are shown as black lines and domesticated admixed accessions as gray lines.

# Outlier Loci and Genomic Regions Related to Domestication

FST outlier loci related to domestication were detected by means of a Bayesian approach implemented in Bayescan and by the hierarchical method of Excoffier et al. (2009) implemented in Arlequin (Excoffier and Lischer, 2010). Figure S9 shows the five outlier loci detected by Bayescan in chromosomes one, three, six and nine. Table S7 shows the location and annotation of these five outlier loci. Of these five SNPs, the ones in chromosomes three and six lie within genes that according to their GO ontology are related to functions relevant for the domestication process (photoperiodism and seed germination, respectively). The SNP in chromosome 3 is interesting because it involves a missense change and the two SNPs in chromosome 9 fall within genes that were reported as Mesoamerican domestication genes in common bean by Schmutz et al. (2014). A total of 21 outlier loci were detected with the hierarchical method implemented in Arlequin (Excoffier and Lischer, 2010). Table S8 shows the location and annotation of these 21 loci; three of these SNPs involve missense changes and four of them have been reported as domestication genes in common bean (Phvul.001G024800, Phvul.002G262700, Phvul.003G111900, and Phvul.006G102400) by Schmutz et al. (2014).

The results obtained with the varLD method show that chromosome regions with significant differences in LD between wild and domesticated beans are located in only four chromosomes as shown in **Figure 10**. It can be noted that the SNP in chromosome three detected as outlier by Bayescan (Table S7) and most of outlier SNPs in chromosome three detected by Arlequin (Table S8) lie within the region of high LD detected by varLD in this chromosome.

As stated in materials and methods, in order to establish the presence of domestication candidate genes within the regions detected by varLD, we counted the number of genes that displayed domestication signatures in P. vulgaris, according to the results reported by Schmutz et al. (2014), and then we evaluated whether the function of these genes was or not related to domestication on the basis of the function reported for Arabidopsis thaliana. Schmutz et al. (2014) identified candidate domestication SNPs as those SNPs that showed significant reduction in genetic diversity in the landraces and those showing significantly higher FST values among wild and domesticated pooled samples.

With this approach we found 150 genes in total and five of them seemed to be involved in functions related to several traits of the domestication syndrome (seed germination, growth promotion, flowering regulation, pod shattering, and photoperiod sensitivity). The results are summarized in Table S9.

# DISCUSSION

# Gene Pools and Genetic Diversity in Wild Lima Bean

The new sequencing technologies allowed us evaluate the genetic structure of wild Lima bean on the basis of thousands of genome-wide SNP markers and confirm, on a more solid basis, the existence of three gene pools: the Mesoamerican I, the Mesoamerican II, and the Andean (AI), with mostly nonoverlapping geographic ranges. The results also suggest the possible existence of another gene pool in central Colombia (AII) (although more samples need to be analyzed).

The different analysis tools used, namely, NJ topologies, PCoA, and Structure assignment tests, agreed in the description of the genetic structure of wild Lima bean into three gene pools. Inside the Mesoamerican gene pools (MI and MII) we did not find evidence for further subgroups in a geographical sense, as earlier suggested by Andueza-Noh et al. (2015) who found on the basis of SSR markers a sub-structuring within the gene pool MI in central-western Mexico.

The finding in this study of a possible existence of a separate cluster (AII) for wild Lima bean accessions from the departments of Cundinamarca and Boyacá in central Colombia is not new. Caicedo et al. (1999) had already observed the existence of the Mesoamerican and Andean gene pools in Lima bean and the existence of a separate cluster in the Colombian departments of Cundinamarca and Boyacá on the basis of AFLP markers. In earlier years, wild accessions of Lima bean from Cundinamarca and Boyacá were collected by CIAT in 1992 (Toro et al., 1993) and 1993, and these wild Lima beans were reported to be morphologically similar to the Andean forms in western Ecuador and northwestern Peru in terms of flowers and seed

size, but similar to the Mesoamerican Lima beans in terms of electrophoretic patterns of phaseolin (the main reserve seed protein). Therefore, one could wonder about the evolutionary significance of these wild populations in Colombia. Previous studies have established that the Andean gene pool AI is ancestral to MII and MI (Serrano-Serrano et al., 2010), and in this context gene pool AII takes relevance to understand the evolutionary history of wild Lima bean and its spread to Mesoamerica from the Andes, therefore additional accessions from central Colombia should be studied in more detail.

When we compared our current GBS results with previous studies based on ITS and cpDNA data (Motta-Aldana et al., 2010; Serrano-Serrano et al., 2012), we observed that there are some few conflicts or discrepancies in the classifications of some wild accessions (in total 12, see Table S3) into Andean vs. Mesoamerican gene pool. For example, within the wild Mesoamerican gene pool MII, the accessions from Ecuador (Azuay, G26721), and Peru (Cajamarca, G25913) displayed, in previous studies, ITS and cpDNA haplotypes typical of Andean beans. This conflict could be explained by introgression among

TABLE 3 | Statistics used to choose among the two competing domestication scenarios and obtained on the basis of an ABC approach.


<sup>a</sup>The estimation of the posterior probability of scenarios was done by applying a direct approach on 500 simulated datasets. The posterior probability is shown along with 95% confidence intervals.

<sup>b</sup>Type I error was estimated from datasets simulated under the true scenario as the proportion of datasets where the true scenario did not show the highest probability among the competing scenarios, probabilities estimated with direct approach.

<sup>c</sup>Type II errors were estimated from datasets simulated under the non-true scenario as the proportion of datasets where the true scenario showed the highest probability among the competing scenarios, probabilities estimated with direct approach.

<sup>d</sup>The estimation of the posterior probability of scenarios was done by applying a weighted polychotomous logistic regression on 10,000 simulated datasets. The posterior probability is shown along with 95% confidence intervals.

<sup>e</sup>Type I error was estimated from datasets simulated under the true scenario as the proportion of datasets where the true scenario did not show the highest probability among the competing scenarios, probabilities estimated with logistic regression.

<sup>f</sup> Type II errors were estimated from datasets simulated under the non-true scenario as the proportion of datasets where the true scenario showed the highest probability among the competing scenarios, probabilities estimated with logistic regression.

Andean wild and domesticated Mesoamerican beans introduced to the Andes or by a phenomenon known as incomplete lineage sorting that can produce conflicts among phylogenetic trees built on the basis of different genomic regions. The first explanation is at least possible for the accession from Ecuador (Azuay) because Mesoamerican landraces could be found at the elevation where this wild accession was collected (470 meters above sea level). For the accession in Peru (Cajamarca) this explanation is less plausible because of the high altitude in which this accession was collected (1,810 m.a.s.l.). In the wild Mesoamerican gene pool MI we also found conflicts. In this gene pool, two wild accessions from Ecuador (G26469 and G26751A) showed large seed sizes (18.5 and 25.4 g, respectively) that are within the range of Andean wild Lima beans and in a previous study (Motta-Aldana et al., 2010), based on ITS and cpDNA polymorphisms, these two accessions were classified as Andean (see Table S3). Another TABLE 4 | Parameter estimation of the ABC approach based on the posterior distribution of scenarios 1 and 2.


<sup>a</sup>See Table 1 for an explanation of parameters.

TABLE 5 | Average values of linkage disequilibrium measured as the correlation coefficient r <sup>2</sup> by chromosome in the sample of wild and domesticated accessions.


accession (G26606) showed a seed size (14.9 g) within the range of Mesoamerican wild beans and had been classified in previous studies as Mesoamerican based on ITS data and as Andean based on cpDNA data (Motta-Aldana et al., 2010). Therefore, these three accessions may represent cases of ancient introgression between Mesoamerican and Andean beans in Ecuador.

For the Andean wild gene pool we found an unexpected result. In previous studies (Motta-Aldana et al., 2010; Serrano-Serrano et al., 2012), all wild accessions that were classified within the Andean gene pool were distributed in Ecuadornorthern Peru, a well-defined geographic area. In the present study, we found within this gene pool accessions from Ecuador and northern Peru with a seed size range that goes from 13 to 22.2 g/100 seeds typical of the Andean wild beans, and also found wild accessions from the Mesoamerican area (from Mexico, Guatemala, and Honduras). These accessions from Mesoamerica show seed sizes from 8 to 11.6 g/100 seeds, a range within the one observed in Mesoamerican wild beans. In fact, in a previous study (Serrano-Serrano et al., 2012) based on ITS polymorphisms, these accessions from Mesoamerica were classified within the gene pool MII, not within the Andean gene pool (see Table S3). Seed size and ITS data suggest a Mesoamerican origin for these accessions, but GBS data placed them closer to the Andean wild populations of Ecuador-northern Peru, a result that is difficult to explain.

Apart from these few cases of conflict, where some of them may be explained by introgression, we found large agreement between GBS and ITS data. This result is remarkable given the fact that although the ITS represents a single locus in the nuclear genome, this locus resulted to be very informative about the genetic structure of Lima bean.

In terms of genetic diversity, GBS data show that gene pool MI (H<sup>E</sup> = 0.115) and gene pool MII (H<sup>E</sup> = 0.138) are very similar; H<sup>E</sup> for MII is only slightly higher than for MI. This pattern contrasts with previous studies based on ITS and cpDNA haplotypes where the gene pool MI was found to be more diverse

comparisons among wild and domesticated Lima beans on the basis of 4,779 SNPs detected by GBS. Plots show the chromosomes where regions with significant differences in LD between wild and domesticated beans were detected.

than gene pool MII, and indeed this pattern was the basis for a test applied by Serrano-Serrano et al. (2010) that inferred processes of population expansions within MII that would have reduced its genetic diversity. Our results are in agreement with a previous study based on nuclear DNA SSR polymorphisms (Andueza-Noh et al., 2015) where the authors found more diversity in the wild MII gene pool (H<sup>E</sup> = 0.65) than in the wild gene pool MI (H<sup>E</sup> = 0.53). This observation is also in agreement with the statistics calculated in the ABC approach that showed that wild gene pool MI (H<sup>E</sup> = 0.1396) is less diverse than wild gene pool MII (H<sup>E</sup> = 0.2059). The higher diversity of wild gene pool MII may be related to its larger effective size (see **Table 4**, see parameter N4) and larger geographic range, from southern Mexico to northern Argentina, compared to the geographic range of gene pool MI mainly in central-western Mexico and to the north-end of the distribution range of the species.

# Demography of Lima Bean Domestication and Founder Effects

In this study we first applied an indirect approach (NJ, PCoA and Structure assignment test) to produce domestication scenarios that were later tested by means of an ABC approach. The indirect approach (see **Figures 6**–**8**) showed that domesticated accessions grouped together with wild accessions into three different clusters: MI, MII, and AI. At first sight, this clustering pattern could indicate three separate domestication events: one for MI, one for MII, and another one for AI.

By comparing **Figures 6**–**8**, it can be seen that the results of NJ, PcoA, and Structure are in general congruent in the clustering pattern of accessions (see Table S1, last two columns). The only major exception is that while in the NJ we observed the six wild Colombian accessions grouped in a well-supported cluster (cluster AII), in the Structure analysis these accessions were assigned to gene pool MII. This result makes evident the need, in future studies, for a more exhaustive analysis of the wild accessions from central Colombia to better establish their genetic relationships to the other gene pools.

An interesting result is that most of the Mesoamerican landraces analyzed (68 in total) are grouped within cluster MI along with wild accessions from this gene pool (**Figure 6**). In this cluster we found three large-seeded landraces from the Andes, which in previous studies (Serrano-Serrano et al., 2012) based on ITS polymorphisms were classified as Andean, therefore we believe these are actually Andean accessions but may have introgressed with introduced Mesoamerican landraces in the Andes. Another interesting result is the fact that within the MI cluster, the landraces form a separate subcluster, which may indicate a single origin for these accessions (one domestication event). In contrast, we see that only a handful of Mesoamerican landraces (14 in total) are grouped within the MII cluster along with wild accessions of this gene pool. In this MII cluster we also found three large-seeded landraces from the Andes, which in previous studies (Serrano-Serrano et al., 2012) based on ITS polymorphisms were also classified as Andean, therefore we also believe these are actually Andean accessions but may have introgressed with Mesoamerican landraces introduced in the Andes (see Table S3). In this MII cluster, the landraces do not form a single subcluster but are interspersed among the wild accessions (see **Figure 6**). This clustering pattern within MII may be compatible with a scenario of introgression among MII wild accessions and introduced domesticated accessions, however a second domestication event accompanied with profuse gene flow cannot be ruled out. The indirect approach therefore suggested two domestication scenarios (see **Figure 1**) that were evaluated with an ABC approach. The first scenario involves a domestication event within gene pool MI and another domestication event within gene pool MII. The domestication event within MI would have occurred in the distribution area of wild MI accessions, which are mainly distributed in centralwestern Mexico, making this the putative domestication area for the MI Mesoamerican landraces. The domestication area for MII landraces would have occurred in the area Guatemala-Costa Rica where the wild MII accessions are more abundantly distributed in Mesoamerica. The second domestication scenario involves only a single domestication within MI followed by admixture between MI domesticates and MII wild populations.

According to the posterior probabilities calculated with the ABC approach, both scenarios seem to be supported by the genetic data, depending on the approach that one applies: the direct approach gives more support to scenario 1 while the logistic approach only supports scenario 2. This conflict among the direct and logistic approach may reflect conflictual molecular signals and not implementation problems because we have shown in the results a good prior-model combination and also, type I and type II errors were low (see **Table 3**), suggesting enough statistical power to differentiate among the competing scenarios.

A single domestication event from gene pool MI, which is mainly distributed in central-western Mexico, has been supported by all previous studies based on cpDNA, ITS and SSR polymorphisms (Motta-Aldana et al., 2010; Serrano-Serrano et al., 2012; Andueza-Noh et al., 2013, 2015), in contrast, all the data collected in previous studies have not confirmed a second domestication event in Lima bean from gene pool MII. On the basis of the high support obtained by scenario 2 with the logistic approach (posterior probability of 100%), we can say that our genetic data give evidence for a contribution of wild gene pool MII in shaping the current genome diversity of landraces through admixture events. Some other lines of evidence observed in this study also support an admixed origin of MII domesticates. First, a lack of reduction in overall genetic diversity in MII landraces suggests a contribution of wild relatives to increase diversity levels. Second, the fact than more than half of the loci tested in MII landraces contain higher genetic diversity than wild populations is compatible with the hypothesis of an admixed origin of MII landraces. Third, the fact that MII domesticates do not form a monophyletic clade and that only represent a small percentage of the domesticated accessions suggests an origin of these landraces through admixture events. Finally, in some places where MII domesticates were found, for example in the Peninsula of Yucatan, introgression among wild and domesticates has been well-documented (Martínez-Castillo et al., 2007; Dzul-Tejero et al., 2014). With this in mind, our discussion on parameters of the domestication process of Lima bean estimated by the ABC approach will continue mainly on the basis of scenario 2.

Two parameters of interest for the domestication process is the population size at the beginning of the process, or in other words the population size of the founders, and the duration of the population bottleneck. These parameters are of interest because they affect the current genetic diversity of the crop and will be discussed below. For the domestication event in gene pool MI, the size of the founder population (N2b in the models) varies from 90 to 1,980 individuals (with mean 498), namely a size below 2,000 individuals (see **Table 4**). This contrasts with the estimation of the current size of the wild ancestor MI (N1 in the models) that varies from 97,000 to 300,000 individuals (with mean 204,000). If we compare these estimates (N1 and N2b), these results would suggest a drastic reduction in population size (a reduction of about 99%). These results are in contrast with those observed by Mamidi et al. (2011) in common bean, where sequences of 13 loci were analyzed by coalescent simulations and the authors observed that the bottleneck effective size within the Mesoamerican gene pool was around 48% the size of the wild ancestor. Both parameters (N1 and N2b) have bias and error measures relatively small and genetic data are informative to estimate these parameters (in Table S6 it can be seen that estimated means for parameters N1 and N2b are close to the true values, and the values without taking into account the genetic data are more biased). The duration of the bottleneck for the MI domestication (parameter db1 in the models) varied from 2,100 to 3,000 years (average of 2,870 years). However, for this parameter genetic data seem not to be very informative given the fact that the estimation of the parameter, bias and errors with and without genetic data are very similar.

Taking together the parameters N2b and db1 we can calculate the bottleneck intensity, k = N2b/db1, in around 0.17, which indicates a strong founder effect for the MI gene pool, because of a small founder population and a long duration of the bottleneck. This would explain the 30% reduction in genetic diversity observed within this gene pool (**Table 2**). Our results are in contrast with those obtained for other crops such as soybean and maize, where the k ratio was 2 or 4–5, respectively, namely the bottleneck population size was higher than the bottleneck duration, therefore these crop species showed a bottleneck not so severe as the one observed here in Lima bean (Tenaillon et al., 2004; Guo et al., 2010). The results of the present study are in agreement with reports in rice where a domestication intensity of k = 0.2–0.5 was observed, with a founding population between 400–500 individuals and a long bottleneck duration (at most 3,000 years, similar to the estimation in Lima bean) (Zhu et al., 2007). It is difficult to establish why Lima bean shows a severe domestication bottleneck in gene pool MI. A selfing mating system could be an explanation because of the reduced chances of crossing with wild relatives, however not all selfing legume species show severe bottlenecks, soybean for example shows a moderate bottleneck (Guo et al., 2010). Therefore, mating system may not be key in determining strength of domestication bottleneck. Another explanation for a severe domestication bottleneck in Lima bean could be that domestication occurred only once in a very reduced area, mainly due to the presence of high cyanogenic glucoside in wild populations that would limit the number of times these beans were taken into domestication, as suggested elsewhere (Debouck, 1996), however we do not have data to test this hypothesis. An additional explanation is that in their migration from the original domestication site to other regions within Mesoamerica, the early domesticates suffered additional bottlenecks.

Another parameter of interest is domestication time. According to scenario 2, domestication within gene pool MI, presumably in central-western Mexico, would have started 7,700 years ago (parameter t2) and after an unknown time, the expansion of the first domesticates to other regions would have started. Presumably, domesticated Lima beans would have migrated toward the south and east of Mexico, in the distribution area of wild gene pool MII with which it experienced admixture events at a time estimated by parameter t1 to be 6,500 years ago. The oldest archeological remains that show the presence of domesticated Lima beans in the distribution area of gene pool MII are those from the site known as Dzibichaltún in the Peninsula of Yucatan in Mexico, with an age of about 1,300 years before present (Kaplan, 1965). This archeological site is within the distribution range of wild populations in the Peninsula of Yucatan, therefore the introgression events among wild and domesticated beans in this site of Mexico could be ancient.

Global founder effects estimated in this study were about 18%, this means that landraces retain in average around 82% of the variation found in wild populations. This reduction is in agreement with a previous study of SSR markers on Lima bean landraces where the authors found a global reduction of genetic diversity in landraces of about 17% (Andueza-Noh et al., 2015). This result also agrees with the study of Schmutz et al. (2014) who carried out the sequencing of the genome of P. vulgaris and resequencing of Mesoamerican and Andean wild and landrace accessions and observed a reduction of about 17% in genetic diversity (measured as nucleotide diversity) in Mesoamerican landraces.

When reduction in genetic diversity in Lima bean was calculated within gene pools, contrasting patterns were found. Within gene pool MI, reduction in genetic diversity was about 30% and no reduction was observed within gene pool MII, the latter result being compatible with the hypothesis of an admixed origin of MII landraces. These values are also in large agreement with previous reports in Lima bean in the sense that there was evidence of a large founder effect within gene pool MI (about 44%), while within gene pool MII it was almost negligible (only 1%) (Andueza-Noh et al., 2015).

In this study we could measure founder effects at a locusby-locus basis and found different patterns as shown in the results section (see **Table 2**). In gene pool MI we can see a global reduction of genetic diversity in landraces of about 30%, but when this reduction is estimated locus-by-locus a larger loss in genetic diversity (in average 78%) was seen in 85% of the loci analyzed. Interestingly, 15% of loci within gene pool MI showed increased diversity in landraces in comparison to their wild counterparts. These results suggest that most loci are losing genetic diversity maybe due to the demographic effects of domestication and also by selection forces (see below). The loci where an increase in diversity was observed in landraces may represent regions in the genome that contain genes that were useful for landrace diversification during the adaptation process or genes that underwent introgression with other landraces or wild populations (Burger et al., 2008).

When global reduction in genetic diversity was measured within gene pool MII, no loss in genetic diversity was observed. However, when this measure was done locus-by-locus it was observed that about 45% of loci showed a reduction in diversity (in average a reduction of 53%) and that 55% of loci showed an increase in diversity among landraces (an increase of around 33%). The same was observed within gene pool AI where no reduction in genetic diversity among landraces was observed at a global scale but when measured locus-by-locus, an average reduction of 47% was observed in 79% of loci, as expected for a domestication event. Clearly, the MII landraces show a locusby-locus pattern that is different from the one observed in MI and AI landraces because in MII landraces most loci showed an increase in genetic diversity. This result might be compatible with and admixed origin for these landraces.

It is well-known that founder effects result in increasing levels of linkage disequilibrium (r 2 ) in the genome. In this study, wild accessions showed pairwise r 2 values, averaged per chromosome, that went from 0.12 to 0.14, while landraces showed pairwise r 2 values, averaged per chromosome, between 0.15 and 0.19, a global increase in r <sup>2</sup> of about 20–30%. Similar results in pairwise LD levels were observed in common bean in a previous study (Rossi et al., 2009) where average r <sup>2</sup> was 0.08 for wild accessions and 0.18 for domesticated accessions, an increase of 55%. The increase in LD levels may be a result of the reduction in genetic diversity during the domestication process as a consequence of the domestication bottleneck that in the gene pool MI was strong, and also as a consequence of selective forces acting during adaptation. Below we are discussing possible genomic regions affected by selection.

# Outlier Loci and Genomic Regions Related to Domestication

We applied two complementary approaches to detect loci related to domestication as FST outliers. With these approaches, a total of 26 FST outlier SNPs were detected (see Tables S7, S8). Interestingly, all FST outlier SNPs detected in chromosome three co-localized within or nearby the LD region detected by varLD in this chromosome. Of these 26 SNPs, five involved missense changes in five genes. One of these changes is in the gene Phvul.001G146000, whose Arabidopsis ortholog is the Growing Slowly (GRS1) gene, a gene that has functions in RNA editing and plant development (mutants show slow growth and sterility) (Xie et al., 2016). A second missense change was found in the gene Phvul.003G176700, whose Arabidopsis ortholog is the Histone deacetylase 15 (HDA15) gene, a negative regulator of expression involved, with other genes, in repression of chlorophyll biosynthesis, photosynthesis, photomorphogenesis, and seed germination in the dark (Liu et al., 2013; Gu et al., 2017). A third missense change was found in the gene Phvul.002G165600, a gene ortholog to the Albina 1 (ALB1 o CHLD) gene in Arabidopsis also involved in chlorophyll biosynthesis (Papenbrock et al., 1997). Another interesting missense SNP was located in the gene Phvul.004G099700, involved in defense response to fungus according to its GO.

Given that selection is expected to increase LD levels beyond the genome-wide LD increase caused by the domestication bottleneck, we expected to be able to detect those regions by means of inter-population (wild vs. domesticated) LD comparisons by the varLD approach. By applying this approach, we identified significant regions in chromosomes two, three, seven, and nine (see **Figure 10**). The small number of regions detected when using the LD approach might be due to the fact that, because of the confounding effects of demographic events over selection signals, background levels of LD had to be taken into account before detecting regions undergoing selection, thus rendering the varLD tests very conservative. Complementary strategies not taken in this study for the detection of regions containing domestication genes are genome-wide association (GWAS) approaches and QTL linkage mapping. In this regards, it is interesting to see that in a GWAS study carried out in common bean to detect genes related to change in seed size during domestication (Schmutz et al., 2014), many of the candidate genes detected in that study co-localize with the LD regions detected in the present study in chromosomes three (25.0–40.0 MB) and seven (9–10.5 MB). The authors found that in common bean the region in chromosome seven that contained seed size candidate genes also showed extensive LD (Schmutz et al., 2014), as we also observed in the present study. Within the chromosome regions in high LD observed in this study, we could locate 150 of the domestication candidate loci identified in common bean (Schmutz et al., 2014) (see Table S9). Although these 150 candidate loci constitute a good starting point, they need to be cross-validated with complementary approaches such as QTL mapping by linkage analyses, GWAS, candidate association studies, selection tests of DNA sequences and differential gene expression, among others.

Among the 150 genes, we found five candidate loci involved in seed germination, organ size, pod dehiscence, and flowering time, all functions related to the domestication syndrome. Within the LD region in chromosome 2 we found two candidate genes: Phvul.002G033500 (start: 3,391,469, end: 3,393,850) and Phvul.002G041800 (start: 3,983,255, end: 3,986,921). The best A. thaliana hit for the gene Phvul.002G033500 is AT5G66460 (Endo-B-Mannanase gene 7, MAN7), a gene encoding an Endo-B-Mannanase, a hydrolitic enzyme that degrades the mannan polymer, the main constituent of the cell wall in the endosperm of seeds and therefore plays a crucial role in seed germination (Iglesias-Fernández et al., 2013). Increased seed germination was a key trait selected during crop domestication that allowed the adaptation of plants to the cultivated fields, therefore this is a good candidate gene. The best A. thaliana hit for gene Phvul.002G041800 is AT3G13960 (growth regulating factor 5, GRF5), a gene that encodes a transcription factor that promotes cell proliferation during leaf development, promotes leaf longevity, stimulates chloroplast division with a correlated increase in chlorophyll content and photosynthetic rate and increase tolerance to grow in nitrogen-depleted soil (Horiguchi et al., 2005; Vercruyssen et al., 2015). It has been shown that overexpression of A. thaliana GRF5 along with AN3 (the product of the gene Angustifolia 3, another transcription factor) increases leaf size (Horiguchi et al., 2005). It is well-known that one of the changes in plant domestication has been the increase in size of some organs such as flowers, fruits, and leaves. Therefore, the relevance of genes that promote growth of different organs during domestication seems plausible.

Within the LD region in chromosome 7 we found the gene Phvul.007G096500 (start: 10,156,828; end: 10,169,477), whose best A. thaliana hit was the gene AT5G04240 (the early flowering gene 6, elf 6), which is involved in the regulation of flowering. Flowering time is an important trait that allows adaptation and spread of early domesticates into regions with diverse photoperiod regimes. Lima bean is an excellent example for this kind of adaptation given the wide latitudinal range that this species explores in the wild and under cultivation. In Arabidopsis there are two pathways that control flowering according to environmental stimuli: the photoperiod pathway and the vernalization pathway, that respond to day length and temperature, respectively (Mouradov et al., 2002). The elf 6 gene encodes a nuclear protein with jumonji and zinc finger domains that plays a role as an upstream repressor in the photoperiod pathway (Noh et al., 2004). Arabidopsis elf 6 mutants display early flowering under long and short days, therefore we consider this gene as a good candidate gene for further study.

In chromosome 9 we observed that the gene Phvul.009G203400 (start: 30,080,598, end: 30,089,684) falls very close to the region in high LD in this chromosome. The best A. thaliana hit for this gene is AT5G60910 (FRUITFULL or FUL), a MADS-Box gen that expresses in the cell layers of the valve tissues of the silique in Arabidopsis (Gu et al., 1998) and control the transcription of other MADS-Box genes such as SHP1/2 (SHATTERING PROOF 1 and 2) required for fruit development and dehiscence or pod shattering (Liljegren et al., 2000), a key trait for crop domestication.

Also in chromosome 9, the gene Phvul009G117500 (start: 17,541,857, end: 17,548,419) falls very close to the high LD region in this chromosome. The best A. thaliana hit for this gene is AT5G17690 (TERMINAL FLOWER 2, TFL2) a gene that controls flowering time and photoperiod sensitivity and regulates the expression of other flowering time genes and other floral organ identity genes (Larsson et al., 1998; Kotake et al., 2003). As indicated above, control of flowering time is key for the adaptation of early domesticates to other regions with different photoperiod regimes and therefore this gene is also a good candidate.

# CONCLUSIONS

In summary we can say that the GBS approach resulted very useful to discover SNP markers for evolutionary studies in wild and domesticated Lima beans. The SNP markers and clustering and Bayesian approaches applied let us confirm the existence of three gene pools in wild Lima beans, the Mesoamerican one (MI), the Mesoamerican two (MII), and the Andean one (AI), with mainly non-overlapping geographic ranges, and also suggest the existence of another Andean gene pool (AII) in central Colombia, although additional information is needed. The ABC approach was very useful to test competing domestication scenarios for Lima bean Mesoamerican landraces. The scenario that was better supported with the logistic regression approach was a single domestication event within gene pool MI for all Mesoamerican landraces, maybe in central-western Mexico, and subsequent admixture among landraces and wild populations within the distribution range of gene pool MII that gave rise to MII landraces. Locus-bylocus analyses of genetic diversity showed that domestication founder effects were strong within gene pools MI and AI, as expected for domestication events, but less drastic for gene pool MII, which is compatible with an admixed origin of MII landraces. After accounting for background increase in LD levels due to the domestication bottleneck, we were able to detect genomic regions with significant differences in LD among wild and domesticated accessions that may represent regions affected by selection processes and that may harbor domestication genes. A search for domestication candidate genes within these LD regions, on the basis of candidate genes reported for common bean, resulted in a list of 150 genes, among them genes related to seed germination, organ size, pod dehiscence, and flowering time. Follow-up studies should include analysis of additional samples from central Colombia in order to gather more evidence about the possible existence of a separate wild gene pool and complementary approaches to map genes related to domestication.

# AUTHOR CONTRIBUTIONS

MC conceived the idea for the research project, carried out laboratory techniques for extraction of DNA to acquire GBS data, analyzed GBS data and produced the first draft of the manuscript. JM collected and provided plant germplasm of Lima bean from Mexico, carried out laboratory techniques for extraction of DNA to acquire GBS data and revised critically the manuscript.

# REFERENCES


# FUNDING

The present study was funded by Colciencias, Colombia, under contract number FP44842-009-2015 and project code 1101-658- 42502, by Fundación para la Promoción de la Investigación y la Tecnología del Banco de la República de Colombia under Project code 3404 and by CONACYT-México under project code CB-2014-240984.

# ACKNOWLEDGMENTS

Thanks are due to the Genetic Resources Unit of CIAT, in special to Dr. Daniel Debouck, for providing the seed material used in this study. One of the authors (MC) is very grateful to Dr. Jorge Duitama for the training received in bioinformatic analyses of GBS data and to Dr. Paul Gepts and Dr. Julin Maloof for the training received in genomic analyses. Thanks are due to Colciencias, Colombia, for providing funding (project code 1101-658-42502) to MC to carry out research on Lima bean. Thanks are also due to CONACYT-México (project code CB-2014-240984) for providing support to JM to carry out collecting trips in Mexico of Lima bean populations. Thanks are also due to Matilde M. Ortiz from CICY for providing technical laboratory support.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2017. 01551/full#supplementary-material


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Chacón-Sánchez and Martínez-Castillo. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Domestication Genomics of the Open-Pollinated Scarlet Runner Bean (Phaseolus coccineus L.)

Azalea Guerra-García1,2 \*, Marco Suárez-Atilano<sup>3</sup> , Alicia Mastretta-Yanes<sup>4</sup> , Alfonso Delgado-Salinas<sup>5</sup> and Daniel Piñero<sup>2</sup>

<sup>1</sup> Posgrado en Ciencias Biológicas, Universidad Nacional Autónoma de México, Ciudad de México, Mexico, <sup>2</sup> Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Ciudad de México, Mexico, <sup>3</sup> Departamento de Ecología de la Biodiversidad, Instituto de Ecología, Universidad Nacional Autónoma de México, Ciudad de México, Mexico, <sup>4</sup> CONACYT-CONABIO, Comisión Nacional para el Conocimiento y Uso de la Biodiversidad, Ciudad de México, Mexico, <sup>5</sup> Departamento de Botánica, Instituto de Biología, Universidad Nacional Autónoma de México, Ciudad de México, Mexico

The runner bean is a legume species from Mesoamerica closely related to common

#### Edited by:

Alejandro Casas, Universidad Nacional Autónoma de México, Mexico

#### Reviewed by:

Peter J. Prentis, Queensland University of Technology, Australia Gonzalo Gajardo, University of Los Lagos, Chile

\*Correspondence: Azalea Guerra-García azalea.guerra@iecologia.unam.mx

#### Specialty section:

This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Plant Science

> Received: 31 July 2017 Accepted: 18 October 2017 Published: 15 November 2017

#### Citation:

Guerra-García A, Suárez-Atilano M, Mastretta-Yanes A, Delgado-Salinas A and Piñero D (2017) Domestication Genomics of the Open-Pollinated Scarlet Runner Bean (Phaseolus coccineus L.). Front. Plant Sci. 8:1891. doi: 10.3389/fpls.2017.01891 bean (Phaseolus vulgaris). It is a perennial species, but it is usually cultivated in smallscale agriculture as an annual crop for its dry seeds and edible immature pods. Unlike the common bean, P. coccineus has received little attention from a genetic standpoint. In this work we aim to (1) provide information about the domestication history and domestication events of P. coccineus; (2) examine the distribution and level of genetic diversity in wild and cultivated Mexican populations of this species; and, (3) identify candidate loci to natural and artificial selection. For this, we generated genotyping by sequencing data (42,548 SNPs) from 242 individuals of P. coccineus and the domesticated forms of the closely related species P. vulgaris (20) and P. dumosus (35). Eight genetic clusters were detected, of which half corresponds to wild populations and the rest to domesticated plants. The cultivated populations conform a monophyletic clade, suggesting that only one domestication event occurred in Mexico, and that it took place around populations of the Trans-Mexican Volcanic Belt. No difference between wild and domesticated levels of genetic diversity was detected and effective population sizes are relatively high, supporting a weak genetic bottleneck during domestication. Most populations presented an excess of heterozygotes, probably due to inbreeding depression. One population of P. coccineus subsp. striatus had the greatest excess and seems to be genetically isolated despite being geographically close to other wild populations. Contrasting with previous studies, we did not find evidence of recent gene flow between wild and cultivated populations. Based on outlier detection methods, we identified 24 domestication-related SNPs, 13 related to cultivar diversification and eight under natural selection. Few of these SNPs fell within annotated loci, but the annotated domestication-related SNPs are highly expressed in flowers and pods. Our results contribute to the understanding of the domestication history of P. coccineus, and highlight how the genetic signatures of domestication can be substantially different between closely related species.

Keywords: domestication, genotyping by sequencing, Phaseolus coccineus, adaptative variation, population genomics

# INTRODUCTION

fpls-08-01891 November 13, 2017 Time: 18:22 # 2

The scarlet runner bean (Phaseolus coccineus L.) is one of the five Phaseolus species that were domesticated in Mesoamerica, and it is the third-most economically important, after P. vulgaris L. and P. lunatus L. The domestication process of this species continues today both in the Americas and Europe, where it was introduced by the Spaniards. One of its main characteristics is its ability to tolerate cooler climates than other Phaseolus and up to date it is an important food source for smallholders and indigenous groups in Mexico (Salinas, 1988). Despite the cultural value, economic importance, and agronomic potential of P. coccineus, little is known about its domestication history and the genetic variability of its wild and cultivated forms.

Wild P. coccineus are perennial climbing plants, occurring mostly at mid-high elevations (1,000–3,000 m.a.s.l.), from northern Mexico (Chihuahua) to Panama (Salinas, 1988). It has 11 pairs of chromosomes and an estimated genome size of 660 Mb (Plant DNA C-values database). Contrasting with the autogamous common bean, the scarlet runner bean is an open-pollinated species. The high morphological diversity of this species has been classified under two subspecies (Freytag and Debouck, 2002): P. coccineus subsp. coccineus (mostly with red flowers), including 11 wild varieties and the domesticated form, and P. coccineus subsp. striatus (purple or mauve flowers), conformed by eight wild varieties. No genetic evidence supports these subspecies and varieties, but given the environmental and cultural heterogeneous landscape where P. coccineus occurs, it is expected that the species should be genetically structured.

As a cultivated species, P. coccineus is currently grown in Mexico, Guatemala, Honduras and Costa Rica, and in lesser degree in South America. In Europe, it is mostly cultivated in the United Kingdom, Netherlands, Italy, and Spain (Rodiño et al., 2006). In Mexico, the scarlet runner bean is cultivated both as a self-sufficiency crop by smallholder farmers (<5 ha) and also commercially for urban areas. Besides its native cultivars, in Mexico there is one breeding line (Blanco Tlaxcala) developed using a multi linear method (Vargas-Vázquez et al., 2012). Feral populations are common, but it is unknown if they originated from hybridization between wild and domesticated populations, or if they escaped from cultivation. Wild, feral and domesticated distributions overlap in Mesoamerica, suggesting that there are plenty of opportunities for gene flow to occur, making the domestication history of P. coccineus difficult to disentangle without high resolution genetic markers.

The domestication history of the scarlet runner bean has been explored previously with low resolution molecular markers, and multiple domestication events were suggested. Specifically, chloroplast and nuclear SSRs of P. coccineus accessions including European domesticated populations, Mesoamerican landraces and wild samples from Mexico, Guatemala, and Honduras (Angioi et al., 2009; Spataro et al., 2011; Rodriguez et al., 2013) suggest that P. coccineus domestication took place in the Guatemala-Honduras area, or that alternatively another domestication event occurred in Mexico followed by extensive hybridization with the cultivated populations from Guatemala and Honduras. However, few Mesoamerican samples were included in these studies, and they focused on European domesticated populations. Phylogenetic analyses including more samples from the wide distribution of P. coccineus could bring clues about the number of domestication events that took place in this species. For example, if cultivars are grouped in one monophyletic clade, it would suggest one domestication event.

Another interesting feature of P. coccineus domestication history is that similar levels of genetic variation have been reported in wild and cultivated populations (Escalante et al., 1994; Spataro et al., 2011; Rodriguez et al., 2013). This contradicts the population genetics models that predicts a genetic diversity reduction and increased divergence between wild and domesticated forms due to demographic factors and selection at target loci (Meyer and Purugganan, 2013). This pattern has been described in crops like sunflower (∼30%; Renaut and Rieseberg, 2015); soybean (∼30%; Li et al., 2013); maize (∼17%; Hufford et al., 2012); and in cultivated Agave species (from 21 to 66%; Eguiarte et al., 2013). Also, in the Mesoamerican common bean a ∼20% reduction in genetic variation has been reported (Schmutz et al., 2014). However, the amount of genetic diversity that is lost along domestication depends on several factors, including the severity and the number of bottlenecks, the strength of selection and human management (Gepts, 2014). To properly assess impact of domestication on the genetic diversity of P. coccineus, genomic data comparing wild and cultivated populations is necessary.

The use of genomic tools also allows to characterize diversity and differentiation patterns across genomes. Regions or variants that departure from neutral predictions are probably influenced by selective pressures and are tagged as candidates. Applying this approach to crop species and their wild relatives allows to distinguish loci affected during domestication, whereas comparisons between landraces and/or improved cultivars measure the effect of subsequent selection (Tang et al., 2010; Gepts, 2014). Furthermore, hypotheses about phenotypic convergence in crops can be tested. In other words, if the same genes or genomic regions were affected during the domestication process of different species.

Here, we aim to deal with the previous knowledge gaps by using genomic data to (1) provide information about the domestication history of P. coccineus and its current evolutionary dynamic in Mexico, in particular to analyze the occurrence of a single or multiple domestication events in Mexico; (2) examine the extent of the domestication bottleneck in this species by comparing the levels of genetic diversity and geographic patterns of the wild, feral and domesticated Mexican populations; and (3) identify candidate loci under natural and artificial selection in P. coccineus genome.

# MATERIALS AND METHODS

# Plant Material and SNP Genotyping

Phaseolus coccineus individuals from 10 wild, three feral and 11 cultivated Mexican populations and one cultivar from Spain were analyzed, as well as plants from the breeding line Blanco Tlaxcala. Taxonomic and wild/feral/domesticated categories were

assigned based on morphology and habitat observations. Only one of the wild populations that were sampled corresponds to subsp. striatus, the rest belong to subsp. coccineus. A population was classified as feral if it was growing out of cultivation and presented intermediate traits between wild and domesticated forms. The Mexican samples cover the species distribution and main cultivation areas at the national level. As outgroups, samples from the closely related species P. vulgaris (three wild and one cultivated) and P. dumosus (seven cultivated) were included (Supplementary Table S1). For the three species, the samples size of each population varied between three to 16 individuals.

Sampling was performed during September–December of 2014 and 2015. In the case of the wild populations, tissue from young leaves was collected and stored in silica until processed. Seeds from cultivars were collected and germinated at the Instituto de Ecología, UNAM. DNA was extracted using DNeasy Plant Mini Kit (Qiagen). DNA samples were genotyped at the Institute for Genomic Diversity at Cornell University (Services | Institute of Biotechnology, 2017). Sequencing libraries were constructed using enzymes PstI and BfaI following the Genotype by Sequencing (GBS) protocol of Elshire et al. (2011). A total of 326 samples were processed in four plates of ninety six samples each, multiplexed and sequenced on four lanes of Illumina HiSeq 2500 (100 bp, single-end reads).

Reads were aligned to P. vulgaris reference genome v1.0 (Phytozome) DOE-JGI and USDA-NIFA, http://phytozome.jgi. doe.gov/ (Phytozome, 2017) using bwa v 0.7.8-r455; (Li and Durbin, 2009). Demultiplexing, initial quality control, assembly and SNP discovery were made with TASSEL pipeline v3.0.174 (Glaubitz et al., 2014). Assembly and SNP discovery were performed independently for two sets of data, one containing samples from P. vulgaris, P. dumosus, and P. coccineus (VDC group), which are the domesticated species of the Vulgaris clade (Delgado-Salinas et al., 2006); and the other data set only including P. coccineus samples. SNPs were filtered in VCFtools 0.1.15 (Danecek et al., 2011) using the following parameters for the two data sets: (1) VDC group: maximum missingness threshold 20% per individual; minimum mean depth 10X; minimum allele frequency (MAF) 0.01; minimum allele count 90%; and only SNPs mapped in chromosomes. (2) Phaseolus coccineus: maximum missingness threshold 30% per individual; minimum mean depth 5X; MAF 0.02; minimum allele count 80%; and only SNPs mapped in chromosomes.

Filtered SNP data, species occurrence data and scripts used for the analyses are available at Dryad Repository under the identifier doi: 10.5061/dryad.q343c.

# Inferring Population Structure and Phylogenetic Relationships

We inferred the population structure of P. coccineus because different genetic clusters are expected to occur due to the isolation and environmental and cultural heterogeneity in which this species occurs. For this, the software Admixture v1.3 (Alexander et al., 2009) was used to infer population structure of P. coccineus. Values of K ranging from one to twenty were tested, and the value that exhibited the lowest cross-validation error was chosen. Then, we examined the phylogenetic relationships between the genetic groups, both cultivated and wild, and if each cluster forms a monophyletic clade. This phylogenetic analysis was also used as a preliminary approach to identify the plausible number of domestication events for the Mexican cultivated P. coccineus (see below for other analyses). Specifically we examined if the cultivated samples was recovered as a monophylogenetic group. For the phylogenetic analysis, wild and cultivated samples of P. coccineus, P. vulgaris, and P. dumosus were analyzed under three schemes:

First, a Maximum-Likelihood based approach was carried out with the FastTree software (Price et al., 2009). For this, a mix of Nearest-Neighbor Interchange and Subtree-Prune and Regraft moves (NNI+SPR) was considered for topology and branch-length optimization and the General-Time Reversible with a single rate per site model (GTR+CAT) was included as nucleotide substitution model. Because FastTree only considers those SNPs identified as fixed within individuals (i.e., homozygous), but polymorphic among individuals, only the 82% of the total VDC subset (41,223 SNPs) were considered in this analysis. Second, a phylogenetic network based on the Neighbor-net algorithm and Patristic Distances with GTR+I+G correction was estimated with SplitsTree (Huson and Bryant, 2006) software. Lastly, we employed a Bayesian multispecies coalescent model (Rannala and Yang, 2003) to estimate the phylogenetic relationships among well-supported clades within P. coccineus solely. We used the program SNAPP 1.3.0 (Bryant et al., 2012), included in the package BEAST 2.4.5 (Bouckaert et al., 2014) to infer species trees directly from biallelic genetic data. We used the eight main genetic clusters (see section Results) inferred by Admixture as a priori designated species and the Wild-TMVB cluster was partitioned in two, taking into account the ML topology of that cluster. Because SNAPP does not incorporate missing data, we selected a subset of our taxonomic sampling that maximized the number of SNPs available. The final analysis retained a total of 600 SNPs under linkage equilibrium; without any missing data and considering a minimum of five individuals from each cluster of the designated species. We used SNAPP's default settings and ran the analysis for 1,000,000 generations sampling every 1,000 generations. We evaluated the convergence (i.e., short variation in -lnL scores, ESS > 100) from our runs by examining log files with the program Tracer 1.5 (Drummond and Rambaut, 2007). We analyzed the tree files with SNAPP-TreeSetAnalyser 2.4.5, to identify species trees that were contained in the 95% highest posterior density (HPD) set and using 10% of topologies as burn-in. Resulted tree files (cloudgrams) were visualized using DensiTree (Bouckaert, 2010).

# Population Genetics Statistics

To evaluate the existence and degree of the domestication bottleneck on P. coccineus we estimated genetic diversity and differentiation indices of the genetic groups inferred by the Admixture analysis (see section Results). Specifically, we used the Hierfstat package (Goudet, 2005) in R (R Core Team, 2017) to estimate per site heterozygosity and FIS, as well as pairwise FST among groups, performing a bootstrap (1,000) to obtain confidence intervals. To test the hypothesis that n<sup>i</sup> = n<sup>j</sup> (where ni is the number of loci of the cluster i where HEi > HEj, and

n<sup>2</sup> is the number of loci of the cluster j where HEj > HEi) we used a pairwise χ 2 tests with Bonferroni correction to avoid false positive results (Sokal and Rohlf, 1995). Also, we estimated the heterozygosity and FIS at the sampling location (P. coccineus dataset) and at the species level (VDC dataset) applying the same test.

# Multiple vs. Single Domestication Events Test

In order to confirm the hypothesis of a single domestication event in Mexico suggested by our phylogenetic analyses (see section Results) we applied the Approximate Bayesian computation (ABC; Beaumont et al., 2002) method implemented in DIYABC 2.04 (Cornuet et al., 2014). Preliminary tests included comparisons among three scenarios with 3 × 10<sup>6</sup> simulated datasets (1 × 10<sup>6</sup> each scenario) in which the position of the Wild-Sierra Madre Occidental (Wild-SMOCC) clade was evaluated (see section Results, Supplementary Figure S1). Our final estimation included 4 × 10<sup>6</sup> simulated datasets (2 × 10<sup>6</sup> each scenario) considering the Wild-SMOCC population fixed as sister clade of the Wild-Trans-Mexican Volcanic Belt (Wild-TMVB) populations (see section Results). The number of domestication events was tested as follows: multiple events (Scenario 1, Supplementary Figure S2) vs. a single one (Scenario 2, Supplementary Figure S2). The DIYABC approach was also applied to estimate the time at which domestication occurred, as well as other demographic parameters such as effective population size (Ne). A subsample from the SNAPP dataset (279 SNPs) and the scheme of eight clusters were used to set populations in DIYABC (**Figure 2B**). Priors were set as follow: log-uniform distributions across all parameters, Ne ranging from 100 to 100,000 individuals, mutation rate set to 10−8–10−<sup>6</sup> across SNPs, and divergence times among populations set to 10–100,000 generations ago (**Table 1**).

We compared the fit of the single vs. multiple domestication events scenarios by estimating their posterior probabilities: with the obtained reference tables from each scenario, we ranked the simulated datasets in order of increasing distance to the observed data considering direct and logistic approaches (Beaumont et al., 2002; Cornuet et al., 2014). Distance between datasets was based on summary statistics, estimated from the empirical and simulated sets. We performed a pre-evaluation step using a principal components analysis (PCA), to ensure that at least one (or more) scenarios would produce simulated datasets close enough to the empirical data. The PCA was based on a set of 5,000 simulated datasets, generated from the parameters' prior distributions (Supplementary Figure S3).

# Identifying Candidate Loci

We used the wild and cultivated samples of P. coccineus to identify candidate loci related to domestication, to cultivar diversification, and to natural selection. Before the candidate SNPs analysis, an additional filter based on linkage disequilibrium (LD) was applied. To determine the threshold distance at which there is no LD, we estimate the inter-variant allele correlations (r 2 ) using PLINK 1.9 (Chang et al., 2015). To distinguish LD due to physical distance (bp), the r <sup>2</sup> was estimated for SNPs located in the same and in different chromosomes. The distance threshold was established in 3,000 bp, so that SNPs closer than this distance were removed.

This LD-filtered dataset was analyzed with two different approaches for outlier detection: the R package pcadapt (Luu et al., 2017) and BayeScan 2.1 (Foll and Gaggiotti, 2008). Only loci identified by the pcadapt and BayeScan methods were considered as candidate loci. Pcadapt detects candidate SNPs assuming that these are outliers with respect to how they are related to population structure. By contrast to population-based approaches, pcadapt does not require grouping individuals into populations and handles admixed individuals (Luu et al., 2017). BayeScan instead uses differences in allele frequencies of predefined populations, in this case the genetic clusters previously established by Admixture.

In both approaches, three separate analyses were performed with each method to detect signatures of different types of selective pressure. First, to detect candidate domestication loci, wild and cultivated samples of P. coccineus were included, and feral individuals were removed. In this case, for the pcadapt analysis, only the first principal component was assessed because it explains the difference between wild and cultivated populations (see section Results). Also, an additional SNPs filter was made and MAF were adjusted to consider SNPs present in at least five individuals. For this dataset, that is MAF = 0.023. For Bayescan no additional filter was made. Second, to identify loci related to diversification in the context of domestication, only cultivated


Please refer to the text to understand what the acronyms stand for.

samples were analyzed. In the pcadapt analyses, the first six components were assessed because they explain the genetic structure of populations, and MAF threshold was set to 0.038 to excluded alleles present in less than five individuals. Notice that in this case, diversification refers to the phase that follows initial domestication and involves the spread and adaptation to different agro-ecological and socio-cultural environments (Meyer and Purugganan, 2013). Lastly, to detect natural selection signatures, we focused both methods on wild samples. Again, for the pcadapt analyses the first six components were assessed and the MAF threshold was set 0.055 to exclude SNPs present in less than five individuals. In all cases, no additional filter was made for BayeScan.

The false discovery rate threshold applied in pcadapt and BayeScan were 0.005 and 0.05, respectively. To compare how genetic variance is explained by candidate SNPs and by data set LD filtered, PCAs were made using the SNPrelate package (Zheng et al., 2012).

Using Phytozome's JBrowser, the putative function and tissue of expression of these loci was examined by looking for the annotation of the selected SNPs in P. vulgaris genome v 2.1 (DOE-JGI and USDA-NIFA<sup>1</sup> ). For each annotated loci we looked for homologous proteins with the highest similarity in other plants, and examined if the homolog genes in Glycine max (soybean) were among the domestication-related loci associated with flowering time and seed size in this species (Zhou et al., 2015).

#### <sup>1</sup>http://phytozome.jgi.doe.gov/

# RESULTS

# Sampling and SNP Genotyping

A total of 296 individuals representing four ecoregions of Mexico (as defined in Instituto Nacional de Estadística, Geografía e Informática (INEGI), Comisión Nacional para el Conocimiento y Uso de la Biodiversidad (CONABIO), and Instituto Nacional de Ecología (INE), 2008) were sampled and successfully genotyped (**Figure 1**). After assembly and SNP discovery, the VDC group dataset contains 241 individuals of P. coccineus, 20 of P. vulgaris and 35 of P. dumosus, 50 273 SNPs, 2.24% mean missing data per individual, and a mean depth per site of 58.63. The P. coccineus dataset includes 242 individuals (91 wild; 20 feral; 131 cultivated), 42,548 SNPs, 3.97% mean missing data per individual, and a mean depth per site of 50.41.

# Inferring Population Structure and Phylogenetic Relationships

The K-value that presents the lower error rate in Admixture analysis was eight (Supplementary Figure S4). Half of the genetic groups correspond to the cultivars from the Trans-Mexican Volcanic Belt (Cult-TMVB), Sierra Madre del Sur and Chiapas Highlands (Cult-SUR-CH), Sierra Madre Occidental (Cult-SMOCC) and Oaxaca Valley (Cult-OV). The other half of the genetic clusters belong to wild populations from the Trans-Mexican Volcanic Belt (Wild-TMVB), Sierra Madre del Sur and Chiapas Highlands (Wild-SUR-CH), Sierra Madre

Occidental (Wild-SMOCC) and subsp. striatus population, located in the TMVB (Wild-striatus; **Figure 2**). The genetic clusters seem to be related to geographic distances (**Figure 1**), except the population Wild-striatus, which is geographically close to populations of P. coccineus subsp. coccineus but seems genetically isolated. Samples from the Spanish population (**Figure 2B**, triangle) were assigned to the Cult-TMVB genetic group, but unlike the individuals of this cluster, samples from Spain do not present a mixed ancestry. Regarding samples of the breeding line Blanco Tlaxcala (**Figure 2B**, circle), they are grouped with landraces from Cult-SMOCC cluster.

The phylogenetic hypotheses constructed with FastTree and SplitsTree (**Figures 2A,C**) are consistent with the Admixture genetic groups (**Figure 2B**). Nevertheless, both analysis suggested the Wild-TMVB group as a paraphyletic clade. ML topology revealed a finer-scale structure, identifying three paraphyletic clades within this genetic cluster, and Wild-striatus cluster is a nested clade differentiated from the rest of the Wild-TMVB group (**Figure 2**). Remarkably, the domesticated populations integrate a monophyletic clade statistically well supported, suggesting a unique domestication event for the Mexican populations. Nevertheless, these phylogenetic hypotheses do not allow to distinguish the genetic pool from which domestication took place, although the Wild-SUR-CH genetic cluster can be discarded.

The ML and Neighbor-Net topologies in which P. dumosus and P. vulgaris were included, positioned P. dumosus as a sister group of P. coccineus (**Figure 2A**). However, the SplitsTree method indicated a basal reticulate pattern among P. dumosus, P. coccineus, and P. vulgaris (**Figure 2C**), suggesting ancestral gene flow, but not recent. Furthermore, there is no evidence of recent gene flow between wild and cultivated groups, but only within genetic clusters (**Figure 2C**).

Regarding SNAPP cloudgram (**Figures 3B,C**), 53 single topologies summarize the 95% HPD consensus tree, indicating a different divergence pattern in which Wild-TMVB populations are the closest clade to the domesticated group. Nevertheless, the complex assignment of individuals within Wild-TMVB and Wild-striatus are shown in a non-solved pattern within the cloudgram as well as in low values of nodal support in the consensus topology (**Figure 3C**). Despite these main inconsistencies between ML and Neighbor-Net vs. SNAPP topologies, all hypotheses favor the occurrence of a single domestication event.

In regards of the ABC-based computations, the model comparisons in preliminary trials indicated scenarios where the Wild-SMOCC population that are paraphyletic to Wild-TMVB

yielded a higher probability in both direct and logistic approaches (Supplementary Figures S1, S2). A final test indicated that the most likely scenario was a single domestication event, being the Wild-TMVB group the closest to the domesticated clade (**Figure 4**; Scenario 2, direct P = 0.786, logistic P = 1.0), which is congruent with the results of SNAPP phylogenetic analyses. Evaluation of the posterior predictions via PCA indicated that parameter values and summary statistics from the simulated datasets based on Scenario 1 closely matched the empirical data (Supplementary Figure S3).

# Wild and Domesticated Population Genetics Statistics

High levels of genetic diversity were found in wild and cultivated populations (**Figure 5**). At the genetic cluster level, the Wild-TMVB group presented the highest diversity and the Cult-OV group the lowest. No clear pattern in the amount of diversity was observed between wild and cultivated clusters. There were cultivated groups with high genetic variance (Cult-SUR-CH and Cult-TMVB), and wild clusters that presented lower diversity than cultivated populations (Wild-SMOCC). At the location level (Supplementary Table S2), the samples from Spain (H<sup>E</sup> = 0.134) and Oaxaca Valley (H<sup>E</sup> = 0.148) presented the lowest diversity, and the highest was found in wild population located in Tlalpan, Mexico City (H<sup>E</sup> = 0.208). Regarding species, P. coccineus showed the highest diversity and P. dumosus the lowest.

Outstandingly, H<sup>O</sup> was greater than H<sup>E</sup> in all the genetic groups except in the Wild-SUR-CH cluster, resulting in negative values of FIS. Within the groups with an excess of observed heterozygosity, Wild-striatus had the lowest inbreeding coefficient (**Figure 5**). On the contrary, at the species level P. vulgaris showed a deficit of heterozygotes, showing a high FIS. The inbreeding coefficient is positive when estimated taking into account all P. coccineus samples. This is caused by the Wahlund effect, which is the reduction of heterozygosity due to subpopulation structure. Regarding pairwise differentiation index, FST values ranged from 0.022 (Cult-TMVB vs. Cult-SMOCC) to 0.178 (Cult-OV vs. Wild-striatus; **Figure 6**). As expected, the pair FST values are greater between wild genetic groups than between cultivated genetic clusters (**Figure 6**).

Cultivated populations of P. coccineus show smaller effective population sizes than wild populations. In some cases, like in Cult-TMVB and Cult-SMOCC, Ne was one order of magnitude smaller than in the rest of the populations. On the contrary, the genetic cluster Wild-SUR-CH had the biggest Ne (**Table 1**). The most recent split was estimated to happen 3.9 × 10<sup>3</sup> generations ago, and occurred between the Cult-SMOCC and the Cult-TMVB clusters. On the contrary, the oldest split event was dated in 4.95 × 10<sup>5</sup> generations ago between the Wild-SUR-CH and the

FIGURE 4 | Best-fitted domestication scenario of P. coccineus achieved with DIYABC. Split times in generations (tn) indicated the average posterior value estimated after Bayesian Computations (95% CI).

rest of P. coccineus clade. The split event that separates wild and domesticated samples was dated about 2.1 × 10<sup>4</sup> generations ago (**Figure 4**). Since P. coccineus is usually treated as an annual when cultivated, that represents 21,000 years ago. In the case of wild, perennial plants, one generation could be more than a year.

# Identifying Candidate Loci

Before LD filtering, the mean r 2 value among SNPs located in the same chromosome separated by a maximum distance of 10,000 bp was 0.151. After eliminating SNPs closer than 3,000 bp, the mean r <sup>2</sup> was 0.063 (Supplementary Figure S5). In the case of

SNPs from different chromosomes, the mean r <sup>2</sup> was 0.022. This low LD is not due to the closeness, but rather by factors like populations structure. Interestingly, the pattern in the decay of LD differed between genetic groups, with the fastest decay and lowest r 2 in cultivated and wild populations from the TMVB. Meanwhile, Wild-striatus, Wild-SURCH and Cult-OV had the slowest LD decay and highest r 2 values (Supplementary Figure S5). After filtering, the data set for candidate loci contained 11,693 SNPs distributed across the 11 chromosomes. In the central region of most of the chromosomes, there is a reduction in SNP density, probably due to centromeres (Supplementary Figure S6).

Using the pcadapt package, 47 SNPs were identified as candidate domestication loci; 342 involved in cultivar diversification; and 1,030 potentially under natural selection. Despite the great number of candidate SNPs that were identified, few are shared among selection types (Supplementary Figure S7). In the case of the BayeScan analyses, 469 candidate SNPs for domestication were identified; 16 related to cultivar diversification; and 12 candidates associated with natural selection. None of these SNPs were shared among the three BayeScan analysis.

Twenty four SNPs related to domestication, 13 to cultivar diversification and eight to natural selection were detected by both approaches and considered as candidate loci for further analyses (Supplementary Table S3). The genetic variance explained by the candidate SNPs compared to the 11,693 SNPs used previously changed dramatically (**Figure 7**). Notably, the genetic and geographic structure of wild and cultivated groups can be recovered by these few candidate SNPs (**Figures 7B,C**) and a clear separation of wild and domesticated populations is observed (**Figure 7A**).

Four SNPs of the candidate domestication loci were found to be annotated in P. vulgaris genome, one of the candidate loci under natural selection and none of the candidate loci for cultivar diversification (Supplementary Table S3). Three of the annotated candidate domestication loci (Phvul.001G232200, Phvul.007G256000, Phvul.009G156400) are highly expressed in flowers, flower buds or young pods, and the remaining locus (Phvul.002G145600) is highly expressed in green mature pods. All these loci have their highest similarity homologs in G. max genome v2.0 (Schmutz et al., 2010), but none of these correspond to the domestication-related loci previously identified by Zhou et al. (2015). The annotated candidate locus for natural selection (Phvul.003G197500) is highly expressed in roots and steam and corresponds to a calmodulin binding protein-like, which also has an homolog in G. max.

# DISCUSSION

# A Single Domestication Event for Mexican P. coccineus in the TMVB

Spataro et al. (2011) and Rodriguez et al. (2013), using SSR data, suggested two domestications events of P. coccineus, one in Mexico and the other in Guatemala-Honduras. The genomic

distinguish cultivar diversification-related loci. (C) Analysis of wild samples to detect signatures of natural selection.

data generated in this work indicates a unique domestication event for the cultivated populations from Mexico (**Figures 2**, **3**). This includes Chiapas populations (Cult-SUR-CH), which are geographically and culturally closer to Guatemala than to Central and Northern Mexico. However, no samples from Guatemala and Honduras were included, therefore a second domestication event in this area cannot be discarded with the present data. Nonetheless, based on the results from SNAPP and DIYABC analyses, we were able to identify Wild-TMVB as the genetic pool from which domestication started in Mexico (**Figure 3**).

The most recent divergence time, that corresponds to the separation between cultivated groups of SMOCC and TMVB, was dated in 3,950 generations ago (**Figure 4**, t1). Assuming one generation per year in cultivated populations, this represents 3,950 years. But divergence between the cultivated and wild clades was dated in 21,000 generations (**Figure 4**, t5). This date is out of range of any plant domestication event and it seems unlikely. There are evolutionary processes that may affect these estimations. Processes like selection, population subdivision and incomplete lineage sorting may result in an overestimations of divergence times because increase the time to coalescence, that is, the time it takes for the two sequences to find their common ancestor (Albrechtsen et al., 2010; Angelis and Dos Reis, 2015). In P. coccineus, the selection made by humans during domestication and the high population structure in wild and domesticated groups probably has resulted in overestimated divergence times.

Also, it has to be considered that wild populations are perennial and thus generation times may be longer than a year.

The genetic findings suggest that P. coccineus domestication likely occurred from TMVB's material, pinpointing the domestication of this species to a particular region within the large Mexican territory where it is cultivated nowadays. Other sources of information could be incorporated to confirm this, using our findings as a geographic reference. If confirmed, identifying the TMVB as the area where domestication started for this species is interesting and important from an evolutionary, cultural and conservation perspective. The TMVB is the most recent mountainous region of Mexico, a biodiversity hotspot and it has a complex bio- and phylogeographic history characterized by following a sky-island dynamic during the last 2 Myr (Mastretta-Yanes et al., 2015). Culturally it became prominent during the Mexica Empire, and has been the most populated part of Mexico since little before the Spanish conquest (Bataillon, 1972). This has derived in other important cases of domestication to occur in this region. For instance, this central region was where the introgression of Zea mays ssp. parviglumis and Z. mays ssp. mexicana occurred during the domestication of maize (van Heerwaarden et al., 2011). However, human occupation in this area is also a concern for conservation, because the growth of urbanization and high-input agriculture in this area threat both P. coccineus landraces and wild populations (CONABIO and IUCN, 2016).

Besides genetic data, a Mexican domestication origin of P. coccineus is also supported by the several names that this bean has among different cultures. For instance, it is called tekómari in Chihuahua (Tarahumara indigenous language); tasukhu in Hidalgo and Puebla (Otomi); ayocote in central states of Mexico (Nahuatl); shaushana or xaxana in Veracruz (Totonaco); ma-mája (Mazateco) in Oaxaca; and botil or shbotil chenec in Chiapas (Tzeltal) (Salinas, 1988). Associated to these groups, there is also considerable traditional knowledge regarding the cultivation and use of P. coccineus species (e.g., Monroy and Quezada-Martínez, 2010).

# Historic and Recent Gene Flow among Wild, Feral and Domesticated Populations

The individuals identified as feral clustered in the domesticated clade (**Figure 2A**), suggesting that they are escaped cultivars. This questions the hypothesis of an hybrid origin between wild and cultivated populations (Salinas, 1988) and contrasts with previous studies of feral P. vulgaris populations in Mexico (Papa and Gepts, 2003), where weedy populations appear to be genetically intermediate between domesticated and wild populations, and not cultivar escapees. Interestingly, the three collected feral populations belonged to the same genetic cluster (Cult-SUR-CH) and presented high levels of mixed ancestry, of which only a small proportion corresponds to wild clusters (**Figure 2B**). Since little evidence of gene flow was found in SplitsTree (**Figure 2C**), probably the mixed ancestry is due to shared polymorphisms or ancestral gene flow, rather than recent introgression events.

The breeding line Blanco Tlaxcala grouped with SMOCC landraces. Probably, breeding practices have acted over specific regions rather than over all the genome. The individuals of this breeding line did not present mixed ancestry, despite that Blanco Tlaxcala was developed using a multi linear method (Vargas-Vázquez et al., 2012). This suggests that all lines used to generate Blanco Tlaxcala belonged to the same genetic cluster (Cult-SMOCC), and they were submitted to several rounds of strong selection, decreasing genetic variation.

Contrary to what was reported by Spataro et al. (2011) and Rodriguez et al. (2013), samples from Spain clustered within the TMVB landraces, indicating that this European population was originated by the introduction of individuals of the Cult-TMVB group into Spain. Nevertheless, because just one European population was analyzed, no general pattern can yet be inferred. Notably, Spanish samples did not present mixed ancestry, meanwhile the rest of the individuals of this genetic group did (**Figure 2B**). Probably the genetic bottleneck that originated European populations and the isolation from wild relatives and American landraces, have decreased the amount of shared ancestral polymorphisms between cultivars from TMVB and Spain.

It has been suggested that hybridization and introgression have played a major role in P. coccineus evolution, both in cultivated and wild populations (Escalante et al., 1994; Angioi et al., 2009; Spataro et al., 2011; Rodriguez et al., 2013). Our results showed mixed ancestry both in wild and cultivated clusters. However, little evidence of introgression and hybridization was detected, and mixed ancestry can also be due to shared ancestral polymorphisms. Nevertheless, wild and cultivated populations frequently coexist, therefore hybridization cannot be discarded and a formal test considering the number and size of introgressed regions and the direction of gene flow must be done.

# Phaseolus coccineus Is Highly Diverse and Structured

Phaseolus coccineus wild populations are divided in four genetic clusters that show considerable population differentiation. Similar levels of differentiation have been observed in several other highland species, which has been related to the high environmental variability and the complex geologic and climatic history of Mexico (Mastretta-Yanes et al., 2015). The extent of this differentiation in crop wild relative species has been mostly done with low resolution neutral makers (Bellon et al., 2009; Piñero et al., 2009) so it still needs to be further explored with genomic data. However, the present study and analyses in teosinte (van Heerwaarden et al., 2011; Aguirre-Liguori et al., 2017), highlight that there is high diversity contained in the genetic pools of crop wild relatives from Mexico.

Besides the diversity contained in wild relatives, one of the most important determinants in crop evolution is the level of genetic diversity contained in the domesticated populations, especially with reference to the wild ancestral gene pool. Genetic diversity reduction has been widely described in crop domestication (Hufford et al., 2013; Li et al., 2013; Schmutz

et al., 2014; Renaut and Rieseberg, 2015). This reduction of genetic diversity is caused by genetic drift resulting from population bottlenecks, and by artificial selection (Gepts, 2014). This phenomenon was also described in P. vulgaris (Schmutz et al., 2014) but in P. coccineus no clear pattern of genetic reduction was found between the wild or cultivated genetic groups (**Figure 5**). The Wild-TMVB cluster presented the highest genetic variation, followed by the Cult-TMVB and Cult-SUR-CH groups. On the contrary, the Cult-OV and Wild-SMOCC clusters showed the lowest HE. Regarding effective population sizes, these were greater in wild than in cultivated genetic clusters, which is expected due to the genetic bottlenecks associated to domestication process. Nevertheless, Ne estimations of domesticated groups are in the order of 103–10<sup>4</sup> . Taking together all results, these suggest that the genetic bottleneck during domestication was not severe. Other factors that may favor the maintenance of genetic diversity in P. coccineus are its high outcrossing rate (Escalante et al., 1994) and the fact that the genetic cluster from which domestication started (Wild-TMVB) presents the highest diversity. Little evidence of recent gene flow was detected, but early gene flow could also favor the amount of genetic diversity in cultivars.

Analyzing the genetic variance at the location level, Spanish samples presented the lowest diversity (Supplementary Table S2), which may be due to the recent demographic bottleneck that occurred during its introduction to Europe. Nevertheless, Oaxaca Valley also showed low genetic variation (Supplementary Table S2) and the ancestry analysis (**Figure 2B**) suggests that it has been genetically isolated from the other genetic clusters.

Regarding the inbreeding coefficient, the wild and cultivated genetic clusters presented negative FIS values, indicating an excess of heterozygotes, except in the Wild-SUR-CH group. A possible explanation for this pattern is inbreeding depression, which effect in progeny has been studied in cultivars from Spain, finding that selfing affected germination, survival rate and seed weight (González et al., 2014). Also, a negative correlation was found between outcrossing rate and seed abortion in wild populations studied by Escalante et al. (1994). In the case of domesticated populations, the bottlenecks that they suffered during domestication may promote the accumulation of deleterious alleles and the increase of inbreeding depression, resulting in lower values of the inbreeding coefficient (Morrell et al., 2011). Opposite to what was expected, in P. coccineus the population with the lowest FIS was a wild cluster (Wild-striatus). This population was previously studied by Búrquez and Sarukhán (1984), who found evidence of self-incompatibility, which is congruent with our results. A possible explanation for this pattern is the accumulation of deleterious alleles in the Wild-striatus cluster. Notably, no mixed ancestry was detected in this genetic group, indicating that it is genetically isolated from other populations despite being geographically close to other wild and cultivated TMVB populations. It is necessary to evaluate other populations of P. coccineus subsp. striatus to know if this is a common pattern and to explore the ecological and genetic causes and consequences of it.

# Adaptative Variation in Wild and Domesticated Populations

Mexico is an environmentally and culturally heterogeneous country, which favored crop genetic diversity. The distribution of Phaseolus, both cultivated and wild, involves an interaction with a wide range of different cultures, and isolated populations are exposed to diverse environmental conditions. For example, compared to P. vulgaris, P. coccineus grows in more humid environments, at cooler temperatures and at higher altitudes. Nevertheless, there are few studies that aim to elucidate the genetic basis of adaptation, especially for the wild populations of Phaseolus crop species (Bitocchi et al., 2017). Our outlier analyses listed some candidate SNPs that could be under artificial selection during the domestication and diversification stages, and others that could be under natural selection. Although most of these outliers are still not annotated, they could serve as a base for identifying population differentiation in adaptive variation, which is a needed step for genetic resources and crop wild relatives conservation (Maxted et al., 2012). Our study is based on GBS data, so P. coccineus genome is not fully saturated, and likely there are loci under selection that we did not sample. Nevertheless, this set of outliers are a first approximation to identify candidate loci to domestication and natural selection in runner bean.

The fact that no loci overlapped between domestication, diversification and natural selection categories shows that different selective processes were detected. This is to be expected because, in general, loci under natural selection and artificial selection related to domestication and diversification are expected to differ across the genome (Meyer and Purugganan, 2013).

The loci involved in domestication are expected to be specially related to the phenotypic changes of the domestication syndrome (Koinange et al., 1996), that is modifications in morphological and physiological traits like seed dispersal, seed dormancy, gigantism, increased harvest index and flowering time (Hammer, 1984). Most of the domestication-related loci identified here are still of unknown function, but the four that are annotated are highly expressed in flowers or pods (Supplementary Table S3). This is interesting because in the soybean, another legume, several domestication-related loci associated with flowering time have been identified (Zhou et al., 2015). However, no overlap among those loci and the ones identified here was found.

# CONCLUSION

The SNPs generated in this work provided high resolution data to understand the domestication of P. coccineus. Results suggest one domestication event for Mexico, which started from the wild genetic pool from TMVB. Furthermore, wild and domesticated populations are highly diverse and presented high values of Ne, suggesting that the demographic bottleneck due to domestication was not severe. These genomic analyses allow to highlight how the genetic signatures of domestication can be substantially different even between species of the same genus domesticated in the same geographic area. Common bean and

scarlet runner bean are closely related species, nevertheless their reproductive strategies and domestication histories seem to be different: P. vulgaris tends to self-crossing, which theoretically facilities the domestication process, and it also suffered a severe domestication bottleneck. On the contrary, P. coccineus is an open pollinated species that presents high levels of genetic diversity and population structure, and its domestication did not result in a strong demographic bottleneck.

Our findings also show that both wild and domesticated populations of P. coccineus are highly structured. Most of the genetic clusters presented an heterozygotes excess, showing evidence of inbreeding depression. Interestingly, the population identified as P. coccineus subsp. striatus shows the greatest excess of heterozygotes and seems to be genetically isolated from other wild and cultivated populations. Contrasting with previous studies, our data shows that gene flow within and between wild and cultivated populations is not a common process. Fully testing this represents an area where further research is needed.

The levels of diversity and population differentiation found here support that the runner bean is a potential source of variability for several traits for plant breeding (Schwember et al., 2017). The data presented here highlights that for a better characterization of P. coccineus wild and cultivated forms there is still a need of more sampling, specially including Central American populations. Complete and annotated genomes of Phaseolus and other legume crops will facilitate not only comparative genomics, but will give a better knowledge of the evolution and domestication of this group of plants that has been independently domesticated by several human groups across its distribution.

# AUTHOR CONTRIBUTIONS

AG-G, DP, and AD-S designed the study. AG-G made the molecular procedures. AG-G, AM-Y, and MS-A conducted

# REFERENCES


the analyses. All authors revised the results and wrote the manuscript.

# FUNDING

This work was supported by Consejo Nacional de Ciencia y Tecnología through the Ph.D. scholarship number 440709 to AG-G and CONACYT Grant 247730 to DP.

# ACKNOWLEDGMENTS

We thank Idalia Rojas, Myriam Campos, Erick García, Verónica González, Alfredo Villarruel, Nancy Gálvez, and Rocío González for fieldwork assistance, Tania Garrido for laboratory technical assistance and Ernesto Campos Murillo for bioinformatic assistance to execute analyses in a cluster environment. We acknowledge funding from the CONACYT grant number 247730 and IEUNAM to DP. Statistical analyses were carried out in the CONABIO's computing cluster, which was partially funded by Secretaría de Medio Ambiente y Recursos Naturales (SEMARNAT) through the grant "Contribución de la Biodiversidad para el Cambio Climático" to CONABIO. This work constitutes a partial fulfillment of the Posgrado en Ciencias Biológicas at the Universidad Nacional Autónoma de México (UNAM) for AG-G. Finally, we thank to all farmers that share with us their seeds and knowledge.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2017.01891/ full#supplementary-material



Instituto Nacional de Ecología (INE) (2008). Ecorregiones Terrestres de México. Available at: http://www.conabio.gob.mx/informacion/gis/



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling Editor declared a shared affiliation and past co-authorship, though no other collaboration, with the authors and states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2017 Guerra-García, Suárez-Atilano, Mastretta-Yanes, Delgado-Salinas and Piñero. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Diversity of Treegourd (*Crescentia cujete*) Suggests Introduction and Prehistoric Dispersal Routes into Amazonia

Priscila A. Moreira<sup>1</sup> \*, Xitlali Aguirre-Dugua<sup>2</sup> , Cédric Mariac<sup>3</sup> , Leila Zekraoui <sup>3</sup> , Marie Couderc<sup>3</sup> , Doriane P. Rodrigues <sup>4</sup> , Alejandro Casas <sup>2</sup> , Charles R. Clement <sup>5</sup> and Yves Vigouroux <sup>3</sup>

<sup>1</sup> Post–Graduate Program in Botany, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil, <sup>2</sup> Centro de Investigaciones en Ecosistemas, Universidad Nacional Autónoma de México, Morelia, Mexico, <sup>3</sup> Institut de Recherche pour le Développement, Université de Montpellier, UMR DIADE, Montpellier, France, <sup>4</sup> Laboratório de Evolução Aplicada, Universidade Federal do Amazonas, Manaus, Brazil, <sup>5</sup> Coordenação de Tecnologia e Inovação, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil

#### *Edited by:*

B. Mohan Kumar, Nalanda University, India

#### *Reviewed by:*

Milton Kanashiro, Embrapa Amazonia Oriental (Embrapa Easter Amazon), Brazil Shabir Hussain Wani, Michigan State University, United States K. S. Rao, University of Delhi, India

> *\*Correspondence:* Priscila A. Moreira pri.ambrosio@hotmail.com

#### *Specialty section:*

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> *Received:* 30 June 2017 *Accepted:* 14 November 2017 *Published:* 29 November 2017

#### *Citation:*

Moreira PA, Aguirre-Dugua X, Mariac C, Zekraoui L, Couderc M, Rodrigues DP, Casas A, Clement CR and Vigouroux Y (2017) Diversity of Treegourd (Crescentia cujete) Suggests Introduction and Prehistoric Dispersal Routes into Amazonia. Front. Ecol. Evol. 5:150. doi: 10.3389/fevo.2017.00150 The use and dispersal of domesticated plants may reflect patterns of early human diffusion of technologies and lifestyles. Treegourd (Crescentia cujete) has fruits with ancient utilitarian and symbolic value in the Neotropics. We assessed diversity based on chloroplast (SNPs), nuclear (SSR) markers, and fruit shapes of cultivated treegourds and wild relatives across Amazonia and Mesoamerica in order to discuss hypothesis of dispersal routes and diversification of fruits along its distribution. The haplotype network showed three distinct groups: Crescentia amazonica, wild Mesoamerican C. cujete, and cultivated C. cujete from Brazilian Amazonia and Mexico. Mexico and Brazil shared two haplotypes, with slightly different distributions in Amazonia. The most divergent haplotype is well-represented in Eastern Amazonia. Nuclear differentiation between Mesoamerican wild and cultivated C. cujete is relatively low (FST = 0.35), compared with Amazonian cultivated (FST = 0.45–0.61). Differentiation is also higher between wild C. amazonica and cultivated C. cujete (FST = 0.57), but modest within cultivated C. cujete from Amazonia and Mexico (FST = 0.04), with higher genetic similarity in northwestern Amazonia. Mexico and Amazonia showed similar chloroplast nucleotide diversity (4.66 × 10−<sup>2</sup> and 5.31 × 10−<sup>2</sup> , respectively), although sample sizes are very different. Except in Northwestern and Eastern Amazonia, we found ample genetic homogeneity of cultivated C. cujete across Amazonia, but highest morphological diversity in the Northwest, with fruit shapes that are absent in Mexico. We conclude that treegourds introduced into the Amazon Basin and Mexico share a common ancestry with a currently unknown origin. The patterns of genetic diversity across Amazonia allow two hypotheses of the routes of introduction: a northwestern introduction into the Negro and Solimões Rivers, and an eastern introduction from the coastal Guianas into the Amazonas River. The dispersal into Amazonia followed previously proposed routes of human and plant migrations. The contrasting fruit shape diversity suggests different utilitarian demands and cultural preferences for treegourd fruits between Mexico and Amazonia.

Keywords: bignoniaceae, calabash, cuia, domestication, ethnobotany, historical ecology, phylogeography

# INTRODUCTION

The use and dispersal of domesticated plants may reflect the patterns of diffusion of human technologies and lifestyles since prehistoric times (Bellwood, 2005; Blench, 2012). Humans greatly expand plants' geographical distributions, which ultimately exerts different ecological and cultural pressures on the evolutionary pathways of plants (Rindos, 1984; Sodero Martins, 2005; Leclerc and Coppensd'Eeckenbrugge, 2012; Meyer and Purugganan, 2013). Various centers of domestication have been proposed in the Americas (Meyer et al., 2012) from where people exchanged plants (Heiser, 1965; Schultes, 1984; Colunga-GarcíaMarín and Zizumbo-Villarreal, 2004). Amazonia is one of them (Clement, 1999), and also encompasses great linguistic diversity (Blench, 2012), ceramic styles (Barreto et al., 2016), and landscape management strategies (Eriksen and Danielsen, 2014), whose geography and chronology are being disentangled (Mayle and Iriarte, 2014; Clement et al., 2015; Neves, 2016; Levis et al., 2017). The Amazonian routes of dispersal of plants and people have been associated with rivers and riparian environments (Schultes, 1984; Godoy et al., 1999; Guix, 2009). However, few studies have demonstrated the genetic signatures of the plants' geographical dispersal mediated by humans in Amazonia (Clement et al., 2010; Shepard and Ramirez, 2011; Thomas et al., 2012; Freitas and Bustamante, 2013), even though they are persistent markers of the long-term use and management of resources (Hanotte et al., 2002; Parker et al., 2010; Armstrong et al., 2017).

Treegourd (Crescentia cujete) is a good case study, since its trees produce fruits with ancient utilitarian and symbolic value widely dispersed across the Neotropics (Gentry, 1980; Arango-Ulloa et al., 2009; Meulenberg, 2011; Aguirre-Dugua et al., 2013; Medeiros and Albuquerque, 2014; Moreira et al., 2017). It currently is one of the most common species in homegardens of the floodplains and adjacent communities of Amazonia (Santos, 1982; Lima and Saragoussi, 2000). Its fruits have different shapes and sizes that are used as bowls, vessels or bottles for drinking or transporting water, bags for provisions, utensils for cooking, and eating, bailing water from canoes, construction of fish traps, manufacture of body ornaments, and musical instruments (Steward, 1948; Patiño, 1967; Morton, 1968; Price, 1982; Bennett, 1992; Heiser, 1993; Meulenberg, 2011). Medicinal uses are also similar across its distribution (Morton, 1968; Duke, 2009), which include neutralization of snake venom and intestinal parasites treatment (Otero et al., 2000; Volpato et al., 2009; Ramos, 2015; Paulo, 2016).

While the great phenotypic variability of cultivated treegourd is a distinctive feature among Crescentia species (Gentry, 1980), its wild populations from Mexican savannahs in the Yucatan Peninsula have smaller, elongated fruits with thinner exocarps (Aguirre-Dugua et al., 2012). The indehiscent and thicker exocarp of cultivated treegourd fruits makes the spontaneous dispersal of seeds impossible (Aguirre-Dugua et al., 2012). Its oldest remains found to date come from a Peruvian archaeological site dating to 5,000–3,800 years BP (Solis, 2006). This pattern contrasts to the bottle gourd, collected from a vine (Lagenaria siceraria), one of the ancient crops similarly used for technological purposes in the Americas (Heiser, 1993). Bottle gourd has been managed at least since the Late Pleistocene (Kistler et al., 2014) and was found in Colombian Amazon by 8,000 BP (Piperno, 2011). The wild progenitor of the cultivated Crescentia cujete remains elusive (Gentry, 1980; Arango-Ulloa et al., 2009; Aguirre-Dugua et al., 2012; Moreira et al., 2017). Gentry (1980) pointed out that C. cujete was certainly native to Mesoamerica, where putative wild populations are found in savannahs and semi-evergreen forests of southern Mexico and northern Central America (**Figure 1**). However, northern South America cannot be ruled out as part of the original distribution area of wild C. cujete, given the occurrence of apparently spontaneous C. cujete in grazed savannahs of Andean and Caribbean regions of Colombia (Arango-Ulloa et al., 2009). Historical anthropogenic fire management in savannahs (Pinter et al., 2011) may have been advantageous for its early dispersal (Bass, 2004) in these regions. Recently, the wild species native to Amazonian and Orinocan floodplains (Crescentia amazonica) was ruled out as the wild progenitor of cultivated C. cujete (Ducke, 1946; Moreira et al., 2017). Likewise, the wild C. cujete populations found in the southeastern Mexico are not the wild progenitor either (Aguirre-Dugua et al., under revision). In this study, we infer treegourd dispersal and diversification across two pivotal regions of the Neotropics: Amazonia and Mesoamerica. We (1) identify genetic relationships among Mesoamerican and Amazonian cultivated C. cujete; (2) infer routes of introduction into and dispersal within the Amazon Basin; and (3) identify centers of morphological and genetic diversity. We discuss whether this genetic/morphological diversity is linked to (1) introgression with local wild parents, (2) ecological diversification, or (3) cultural diversification, since all three of them are possible along the dispersal routes.

# MATERIALS AND METHODS

# Sampling

We performed molecular analyses using full chloroplast (SNPs) and nuclear (SSR) markers. We also analyzed fruit morphology along the major rivers of Brazilian Amazonia and in parts of Mesoamerica (Supplementary Table S1). We used a previously published genetic and morphological dataset (Moreira et al., 2017) of cultivated C. cujete (N = 372) distributed in 122 localities along the five major rivers of the Brazilian Amazon basin, as well as wild Brazilian treegourds (C. amazonica) (N = 20) distributed in three of the rivers mentioned (**Figure 1**).

From Mexico, we add new genetic data of cultivated C. cujete from the Yucatan Peninsula, Oaxaca and Chiapas, wild samples from the Yucatan savannahs and a putative wild sample from Costa Rica (**Figure 1**). We also integrate morphological data from Mesoamerican samples (N = 188), part of which (N = 124) was published previously (Aguirre-Dugua et al., 2013). All Mesoamerican wild samples were identified as C. cujete Linnaeus 1753. In order to depict the putative geographical distribution of wild C. cujete, we searched for individuals of C. cujete described as spontaneous in savannahs on herbarium descriptions found in GBIF (Global Biodiversity Information Facility) (**Figure 1**).

This research followed the International Society for Ethnobiology's code of ethics (International Society of Ethnobiology, 2006) and was approved by the Committee for Ethics in Research with Human Beings of the National Research Institute for Amazonia (CEP INPA, proc. no. 408.611, 2013). Collection in Brazil was authorized by the Brazilian System for Authorization and Information in Biodiversity, Chico Mendes Institute for Biodiversity Conservation, proc. no. 25052–1, 2012, and transportation by the Brazilian Institute for the Environment and Renewable Natural Resources, proc. no. 14BR015576/DF, 2014. Collection in Mexico and Costa Rica was authorized by proc. no. SGPA/DGGFS/712/3691/10.

# Genetic Analysis

We used a previously described protocol for genotyping nuclear microsatellites and the detection of single nucleotide polymorphisms along the entire sequence of the maternally inherited chloroplast genome (Moreira et al., 2016, 2017). In total, 250 samples were genotyped for eight nuclear microsatellites (SSR): 234 from Brazilian Amazonia (215 cultivated C. cujete and 19 wild C. amazonica), and 16 from Mesoamerica (7 cultivated C. cujete from Mexico, 8 wild C. cujete from Mexico, 1 wild C. cujete from Costa Rica). Data from the chloroplast genome was obtained from a total of 215 samples: 191 C. cujete and 16 C. amazonica from Amazonia, 5 cultivated C. cujete from Mexico, 2 wild C. cujete from Mexico, 1 wild C. cujete from Costa Rica. Among the total sample (N = 250), 80 % were genotyped and sequenced for both kinds of markers.

The nuclear SSR dataset was used to assess population structure with a Bayesian approach (Structure 2.3, Pritchard et al., 2000). We applied the admixture model in order to identify ancestral population proportions for each individual and their probable populations of origin. Using total sampling and assuming independent allele frequencies in each population, which reduces the risk of overestimating the number of clusters (Pritchard et al., 2000), we assessed the number of clusters K varying from 1 to 20, with 100,000 burn-in, 100,000 iterations, and five different runs for each K value. To attempt to identify different genetic pools within the cultivated cluster, we performed an additional analysis on a subset including only cultivated C. cujete samples, whose membership probability was higher than 0.6 in the cultivated cluster (N = 200). Using the admixture model, we experimented with two allele frequency assumptions (Pritchard et al., 2000): the independent model as default; and the correlated (assuming lambda = 1), since it is likely that cultivated populations share ancestry due to migration and vegetative propagation. Evanno et al. (2005) 1K was used to guide our choice of the most likely number of groups. Additionally, we performed a Principal Components Analysis (PCA) with stats R package (R Core Team, 2015) in order to uncover additional genetic structure in our data (Jombart et al., 2009). The PCA was non-centered, but scaled in order to compensate for differences in polymorphism and missing data among the loci analyzed. The spatial interpolation of the clusters obtained in Structure was analyzed using the kriging method in the fields R package (Nychka et al., 2015). Based on geostatistics and maximum likelihood, the krig function estimates the covariance in a grid (we used the scale parameter theta = 50) and infers the fitted surface between geographical coordinates and genetic relationship among samples (Nychka et al., 2015). Nuclear genetic diversity of C. cujete [allelic richness (Ar), private alleles (Ap), observed heterozygosity (Ho), expected heterozygosity (Hs)] was estimated for the five Amazonian rivers considered and the Mexican samples using hierfstat (Goudet, 2005) and poppr (Kamvar et al., 2014) R packages. Pairwise FST between regions were estimated and statistically evaluated using 1,000 bootstraps (Nei, 1987). A neighbor-joining dendrogram of regions was constructed based on Nei's distance and 1,000 bootstraps (Saitou and Nei, 1987). The inbreeding coefficient FIS for each region was estimated and its significance evaluated (considering a Bonferroni corrected p-value of 0.006) using pegas R package (Paradis, 2010).

For the identification of chloroplast SNPs, we used a bioinformatic pipeline previously validated for the sequencing of the entire chloroplast genome (Scarcelli et al., 2016). Briefly, SAMTOOLS 0.1.7 with option-B (Li et al., 2009) was used to generate an mpileup file. VARSCAN 2.3.7 (Koboldt et al., 2012) was used to call SNPs from this mpileup file. The variant call format file (VCF) generated was filtered following Scarcelli et al. (2016) and resulted in a total of 334 cpSNPs detected in our dataset. The final vcf file was exported as a fasta file using VCFtools 1.14 (Danecek et al., 2011) and haplotypes identified with DNAsp 5.10.1 (Librado and Rozas, 2009). An haplotype network was constructed using the median joining algorithm (Bandelt et al., 1999) and samples with up to 6.5% of missing data using POPART 1.7 (Leigh and Bryant, 2015). The geographical distribution of the shared haplotypes of C. cujete samples was plotted using GenGIS 2.5 (Parks et al., 2009). The chloroplast diversity of C. cujete [total number of polymorphic sites (S), number of haplotypes (h), and nucleotide diversity (π)] were estimated according to Nei (1987) using DNAsp 5.10.1. The presence of singleton samples and their contribution with unique alleles were identified by VCFtools 1.14. Paired FST among the Amazonian rivers and Mexico were estimated using the distance method of Tajima and Nei (1984), and their significance was evaluated with 1,000 permutations at a significance level of 0.05 using Arlequin 3.5 (Excoffier and Lischer, 2010).

# Morphological Analysis

Fruit shapes of cultivated C. cujete were registered in 286 individuals and fruit diameter was measured in 175 individuals in the Amazon Basin. For Mesoamerican samples, we analyzed 117 cultivated individuals from Mexico, among which 64 were from nine localities in the Yucatan Peninsula (Aguirre-Dugua et al., 2013) and 53 were from 19 localities representing the Gulf of Mexico coast, Tehuacan Valley, and Pacific Ocean coast from the states of Michoacan, Oaxaca, and Chiapas (**Figure 1**).

The shape of the mature fruits of each individual was classified visually into nine categories: spherical, flattened, oblong, cuneate, elongated, globular, rounded-drop-shaped, oblong-drop-shaped, and kidney-shaped. All of these categories, except spherical, followed the classification created for Colombian fruits (Arango-Ulloa et al., 2009). The spherical fruit was added as a new category, since it is a remarkable shape found in Mexico, which has a higher index of roundness than flattened fruits (Aguirre-Dugua et al., 2012, 2013). For Brazilian samples, the flattened type was sub-divided in order to discriminate these perfectly spherical fruits from flattened ones based on visual comparison of photographs. The Shannon index was adapted to estimate fruit shape diversity using H' = −P i pi logp<sup>i</sup> , from Pielou (1975), where p<sup>i</sup> is the relative frequency of each fruit shape. The Shannon index was calculated for each Amazonian river, and for Amazonia and Mexico.

# RESULTS

# Geographic Patterns of Nuclear Diversity

Evanno et al. (2005) 1K suggested that two clusters are the most likely structure in the dataset (K = 2, **Figure 2A**, Supplementary Figure S1). At K = 2, a clear distinction among wild and cultivated samples was observed (clusters shown in blue and red in **Figure 2A**, respectively), regardless of their geographical origin. Mexican cultivated C. cujete samples showed an admixed pattern (membership probability to wild cluster from 0.16 to 0.87), as did some of the cultivated C. cujete from the Amazon Basin (membership probability to wild cluster from 0.01 to 0.98). The wild admixture within cultivated C. cujete in the Amazon Basin had higher proportions along the Amazonas River, decreasing values along the Solimões, Madeira, and Negro rivers, and was absent along the Branco River (**Figure 2A**). The wild Costa Rican sample displayed a membership probability of 0.25 to the cultivated cluster, a larger proportion than the membership shown by the Mexican wild samples (0.01–0.02). In the Principal Component Analysis (PCA), the first two principal components explained 16.7% of the total variance found in the dataset (**Figure 2B**). Principal component one separated wild from cultivated samples, while principal component two separated the Brazilian wild C. amazonica from the Mesoamerican wild C. cujete samples. The wild sample from Costa Rica was intermediate between wild and cultivated Mexican samples, which agree with its ancestry pattern observed in the clustering analysis performed by Structure. One Brazilian sample from Amazonas River was relatively closer to the Costa Rican sample (**Figure 2B**). To assess to what extent the intermediate ancestry of cultivated Mexican samples between wild Mesoamerican and Brazilian cultivated samples (**Figure 2B**) was associated with hybridization or divergence, we performed a Structure analysis among only Mesoamerican samples. This analysis clearly differentiates two groups of wild and cultivated Mesoamerican C. cujete (Supplementary Figure S2). However, we still observed the Costa Rican sample as having intermediate ancestry among these Mesoamerican samples (Supplementary Figure S2). Consequently, the intermediate ancestry detected in cultivated Mesoamerican samples may reflect divergence rather than hybridization. The differentiation between wild C. cujete and cultivated samples was also evident in the neighbor-joining

FIGURE 2 | Genetic structure of 250 cultivated and wild treegourds (Crescentia cujete; C. amazonica) from Brazilian Amazonia and Mesoamerica based on 8 nSSR. Wild C. cujete were from the Yucatan Peninsula in Mexico and the Pacific coast of Costa Rica. (A) Structure plots at K = 2. The y-axis shows the proportion of assignment to the cluster and each vertical bar represents a single plant. The geographic locations and river basins are separated by white vertical columns; in Mexico, the first group is cultivated and the second is wild. (B) Principal components analysis (PCA) of nuclear genetic structure. The solid symbols represent the two species in Brazil, while the gray refers to wild and cultivated C. cujete in Mesoamerica. Numbers in parenthesis show the percentage of the allelic variation explained by each axis. (C) Neighbor-joining tree of the geographic relationships between wild and cultivated samples based on Nei's genetic distance with 1,000 bootstraps supports indicated on the nodes.

dendrogram (**Figure 2C**). The level of differentiation between Mesoamerican wild and cultivated C. cujete was relatively low (FST = 0.35, IC95% = 0.13–0.60). The differentiation between Mesoamerican wild and Amazonian cultivated C. cujete samples was lowest with the Negro River (FST = 0.45, IC95% = 0.27– 0.65), followed by the Amazonas (FST = 0.50, IC95% = 0.37– 0.64), the Solimões (FST = 0.52, IC95% = 0.40–0.68), Madeira (FST = 0.57, IC95% = 0.39–0.81), and Branco (FST = 0.61, IC95% = 0.45–0.82). The wild C. amazonica samples showed high differentiation compared with cultivated C. cujete samples (FST = 0.57, IC95% = 0.37–0.64).

We performed another Structure analysis with the cultivated samples, using only plants whose membership probability was higher than 0.6 in the cultivated cluster (**Figure 2A**). The two allele frequency models showed similar patterns, with better defined clusters using the correlated model (Supplementary Figures S3, S4). Again, Evanno et al. (2005) 1K suggested that two clusters are the most likely structure (K = 2, Supplementary Figure S3); these distinguished Mexican from Brazilian samples, with considerable admixture widely distributed in the Amazon Basin (**Figure 3A**). Evanno et al.'s 1K suggested decreasing likelihood of structure up to four clusters (K = 3 and K = 4), although the fourth cluster did not show a pattern that was clearly different from K = 3. At K = 3, Mexican and Brazilian samples showed strong admixture (green and yellow clusters in **Figure 3A**). The green cluster membership was found in Mexico, but was higher along the Negro River and upper sections of the Branco River, with decreasing membership along the Solimões, Amazonas, and Madeira rivers (**Figure 3A**). In contrast, the third yellow cluster, also found in Mexico, was predominant along the Amazonas and Madeira rivers, scattered along the Solimões, but also high in the middle Negro River (**Figure 3A**). The neighborjoining tree differentiated two groups within Amazonia that are both genetically different from Mexico (**Figure 3B**). However, the differentiation between Amazonia and Mexico is modest (FST = 0.04, IC 95% = 0.006–0.08). Spatial interpolation of the Structure clusters highlights that, although the admixture between Mexico and Amazonia (**Figure 3A**), genetic similarity is higher between Mexican samples and northwestern Amazonia (**Figure 3C**). The spatial interpolation also reveals the wide genetic homogeneity of cultivated C. cujete across Amazonia, except for the genetic differentiation in the Northwest and East, which is free from local wild-admixture effect in this data set (**Figure 3C**). The Northwestern and Eastern regions are relatively similar (**Figure 3C**), which agrees with the distribution of the Eastern yellow cluster up to the middle Negro River (**Figure 3A**). As expected, the Structure clusters in Amazonia without the Mexican samples show similar spatial interpolation pattern (Supplementary Figure S5).

# Geographic Patterns of Chloroplast Diversity

The haplotype network showed three distinct groups: C. amazonica, wild Mesoamerican C. cujete, and cultivated C. cujete from Brazil and Mexico (**Figure 4A**). The wild Mexican C. cujete lineage is more distant from cultivated C. cujete (55 substitutions + 12 substitutions) than is wild C. amazonica (39 + 12 substitutions). In the cultivated C. cujete group, five common haplotypes were identified, among which four are very close to each other (1 and 2 substitutions) at the core of the cultivated haplogroup (H1, H2, H3, H4). Haplotype H5 is differentiated by at least four substitutions from the core of the network. Divergent cultivated C. cujete samples from the Amazon basin were arranged in the extreme branches of the C. cujete group in the haplotype network (**Figure 4A**); the highest number of substitutions (36 and 75) was comparable to the differentiation between the wild and cultivated groups.

The most common haplotype in the Amazon basin (H1) was widely dispersed, but not found in Mexico. Mexico and

FIGURE 3 | Nuclear genetic differentiation between cultivated C. cujete samples (N = 200) from Mexico and Brazilian Amazonia using 8 nSSR. Only samples with high cultivated membership (>0.6) from Figure 2 were included. (A) Structure analysis based on correlated allele frequency model. Plots show the two likely groupings (K = 2 and K = 3). The y-axis shows the proportion of assignment to the cluster and each vertical bar represents a single plant. Samples were ordered by their geographical location along the main rivers/country: the Negro, Solimões, and Amazonas Rivers are ordered west to east; the Branco River and Mexico are ordered north to south; the Madeira River is ordered south to north. (B) Neighbor-joining tree of the geographic relationships based on Nei's genetic distance with 1000 bootstraps supports indicated on the nodes. (C) Spatial interpolation of the Structure clusters (Q) at K = 2 indicated above (Figure 3A). The colored bar on the right indicates the probability of assignment to the green cluster (Figure 3A) between samples (white dots). Although the admixture between Mexico and Amazonia (Figure 3A), genetic similarity is higher between Mexican samples and northwestern Amazonia. Within Amazonia, cultivated C. cujete is genetically homogeneous, except by the differentiation in the Northwest and in the Eastern, which agrees with K = 3 (Figure 3A).

Brazil shared haplotypes H2 and H3, which, although different by only one substitution, showed slightly different distributions in the Amazon Basin (**Figure 4B**). Haplotype H2, the most common in Mexico, is restricted to the western half of Brazilian Amazonia, with higher frequency in the Northwest. Haplotype H3 is unevenly distributed in the Amazon Basin, but absent in the Northwest. Haplotype H4 is widely distributed, whereas haplotype H5, the most divergent haplotype (**Figure 4A**), is less abundant and found at low frequencies along the middle Negro River, but is well-represented in Eastern Amazonia. The most divergent rare haplotypes (H6, H10, H11) agree with the geographical distribution of the haplotype H5. The other rare haplotypes (H7, H8, H9) were sparsely distributed along the Solimões and Madeira rivers, except the haplotype H12 shared between Madeira and Branco River and the haplotype H13, restricted to the upper sections of Negro and Solimões Rivers (**Figure 4B**). None of the Amazonian rivers were significantly divergent from Mexico (**Table 1**), certainly because of the small sample size from Mexico. Within the Amazon Basin, the Amazonas River is the most differentiated from all other rivers (**Table 1**).

# Genetic Diversity in Cultivated *C. cujete*

Based on 8 nSSR of cultivated samples, there were 31 alleles in Mexico and 55 in the Amazon Basin (**Table 2**), although the sample sizes of the two regions are very different. The number of private alleles among cultivated samples showed that seven alleles were only found in cultivated Mexican samples and 31 alleles in cultivated Amazonian samples (**Table 2**), among which six are also found in wild Mesoamerican samples. Among Amazonian samples, the Amazonas River concentrated private alleles (5) not found in local wild C. amazonica. The Negro, Solimões and Madeira rivers had fewer private alleles, while none was found in the Branco River (**Table 2**). Mexico presented the highest expected heterozygosity (Hs). In the Amazon Basin, heterozygosity was highest along the Negro River, followed by the Solimões, Amazonas, Madeira rivers, and was lowest along the Branco River (**Table 2**). Mexico presented significant inbreeding, while in the Amazon Basin inbreeding was significant along the Branco and Madeira rivers (**Table 2**).

Among the 334 SNPs found in chloroplast sequences, 206 were found in cultivated C. cujete. Mexico and the Amazon Basin showed similar nucleotide diversity (π), 3.78 × 10−<sup>2</sup> and 3.83 × 10−<sup>2</sup> , respectively, although sample sizes are very different and the Amazon Basin harbors highly divergent samples (**Figure 4A**). Among cultivated C. cujete, 15 samples produced 119 unique SNP alleles, of which 66 % were from only two samples collected along the Amazonas River, which thus produced an extremely high nucleotide diversity estimate for this river (π = 9.31 × 10−<sup>2</sup> ). When these 15 singleton samples were discarded, there were 93 SNPs and nucleotide diversity in Mexico was still similar to the Amazon Basin (**Table 2**). The highest nucleotide diversity was still along the Amazonas River, with decreasing values along the Solimões, Madeira, Negro, and Branco rivers (**Table 2**).

# Morphological Diversity of Cultivated *C. cujete*

We identified a total of eight fruit shapes in the Amazon Basin and five in Mexico (**Figure 5A**). Fruit shapes shared among these regions were spherical, flattened, oblong, elongated, and cuneate, with higher frequencies of spherical, flattened, and

oblong shapes in both regions. Three types (globular, roundeddrop, and oblong-drop) were only recorded in the Amazon Basin. The kidney-shaped fruit found in Colombia was not found in Mexico or Brazilian Amazonia. The absence of drop-shaped fruits in Mexico, which are types clearly distinguished from the others, indicate higher morphological diversity along Amazonian rivers than in Mexico. The Solimões River harbors all the eight fruit shapes described (**Figure 5B**). The spherical shape, the most frequent in Mexico, is relatively rare in the Amazon Basin, with a higher frequency along the Amazonas River (**Figure 5B**). The fruit types absent in Mexico were rare in the Amazon Basin as well, except the rounded-drop shape. This fruit type showed relatively high frequency along the Negro River, more than the more common flattened and oblong shapes (**Figure 5B**). The fruit shape diversity index was higher along the Negro River, with decreasing values along the Solimões and Amazonas rivers, followed by Mexico, and lowest along the Madeira and Branco rivers (**Table 2**). The fruit shape diversity index was not correlated with any of the genetic estimators (p > 0.05). The fruit diameters showed the lowest average along the Negro River and in Mexico, and the highest along the Madeira River (**Table 2**). Mexico and the Negro River also showed the extremes of size variation, with Mexico least variable and the Negro most variable (**Table 2**).

# DISCUSSION

Cultivated C. cujete are quite similar from Mexico to Brazil, suggesting a common genetic origin. But these cultivated types are strongly differentiated from wild types, both from Mexico and Amazonia, suggesting these wild populations are not the direct ancestors of cultivated C. cujete. The geographical origin of the domestication of this species is still uncertain. However, the high diversity of cultivated C. cujete from Mexico, compared to Amazonia, suggests that its origin may be in Central America. Diversity analyses allowed discussion of the different routes of introduction into Amazonia and subsequent dispersal. More than one route may have been used: a northwestern introduction into the Negro and Solimões Rivers; and an eastern introduction from

TABLE 1 | Paired FST distance matrix between cultivated C. cujete chloroplast sequences (N = 181) based on 93 SNPs from Mexico and five Amazonian rivers.


The FST -values are below the diagonal; in italics above the diagonal the significance evaluated using 1,000 bootstraps at p ≤ 0.05. Significant FST are indicated with bold script. Samples with singletons (see text) were not included.

the coastal Guianas into the Amazonas River. Finally, fruit shape diversity suggests distinct selection pressures across the crop's distribution.

# Relationships among Mesoamerican and Amazonian Treegourd Populations

The wild samples from Mexico (taxonomically identified as C. cujete) and the Amazon Basin (identified as C. amazonica) were strongly differentiated from the cultivated samples, given their FST values based on nuclear SSR and number of substitutions in the chloroplast genome. The high number of substitutions in the chloroplast sequences between these wild taxa suggests ancient divergence. The differentiation between wild and cultivated in Mexico (Aguirre-Dugua et al., 2012; under revision) and between wild and cultivated in Amazonia was already noted Moreira et al., 2017). These results suggest that neither of these wild relatives are the direct ancestor of cultivated C. cujete, although Mexican wild samples present clear morphological identification as C. cujete based on Gentry (1980) description.

The Costa Rican sample showed an intermediate admixed nuclear pattern, but high chloroplast differentiation from the cultivated samples (**Figures 2**, **4A**). Consequently, it could be a wild individual pollinated by cultivated C. cujete. However, because ancestry could also reflect divergence, increased sampling in Central America is of interest. Although our results rule out the possibility that cultivated C. cujete was derived from the wild samples from the Yucatan Peninsula, we cannot rule out an origin somewhere between Central America and northern South America, where other potentially wild C. cujete populations occur in savannahs (**Figure 1**). Nevertheless, our results provide evidence that introduction of domesticated C. cujete in Mexico and Amazonia originated from the same source, given the Mexican relationship with Amazonian samples (**Figure 3A**, yellow and green clusters) and occurrence of wild Mesoamerican alleles in cultivated Amazonian C. cujete samples.

TABLE 2 | Genetic diversity of cultivated Crescentia cujete in Mexico and along major rivers of the Brazilian Amazonia, based on 8 nuclear SSR, 93 chloroplast SNPs and eight fruit shapes.


N, number of samples, At, total number of alleles, Ar, rarefied allele counts, Ap, number of private alleles, Ho, observed heterozigosity, Hs, expected gene diversity, mean FIS (\* significant at p < 0.05 at least at 50 % of loci), S, number of polymorphic sites, h, number of haplotypes, π, nucleotide diversity, H'shape, Shannon index of fruit shape diversity estimated for each region; and D, fruit diameter (average ± SD).

( # ) samples with singletons were not included (N = 15).

# Hypotheses of Treegourd Introduction into Amazonia

by Arango-Ulloa et al. (2009), except the spherical shape.

The patterns of treegourd genetic diversity across the Amazon Basin allow two, not mutually exclusive, hypotheses of introduction: a Northwestern route and an Eastern route. A Northwestern route into the upper Negro River is supported by the relatively high levels of heterozygosity and fruit shape diversity (**Table 2**), higher proportions of Mexican ancestry (**Figure 3A**, green cluster) and higher frequency of the most common haplotype in Mexico (**Figure 4B**, haplotype H2). This route into Negro River is possible from the Orinoco River, given the fluvial connections via de Cassiquiare canal. This route was part of an extensive social trading network (Hornborg, 2005), based at least in part on the Arawak network (Eriksen and Danielsen, 2014). This route has also been suggested for various crop dispersals (Schultes, 1984), such as cocona (Solanum sessiliflorum), whose populations were domesticated in the upper Orinoco River (Volpato et al., 2004) and which was widely cultivated in Northwestern Amazonia (Schultes, 1957). Similarly, people from the upper Negro River reported intentional collection of treegourd propagules from the Cassiquiare, where treegourd is considered a spontaneous tree in the floodplains, while along the Negro River cultivation demands more effort (P.A.M., personal observation).

A possible Western route into the upper Solimões River is partially supported by heterozygosity and fruit diversity (**Table 2**); the presence of all fruit shapes described enhances the possibility (**Figure 5B**). Moderately high nucleotide diversity with the highest number of haplotypes are the strongest evidence (**Table 2**), especially because hybridization with wild populations was not reported (Moreira et al., 2017), suggesting that this is C. cujete diversity. This route might reflect introduction from the Pacific coast and crossing of the Andes mountains via the Napo and Putumayo rivers (Schultes, 1984), as might be the case of cacao (Theobroma cacao) (Thomas et al., 2012) and peach palm (Bactris gasipaes) (Rodrigues et al., 2005) demonstrated by molecular evidence. However, it is also possible that this is a continuation of the Negro River route across interfluvial areas, as suggested by the distribution of abundant haplotype H2 and the rare haplotype H13 (**Figure 4B**).

The Eastern route into the Amazonas River is supported by high heterozygosity and fruit diversity (**Table 2**), with high Mexican ancestry not found in Western Amazonia (**Figure 3A**, yellow cluster). The highest levels of nucleotide diversity (**Table 2**) and the particular distribution of haplotypes not found in Western Amazonia (**Figure 4B**, haplotype H5), which include one of the Mexican haplotypes (**Figure 4B**, H3), agree with the nuclear pattern. This route is linked to the coastal Guianas, an ancient area of exchange of Amazonian crops with Mesoamerica (Schultes, 1984). Molecular data of early maize (Zea mays) introduction into South America support dispersal from Mesoamerica through the Caribbean, spreading along the lowlands of the northeastern coast of South America to finally reach Amazon Basin through river systems (Freitas et al., 2003; Bedoya et al., 2017), although the oldest archaeological remains of maize are western (Bush et al., 2016). This route also agrees with pineapple dispersal from the Guianas, where it was domesticated and introduced into Mexico (Coppens D'Eeckenbrugge and Duval, 2009).

The extremely high chloroplast nucleotide diversity along the Amazonas River, almost twice that along the Solimões River (**Table 2**), is an unexpected result. Such high diversity was also observed with nuclear markers, given the relatively higher number of exclusive cultivated alleles along the Amazonas River (**Table 2**), which might not be related to local hybridization, since they were not found in C. amazonica (Moreira et al., 2017). While nuclear information is limited by the small number of loci analyzed, the chloroplast pattern is robust and they are in agreement. Therefore, we do not rule out that diversity along the Amazonas River might have been promoted by interspecific hybridization between Mesoamerica and northern South America, where most diversity of Crescentia species is found (Gentry, 1980) and hybrid samples might have been introduced into Amazonia. Another process that is complementary and also deserves future investigations is the role of seed cultivation to deal with high flooding described along the Amazonas River (Moreira et al., 2017), since seeds might show diversity not found among cuttings as usually practiced (Arango-Ulloa et al., 2009; Aguirre-Dugua et al., 2012; Moreira et al., 2017). This hypothesis follows that of manioc (Manihot esculenta), where cuttings are usually practiced, but seed propagation is important to maintain diversity (Peroni and Sodero Martins, 2000; Elias et al., 2001; Duputié et al., 2009; McKey et al., 2010).

# Hypothesis of Fruit Dispersal and Diversification

Domesticated varieties often present greater fruit shape diversity than their wild relatives, as observed in bottle gourd (L. siceraria), whose fruits have similar technological uses (Heiser, 1993; Morimoto et al., 2005). Across its distribution, the pattern of treegourd fruit shape diversity (**Figure 5**) suggests different cultural preferences affecting diversification. The highest shape diversity was found along the Negro and Solimões rivers (**Figure 5B**, **Table 2**). Similar high diversity was also observed in the Orinoco and Caribbean regions of Colombia (Arango-Ulloa et al., 2009), suggesting northwestern South America is an area of treegourd diversification. This pattern of diversity agrees with Amazonian ethnographies that underscore the cultural value of morphotype diversity cultivated for its own sake, such as in manioc (Rival and McKey, 2008) and pequi (Caryocar brasiliense) (Smith and Fausto, 2016). Nevertheless, the greater local frequency of the spherical type in Mexico and rounded-drop shape along the Negro River (**Figure 5**) suggests distinct selection pressures, as also described for popcorn in Peru (Grobman et al., 2012) and the differential selection of bitter and sweet manioc between Amazonia and the Atlantic Forest in Brazil (Emperaire and Peroni, 2007). Modern Maya people in Mexico and Guatemala have a long history of strong selection of spherical fruits of C. cujete for bowls (jícaras) to use with traditional beverages in rituals and also daily life situations (Ventura, 1996; Aguirre-Dugua et al., 2012, 2013). In Amazonia, the spherical and drop-shaped fruits of C. cujete have different symbolic importance and are recognized with distinct names by Tukano Oriental speakers (Pieter van der Veld, pers. communication), a linguistic family found in Northwestern Amazonia. The spherical fruit is called wahatowê, and is used as bowls to prepare ipadu powder (Erythroxylum coca var. ipadu) in rituals. In contrast, the rounded-drop, called ñahsãwaha, is common in daily life as a spoon and cup for collective food consumption (xibé, a meal of water and manioc flour, and açaí, the juice from Euterpe precatoria). Local people along the upper Negro River reported that the spherical type was also used as an ashtray by healers (pajé) in blessing rituals with tobacco smoke. Ethnographies also reported different treegourd fruits for each type of use, such as cuia-de-tapioca and cuia-de-ipadu (Ribeiro, 1995), although shape differences were not mentioned. In Northwestern South America, these bowls are cultural markers for the traditional use of coca introduced from the Andean foothills (Plowman, 1984). Interestingly, the spherical fruit shape selected in Mexico was the same as the one used in special rituals in Negro River Basin. This suggests that the wide dispersal of plants between South America and Mesoamerica in pre-Columbian times was motivated not essentially by food consumption, as would be expected for agrarian societies, but mainly for recreative and religious purposes (Neves, 2016). Indeed, archaeological remains of C. cujete in Central America and the Antilles were found in ritualistic contexts, such as offerings in funerary rituals (Beaubien, 1993; Conrad et al., 2001). This hypothesis of recreative and religious exchanges is also supported by the ancient dispersal of maize (Zea spp.) for beer preparation and tobacco (Nicotiana spp.) for magic and therapeutic uses, both widely exchanged between these continents (Heiser, 1965; Smalley and Blake, 2003), possibly as sacred gifts (Norton, 2008).

The relatively high morphological diversity found along the Solimões and Amazonas rivers, where most of rare fruit shapes were found (**Figure 5B**), suggests different demands for fruit shapes since pre-historic times, as expected among plants with technological uses (Blench, 2012). The upper Solimões River and middle Amazonas River were ancient treegourd handicraft centers that were regarded by both Europeans and Native Amazonians as one of the best expressions of their arts and an important article of trade (Rodrigues-Ferreira, 1933; Métraux, 1948). During the colonial period, villages along the Amazonas River produced 5,000–6,000 bowls a year that were exchanged for food (Rodrigues-Ferreira, 1933). This handicraft tradition extends until today, especially for the production of tacacá bowls (a kind of soup), which are made with the rounded (spherical and flattened) fruits (Moreira et al., 2017).

Although there is similarly high biological and cultural complexity in Mesoamerica and Amazonia (Blench, 2012; Clement et al., 2015; Casas et al., 2017), these two plant domestication centers contrast in terms of the morphological diversity of cultivated C. cujete fruits. Curiously, although Mexico pre-history is especially rich in complex societies, such as the Maya (Willey, 1956), morphological fruit diversity is lower and particular fruit shapes are absent, which also reinforces different cultural selection pressures between these regions. It follows that, although the introduction of the cultivated germplasm into both Mexico and Amazonia should lead to a bottleneck (i.e., through founder effect), it might be less severe in Amazonia due to a more diverse array of usages. Moreover, although the spread of a phenotype during dispersal might also be influenced by wild introgression/hybridization (Meyer and Purugganan, 2013), this effect was remarkable only on treegourd fruit size and not on shape diversity in Amazonia (Moreira et al., 2017). Within Mexico, elongated and smaller shapes spontaneously grown in homegardens, resulted possibly from gene flow with wild populations, are not appreciated in Yucatan

Peninsula (Aguirre-Dugua et al., 2012), but are selected in the Pacific Coast as spoons (X.A.D, personal observation), although at low frequencies (**Figure 5A**). Therefore, cultural selection influences the bottleneck during introduction and afterwards the management of hybridization with local wild congeners. Whereas, distribution of shape diversity reflects different culture preferences, size is more influenced by local wild introgression effects.

# CONCLUSIONS

We demonstrated with molecular evidence that C. cujete introduced into the Amazon Basin and Mexico shares a common ancestry with a currently unknown origin. The dispersal followed previously proposed routes of human and plant migrations into Amazonia. The patterns of genetic diversity across Amazonia allow two, not mutually exclusive, hypotheses of the routes of introduction: a Northwestern introduction into the Negro and Solimões rivers, and an Eastern introduction from the coastal Guianas into the Amazonas River. The fruit shape diversity reveals different ancient utilitarian demands for the fruits. Mesoamerica and Amazonia have contrasting fruit morphological diversity, which suggests different cultural preferences along treegourd's dispersal routes. More comparative studies of its different uses, with a broader genetic and phenotypic distribution, would be useful to better understand the dispersal and diversification of C. cujete in the Americas.

# REFERENCES


# AUTHOR CONTRIBUTIONS

PM, XA-D, CC, YV, and AC conceived the study. PM and XA-D carried out the field collections and interviews. PM, LZ, MC, CM, and DR performed the molecular work. PM, XA-D, CM, and YV performed the analysis. PM, XA-D, CC, and YV wrote the manuscript.

# ACKNOWLEDGMENTS

This research was supported by the Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq-473422/2012-3), the Fundação de Apoio à Pesquisa do Estado do Amazonas (FAPEAM 062.03.137/2012), the Agence Nationale de la Recherche (ANR-13-BVS7-0017), and the ARCAD project funded by the Agropolis Fondation. PM thanks the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior for a scholarship (CAPES-99999.010075/2014-03). We thank the Instituto de Desenvolvimento Agrário do Amazonas for logistical support and farmer families for their support, kindness and consent for this research.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2017.00150/full#supplementary-material


Guix, J. C. (2009). Amazonian forests need indians and caboclos. Orsis 24, 33–40.


Heiser, C. B. J. (1965). Cultivated plants and cultural diffusion in nuclear America. Am. Anthropol. 67, 930–949. doi: 10.1525/aa.1965.67.4.02a00040


da Amazônia - Rumo a Uma Nova Síntese, eds C. Barreto, H. P. Lima, and C. J. Betancourt (Belém: IPHAN; Ministério da Cultura), 32–39.


Pielou, E. C. (1975). Ecological Diversity. New York, NY: Wiley.


Rodrigues-Ferreira, A. (1933). Memoria sobre as cuyas. Rev. Nac. Educ. 1, 58–63.

Saitou, N., and Nei, M. (1987). The neighbor-joining method: a new method for reconstructing phylogenetic trees. Mol. Biol. Evol. 4, 406–425.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Moreira, Aguirre-Dugua, Mariac, Zekraoui, Couderc, Rodrigues, Casas, Clement and Vigouroux. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Origin and Dispersal of Domesticated Peach Palm

Charles R. Clement 1, 2 \*, Michelly de Cristo-Araújo1, 2, Geo Coppens d'Eeckenbrugge<sup>3</sup> , Vanessa Maciel dos Reis <sup>2</sup> , Romain Lehnebach4, 5 and Doriane Picanço-Rodrigues <sup>2</sup>

<sup>1</sup> Department of Technology and Innovation, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil, <sup>2</sup> Laboratório de Evolução Aplicada, Instituto de Ciências Biológicas, Universidade Federal do Amazonas, Manaus, Brazil, <sup>3</sup> Centre de Coopération Internationale en Recherche Agronomique Pour le Développement, UMR AGAP, Montpellier, France, <sup>4</sup> Université Montpellier II, UMR AMAP (Botanique et Bio-Informatique de L'architecture des Plantes), Montpellier, France, <sup>5</sup> Centre de Coopération Internationale en Recherche Agronomique Pour le Développement, UMR AMAP, Kourou, French Guiana

#### Edited by:

Eike Luedeling, Consultative Group on International Agricultural Research, United States

#### Reviewed by:

Paul Gepts, University of California, Davis, United States Rachel Meyer, University of California, Los Angeles, United States

> \*Correspondence: Charles R. Clement cclement@inpa.gov.br

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> Received: 18 May 2017 Accepted: 13 November 2017 Published: 29 November 2017

#### Citation:

Clement CR, Cristo-Araújo M, Coppens d'Eeckenbrugge G, Reis VM, Lehnebach R and Picanço-Rodrigues D (2017) Origin and Dispersal of Domesticated Peach Palm. Front. Ecol. Evol. 5:148. doi: 10.3389/fevo.2017.00148 Peach palm (Bactris gasipaes Kunth) is a Neotropical palm domesticated by Native Americans. Its domestication resulted in a set of landraces (var. gasipaes), some with very starchy fruit used for fermentation, others with an equilibrium of starch and oil used as snacks. Which of the three wild types (var. chichagui) was involved and where the domestication process began are unclear, with three hypotheses under discussion: an origin in southwestern Amazonia; or in northwestern South America; or multiple origins. We reevaluate one of the wild types, defining it as the incipient domesticate, and then evaluate these hypotheses using the Brazilian peach palm Core Collection and selected herbaria samples to: (1) model the potential distributions of wild and domesticated populations; (2) identify the probable origin of domestication with a phylogeographic analysis of chloroplast DNA sequences; and (3) determine the dispersal routes after domestication using spatial analysis of genetic diversity based on 17 nuclear microsatellite loci. The two very small-fruited wild types have distinct distributions in the northern Andes region and across southern Amazonia, both under moderately humid climates, while the incipient domesticate, partly sympatric with the southern wild type, is also found along the Equatorial Andes, in a more humid climatic envelope, more similar to that of the domesticated landraces. Two distribution models for Last Glacial Maximum conditions (CCSM4, MIROC) also suggest distinct distributions for the two wild populations. The chloroplast DNA phylogeographic network confirms the area of sympatry of the incipient domesticate and the southern wild type in southwestern Amazonia as the origin of domestication. The spatial patterns of genetic diversity confirm the proposal of two dispersals, one along the Ucayali River, into western Amazonia, northwestern South America and finally Central America; the other along the Madeira River into central and then eastern Amazonia. The first dispersal resulted in very starchy fruit for fermentation, while the second may have been later and resulted in snack fruits. Further explorations of southwestern Amazonia are essential for more precise identification of the earliest events, both with new archeological methods and genetic analyses with larger samples.

Keywords: Bactris gasipaes, chloroplast phylogeography, ecological niche models, landrace biogeography, microsatellite markers

# INTRODUCTION

The peach palm (Bactris gasipaes Kunth, Palmae) is a Neotropical palm with populations domesticated by Native Americans (Clement, 1988), and presents impressive morphological diversity in its wild and cultivated populations, since these occur in different environments and exhibit different degrees of domestication (Mora-Urpí et al., 1997). At the time of European conquest, peach palm was an important food crop and the basis of a fermented drink, both of which featured in community festivals from western Amazonia to southern Central America (Mora-Urpí et al., 1997; Patiño, 2002). It was less important in the rest of humid-lowland northern South America (Patiño, 1963, 2002).

The origin of domesticated peach palm from wild populations remained a matter of speculation for more than a century, until the systematic analysis of Bactris presented by Henderson (2000). Since then several hypotheses have proposed a single origin (Morcote-Rios and Bernal, 2001; Rodrigues et al., 2005; Cristo-Araújo et al., 2013; Galluzzi et al., 2015) or multiple origins (Mora-Urpí, 1999; Hernández-Ugalde et al., 2011). We will identify a problem in the systematic analysis that has influenced many of these hypotheses and is essential to understanding the origin of domesticated peach palm, then model the ecological niches of the wild populations, and expand our genetic analyses of the origin of the domesticated populations (Cristo-Araújo et al., 2013) to understand their dispersal.

Henderson (2000) reduced the previously recognized nine species and three varieties in Martius' genus Guilielma into synonymy with B. gasipaes, and proposed two varieties: chichagui (H. Karsten) A.J. Henderson, including wild populations with small fruits (1.2–2.3 × 1.1–1.8 cm); and gasipaes, including domesticated populations of peach palm with large fruits (3.5– 6.5 × 3–4.5(−6) cm) (p. 71). This revision allowed phylogenetic hypotheses about the origin of var. gasipaes and the subsequent dispersal of its cultivated populations and landraces (Cristo-Araújo et al., 2013). However, there is a disjunction in fruit sizes between var. chichagui and var. gasipaes that should not exist if var. gasipaes was domesticated from var. chichagui. This may be due to lack of herbarium samples that fill the gap, or to attributions of synonymy during Henderson's revision, or both.

Within var. chichagui, Henderson (2000) proposed the existence of three wild morphotypes, without attributing synonymy of previously accepted species to morphotype. Here we expand on Ferreira (1999) and Clement et al. (2009b), and propose that type 1 is synonymous with Guilielma mattogrossensis Barbosa Rodrigues and G. microcarpa Huber, type 2 with G. macana Martius and Bactris caribaea H. Karsten, and type 3 with B. speciosa Martius var. chichagui H. Karsten, hence the varietal name in Henderson's combination. Two other previously accepted species also have small to very small fruit: G. insignis Martius and Martinezia ciliata Ruiz & Pavon. Both were attributed to var. gasipaes by Henderson (p. 71); however, both are more likely to be synonymous with var. chichagui type 3 (Clement et al., 2009b).

When mapping the distribution of wild types 1 and 3 in southern Amazonia, Clement et al. (2009b) identified sympatry in southwestern Amazonia, while type 2 is isolated in northern South America (**Figure 1**). Hernández-Ugalde et al. (2011) describe the geological history of northern South America, and how this contributed to isolate the wild types 1 and 2 into their current distributions. Type 3 is the most variable of Henderson's wild types, with fruits that range from 2 to 10 g, rarely 15 g, whereas types 1 and 2 both have fruits that range from 0.5 to 2 g. Sympatry of types 1 and 3 was also noted by Huber (1904), who suggested that hybridization between small-fruited G. insignis and his very small-fruited G. microcarpa could explain the origin of cultivated peach palm in southwestern Amazonia. Observe that Huber apparently considered G. insignis fruits to be smaller than typical cultivated var. gasipaes, hence our suggestion that it should be synonymous with var. chichagui type 3, contrary to Henderson. Sympatry has additional significance: gene flow inhibits population divergence (Futuyma, 2005; p. 216) and this suggests that one type is not valid as a wild type.

These observations allow us to define var. chichagui type 3 as the incipient domesticate, following Clement et al. (2009a), i.e., it represents the beginning of domestication of peach palm from type 1 (Cristo-Araújo et al., 2013; Galluzzi et al., 2015). Saldías-Paz (1993) observed exactly the expected type of variation in fruit size and human propagation in lowland Bolivia, where very small fruits similar to type 1, which he equated to G. microcarpa, were observed in open forests, small fruits similar to type 3, which he equated to G. insignis, were observed in anthropogenic forests and were tolerated when they appeared in swiddens, and only var. gasipaes microcarpa-type fruits were intentionally propagated.

This redefinition of type 3 as the incipient domesticate explains the variability in fruit size, from the 2 g of type 1 to the minimum 15 g of a microcarpa population of var. gasipaes, as well as the disjunct distribution of type 3 (**Figure 1**), since dispersal by humans is effective for crossing barriers such as the Andes (see Supplementary Material 1.1. for more about type 3). This proposal is also consistent with a hypothesis of Graefe et al. (2012) that some "natural populations are in reality feral populations, i.e., material from cultivated populations that have gone wild," even though they doubted its validity because of the advanced degree of domestication of most var. gasipaes populations. Type 3 has generally been confused for a wild type precisely because it survives in little-disturbed ecosystems, i.e., it can become feral quite successfully.

Domestication is a co-evolutionary process in which human selection, both conscious and unconscious, interact with natural selection and result in changes in the population's genotypes and phenotypes that make them more useful to humans and better adapted to human intervention in the landscape (Rindos, 1984; Clement, 1999). Consequently, different populations may present different modifications due to selection, which Clement (1999) organized along a continuum from incipiently domesticated to semi-domesticated to domesticated. At the origin of domestication, incipient domesticates exhibit small differences from the local wild populations and hybridize freely with them, inhibiting changes due to human selection (Miller and Gross, 2011), just as observed by Saldías-Paz (1993). As the incipiently domesticated populations become more useful, their numbers increase in anthropogenic landscapes,

FIGURE 1 | Distribution of wild populations (var. chichagui, types 1, 2, and 3; following Clement et al., 2009b) and domesticated landraces (var. gasipaes; following Clement et al., 2010) of peach palm (Bactris gasipaes) once represented in the Peach palm Active Germplasm Bank at the Instituto Nacional de Pesquisas da Amazônia, Manaus, Amazonas, Brazil. Core Collection (CC) samples within the Peach palm Active Germplasm Bank (Cristo-Araújo et al., 2015) are identified with blue dots (wild samples) and red dots (domesticated samples). Landrace distributions are in differentially textured areas and numbered: microcarpas (1) Pará and (2) Juruá; mesocarpas (3) Pampa Hermosa, (4) Tigre, (5) Pastaza, (6) Inirida, (7) Cauca and (8) Utilis; macrocarpas (9) Putumayo and (10) Vaupés. See Hernández-Ugalde et al. (2011) for other landraces and populations.

permitting greater responses to selection. When humans disperse domesticates beyond the distribution of wild populations, response to selection will be freed from frequent introgression with wild types (Clement et al., 2009a; Miller and Gross, 2011). The distribution of peach palm's landrace complex (**Figure 1**) demonstrates the expected trends, with the macrocarpa landraces exhibiting dramatic changes in fruit size and little sympatry (Putumayo) or no sympatry (Vaupés) with wild populations.

Before Henderson's revision, some of the domesticated populations of peach palm had been grouped into landraces (Mora-Urpí et al., 1997) and their distribution was mapped (**Figure 1**). This landrace classification was based on fruit size, as it reflects the degree of change due to human selection during domestication (Mora-Urpí, 1984; Clement, 1988; Meyer et al., 2012). Microcarpa landraces have small fruits (<25 g), mesocarpa landraces have intermediate sized fruits (25–70 g), and macrocarpa landraces have large fruits (>70 g) (Mora-Urpí et al., 1997).

One result of the domestication process that is very important for our discussion is ecological adaptation. Fully domesticated populations have reduced ecological adaptation in their original ecosystems, having lost defensive mechanisms, reproductive success, competitive ability etc., generally due to natural selection for adaptation to human created agroecosystems (Harlan, 1992; Clement, 1999; Purugganan and Fuller, 2009). By definition the incipient domesticate has not lost ecological adaptation (Clement, 1999), partly because the domestication process is only starting and partly as a consequence of hybridization with local wild plants (Miller and Gross, 2011). Var. chichagui type 3's adaptation to advanced secondary succession in anthropogenic landscapes, as observed by Saldías-Paz (1993) in Bolivia, and its survival in naturally open forests, explain why so many botanists have considered it to be wild. This also explains why hypotheses about the origin of domestication of var. gasipaes from var. chichagui type 3 outside of the distribution of var. chichagui type 1 are problematic, e.g., Mora-Urpí (1999), Morcote-Rios and Bernal (2001), and Hernández-Ugalde et al. (2011), and why secondary domestication events (Galluzzi et al., 2015) outside of the distribution of var. chichagui type 1 are also.

Identifying the origin of domestication and tracing subsequent dispersal routes of cultivated plants is a multidisciplinary task, involving botanical, biogeographical, historical, archeological, linguistic and genetic evidence. In the case of peach palm, there are some historical references and numerous indigenous names (Patiño, 1963, 2002), but little archeological information (Morcote-Rios and Bernal, 2001). However, there is abundant botanical and biogeographic information synthesized by Henderson (2000), and an increasing abundance of genetic information. Early molecular genetic studies found deep divergence between populations in southwestern to eastern Amazonia (Pará landrace, **Figure 1**), and those in western Amazonia, northern South America and Central America (the other landraces, **Figure 1**) (Rojas-Vargas et al., 1999; Rodrigues et al., 2005; Hernández-Ugalde et al., 2011). Genetic introgression between adjacent cultivated and supposedly wild populations (in fact, populations of var. chichagui type 3) was reported (Couvreur et al., 2006). Hernández-Ugalde et al. (2011) interpreted these relationships among cultivated populations and adjacent wild populations in at least three regions as independent domestications. Clement et al. (2010) reviewed the molecular evidence and kept the most parsimonious hypothesis: a single domestication event in southwestern Amazonia with two dispersals. They reasoned that because nuclear DNA markers are inherited from both parents and undergo recombination, these markers are not ideal for identifying origins. Analysis with chloroplast DNA avoids the problems of meiotic recombination and biparental inheritance, and is more suitable for phylogeographic analysis (Avise, 2004). The first analysis with a chloroplast sequence (Cristo-Araújo et al., 2013) strongly suggests a single domestication event in southwestern Amazonia. We will expand on this analysis here.

Ecological Niche Modeling (ENM), which allows approximating both current and past distributions of species, has been used in a number of cases to test biogeographical hypotheses with cultivated species. Galluzzi et al. (2015) modeled wild and domesticated peach palm distributions, without a clear understanding of the implications of var. chichagui type 3, the incipient domesticate. As has been shown in cotton (Gossypium hirsutum L.), the distribution range of cultivated plants is considerably wider than that of their wild ancestor, because their climatic envelope is essentially delimited by the fundamental niche of the species (defined by abiotic constraints), as farmers, through common agricultural practices, control most biotic components of the ecological niche, particularly competition and parasitism (Coppens d'Eeckenbrugge and Lacape, 2014). In contrast, the distribution of wild populations is delimited by the species' realized niche, as competition and predation fully constrain the species' distribution. Finally, the distribution of feral populations is intermediate between that of wild and cultivated populations, as the impact of biotic factors depends on the degree of landscape anthropization. Thus, feral cotton's distribution is very similar to that of cultivated landraces; however, they tend to disappear in a few generations after fields are replaced by secondary vegetation.

Referring explicitly to the niche concept underlying the distribution models of wild, feral and cultivated plants allows evaluating the status of particular populations. In the particular case of peach palm, we can examine the ecological niche similarities for the three var. chichagui morphotypes. Under our working hypothesis that type 3 populations are incipiently domesticated forms, we expect that many of them are found outside of the niches of the two other morphotypes. Furthermore, once the wild status of the two morphotypes is assessed, we can evaluate hypotheses about the species' distribution at the time when humans began to interact with native Amazonian plants at the end of the last glacial period.

This study aimed to evaluate hypotheses about the origin of domesticated peach palm using: (1) ecological niche models to identify the potential distributions of wild and domesticated populations, to compare these with known distributions, to assess the status of var. chichagui type 3 in light of types 1 and 2, and then project models of these distributions on climatic models for the last glacial maximum; (2) phylogeographic analysis of chloroplast DNA sequences to determine the relationship among wild and domesticated populations, as well as the probable origin of domestication; and (3) phylogenetic analysis and spatial distribution of genetic diversity of peach palm, based on nuclear microsatellite loci, to determine the location of areas with greater genetic diversity and likely dispersal routes after domestication.

# RESULTS

# Distributions of Wild and Domesticated Peach Palm

The geographic distribution of our sample for modeling (**Figure 2A**) is similar to that shown in **Figure 1**. The two wild morphotypes of var. chichagui are clearly separated by an Equatorial band, with type 1 in southern Amazonia, and type 2 along the Caribbean coastal region of Colombia and western Venezuela, the Andean foothills of the Colombian and Venezuelan Orinoquia, and the Andean valleys in Colombia, reaching elevations above 1,000 m, south to the Quindío region in the Cauca Valley. Type 3 is sympatric with type 1 in southwestern Amazonia and it is also found on both sides of the Andes of Ecuador, as well as in the Cauca Valley in Colombia, where it appears marginally sympatric with type 2.The main differences between **Figures 1**, **2A** concern the presence of type 2 further south in Colombia, confirmed by Rodrigo Bernal, Universidad Nacional de Colombia. Our sample shows limited sympatry between var. chichagui and var. gasipaes. The few cases where var. chichagui is found in close proximity to var. gasipaes (Cauca department in Colombia, parts of Ecuador, Peru, and the upper Solimões River in Brazil) involve type 3. Interestingly, samples of var. gasipaes in secondary vegetation (magenta crosses) are more common in regions where type 3 is found.

The Principal Component Analysis (PCA) characterizing the climatic envelopes of the different taxa (**Table 1**, **Figure 3**) did not use bioclimatic variables 1 (mean annual temperature), 5 (maximal temperature of warmest month), 12 (annual

FIGURE 2 | Distribution of wild and cultivated Bactris gasipaes samples used in our models (A), and modeled distributions, based on current Worldclim conditions, of wild peach palm, var. chichagui type 1 (B), type 2 (C), a combination of types 1 and 2 (D), the incipient domesticate var. chichagui type 3 (E), and var. gasipaes (F). Colors indicate climate suitability according to logistic thresholds (dark green below 10% training omission, light green above this 10% threshold, yellow above 33% threshold, orange above 67% threshold). Symbols: red squares, var. chichagui type 1; red triangles, var. chichagui type 2; magenta circles, var. chichagui type 3; blue crosses, cultivated var. gasipaes; magenta crosses, feral var. gasipaes.

precipitation), and 16 (precipitation of wettest quarter), as they contributed little to the different Ecological Niche Models and/or can be deduced directly from other variables. The first axis is related to decreasing seasonality and increasing precipitation. The second axis is positively correlated with temperatures. Representatives of var. chichagui types 1 and 2 found in open tropical forests are concentrated in the upper left of the principal plane (warm tropical climate with more pronounced seasonality). A similar trend is observed for var. gasipaes; however, its climatic space is wider as it can be grown under more humid equatorial conditions, which is fully consistent with its common presence along the Equator. The climatic envelope of var. chichagui type 3 is intermediate between that of type 1 (its putative wild ancestor), and that of domesticated peach palm (var. gasipaes) (Supplementary Material 1.2; Figure S2), which is consistent with our working hypothesis that this type is the incipient domesticate. This is further supported by the fact that observations of feral var. gasipaes in secondary or disturbed vegetation occupy the same climatic space.

TABLE 1 | Principal component analysis of bioclimatic variables for Bactris gasipaes and contributions of the bioclimatic variables to the first two components.


Values in bold face contribute significantly to the principal component.

gasipaes. Projection of observations of var. chichagui types 1, 2, and 3, cultivated var. gasipaes, and feral or escaped specimens found in disturbed or secondary vegetation in the principal plane. Axis legends identify the main relations of principal components with original variables and percentage of explained variance.

### Distributions of var. Chichagui Types 1 and 2

The limited overlap between the climatic envelopes of var. chichagui types 1 and 2 raises the question of their ecological differentiation. If any, it could be related to the topographical contrasts between their home regions, as suggested by their relative distribution in the principal plane: in southwestern Amazonia, conditions for type 1 get both warmer and wetter close to the Equator, while type 2, in the northern Andes, finds wetter conditions at higher elevations, i.e., under cooler conditions. To assess whether such sources of climate variation may have resulted in significant ecoclimatic adaptation, we modeled the distribution of both wild types separately. Extrapolating the type 1 distribution to tropical South America (**Figure 2B**) allowed us to predict the main features of type 2's distribution, despite the topographic differences and low number of type 1 observations. The reciprocal extrapolation from the even smaller sample of type 2 data (**Figure 2C**), which does not discard the equatorial region east of Ecuador and along the Amazon river, is less convincing; however, it is still consistent with most observations of type 1. Interestingly, both models predict presence in the inter-Andean valleys of Colombia, the Caribbean coast, and the Andean foothills to the Orinoco basin. On the other hand, both **Figures 2B,C** lack specificity and indicate suitable regions where wild peach palm is absent, for example on the Guiana shield and in southeastern Brazil. Finally, the best distribution map results from the combination of types 1 and 2 (**Figure 2D**). This model is much more specific, with an excellent correspondence between observations and potential distribution. The current distribution of type 1 is well represented and explained with a considerable extension of suitable climates in the southern Amazon basin and to southeastern Brazil. The current distribution of type 2 appears less massive, but is equally well explained by the network of large Andean valleys and foothills in Colombia and Venezuela, as well as parts of their Caribbean coastal regions and around Lake Maracaibo in Venezuela A third important and well separated suitable area exists in eastern Venezuela and Roraima; however, no wild peach palm has been reported there during the last 100 years of exploration.

This wild peach palm distribution model is highly consistent with our views on type 3 as the incipient domesticate, introduced by man and feralized in Ecuador, on both sides of the Andes. In some cases, as in southern Colombia and western Ecuador, feralization was favored by suitable climates, associated with more open vegetation (and less competition). In other cases, as on the eastern side of the Ecuadorian Andes, feralization was possible mostly in anthropogenic landscapes, as indicated by the observations of fully domesticated peach palms in the neighborhood, and their remnants in secondary vegetation. This is likely to be the case for Panamanian representatives of type 3 also, including the Azuero population mentioned by Hernández-Ugalde et al. (2011), as they appear in areas that are climatically unsuitable for wild peach palm.

The distribution model for the var. chichagui types 1 and 2 combination was projected for the Last Glacial Maximum climate, as predicted by the CCSM4, MIROC-ESM, and MPI-ESM-P climate models (**Figure 4**). The first two models project similar distributions of var. chichagui types 1 and 2, which remain separated during glacial periods, especially in western Amazonia. The third model is not consistent with the modern distribution of var. chichagui types 1 and 2, which casts doubts on its validity. Thus, these LGM simulations clearly suggest that populations of types 1 and 2 have long been separated, which helps explain the strong genetic differentiation (Hernández-Ugalde et al., 2011). They also

FIGURE 4 | Maxent-generated potential distribution models based on Last Glacial Maximum (A) CCSM4, (B) MIROC-ESM, and (C) MPI-ESM-P estimated climatic conditions for B. gasipaes var. chichagui types 1 and 2. Color code as in Figures 2B–D.

suggest that there was suitable habitat for var. chichagui type 1 in southwestern Amazonia when humans arrived in the late Pleistocene.

### Distributions of var. Chichagui Type 3 and var. Gasipaes

The number of field collections and herbarium samples that could be attributed to var. chichagui type 3 is quite small (n = 29). The distribution model (**Figure 2E**) is consequently less reliable. Nonetheless, both the PCA (**Figure 3**) and the ecological niche model indicate that type 3's climatic envelope is slightly more humid than that of type 1, based on the expansion from Southwestern Amazonia northwards.

The modeled distribution of var. gasipaes (**Figure 2F**) represents reasonably well what is expected for cultivated peach palm from the literature and anecdotes. The highest probabilities are observed across central and western Amazonia, which is where a significant amount of collecting occurred in the late twentieth century (see also Figure S1) and where peach palm was most important at the time of European conquest (Patiño, 1963, 2002) (Figure S3). This area has much higher precipitation than the area where var. chichagui type 1 is distributed, even in the western part of its distribution. Also, var. gasipaes' niche encompasses var. chichagui type 3's niche, whereas the niche of wild types 1 and 2 (**Figure 2**) only encompasses type 3's niche in the southwestern Amazon, where types 1 and 3 are sympatric. Hence, peach palm's fundamental niche is much ampler than the realized niches of var. chichagui types 1 and 2, and type 3 shows the beginnings of this change.

# The Origin of Domestication Identified with a Chloroplast DNA Sequence

Only two of the 12 hypervariable sequences identified by Shaw et al. (2007) were variable in our study: psbJ-petA and psaI-accD. The psbJ-petA sequence had 1,040 base pairs, of which 26 were variable. The psaI-accD sequence had 622 base pairs, but was only useful for discriminating B. simplicifrons from B. gasipaes/B. riparia and was not used for further analysis. Seventy six of the 126 plants analyzed presented a 13 base-pair inversion in psbJ-petA at the same position (**Figure 5**), which distinguishes eastern and western Amazonian landraces and populations of peach palm. The population of var. chichagui type 1 near Rio Branco, Acre, Brazil, was polymorphic for this inversion, which

has implications for the origin of domestication and subsequent dispersal of var. gasipaes. Neither var. chichagui nor Bactris riparia were discriminated from B. gasipaes var. gasipaes with the information in this sequence.

In this set of 126 plants, 12 haplotypes were identified and organized into a network with maximum parsimony analysis and the Median Joining algorithm (**Figure 6A**). Three haplotypes were very common, with one very common in southwestern to eastern Amazonia, and two very common in western Amazonia to Central America, with one mutational difference between the two common western haplotypes. Nine less common haplotypes were specific to a landrace (Juruá) or shared by a landrace (Putumayo), a population (upper Madeira River) or a species (B. simplicifrons). The eastern and western groups of haplotypes were differentiated by the inversion (**Figure 5**), with interesting exceptions. As mentioned, var. chichagui type 1 was polymorphic for the inversion. So was the Putumayo landrace, which has the largest distribution of the western landraces and extends eastwards along the Solimões River to contact with the Pará landrace in Central Amazonia (**Figure 1**). Hence, this polymorphism in the Putumayo landrace is probably due to introgression, unlike the polymorphism in var.chichagui type 1 in Acre. Because of small sample sizes and the conserved nature of the chloroplast sequences, estimates of chloroplast diversity were not very informative (**Table 2**), although var. chichagui type 1

TABLE 2 | Genetic diversity parameters estimated for the two chloroplast sequences for Bactris gasipes var. chichagui types 1 and 3, and seven landraces and two non-designated populations of var. gasipaes in the Core Collection of the Peach palm Active Germplasm Bank, Manaus, Brazil, and two Bactris species used as outgroups.


n, number of plants analyzed; h, number of haplotypes; #Subs, number of nucleotide substitutions; #InDels, number of insertions and deletions; S, number of polymorphic sites; Hd, haplotypic diversity; π, nucleotypic diversity.

had approximately the same haplotypic diversity as var. gasipaes, principally because var. chichagui type 1 and the Putumayo landrace are polymorphic for the psbJ-petA inversion.

The phylogenetic tree estimated with Bayesian methods is similar to the haplotype network and the bootstrapped confidence values are high for all relationships (**Figure 6B**). As in the network, there is a clear separation between eastern and western groups. There is one important difference: the plants of var. chichagui 1 that grouped with the eastern Amazonian populations in the network are grouped with var. chichagui 3—the incipient domesticate—among the western Amazonian populations, even though half of them contain the inversion.

The phylogenetic network and tree support a single domestication event in southwestern Amazonia, probably in the upper Madeira River basin of modern Bolivia. The argument is most readily observed in the haplotype network (**Figure 6A**): the out-group (B. simplicifrons) is most closely related to the upper Madeira River populations, most of which have only one mutational difference from all of the other eastern Amazonian populations. Although var. chichagui type 1 does not have exactly the same haplotype as the upper Madeira River populations, both populations occur in the same general area and share the ancestral psbJ-petA chloroplast sequence.

# Dispersal of the Landrace Complex Interpreted with Nuclear Microsatellites

The 17 most informative SSR loci among the 39 tested detected 302 alleles in the 173 plants analyzed, with a mean of 17.8 alleles per locus. The two accessions of var. chichagui type 1 had slightly lower heterozygosities than the two type 3 accessions, probably because the type 3 accessions are from different populations, both in sympatry with type 1, whereas the type 1 accessions are from the same population (**Table 3**). This explains the difference in inbreeding coefficients also. The highest values of observed heterozygosities are in landraces or undesignated populations within the distributions of var. chichagui types 1 and 3, which suggest introgression. The lowest heterozygosities occur in the two landraces furthest from the center of domestication in southwestern Amazonia: Utilis in Central America and Pará in eastern Amazonia.

Based on 173 plants of the Core Collection, the best grouping of accessions with the Structure program was found for K = 2, with interesting groupings at K = 3 and 4 (**Figure 7**). At K = 2, the southwestern to eastern populations were distinguished from all other populations (**Figure 8A**). At K = 3, the Utilis landrace of Central America was discriminated from the other western populations (**Figure 8B**). At K = 4, the western Amazonian populations were divided into two groups (**Figure 8C**): a southern group containing the Pampa Hermosa and Juruá landraces and the Ucayali River populations; a northern group containing the two macrocarpa landraces (Putumayo and Vaupés) and the mesocarpa Cauca landrace of western Colombia. Note that the southern group is in sympatry with var.chichagui types 1 and 3, while the macrocarpa Putumayo has only minor areas of sympatry and the macrocarpa Vaupés has none (**Figure 1**). The origin of domestication encompasses the eastern half of the southern western group (populations 7, 8, 9) and the western part of the eastern group (populations 10, 11). At K = 10 (data not shown) some landraces in **Table 3** are relatively well distinguished, but others present considerable admixture with adjacent landraces and populations, as is already evident at K = 4 (**Figure 8C**).

Although Structure offers robust simulations, it is based on the presuppositions of the Hardy-Weinberg equilibrium (Pritchard et al., 2000), many of which do not hold for small populations, especially for domesticated populations, nor for groups of gene bank accessions. Hence, we used spatial Analysis of Principal Components (sPCA), which does not rely on HWE presuppositions, to examine the relationships of the 156 var. gasipaes plants in the Core Collection with 17 SSR. The high variance and Moran's I recorded for the first three global principal components highlight the existence of global structure (data not shown) and is corroborated by the significance of the Monte-Carlo simulations (p < 0.001). Due to the low variances and Moran' I index (data not shown), as well as the

weak significance of the Monte-Carlo simulations (p = 0.015) within the set of local components, the local structure is not discussed.

The global structure using this SSR data presented good interpretation of the spatial distribution of genetic diversity and is similar to the Structure analysis (**Figure 8**). The first



n, number of plants analyzed; A, number of alleles; P, private alleles; Ho, observed heterozygosity; He, expected heterozygosity; Fis, intra-population inbreeding coefficient.

spatial principal component differentiated eastern Amazonia, as in the Structure analysis at K = 2 (**Figure 9A**), the second differentiated Central America from the other western populations (**Figure 9B**), as in the Structure analysis at K = 3, and the third was less efficient at differentiating northern western Amazonia from southern western Amazonia (**Figure 9C**), probably because of the abundant gene flow. The spatial synthetic projection of the 3 global components (**Figure 9D**) suggests that eastern Amazonia is not as clearly related to southwestern Amazonia as in the Structure analysis, although this may be due to the lack of sampling along the middle and lower Madeira River. The relationship among populations in southwestern Amazonia is also much clearer than in the Structure analyses in that the upper Madeira and Ucayali Rivers are more clearly related.

Although the Core Collection is quite small and there appears to be abundant gene flow among these populations at different scales (**Figures 8**, **9**), the Nei genetic distances among these groups are informative (**Figure 10**). The deepest divergence is between the southwestern to eastern populations, including var. chichagui 1, and all of the western populations, as in **Figure 8A**. However, the var. chichagui 1 population from Rio Branco, Acre, is not at the root of this group, suggesting that it is not the original source population for domestication. The western cluster contains the three other groups defined by Structure (**Figure 8**), with a very interesting organization. The cluster is rooted in var. chichagui type 3, the incipient domesticate, and has the microcarpa Juruá landrace and Ucayali River populations in sequence, followed by the mesocarpa Pampa Hermosa landrace. All of these populations are sympatric with var. chichagui type 1 also. The next cluster is derived from the previous, as expected by dispersal of domesticated types northward. The macrocarpa Putumayo landrace in western Amazonia is associated with the mesocarpa Cauca landrace in western Colombia, suggesting that there is gene flow over the Andes, perhaps in southern Colombia. Both Cauca and the western part of Putumayo are sympatric with the incipient domesticate (var. chichagui type 3). Also in the northwestern Amazonia Structure group (**Figure 8C**), the Vaupés landrace is the only one that is not sympatric with any wild populations, which may explain why it is the larger-fruited of the two macrocarpa landraces, since there is no introgression to slow response to selection. What is curious in this cluster is that the mesocarpa Utilis landrace of Central America is derived from the same lineage that gave rise to Vaupés, but this may only be an artifact of small sample sizes, although there is gene flow (or gene bank error?) visible in the Structure analyses (**Figures 8B,C**). Another possibility is that Utilis is derived from a different dispersal than Cauca, with the latter a dispersal over the Andes from the upper Putumayo River and the former a dispersal along the northeastern flank of the Andes and then into Central America.

AMOVA estimated that 87% of the total genetic variation accessed with these 17 SSR in the Core Collection is found within landraces and populations, while 13% is found among them. Other genetic divergence indices [Fst (0.13), Rst (0.19) and Gst (0.13)] agree with the AMOVA estimate of variation among populations. When comparing only the var. gasipaes accessions at K = 2 (**Figure 8A**) and the deep dichotomy in the dendrogram (**Figure 10**), AMOVA estimated 92% within and 8% among landraces and populations.

# DISCUSSION

# Distributions of Wild and Domesticated Peach Palm

Galluzzi et al. (2015) were the first to use ecological niche modeling with Bactris gasipaes, but our results cannot be compared with theirs for several reasons. Although they accepted the hypothesis that var. chichagui type 3 represents the incipient domesticate, they pooled all three types of var. chichagui into a 55-record "wild" sample, rather than maintain type 3 separate. They then added all samples of var. gasipaes that fall in the same climatic envelope (polygon in their Figure 2) to allow increased precision for LGM modeling, without considering that the niches of wild, feral, and cultivated plants cannot be interpreted in the same way. Furthermore, the resulting sample is biased, because their choice of a climatic envelope was determined partly by the two most extreme var. chichagui outliers, as well as the great majority of observations concentrated on the opposite convex side of the polygon in their PCA. This is important because distribution models are determined not only by the overall climatic space, but also by the distribution of observations within it.

of Evanno et al. (2005).

Central American populations.

We followed a different approach, where wild peach palm's distribution was modeled from the observations of truly wild peach palm, i.e., a subsample including only observations of B. gasipaes var. chichagui that could be assigned to types 1 and 2, even though their number is modest. Observations of feral and escapes (respectively from var. chichagui and var. gasipaes), as well as cultivated populations, were only kept for purposes of comparison, using them in the PCA comparative climatic characterization, and contrasting their realized distribution with that of wild peach palm.

Although the geographic distributions of var. chichagui types 1 and 2 are widely separated, the characterization and projections of their respective climatic spaces appear consistent with their infra-specific taxonomic status as two morphotypes of the same botanical variety (Henderson, 2000), with little apparent

each accession.

divergence in their ecologies. The fact that both models predict presence in the inter-Andean valleys of Colombia supports this hypothesis of limited ecological differentiation. Sample size is less of a problem when the truly wild types of var. chichagui are pooled in the analysis, for a total of 60 observations. The resulting model shows an excellent correspondence between these observations and their disjunct potential distribution (Henderson, 2000; Figure 29B, p. 71), and allows visualizing the more humid equatorial geographic barrier hampering gene flow between them.

Our extrapolation of the wild peach palm distribution during the LGM gave variable results according to the climatic model. In terms of suitable habitat across southern Amazonia, our CCSM4 and MIROC-ESM modeled distributions fit reasonably well into the ecotone between the evergreen broad-leaf and the deciduous broad-leaf forests modeled by Mayle et al. (2004) for the LGM, which reflects the ecological adaptation of type 1 (Clement et al., 2009b). The CCSM4-based modeled LGM distribution was the most consistent with the modern wild peach palm distribution, showing the same potential separation of favorable habitats of types 1 and 2. Varela et al. (2015) caution that different general circulation models offer different predictions for the tropics, which explains why MPI-ESM-P produced such a divergent modeled distribution.

# The Origin of Domestication

The origin of domestication of cultivated populations of any species should be sought in the distribution of its wild populations. In the case of peach palm, var. chichagui types 1 and 2 have the smallest fruits and are considered truly wild. The molecular analyses that included type 2 concluded that it was not involved in the domestication of peach palm (Hernández-Ugalde et al., 2011), as did cladistic analysis of morphological traits (Ferreira, 1999). For reasons presented in the Introduction and re-enforced by the ecological niche models, type 3 is not wild. Hence, the origin of domestication is expected in the geographic

area of sympatry between type 1 and the incipient domesticate (**Figure 1**). Chloroplast sequences were used to examine this expectation.

Intergenic spacers in the chloroplast genome are important sources of information in plant systematics, but are often insufficiently variable at low taxonomic levels to differentiate populations (Shaw et al., 2005, 2007). This is clear in Bactris, where interspecific chloroplast variation is scarce in the Bactris species closest to B. gasipaes and even scarcer within the species, as Couvreur et al. (2007) failed to discriminate between var. chichagui and var. gasipaes with commonly used chloroplast sequences trnD-trnT and trnQ-rps16, and a sequence that they designed (psbC-trnfM), all located in non-coding regions. We also failed to discriminate var. chichagui from var. gasipaes or B. gasipaes from B. riparia with the psbJ-petA and psaI-accD sequences, although we did find variation in psbJ-petA, but this sequence is less variable than trnD-trnT (1,066 base pairs with 36 variable) and trnQ-rps16 (1,046 base pairs with 47 variable) (Couvreur et al., 2007). The 13 bp inversion that we found (**Figure 5**) is intermediate in size between the minute (4 bp) and a middle-sized (20 bp) inversions in Bactris trnD-trnT (Couvreur et al., 2007), although neither allowed discrimination within the Bactris gasipaes-riparia complex.

Both the haplotype network (**Figure 6A**) and the tree (**Figure 6B**) show a clear separation between eastern and western populations of var. gasipaes, reflecting the deep divergence found in all previous molecular analyses (Rojas-Vargas et al., 1999; Rodrigues et al., 2005; Cristo-Araújo et al., 2010; Hernández-Ugalde et al., 2011). Since this analysis is with a single chloroplast sequence, it is evident that this inversion does not explain the deep divergence observed with nuclear markers, but it does provide a parallel marker.

The network and the tree also support a single domestication event in southwestern Amazonia, probably in the upper Madeira River basin of modern Bolivia, which had already been proposed as the center of domestication (Huber, 1904; Clement, 1995; Rodrigues et al., 2005; Cristo-Araújo et al., 2010, 2013; Galluzzi et al., 2015). This is also one of the areas identified by Mora-Urpí (1993, 1999) and Hernández-Ugalde et al. (2011). This chloroplast analysis does not provide support for additional domestication events outside of southwestern Amazonia, contrary to the hypotheses of Mora-Urpí (1993, 1999), Morcote-Rios and Bernal (2001), and Hernández-Ugalde et al. (2011), nor the idea of secondary domestications suggested by Galluzzi et al. (2015), although this may be because of the small amount of variation found to date. These unsupported hypotheses all depend upon the distribution of var.chichagui type 3, the incipient domesticate.

# Dispersal of the Landrace Complex

In domesticated peach palm, two dispersals out of the center of domestication in southwestern Amazonia were hypothesized (Rodrigues et al., 2005), one down the Ucayali River into western Amazonia and beyond, and one down the Madeira River into eastern Amazonia, and should exhibit these trends when examined with neutral molecular markers. The highest values of observed heterozygosity are in landraces or undesignated populations within the distributions of var. chichagui types 1 and 3 (**Table 3**), which suggests introgression (Couvreur et al., 2006; Hernández-Ugalde et al., 2011). The lowest heterozygosity values occur in the two landraces at the extremes of the two hypothesized dispersals: Utilis in Central America at the end of the western dispersal and Pará in eastern Amazonia at the end of the eastern dispersal. The low heterozygosity in the Utilis landrace is surprising, given the existence of type 3 populations in the region and observed introgression (Hernández-Ugalde et al., 2011), suggesting that Utilis represents an independent dispersal and not an in situ development from the local var. chichagui type 3 populations.

While validation of morphometrically defined landraces has been a preoccupation of the molecular genetic analyses in the Brazilian germplasm bank (Sousa et al., 2001; Clement et al., 2002; Rodrigues et al., 2005; Cristo-Araújo et al., 2010; Santos et al., 2011), these analyses often had insufficient or unbalanced numbers of each population to work with. When sufficient numbers were available, they validated some landraces and did not validate others, specifically the Guatuso and Tuira landraces in Central America, and the Solimões landrace in central-western Amazonia (Rodrigues et al., 2005). Hence, Structure was used to study the relationships among accessions in the Core Collection.

K = 2 (**Figure 8A**) identified the deep divergence detected in previous molecular analyses (Rojas-Vargas et al., 1999; Rodrigues et al., 2005; Cristo-Araújo et al., 2010; Hernández-Ugalde et al., 2011). In K = 3, the Utilis landrace of Central America was distinguished from the other western populations (**Figure 8B**), probably because allelic richness is lower (**Table 1**) and some alleles are locally common (Hernández-Ugalde et al., 2011), although this was not detected by Galluzzi et al. (2015) in their reanalysis of Hernández-Ugalde et al.'s dataset. At K = 4 a considerable amount of admixture was detected (**Figure 8C**), certainly the reason for the poor discrimination between the groups.

The K = 4 grouping also identifies either long distance dispersal events or germplasm bank errors; the latter have been detected before in the INPA germplasm bank with molecular analyses (Sousa et al., 2001; Rodrigues et al., 2005; Cristo-Araújo et al., 2010). Two accessions from the relatively spineless Guatuso populations of the Utilis landrace in Central America are assigned to the northern western Amazonia group, suggesting that spinelessness may not have been selected in situ but may be due to long distance dispersal, and the accessions from Coari, previously classified with the Putumayo landrace (Rodrigues et al., 2005), are assigned to the southern western Amazonia group, some 500 km to the west. Seed exchange networks had previously been identified within the Pampa Hermosa landrace in southern western Amazonia (Adin et al., 2004), as well as long distance gene flow between the southern and the northern western Amazonian groups (Cole et al., 2007), so this admixture is not surprising and may not be due to germplasm bank error.

Overall, the Structure analyses confirm part of the landrace hierarchy proposed originally by Mora-Urpí and Clement (1988) and validated by previous molecular analyses (Rodrigues et al., 2005; Cristo-Araújo et al., 2010). They also confirm the commonness of both long and middle distance gene flow and introgression reported by various authors (Adin et al., 2004; Couvreur et al., 2006; Cole et al., 2007; Hernández-Ugalde et al., 2011). The fact that they do not fully validate the landrace hierarchy can be attributed to the design of the Core Collection (Cristo-Araújo et al., 2015), since this was not designed primarily to study the origin and dispersal of peach palm, but to support the management of peach palm germplasm at INPA.

The spatial analysis of principal components (**Figure 9**) generally agreed with the Structure analyses (**Figure 8**), but also suggested a very interesting relationship among the upper Madeira River populations and the Ucayali River populations, which was not evident in the Structure analyses. This relationship is the primary region of sympatry between var. chichagui types 1 and 3 (**Figure 1**), and where var. gasipaes microcarpa populations have the smallest fruit. In fact, the headwater tributaries of the Ucayali River and those of the Madre de Dios River, the major northern tributary of the upper Madeira River, are quite close in southern Peru, allowing relatively easy human passage. It follows that the upper Ucayali River basin cannot be ruled out as part of the center of origin of domestication. Further prospection of peach palm in the upper parts of both basins will allow better resolution of these analyses.

The Neighbor-joining dendrogram of Nei's genetic distances (Nei, 1978) among the landraces and populations of the Core Collection (**Table 3**, **Figure 10**) is quite similar to all previous dendrograms based on molecular analyses. The eastern cluster contains the upper Madeira River populations and the Pará landrace, as observed by Rojas-Vargas et al. (1999) and Hernández-Ugalde et al. (2011), and in agreement with morphological similarities (Mora-Urpí, 1999), since both have microcarpa fruit types. This cluster is associated with var. chichagui type 1, as observed by Rodrigues et al. (2005) and Hernández-Ugalde et al. (2011), but is not rooted in the Rio Branco population of type 1. Hence, the exact origin of domestication remains to be identified, although the general region of origin is clear.

As expected, the western cluster in **Figure 10** is similar to that reported previously (Rodrigues et al., 2005; Cristo-Araújo et al., 2010), since it uses some of the same accessions, and is quite different from Hernández-Ugalde et al. (2011), whose analyses identified the divergence between the Cauca and Utilis landraces and the western Amazonian landraces at a lower level in the dendrogram. However, given the numerous additional analyses included here (**Table 1**, **Figures 6**–**8**), plus the reinterpretation of var. chichagui type 3 as the incipient domesticate, this interpretation of **Figure 10** appears to represent a more robust hypothesis of the origin and dispersal of domesticated peach palm.

The two dispersals proposed here also resulted in two quite different fruit types with different cultural importance. The western dispersal down the Ucayali soon generated starchy fruit, possibly quite early, since Mora-Urpí (1984) observed that very starchy microcarpa fruit were common in Pucallpa and Contamana, along the Ucayali River in Peru. Starchy fruit are easily fermented, much like sweet manioc (Manihot esculenta) or maize (Zea mays) (Patiño, 1963, 1992, 2002). Because starch is much less energy intensive than oil, even unconscious selection for starchy fruit quickly results in increases in fruit size (Clement et al., 2009a), resulting in the mesocarpa Pampa Hermosa and Tigre landraces in central Peru. As the dispersal of this type of fruit continued down the Ucayali and Amazonas into northwestern Amazonia, the cultivated peach palms were taken out of sympatry with type 1 and the previously distributed incipient domesticate (type 3), and the very starchy macrocarpa fruits of the Putumayo and Vaupés landraces could appear. However, the oldest archeological record in Colombian Amazonia is the Abejas site, along the Caquetá River, with pollen dated to 1,535 BP (Morcote-Rios and Bernal, 2001), which suggests that this dispersal may have been rather late. Throughout central-western and northwestern Amazonia peach palm was cultivated both in homegardens and in swiddens, yielding large amounts of starchy fruit for fermentation that became the centerpiece of yearly harvest festivals (Patiño, 1992). As the dispersal continued northwestward into Central America, the cultivated populations were again sympatric with the incipient domesticate (type 3) and the starchy mesocarpa Cauca and Utilis landraces appeared. During the conquest of Panamá and Costa Rica, European adventurers felled tens of thousands of peach palms in the Sixaola River valley in order to subdue the native peoples, which resulted in the first court case of the Spanish crown against a group of conquistadors and is the reason that such dramatic numbers of palms are known to have existed (Patiño, 1963). Given the enormous numbers of palms involved, we can assume that peach palm was as important in southern Central America as in western and northwestern Amazonia.

The eastern dispersal appears to have been quite different, since the fruit retained considerable quantities of oil and was never selected even to mesocarpa size, even though the majority of the dispersal was outside of the distribution of the wild type. Oily fruit do not ferment well and there are no early historical records of harvest festivals with abundant fermented peach palm, as there are in western Amazonia. Bates (1962) observed fruits typical of the Pará landrace during his trip along the Amazon River, and commented that they increased in size once he started up the Solimões River, confirming the confluence of the two dispersals in Central Amazonia mentioned above. It is even possible that Bates' observations represent the final expansion of the eastern dispersal, since Patiño's (1963) analysis of the earliest European reports from eastern Amazonia seldom mention peach palm. Patiño's (1963) map (p. 131) supports this supposition of a late expansion into eastern Amazonia. At the Hatahara archeological site, lower Solimões River, 20 km from its confluence with the Negro River to form the Amazon River, Bactris-Astrocaryum phytoliths increase in number continually from the lowest levels (∼1,000 BP) to the time of conquest (500 BP) in terra preta middens; since it is the only cultivated palm in Central Amazonia, these phytoliths may represent peach palm (Bozarth et al., 2009), although managed Astrocaryum aculeatum cannot be ruled out. It follows that the eastern dispersal may have started later than the western dispersal.

# CONCLUSIONS

The argument developed here starts from the reanalysis of Henderson's (2000) taxonomic revision of Bactris gasipaes and is based on expectations that arise from the domestication process. Precisely because domestication is a process, gradual changes from the wild type to the domestication continuum of incipient to semi-domesticated to domesticated are expected, and a species with abundant domesticated populations, such as peach palm, is expected to contain populations along the whole continuum. The reinterpretation of var. chichagui type 3 from wild to the incipient domesticate fills the gap in the continuum that had been lacking. The ecological niche modeling and climatic PCA suggest that var. chichagui types 1 and 2 do not differ significantly in their climatic space, which contrasts with the wider adaptation of type 3 and the even wider adaptation of var. gasipaes. The incipient domesticate (type 3) was able to maintain populations in anthropized forests, under more humid conditions as it was dispersed from southwestern Amazonia into western Amazonia and beyond. The ecological niche models of wild peach palm's potential distribution during the Last Glacial Maximum suggest that it was present in southwestern Amazonia when people arrived. This identification of the incipient domesticate also narrowed the search for the origin of its domestication to southwestern Amazonia, where it is sympatric with var. chichagui type 1, as expected if it originated there.

Although only one of the 12 chloroplast sequences tested was informative within peach palm, the inversion in psbJ-petA paralleled the deep divergence in nuclear molecular genetic variability observed in all previous analyses. The patterns observed in the geographic distribution of nuclear genetic diversity are those expected during dispersal from the origin of domestication to peach palm's present distribution throughout the lowland Neotropics from Bolivia to Nicaragua, even though the INPA Core Collection does not have samples from numerous areas in this ample distribution. It does, however, have enough samples in strategic locations to confirm two major dispersals: the first out of southwestern Amazonia down the Ucayali River into western Amazonia and beyond, which resulted in the complex landrace hierarchy of that region and the very starchy fruit that could be fermented and become important to pre-conquest indigenous cultures; the second out of the same region down the Madeira River into eastern Amazonia, which did not result in a complex landrace hierarchy, perhaps because the starchy-oily microcarpa fruit were used more for snacks than as a starchy staple.

The current analysis obviously has limitations, principally due to the modest number of samples for such an ample distribution, even though these were carefully chosen to be representative of the samples available in the Brazilian peach palm collection via the creation of the Core Collection. Further studies of the two wild types and the incipient domesticate are needed. A revision of the infra-specific relationships and nomenclature of Bactris gasipaes will be required to assist botanists and plant breeders with this new proposal.

# MATERIALS AND METHODS

# Core Collection for Genetic Analysis and Niche Modeling

We used 174 plants from 36 accessions (3–5 plants per accession) of domesticated peach palm (var. gasipaes) and four accessions (2–5 plants per accession) of wild peach palm (var. chichagui types 1 and 3). These accessions belong to the Core Collection designed by Cristo-Araújo et al. (2015) within the Peach palm Active Germplasm Bank, maintained by the Instituto Nacional de Pesquisas da Amazônia (INPA), located at km 38 of the BR-174 highway, Manaus, Amazonas, Brazil (latitude 2◦ 38′ 34.28′′ S and longitude 60◦ 2 ′ 33.63′′ W). An accession is the progeny obtained from seed of a single open-pollinated bunch from a palm sampled in a traditional farmer's property. All sampling was done with prior informed consent before the Convention of Biological Diversity. Five samples each of Bactris riparia, a very close relative of peach palm, and B. simplicifrons, a distant relative (Henderson, 2000; Couvreur et al., 2007), were also genetically characterized to serve as out-groups.

# Geographic Coordinates for Niche Modeling

In addition to the geo-referenced samples of the Core Collection, we used some B. gasipaes var. gasipaes from the Peach palm Active Germplasm Bank and downloaded geo-referenced occurrence records of var. gasipaes and var. chichagui from the Global Biodiversity Information Facility (GBIF) data-portal (http://data.gbif.org) on 21 August 2013 (see Supplementary Material 2.1). The Instituto de Ciencias Naturales, Bogotá, the Herbario Nacional de Bolivia and the Herbarium of the University of Aarhus kindly supplied additional coordinates and/or information to confirm the type of var. chichagui contained in the GBIF database. Only the samples that could reasonably be classified to a specific type of var. chichagui were used for wild peach palm ENM. We also used geo-referenced samples reported in Clement et al. (2009b) for which we have personal information, i.e., we did not use possible var. chichagui from the RADAM database, because these could not be identified as to type. The data set selected for wild peach palm niche modeling includes 38 type 1 and 22 type 2 (Supplementary Material 2.1). The information gathered on other observations of B. gasipaes was used for comparison with feral and cultivated materials, including 29 type 3, 202 var. gasipaes dataset, as well as 25 observations involving feral peach palms that could not be assigned to a particular morphotype, and are probable escapes from cultivation. All geographic coordinates were assigned or verified, using the Geonames gazetteer (http://www.geonames. org/) and Google Earth.

# Ecological Niche Characterization and Modeling

For the environmental layer input, we used 19 bioclimatic variables at a 2′ 30′′ grid resolution (corresponding roughly to 4.4 × 4.6 km at the Equator), for current conditions (∼1950– 2000) (http://www.worldclim.org/current) (Hijmans et al., 2005) and for past conditions—Last glacial maximum (LGM; ∼21,000 years BP) (http://www.worldclim.org/past from http:// pmip2.lsce.ipsl.fr/). For the LGM models, we used CCSM4.0 (Community Climate System Model), MIROC-ESM (Model for Interdisciplinary Research on Climate-Earth Model System), and MPI-ESM-P (Max Planck Institute-Earth Model System).

The geo-referenced samples were used to model the geographic area that would be most likely to meet the climatic requirements of wild and cultivated peach palm (Phillips et al., 2006). The Maxent program identifies potential distribution areas on the basis of their similarity in climatic conditions compared to those at the sites where the species has already been observed, hence modeling where conditions are suitable for their survival. It infers the probability distribution of maximum entropy (i.e., closest to uniform) subject to the constraint that the expected value of each environmental variable (or its transform and/or interactions) under this estimated distribution matches its empirical average (Phillips et al., 2006; Thomas et al., 2012). To model the distribution of the realized niche of wild peach palm, Maxent was run on the following subsamples: (1) var. chichagui type 1 (Bioclim coverage 5–17◦ S, 49–76◦W), (2) var. chichagui type 2 (2–12◦N, 70–76◦W), and (3) var. chichagui types 1 and 2 (18◦ S-16◦N, 48–86◦W). It was also run on the whole sample (18◦ S-16◦N, 48–86◦W), dominated by cultivated peach palm data, to approach the distribution of the fundamental niche (Supplementary Material 1.2). A logistic threshold value equivalent to the 10th percentile training presence was retained to separate climatically favorable areas from marginally fit areas. Thresholds of 33 and 67% training presence were used to discriminate "very good" and "excellent" climates for the production of comparable climate suitability maps. For the LGM distribution models, we used the combined var. chichagui types 1 and 2 sample, and excluded Bioclim variables 14 and 15 that show a high level of discrepancy between LGM climate models (Varela et al., 2015).

We performed a principal component analysis (PCA) on the whole dataset, to characterize and compare the climatic envelopes of wild, feral, and cultivated peach palms, retaining those variables that contributed to the Maxent model and applying a varimax normalized rotation. The maximal temperature from the warmest month (Bio5) was discarded as it can be deduced from the minimal temperature of the coldest month (Bio6) and the annual range (Bio7). The different categories of peach palm populations were then plotted onto the principal plane.

# Analysis Using cpDNA

Fourteen chloroplast sequences (Shaw et al., 2007) were tested and two were informative (psbJ-petA and psaI-accD), but only psbJ-petA was used, because psaI-accD was only useful for discriminating B. simplicifrons from the B. gasipaes/riparia complex. The alignment of sequences obtained in both directions (forward and reverse) and the creation of the consensus sequence of each pair were performed using BioEdit 7.0.5.3 (Hall, 1999). Bayesian phylogenetic reconstructions were conducted in MrBayes v2.01 (Huelsenbeck and Ronquist, 2001), which uses the Metropolis-coupled Markov Chain Monte Carlo (MCMC) method to estimate the posterior probability distribution (Schmidt, 2009). Two runs with 10 million generations applied substitution models determined for each partition in MrModeltest v.2.2 (Nylander https://www.abc.se/~nylander/). In order to estimate posterior probabilities, 25% of the trees were discarded as a burn-in stage, observing when average standard deviation of split frequency (ASDSF) values dropped below 0.01. Phylogenetic Network v.4.5.1.6, developed to estimate phylogenetic networks with maximum parsimony, was used to build a network of haplotypes with the Median Joining algorithm (Bandelt et al., 1999). This method combines features of Kruskal's algorithm that finds the best tree while favoring short connections, the heuristic algorithm of maximum parsimony of Farris, and adds vertices called median vectors that represent extinct or un-sampled haplotypes in populations (Bandelt et al., 1999). Chloroplast genetic diversity across taxa was estimated with DNAsp 5 (Librado and Rozas, 2009).

# Analysis Using SSR Markers

We tested 39 SSR loci developed for peach palm (Martinez et al., 2002; Billotte et al., 2004; Rodrigues et al., 2004) for selection of loci with clear and informative amplification profiles. PCR reactions were performed according to Rodrigues et al. (2004). The SSR data is in Supplementary Material 2.2. Allele frequencies and private alleles of all loci were calculated using the Convert program (Glaubitz, 2004). We estimated the genetic distances of Nei (1978) between landraces defined by Rodrigues et al. (2005).

# Bayesian Analysis of Population Structure

Pritchard et al. (2000) developed a Bayesian method (implemented in the program Structure) to model the number of groups (K) of individuals based on their multi-locus genotypes. One advantage of this method is that populations need not be defined a priori, but will be identified by the data generated with SSR markers. This is very important when studying samples from germplasm banks, which often do not contain samples that can be considered representative of populations. Hence, it is even more important when analyzing core collections. The parameters used were: burn-in of 10,000 permutations, Markov Chain Monte Carlo (MCMC) with 100,000 permutations, the admixture ancestry model, where each individual can have more than one ancestral population, and independent allele frequencies (λ = 1). The best K was identified by LNP values (D) and 1K, following Evanno et al. (2005), from 15 simulations for each possible K from 1 to 12 (the number of hypothesized landraces and non-designated populations).

# Spatial Principal Components Analysis

We used spatial principal components analysis (sPCA) (Jombart et al., 2008), implemented in adegenet 1.3–2 (Jombart, 2008) with R (R Development Core Team, 2011) to visualize continentalscale gradients in the genetic diversity of peach palm. This method uses a matrix with allele frequencies of genotypes and a spatial weighting matrix containing measurements of spatial proximity among entities based on a connection network to produce scores that summarize the spatial structure and the genetic variability among groups of individuals across geographic space (Jombart et al., 2008). Various types of connection networks are available in adegenet. Given the continental scale of our study, we used the Gabriel graph network, because (1) it avoids unlikely connections (e.g., between eastern Amazonia and Central America), unlike Delaunay triangulation, and (2) allows possible connections at regional scale (e.g., southern and northern Western Amazonia) unlike relative neighbors network.

Moran's I is used to measure spatial autocorrelation in allele frequency values of samples. More specifically, sPCA optimizes the product of the variance of a few synthetic variables and of Moran's I, and generates two sets of axes: one with positive eigenvalues and the other with negative eigenvalues. Positive eigenvalues correspond to global structures, while negative eigenvalues are indicative of local patterns (Jombart et al., 2008). Abrupt decreases in both sets of eigenvalues indicate that global or local structure should be interpreted. The significance of "global" and/or "local" spatial structure was assessed using monte-carlo simulations implemented in global.rtest() and local.rtest() functions respectively, from adegenet 1.3–2 (Jombart et al., 2008). We proceeded with 9999 permutations per test. After having identified local and/or global structure and selected the

# REFERENCES


number of components to consider, samples position on each component was plotted onto the geographic space. As several principal components were retained we also used the colorplot() function from adegenet 1.3–2 (Jombart et al., 2008) to summarize spatial gradient of peach palm genetic diversity. This function uses the score of 1 to 3 components to compose a color per sample based on the RGB (Red Green Blue) system.

# AUTHOR CONTRIBUTIONS

CC designed research, obtained funding, contributed to analysis, wrote the article. MC-A, VMR, and DP-R executed research, contributed to analysis, wrote the article. GC and RL contributed to analysis, wrote the article.

# ACKNOWLEDGMENTS

We thank the Brazilian National Research Council (CNPq), Universal projects 47.6189/2003-9 and 47.5219/2006-6, and CT-Amazonia project 57.5588/2008-0, for financial support, the Foundation for the Support of Science in the State of Amazonas (FAPEAM) (project 062.00831/2015) and the Agence Inter-établissements de Recherche pour le Developpement (AIRD) bilateral cooperation agreement, project 2074/2011, for financial support and training, the National Research Institute for Amazonia (INPA) for a PCI scholarship for MC-A (2008–2013), Rodrigo Bernal, Universidad Nacional de Colombia, Birgitte Bergmann and Henrik Balslev, Herbarium AAU, Aarhus University, Viviana Vargas and Monica Moraes, Herbario Nacional de Bolivia de la Universidad Mayor de San Andrés, for information on var. chichagui, Yves Vigouroux, Institut de Recherche pour le Développement (IRD), and Evert Thomas, Bioversity International, for useful criticism and suggestions on earlier versions of this paper, and CNPq for a research fellowship for CC.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2017.00148/full#supplementary-material

Bates, H. (1962). The Naturalist on the River Amazons [Reprinted from the 2nd Edn., 1864, John Murray, London]. Berkeley, CA: University of California Press.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Clement, Cristo-Araújo, Coppens d'Eeckenbrugge, Reis, Lehnebach and Picanço-Rodrigues. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Domestication and Genetics of Papaya: A Review

#### Mariana Chávez-Pesqueira<sup>1</sup> \* and Juan Núñez-Farfán<sup>2</sup>

<sup>1</sup> Unidad de Recursos Naturales, Centro de Investigación Científica de Yucatán A. C., Mérida, Mexico, <sup>2</sup> Laboratorio de Genética Ecológica y Evolución, Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de Mexico, Mexico City, Mexico

A wealth of plant species used by humans for different purposes, but mainly as food, originated and domesticated in the Mesoamerican region. Papaya (Carica papaya) is the third most cultivated tropical crop worldwide, and it has been hypothesized that Mesoamerica is the most likely center of its origin and domestication. In support of it, many wild populations of papaya occur throughout Mesoamerica and hence represent the gene pool of genetic variability for further evolution and future crop management. Despite its importance, a dearth of information exists regarding the status of wild populations of papaya, as compared to the extent of knowledge, and interest, on domesticated varieties. We review the evidence on the extant wild populations of papaya, as well as its origin and distribution. Also, we synthetize what is known on the domestication history of the species, including the domestication syndrome that distinguishes wild and domesticated papayas. Moreover, we make an account of the use of genetic markers to assess genetic diversity of wild and domesticated papaya, and discuss the importance of papaya as the first species with a transgenic cultivar to be released for human consumption, and one that has its complete genome sequenced. Evidence from different disciplines strongly suggest that papaya originated and was domesticated in Mesoamerica, and that wild populations in the region possess, still, high genetic diversity compared to the domesticated papaya. Finally, we outline papaya as an excellent model species for genomic studies that will help gain insight into the domestication process and improvement of papaya and other tropical crops.

Keywords: Carica papaya, wild papaya, domestication, center of origin, distribution, genetic diversity, genomics, Mesoamerica

# CARICA PAPAYA

Papaya (Carica papaya L.) is a fast-growing, short-lived, tropical tree, cultivated for its fruit, papain, pectin, and antibacterial substances (Niklas and Marler, 2007). Nowadays papaya is grown widely in tropical and subtropical lowland regions around the world, and the trade amounted nearly \$200,000 million dollars by 2009 (Evans and Ballen, 2012).

Carica papaya is a member of the Caricaceae family and is the most economically important species in the family (Carvalho and Renner, 2012). C. papaya is the only member of the genus after its rehabilitation from the Vasconcella group which was considered part of the genus Carica, until the year 2000 (Badillo, 2000). The Caricaceae family originated in Africa where two extant species occur. The dispersal to Central America from Africa occurred ca. 35 million years ago (MYA), possibly by floating vegetation carried by ocean currents (Carvalho and Renner, 2012).

#### Edited by:

Alejandro Casas, Instituto de Investigaciones en Ecosistemas y Sustentabilidad, Universidad Nacional Autónoma de Mexico, Mexico

#### Reviewed by:

Shabir Hussain Wani, Michigan State University, United States Joao Paulo Fabi, University of São Paulo, Brazil

#### \*Correspondence:

Mariana Chávez-Pesqueira mariana.chavez@cicy.mx

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> Received: 30 June 2017 Accepted: 20 November 2017 Published: 01 December 2017

#### Citation:

Chávez-Pesqueira M and Núñez-Farfán J (2017) Domestication and Genetics of Papaya: A Review. Front. Ecol. Evol. 5:155. doi: 10.3389/fevo.2017.00155 The Caricaceae members arrived to South America from Central America between 27 and 19 MYA, when the Central America land bridge already had begun to form, facilitating the range expansion from Mexico to South America (Carvalho and Renner, 2012). It is estimated that C. papaya diverged from its sister clade about 25 MYA, and belongs to a small clade restricted to Mexico, Guatemala and El Salvador that includes four species: three perennial herbs, Jarilla chocola, J. heterophylla, and J. nana that occur in seasonal tropical forests, and the treelet Horovitzia cnidoscoloides, endemic to Oaxaca in Mexico (Carvalho and Renner, 2014).

Papaya possess a morphological structure and development according to Corner's model of architecture (Hallé et al., 1978): a monopodial, single, orthotropic and nonbranching trunk constructed by one vegetative meristem, with axillary inflorescences, hence with indeterminate growth. Carica papaya produces huge palmate-shaped leaves. Female flowers are produced at the axils of the leaf petiole, and later the fruits will occupy that position along the stem. Male individuals produce inflorescences. Three sex types of papaya are known: female, male, and hermaphrodite trees. In wild populations, most individuals are diclinous (dioecy), whereas cultivated papayas are dioecious or hermaphrodites (Carvalho and Renner, 2012; Chávez-Pesqueira et al., 2014). Cultivars can inbred, resulting in stable characteristics across generations (Manshardt, 1992). For gynodioecious cultivated papaya, it has been reported that two-thirds of the plants correspond to hermaphrodites and onethird to female plants, though dioecious cultivars exist (VanBuren et al., 2015). Sex determination for the three types of papaya, female, male, and hermaphrodite plants, is genetically regulated by the pairing of sex chromosomes (Carvalho and Renner, 2012), through a sex-linked region that behaves like an XY sex chromosome. For male and hermaphrodite individuals, sex is controlled by slightly different Y chromosome regions: Y<sup>h</sup> in hermaphrodites and Y in males (VanBuren et al., 2015). Female plants produce flowers and fruits all year round in tropical regions; however, in subtropical areas, despite continuous flowering, fruit set is decreased during drier seasons (Gonsalves, 1998).

Papaya is the third most cultivated tropical crop world-wide, Brazil and India are the largest producers of papaya although Mexico is the main exporter (Evans and Ballen, 2012). Under cultivation papaya trees grow fast, producing mature fruits within 9–12 months after planting. Commercially, a density of 1,500– 2,500 trees per hectare, can produce from 125,000 to 300,000 lbs per hectare, per year (Gonsalves, 1998). Among common fruits, papaya is ranked first on nutritional scores for the percentage of vitamin A, vitamin C, potassium, folate, niacin, thiamine, riboflavin, iron and calcium, and fiber (Huerta-Ocampo et al., 2012). Moreover, fruits, stems, leaves and roots of papaya are used in a wide range of medical applications and papain production (Ming et al., 2008). Commercial production of papain is directed for protein digestion, mainly as a red meat tenderizer, for the brewing of beer, and the skin treatment of warts and scars (Ming et al., 2012). Because of its rapid growth, continuous harvest and multiple uses, papaya is widely common in home gardens of tropical regions (Manshardt, 1992).

Wild, undomesticated, populations of papaya still grow in many regions of Mesoamerica, but it is common in naturally disturbed tropical forests (Chávez-Pesqueira and Núñez-Farfán, 2016). On occasion, wild papaya is also found in home gardens of ethnic groups in southern Mexico (Paz and Vázquez-Yanes, 1998) where it is used to make jams and dried crystallized fruit candy (Chávez-Pesqueira and Núñez-Farfán, pers. obs). In the rain forests of southern Mexico, wild papaya trees behave like a typical fast-growing, short-lived, nomadic tree (Van Steenis, 1958); they establish rapidly and reach maturity and reproduction only in recent (1–5 years, Paz and Vázquez-Yanes, 1998) and moderately large canopy gaps in mature rain forest, as well as in early secondary forests or "acahuals," and man-made clearings. Because of this, wild papaya is rare in the rain forest since it depends on the creation of new canopy gaps to persist and disperse (Chávez-Pesqueira and Núñez-Farfán, 2016). In mature forests, papaya plants mainly die due to shading by other trees that "close" the canopy gaps by lateral growth. Because of its nomadic nature, wild papaya represents a key element in the regeneration dynamic of tropical and sub-tropical forests along its natural distribution (Chávez-Pesqueira et al., 2014).

Two main differences in flower morphology and fruit size distinguish wild and domesticated papayas; wild papayas bear either male or female flowers (i.e., dioecious; Chávez-Pesqueira et al., 2014), instead of hermaphrodite flowers, like most cultivated varieties. Wild papayas produce very small fruits (no more than 8 cm in diameter; Chávez-Pesqueira and Núñez-Farfán, pers. obs.), with numerous seeds and a thin mesocarp, almost lacking pulp (Carvalho and Renner, 2012; **Figure 1**). Carica papaya is mainly pollinated by sphingid moths (Vega-Frutis and Guevara, 2009) and skippers. Seed dispersal is likely carried by birds, bats and small mammals that consume the fruits (Niklas and Marler, 2007; Chávez-Pesqueira and Núñez-Farfán, 2016).

In southern Mexico, isolated individuals of papaya, and some in home gardens, show intermediate fruit phenotypes between wild and cultivated plants (Paz and Vázquez-Yanes, 1998). These phenotypes suggest introgression due to occasional mating between population types (Chávez-Pesqueira et al., 2014). Since wild C. papaya represents the genetic reservoir of the species (Chávez-Pesqueira and Núñez-Farfán, 2016), exhaustive efforts should be made to know and preserve this important species in its wild form. Moreover, because papaya is one of the most economically important tropical crops in the world, and wild populations still occur naturally, the species represents an ideal system to study, in depth, the process of domestication. Here, we present a review of the existent knowledge on the origin, distribution and domestication of papaya, and discuss the usage of genetic and genomic methods to study the domestication of this valuable species.

# ORIGIN AND DISTRIBUTION

Different authors suggest a Mesoamerican origin of C. papaya, comprising southern Mexico to Central America (Vavilov, 1926; Storey, 1976), although no direct archaeological evidence

FIGURE 1 | (A) Wild female papaya tree with fruits in Yucatan, Mexico. (B) Wild population of Carica papaya in Yucatan, Mexico. (C) Wild fruits, intermediate form fruits (possible hybrid between wild and domesticated plants), and domesticated papaya fruit (Maradol variety).

regarding the center of origin of papaya has been reported as yet (Fuentes and Santamaría, 2014). One reason for the lack of archaeological data is the difficulty to identify papaya from phytoliths, and their pollen grains have been hardly found (Carvalho and Renner, 2012). Together, the presence of undomesticated populations in Mexico and Central America and its cultivation in the region predating the Spanish conquest of Mexico, support the Mesoamerican hypothesis (Carvalho and Renner, 2014; Fuentes and Santamaría, 2014).

Phylogenetic evidence of the Caricaceae family, also supports a Mesoamerican origin of papaya. Carvalho and Renner (2012) obtained a molecular phylogeny of the Caricaceae family using chloroplast and nuclear data (4,711 bp) of the 34 species in the family. This is the only phylogeny using members from all the Caricaceae family. Their resulting phylogeny points that Carica papaya is more closely related to the genus Horovitzia, endemic to Mexico, and Jarilla, endemic to Mexico and Guatemala, than to the South American genus Vasconcellea, as previously thought (Carvalho and Renner, 2014). Moreover, Carica, Jarilla and Horovitzia show an unilocular ovary, whereas the remaining South American Caricaceae possess 5-locular ovaries (Carvalho and Renner, 2012). This morphological synapomorphy supports the family phylogeny.

Further evidence is offered by the origin of the Y<sup>h</sup> chromosome in cultivated hermaphrodite individuals of papaya (VanBuren et al., 2015). By sequencing the entire male-specific region of the Y chromosome and comparing it with the previous sequences of the hermaphrodite-specific region of the Y h chromosome, (Wang et al., 2012),VanBuren et al. (2015) found that the Y<sup>h</sup> chromosome possess lower nucleotide diversity as compared to the Y. This reduced variability is consistent with a genetic bottleneck scenario possibly brought about by domestication, and suggests that dioecy is ancestral in C. papaya. Given the ubiquity of dioecy in wild populations of papaya in Mesoamerica, an origin for the species in this region is a more parsimonious hypothesis. Yet, genomic studies can aid to determine the origin of some innovations related with domestication.

High values of genetic variation of wild populations, and wild relatives, are expected to be found in the centers of origin of crop species (Gepts and Papa, 2003). Additionally, domestication is expected to reduce genetic diversity and provoke selective sweeps in genes associated to characters target of domestication (Purugganan and Fuller, 2009). A recent phylogeographic analysis of 19 wild populations of papaya in Mexico using nuclear and chloroplast markers, revealed high values of genetic diversity (Chávez-Pesqueira and Núñez-Farfán, 2016). The higher genetic diversity was found in locations of southern Mexico, suggesting this region as a genetic reservoir for the species (see below). Summing up, evidence strongly suggests that C. papaya originated in Mesoamerica, likely in southern Mexico.

Regarding the natural distribution of papaya, this has been suggested to range from the northern tropical limit of Mexico to Costa Rica in Central America (Aradhya et al., 1999; Carvalho and Renner, 2012). However, the precise assessment of its natural distribution is still lacking. One reason for this is the scarcity of studies in wild populations of papaya, and the paucity of herbarium specimens indicating whether a specimen belongs to a cultivated or wild individual. Fuentes and Santamaría (2014) performed searches in herbaria around the world to explore the distribution of papaya. Although they did not distinguish between wild and cultivated plants, their results show that most specimens belong to Mexico and to a lesser extent to Central America Unfortunately, the present rates of deforestation and habitat fragmentation within the proposed distribution range of wild papaya is high enough as to endanger the persistence of plant species (Barlow et al., 2016). This, coupled with the lack of information on the state of many wild varieties of important crop species, warn us about the relevance of studying and conserving wild populations and wild relatives of papaya and other crop species.

# DOMESTICATION

The limited occurrence of wild populations of C. papaya and its four closest relative species in Mexico and Central America, are in line with the domestication of papaya in the Mesoamerican region (Carvalho and Renner, 2012). Mesoamerica is considered as one of the World's centers of plant domestication (Harlan, 1971; Pohl et al., 1996). The Maya was the most important culture present in that region before the conquest of Mexico by Spain in the sixteenth century, and probably one of the first to cultivate and trade the fruits of C. papaya (Colunga-GarcíaMarín and Zizumbo-Villarreal, 2004). Moreover, it has been suggested that the enzime papain was used by the Mesoamerican people to tenderize meat by wrapping it in papaya leaves and that this knowledge was then taken to Europe after the Spanish colonization (Larqué-Saavedra, 2016). By the time of the conquest of Mexico, it is believed that papaya was cultivated by native people all the way from southern Mexico to the Isthmus of Panama, where it was locally known as olocoton (Storey et al., 1986).

In the sixteenth century the Spaniards were probably the first responsible for the spread of papaya beyond Mesoamerica (Carvalho and Renner, 2014). It was introduced into the Hispaniola island (nowadays Haiti and Santo Domingo) in 1,521. There, it acquired the Carib Indian name ababai, that would then be changed to papaia, papia, papeya, and finally papaya. From this island, it spread to other West Indian islands, such as Jamaica, where it was known as pawpaw; to Cuba where it was named fruta bomba; to Venezuela known as lechosa; and to Brazil and Argentina where it is called mamao and mamón, respectively. In 1526 papaya was taken outside of America to Indonesia, and then spread commercially throughout the East Indies and tropical Asia (Storey et al., 1986). After that, it rapidly spread into other Asian countries and finally to Africa, brought by European colonial powers, such as Portugal, Denmark, Great Britain and France (Manshardt, 2014).

Moreover, because the papaya seeds have a moderate period of longevity, it is likely that this trait aided to its rapid spread throughout the tropics, where it has existed practically since man has recorded modern history (Schroeder, 1958). Nowadays, there are many varieties of papaya cultivated in most tropical and subtropical regions of the world, differing in traits such as fruit size, color, flavour, and tree size (Moore, 2014).

Domesticated plants commonly display a suite of traits, defined as the "domestication syndrome," that distinguish them from their wild populations or relatives. In general, they differ in traits related to yield, food usage, and cultivation (Martínez-Ainsworth and Tenaillon, 2016). For papaya, domesticated plants have become morphologically and physiologically different in relation to their wild counterparts (Paz and Vázquez-Yanes, 1998). The principal characters of papaya that have been studied as target of selection under domestication are: tree size, fruit size, sex types, and morphology, and germinability of seeds. Whereas tree size has been selected to become smaller to facilitate fruit harvesting (Niklas and Marler, 2007), selection has been focused on enlarging the fruit size and increasing the ovary wall (pulp) for human consumption (Coppens d'Eeckenbrugge et al., 2005; Manshardt, 2014). As a result, several varieties of papaya are cultivated around the world.

Regarding sex types, in dioecious fruit crops, mutations inducing hermaphroditism have been associated with domestication; this is the case of species like strawberry, grape and papaya (Janick, 2006). In the Caricaceae family, dioecy is the ancestral stateand it has been suggested that hermaphrodite individuals in papaya, resulted from a natural mutation in male plants and were likely selected by humans for its favorable fruit phenotype (Ueno et al., 2015). Moreover, the region of the chromosome that produces hermaphrodite papaya plants, which occurs only in cultivated varieties, arose only ∼4,000 years ago (higher posterior density of 95%: 1,400– 6,700 years ago; VanBuren et al., 2015). Although crop plant domestication in Mesoamerica occurred around 6,200 years ago, the estimation coincides with the rise of the Maya civilization. Given that hermaphrodite individuals are rarely found in wild populations of Mesoamerica, this strongly supports that papaya was domesticated by the Mayans or other Mesoamerican cultures thousands of years ago. It is believed that the rare allele for hermaphroditism in wild populations was probably selected by early domesticators favoring mutants with red flesh color over the typical yellow/orange color of wild papayas (Manshardt, 2014).

Finally, seeds from wild and domesticated papayas differ in size, germination rate, dormancy, and light sensitivity. Paz and Vázquez-Yanes (1998) found that seeds of wild papayas are smaller than those of cultivated plants in southern Mexico, and that they can remain dormant if buried for long periods of time, reflecting their strong light requirement for germination. In cultivated papaya, seeds germinate with a higher probability, lessening the importance of specific environmental conditions to germinate (Paz and Vázquez-Yanes, 1998). This domestication syndrome has been reported for other crops (Doebley et al., 2006; Gepts, 2010).

# GENETICS AND GENOMICS

Understanding crop domestication is crucial to fulfill the demand for improving yield and quality of crops (Tang et al., 2010). Knowledge of the genetic diversity within crop species is essential to understand their origin, domestication and evolutionary relationships, and to efficiently develop strategies for the conservation of their genetic resources, and effective crop improvement (Moore, 2014). Furthermore, crops represent excellent systems for the study of rapid evolution. For the case of papaya, the species now represents an important model in genetic and genomic studies; papaya is one of the first plant species to have its genome sequenced (Ming et al., 2008) and the first transgenic cultivar released for human consumption (Manshardt, 2014). In this section, we briefly review some advances of genetic and genomic methods in papaya; from what is known about the state of wild populations, to recent advances in domesticated varieties. It must be noted, however, that recent specialized reviews about genomics in papaya have been published elsewhere (Ming et al., 2012; Ming and Moore, 2014; Tripathi et al., 2014).

A recurrent consequence of domestication is the reduction of genetic diversity due to the 2-fold effect of genetic drift and selection that operate during the domestication process (Doebley et al., 2006; Gepts, 2014). Assessing the functional variation of wild genetic resources to expand the usable genetic diversity is of utmost importance for breeding and improvement programs (Martínez-Ainsworth and Tenaillon, 2016). However, little information is available about the genetic diversity of wild varieties of important crop species, with relevance for conservation and management (Chávez-Pesqueira and Núñez-Farfán, 2016).

Early on, the use of genetics in tropical fruit crops was confined to the development of isozyme and dominant PCRbased markers and their use for germplasm diversity analysis and clonal fingerprinting; however, information from dominant markers is of limited use in genomic applications (Litz and Padilla, 2012). Since then, codominant markers, known as simple sequence repeat (SSR) markers or microsatellites, have been developed for several tropical fruit crops and used for parentage analysis, clonal fingerprinting, genetic diversity analysis, and development of genetic linkage maps. Nowadays, the increased capability of DNA sequencing by next generation methods, is leading to increased interest in tropical tree fruit crops and to new opportunities to increase the rate of genetic gain in breeding programs (Litz and Padilla, 2012). Recent approaches in massive sequencing tools, such as de novo genome sequencing, whole genome resequencing, reduction of genome complexity using restriction enzymes, transcriptome analysis, and epigenetic studies are becoming of great importance in domestication studies (Guerra-García and Piñero, 2017).

Genetic knowledge of papaya has been accelerated with the advances in molecular markers, linkage and physical maps, comparative genomics studies, and the sequencing of its genome (Tripathi et al., 2014). For genetic diversity studies, some molecular markers have been developed for papaya, mainly SSRs (Ocampo et al., 2006; Eustice et al., 2008; De Oliveira et al., 2010a; Fang et al., 2016). However, most studies have focused on cultivated papaya, with a dearth of studies addressing the genetic diversity and structure of wild populations.

Our recent surveys about the genetic diversity and structure of wild papaya in its natural distribution in northern Mesoamerica (Mexico) (Chávez-Pesqueira et al., 2014; Chávez-Pesqueira and Núñez-Farfán, 2016), have revealed high levels of genetic diversity for the species in the wild. Using both nuclear SSRs and chloroplast DNA markers in 355 individuals of 19 natural populations, we found a mean observed heterozygosity of 0.681 (range of 0.409–0.783) and a mean haplotidic diversity of 0.701 (range of 0.307–0.934) (Chávez-Pesqueira and Núñez-Farfán, 2016). The area with the higher genetic diversity for both markers occur in southeast Mexico, near the region of the Tehuantepec Isthmus. With the chloroplast DNA, we found a lack of phylogeographic structure but a recent structure with the microsatellite data (RST = 0.149). At the local scale, we also found a negative effect of habitat fragmentation for wild populations of papaya in a fragmented rainforest in Los Tuxtlas rainforest in southern Mexico (Chávez-Pesqueira et al., 2014). Populations that inhabit forest fragments showed a reduced genetic diversity, higher population differentiation, and less migrants. Together, these studies suggest that wild papaya has maintained genetic connectivity among populations throughout time; however, populations are becoming recently structured probably due to human disturbances of its natural habitat, like habitat fragmentation, rendering important conservation concerns for the species in its wild form. Efforts to conserve the natural genetic resources of the species should be addressed as well to assure the conservation of the domesticated varieties through genetic improvement.

Genetic diversity of cultivated and feral papayas has shown to be, in general, lower than wild populations: (1) Using AFLPs markers, a genetic similarity of 0.880 has been reported for 63 cultivated papaya accessions (Kim et al., 2002); (2) an expected heterozygosity (He) ranging from 0.54 to 0.69 in cultivars from the Caribbean region (Ocampo Pérez et al., 2005); (3) a mean He of 0.59 using 51 polymorphic loci from 30 accessions and 18 landraces from different countries, mainly Brazil (De Oliveira et al., 2010b); (4) He ranges of 0.51–0.64 in feral papayas of Costa Rica (Brown et al., 2012); and (5) a mean He of 0.011 in Maradol cultivars in the Pacific coast of Mexico (Chávez-Pesqueira et al., unpublished data). Years of selective breeding would explain the low genetic diversity in the domesticated papayas brought about the domestication bottleneck.

During domestication, gene flow between wild and domesticated conspecific plants plays an important role in addition to other evolutionary forces such as genetic drift and selection. In many cases, crop plants and their wild progenitors belong to the same biological species (Gepts, 2014; Rendón-Anaya et al., 2017), as in papaya, or cohabit with close wild relatives. There is morphological evidence of gene flow between wild and domesticated forms in papaya in southern Mexico (Paz and Vázquez-Yanes, 1998; Chávez-Pesqueira and Núñez-Farfán, 2016; **Figure 1C**). However, there are no studies evaluating the ecological and evolutionary consequences of this phenomenon. Considering the introduction of transgenic papayas amid the natural distribution of wild populations of C. papaya (see below), would represent a potential ecological risk that demands consideration.

Papaya is known as the world's first transgenic cultivar to have been released for commercial production (in Hawaii, USA), in 1998 by public institutions in the USA (Tecson Mendoza et al., 2008; Manshardt, 2014). This transgenic variety was developed to control the papaya ringspot virus (PRSV). The PRSV acts destroying the photosynthetic capacity of the leaves, leading first to a decrease in fruit quality and yield, loss of vegetative vigor, and eventual death of plants (Lius et al., 1997). After a spread of the virus in 1994 that nearly killed all cultivars in Puna Island in Hawaii, efforts to develop a resistant variety to the PRSV began. Two varieties successfully controlled the virus infection and saved the papaya production in Hawaii: Rainbow and SunUp. According to Gonsalves (2014), transgenic papaya acreage in the year 2012 was about 85% in Hawaii. However, some concern raised regarding the potential effects of gene flow between transgenic and nontransgenic cultivars. Experiments were carried out to assess this; from 2004 through 2009, Gonsalves et al. (2012) studied transgene flow of "Rainbow" (transgenic) to "Kapoho" (nontransgenic) cultivars in Hawaii. Their results showed that observed transgene pollen dispersal to hermaphrodite "Kapoho" from neighboring plants of the "Rainbow" variety was very low, but from "Rainbow" to female "Kapoho" individuals, it was higher (Gonsalves, 2014), confirming the existence of gene flow in some amount among cultivars. Although cultivars of transgenic papaya are confined to Hawaii, little is known about the ecological, evolutionary and commercial consequences of possible transgene flow in the center of origin of papaya where natural populations occur, and where attempts to release transgenic cultivars have prevailed (Silva-Rosales et al., 2010). Nowadays, transgenic papaya from Hawaii is consumed in USA, Canada, and Japan, but no studies have been made to assess the possible movement of seeds to papaya's origin center. Because transgenic papaya saved Hawaii's papaya industry, it has been recognized as a model species of translational biotechnology and the best-characterized commercial transgenic crop. Given that PRSV is widespread in papaya cultivars around the world, papaya SunUp could serve as a transgenic germplasm source to breed suitable virusresistant cultivars (Ming et al., 2008). However, despite these achievements, many countries have rejected transgenic crops; in 2010, only transgenic papaya and plum had been approved for human consumption (Litz and Padilla, 2012).

With the recent advances of DNA next generation sequencing a great interest has been advocated to crops. Papaya is the fifth flowering plant (after Arabidopsis, rice, poplar, and grape) to have its genome sequenced (Kanchana-Udomkan et al., 2014). A draft genome (approximately 3x coverage) was obtained from a transgenic female cultivated papaya from the SunUp variety from Hawaii, using a whole-genome shotgun approach (Ming et al., 2008). The assembled contigs summed up 73% of the genome (271 Mb) with 370 Mb of spanning scaffolds (Wei and Wing, 2008). Further efforts were made to complete the draft genome of papaya; Yu et al. (2009) constructed a BAC (Bacterial Artificial Chromosome) based physical map to support the sequence assembly with an estimated genome coverage of 95.8%. The papaya genome has shown to be three times the size of the Arabidopsis genome, containing fewer genes, with a minimal angiosperm gene set of 13,311. Amplifications in gene number with roles in the evolution of tree-like habit, deposition and remobilization of starch reserves, attraction of seed dispersal agents, and adaptation to tropical day lengths have been identified (Ming et al., 2008).; Moreover, the small genome and diploid nature (2n = 18) of the species, render papaya as an excellent model for genomics (Ming and Moore, 2014; Tripathi et al., 2014).

Recent genetic and genomic methods have revealed interesting facts about domestication in papaya. For instance, Wu et al. (2017) found evidence of a recent emergence of the y allele and its selection in red fruited cultivation. Analysing the levels and patterns of genetic diversity at the CYC-b locus (y allele) and six loci in a 100-kb region flanking this locus, they found evidence of a strong selective sweep, as genetic diversity showed a reduction at the recessive y allele in comparison with the dominant Y allele present in the yellow-fleshed papayas (wild and cultivated). By means of a haplotype network, the authors suggest that the y allele likely originated in the wild and was introduced into the domesticated varieties during the domestication of papaya. However, they found shared haplotype structure among some wild, feral, and cultivated haplotypes near the y allele, suggesting a successive escape of the y allele from red cultivars back into wild populations, probably through feral intermediates in Costa Rica (Wu et al., 2017). In other study, Porter et al. (2009), using a genome-wide analysis, revealed that papaya has relatively few nucleotide-bindings sites (NBS) encoding genes, but structurally diverse showing a novel subgroup. This group of genes are the most common in plant disease resistance proteins. The authors argue that the absence of recent genome duplication and relatively low gene number in papaya, may explain the apparent scarcity NBS gene number, and that this could function as pathogen surveillance, making papaya suitable for functional studies and a better understanding of plant resistance at the genetic level (Porter et al., 2009).

Furthermore, papaya has been recognized as an excellent model for studies about sex determination in plants (Ming et al., 2012; Aryal and Ming, 2014), sex chromosome evolution (Weingartner and Moore, 2012), origin and evolution of dioecy in the Caricaceae family, and the identification of candidate genes and genome-wide DNA markers for papaya improvement (Ming et al., 2012). These studies have helped to expand the knowledge for tropical fruit tree genomics. Moreover, phylogenetically, papaya belongs to the Brassicales, and can serve as an excellent out-group to study genome evolution in the Brassicaceae family, given that it shares a common ancestor with Arabidopsis, about 72 MYA (Ming et al., 2012). Undoubtedly, the sequencing of the papaya genome has opened avenues to investigate the function of many genes for which little is known. The sequenced genome will facilitate knowledge about the genomic regions associated to diverse aspects of plant defense, plant growth, development of leaves, roots, and fruits, flowering, as well as fruit ripening, and circadian clock (Tripathi et al., 2014).

Regarding its commercial use, a variety of markers have been developed to distinguish between the different sexes in juvenile plants (Somsri et al., 1997; Deputy et al., 2002; Lemos et al., 2002; Urasaki et al., 2002; for a complete review of molecular markers in papaya see Kanchana-Udomkan et al., 2014). Most markers achieved to distinguish between male/hermaphroditic and female individuals. This aided in eliminating females, given that hermaphrodites are preferred for their bigger and elongated fruits, rather than the rounded female fruits that require greater container space for shipping (Tripathi et al., 2014). However, in commercial production, males are useless and with these markers, males couldn't be distinguished. Recently, Liao et al. (2017), used a large male-specific retrotransposon insertion of 8396 bp to develop two papaya male-specific markers. Given that sex determination is only possible around 6 months after germination, these markers largely facilitate the elimination of undesired plants. Finally, sex markers are also useful in ecological and evolutionary studies that require the estimation of effective population sizes which can be obtained with the sex ratio of populations (Chávez-Pesqueira et al., 2014).

In spite of these advances, many questions remain unsolved for papaya, mainly for the wild counterpart. For example, the lack of knowledge about the consequences of gene flow from domesticated and/or transgenic populations. Moreover, few studies have evaluated genetic regions of wild individuals (standing variation) for crop improvement purposes (but see Vázquez Calderón et al., 2014).

The use of genomics in tropical crop breeding will greatly assist in solving many difficult challenges in the face of global warming and climate change (Guerra-García and Piñero, 2017). Compared to annual crops and temperate trees, genetic improvement of tropical fruit trees has been limited (Litz and Padilla, 2012). The use of genomics for the development of new tropical cultivars adapted to higher temperatures and increased droughts will become necessary to alleviate food security in the upcoming years (Abberton et al., 2016). The organization of genetic diversity in wild relatives represent important genetic resources that can facilitate the development of climate-tolerant cultivars (Gepts, 2014).

Finally, the continuous advances of genomic tools in crop species will also serve to gain insights into the ecological and evolutionary knowledge of domestication in tropical crops. For instance, the functional genomics aspects of domestication will help assess how many and which genes show differences in expression between wild and domesticated types (Sarah et al., 2017), which factors affect gene expression, and a better understanding of biotic interactions and how these have affected crop evolution, among many others (Gepts, 2014).

# REFERENCES

Abberton, M., Batley, J., Bentley, A., Bryant, J., Cai, H., Cockram, J., et al. (2016). Global agricultural intensification during climate change: a role for genomics. Plant Biotechnol. J. 14, 1095–1098. doi: 10.1111/pbi.12467

Plant domestication represents one of the most relevant events in human history. Today, papaya is one of the most important economically crop species worldwide. Understanding the ecology, evolutionary history and domestication process of such species is necessary to maintain food security in the future and counteract upcoming threats of overpopulation and climate change.

# FINAL REMARKS

The understanding of the origin and domestication of crop species has become a topic of great interest and has advanced importantly through combining approaches from diverse disciplines (Martínez-Ainsworth and Tenaillon, 2016). Papaya represents the third most produced crop in the tropics worldwide and an important source of commercial uses for humans. It was most probably originated and domesticated in Mesoamerica where wild populations still occur and distribute in Mexico and Central America. In nature, wild papayas play an important role in the regeneration of their natural habitat and possess high levels of genetic variation, representing the genetic wealth and evolutionary potential of the species. Finally, papaya is an excellent model for genomic studies as it is one of the first plant species to have its complete genome sequenced. However, there is still much more knowledge needed from this important species, mainly about its wild populations and the evolutionary process of domestication.

# AUTHOR CONTRIBUTIONS

MC-P and JN-F conceived the review and the outline, and searched for literature. MC-P wrote the manuscript. JN-F reviewed drafts of the manuscript and contributed to writing the final version. Both authors were involved in the final editing and review of the paper.

# FUNDING

Funding was provided by Comisión Nacional para el Conocimiento y Uso de la Biodiversidad (CONABIO) and Dirección General del Sector Primario y Recursos Naturales Renovables of Secretaría de Medio Ambiente y Recursos Naturales (SEMARNAT), project WQ003 "Análisis para la determinación de los centros de origen y diversidad genética de Carica papaya (Caricaceae)."

# ACKNOWLEDGMENTS

We thank Rosalinda Tapia-López for her help in logistics and lab work in our studies with papaya.

Aradhya, M. K., Manshardt, R. M., Zee, F., and Morden, C. W. (1999). A phylogenetic analysis of the genus Carica L.(Caricaceae) based on restriction fragment length variation in a cpDNA intergenic spacer region. Genet. Resour. Crop Evol. 46, 579–586. doi: 10.1023/A:10087865 31609


male-hermaphrodite differentiation in papaya (Carica papaya) trees. Mol. Genet. Genomics 290, 661–670. doi: 10.1007/s00438-014-0955-9


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling Editor declared a shared affiliation, though no other collaboration, with one of the authors JN and states that the process nevertheless met the standards of a fair and objective review.

Copyright © 2017 Chávez-Pesqueira and Núñez-Farfán. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Firewood Resource Management in Different Landscapes in NW Patagonia

#### Daniela V. Morales <sup>1</sup> , Soledad Molares <sup>1</sup> and Ana H. Ladio<sup>2</sup> \*

<sup>1</sup> Centro de Investigación Esquel de Montaña y Estepa patagónica, CONICET Universidad Nacional de la Patagonia San Juan Bosco, Esquel, Argentina, <sup>2</sup> Instituto Nacional de Investigaciones en Biodiversidad y Medio Ambiente, CONICET Universidad Nacional del Comahue, S.C. de Bariloche, Argentina

Ecosystems, their components, processes and functions are all subject to management by human populations, with the purpose of adapting the environments to make them more habitable and ensuring the availability and continuity of subsistence resources. Although a lot of work has been carried out on resources of alimentary or medicinal interest, little has been done on associating processes of domestication with firewood extraction, a practice considered to be destructive of the environment. In the arid steppe of NW Patagonia, inhabited and managed for different purposes for a long time by Mapuche-Tehuelche communities, the gathering of combustible plant species has up to the present time played a crucial role in cooking and heating, and work is required to achieve sustainability of this resource. In this study we evaluate whether environments with less landscape domestication are more intensively used for firewood gathering. Using an ethnobiological approach, information was obtained through participant observation, interviews and free listing. The data were examined using both qualitative and quantitative approaches. Twenty-eight firewood species are gathered, both native (75%) and exotic (25%). The supply of firewood mainly depends on gathering from the domesticated (10 species), semi-domesticated (17 species) and low human intervention landscapes (17 species). In contrast to our hypothesis, average use intensity is similar in all these landscapes despite their different levels of domestication. That is, the different areas are taken advantage of in a complementary manner in order to satisfy the domestic demand for firewood. Neither do biogeographic origin or utilitarian versatility of collected plants vary significantly between the different landscape levels of domestication. Our results show that human landscape domestication for the provision of firewood seems to be a socio-cultural resilient practice, and shed new light on the role of culture in resource management. This approach may offer new tools for the development of firewood and cultural landscape management, and conservation planning.

#### Keywords: gathering practices, socio-cultural resilience, complementarity, management

# INTRODUCTION

Ecosystems, their components, processes and functions are all subject to transformation or management by local populations. Traditional ecological management is understood to be a set of interventions and transformations which are the result of community decisions regarding the natural and artificial systems, with the explicit purpose of adapting the landscape to make it more

Edited by:

Urs Feller, University of Bern, Switzerland

#### Reviewed by:

Andrey S. Zaitsev, Justus Liebig Universität Gießen, Germany Milton Kanashiro, Embrapa Amazonia Oriental (Embrapa Easter Amazon), Brazil

> \*Correspondence: Ana H. Ladio ahladio@gmail.com

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> Received: 23 June 2017 Accepted: 01 September 2017 Published: 15 September 2017

#### Citation:

Morales DV, Molares S and Ladio AH (2017) Firewood Resource Management in Different Landscapes in NW Patagonia. Front. Ecol. Evol. 5:111. doi: 10.3389/fevo.2017.00111 habitable and ensuring the availability of certain natural resources (Casas et al., 2014). According to Toledo et al. (2003), resource management is an adaptive response to the uncertainty of their availability. It is part of the human desire to domesticate or control whatever is affected by this condition of uncertainty, so as to ensure the continuation of social and cultural life (Blancas et al., 2014).

The subject of landscapes as biocultural systems has been studied widely over recent years (Berkes et al., 2000; Miller and Davidson-Hunt, 2010; Casas et al., 2014; Lins Neto et al., 2014). They are considered as physical spaces which reflect the use, values, learning and cosmovisions of the societies inhabiting them over time (Capparelli et al., 2011; Castro et al., 2012), and they also acquire a symbolic character as they are accorded significance by the different cultures (Greider and Garkovich, 1994; Boillat et al., 2013; Fernández-Llamazares et al., 2016).

The landscapes have been, and still are, constantly changed by their human inhabitants, leading to processes of domestication through both individual and collective subsistence practices which are reproduced and recreated in a dynamic way. This leads to socio-environmental constructions impregnated with everyday experience, in a permanent state of adaptation to the surroundings and their changing conditions (Berkes and Davidson-Hunt, 2006; Haber, 2006; Kareiva et al., 2007; Castro et al., 2012; Lema, 2014). It is currently considered that the study of these practices in contexts that have undergone processes of change represents a key way of understanding the value of resources and their potential conservation (Washington, 2013; Blancas et al., 2014).

Local strategies for managing certain space-time resources depend on perception of which resources are useful and can be taken advantage of and preserved, and which are not, according to the necessities and perspectives arising over time among the population (Blancas et al., 2014). Shaanker et al. (2003) highlight that locals' ecological knowledge may have important implications for the long term conservation of plant resources, and that communities with a greater degree of this knowledge might be more "prudent" in their use of the environment, through adoption of less destructive harvesting practices, particularly when economic interests are affected, and even when being "prudent" implies poorer short term returns.

The study of the body of fuelwood knowledge which forms part of the accumulative cultural heritage of local communities, referred to conceptually as traditional ecological knowledge (TEK) (Berkes et al., 2000), may contribute to the construction of perspectives on the sustainable management of these ecosystems. This involves in-depth study of the knowledge and practices associated with use of the environment, which are related to the maintenance of ecological cycles, and therefore ensure sustainability, thus making possible the coexistence and even the evolution of inhabitants along with their natural resources (Berkes et al., 2000; Shaanker et al., 2003; Berkes, 2004; Tiwari et al., 2010). Despite the progress made in this subject with regard to the use of firewood (Cardoso et al., 2012, 2015; Arre et al., 2015; Morales et al., 2017), the processes associated with domestication of the arid cultural landscape leading to the promotion and supply of these scarce resources are still unknown. The studies refer mainly to firewood extraction as a destructive practice that shows little care for the environment or resource renewal (Tabuti, 2007).

Among the practices of anthropization of the environment, the ex and in situ management of vegetation (Casas et al., 2007, 2008, 2014; Lins Neto et al., 2014), control of ecological succession, diversion of watercourses and the gathering of useful plants (Davidson-Hunt and Berkes, 2003; Molares and Ladio, 2009; Pirondo and Keller, 2014), have been mentioned as strategies which increase the supply of species of interest and reduce the uncertainty of their availability throughout the year and in succeeding years (Barrera-Bassols and Toledo, 2005; Molares and Ladio, 2009). For example, taking advantage of ruderal and exotic species that appear rapidly on land where clearings have been produced by disturbances of moderate intensity is a practice shared by many cultures (Molares and Ladio, 2014).

The species diversity constructed through different management practices may be taken advantage of in different ways, utilitarian versatility being considered as the sum of the different uses presented by a certain resource. It has been found that the species with highest use consensus in these societies often present high utilitarian versatility, a pattern based on profound knowledge of the best-known species, which promotes caring attitudes, since the people value the potential of the species in a holistic way (Richeri et al., 2013). Studies performed on medicinal plants have shown that plants native to a certain place generally offer greater use diversity than exotic species, due to the longer period of time they have been in contact with local populations (Molares and Ladio, 2009).

Since ancient times indigenous communities in Patagonia have bonded with the resources in their environment, which has not only allowed them to develop a deep knowledge of nature, but has also favored the creation of diverse landscapes with different levels of domestication (Ladio and Molares, 2014; Sedrez dos Reis et al., 2014). These communities have viewed nature as a fundamental part of their life system. In particular, the Andean environment, across its entire altitudinal gradient, is taken advantage of as a source of natural resources through different gathering practices, although practices tending toward tolerance, facilitation and promotion are also observed (Ladio and Molares, 2014), generating a gradient/continuum of landscapes with different levels of human intervention. These spaces were, and still are, valued differentially according to their vegetation and geomorphological characteristics, as well as their utilitarian and symbolic attributes (Ladio and Lozada, 2003, 2004; Ladio et al., 2007).

The gathering of woody species for fire is a subject of great socioeconomic relevance on a global level (Tabuti et al., 2003; Chettri and Sharma, 2007; Ramos et al., 2008) and as an area of study, considering the cultural value of heating by firewood, the severity of winter at this latitude, the scarcity of available petroleum-based alternatives, and their high cost (Cardoso et al., 2012). As a result, most households, and the rural or semi-rural ones in particular, depend principally on plants for cooking and heating (Cardoso et al., 2012; Arre et al., 2015). Recent ethnobotanical studies in Patagonia have documented consumption patterns of firewood species, the attributes of the wood, and preferences shown by locals, oriented mainly toward native species (Cardoso et al., 2012, 2013, 2015; Arre et al., 2015) and the new practices employed to counteract scarcity of the firewood resource, such as the use of exotic species of more or less recent appearance in the region (Cardoso et al., 2017). This evidence reflects constant processes of hybridization of knowledge, which make possible the incorporation of new woody elements to compensate for deficiencies (Cardoso and Ladio, 2011).

The objectives of this work were to analyze the gathering patterns of firewood plants and their relationship with the different domesticated landscapes that form part of daily life in two rural communities of the Patagonian steppe. Our questions are: (1) whether environments with a lower level of landscape domestication are used more intensively for firewood gathering, in terms of species richness and use consensus. (2) whether in these environments with a lower level of landscape domestication a higher proportion of native species are gathered, and a higher number of species that were used in the past. (3) whether in these environments with a lower level of landscape domestication the woody species are used in a more versatile way.

# MATERIALS AND METHODS

# Study Area

The study area takes in the communities of Gualjaina (42◦ 4 ′ S and 70◦ 32′W) and Paraje Costa del Lepá (42◦ 34′ S and 71◦ 03′W), in the northwest of Chubut province, Argentina (**Figure 1**). They are neighboring communities, approximately 15 km apart, and have close links in terms of family, lifestyle and work, among other factors. The area is located in the Central Plateau Region, in the Lepá and Gualjaina river basins, at an altitude of between 545 and 760 m.a.s.l.

These communities are located in environments dominated by grass-shrub steppe. The predominant species are Mulynum spinosum (Cav). Pers., Senecio filaginoides DC., Nassauvia axillaris (Lag. ex Lindl.) D. Don., Berberis mycrophilla G. Forst., and Schinus johnstonii F.A. Barkley. This area corresponds to the phytogeographical region of Patagonian Province, Central District (Cabrera, 1976).

The climate is dry and cool. Annual average precipitation of 119 mm, mainly concentrated between May and September, with annual temperatures of 17.5◦C in summer and 2.6◦C in winter; dominant winds are from the west (Mereb, 1990).

Gualjaina has an estimated population of 1183 inhabitants and a total population of 2,500 people including the 17 smaller communities (one of these is Costa de Lepá) that have formed close to it, and it covers an area of ca. 2,870 km<sup>2</sup> , based on the population census of 2010 (Instituto Nacional de Estadística y Censos, 2010). In this region a proportion of the population live in the town, while other residents are distributed along the edges of rivers or in rural areas. Most people are descendants of Mapuches and Creoles (ECPI, 2007). There are some institutions in the communities, such as police, hospital, school and some churches (Catholic and Evangelical). In addition, some national institutions from outside the community work in the area (National Institute for Agricultural Technology; Development Corporation of Chubut province). They have provided technical assistance and exotic plant seeds with agricultural uses. The main economic activity is based on livestock breeding and to a lesser degree, agriculture, handcraft sales, commerce, tourism, and some people are supported by social assistance from the state. Firewood is the most significant fuel in these communities, many of which employ it for domestic use, as they have no access to gas.

# Data Collection

Fieldwork for this study was carried out between February and November 2015. A total of 33 household heads consented to participate in the interviews. Each person was asked to sign the Free and Clarified Consent Form according to the Code of Ethics of the International Society of Ethnobiology (ISE, 2006).

The adult population was composed of men and women ranging in age from 30 to 90, with a mean age of 59.8. The informants were selected at random, each one representing a household unit. Free listing was used to determine the composition and species richness of firewood plants known by

each informant (Alburquerque et al., 2010). In addition, open and semi-structured interviews were carried out in order to comprehend the landscape dynamic used to obtain firewood plants. We asked the local inhabitants about places for gathering firewood species (i.e., scrubland, cultivated land, areas close to water bodies). Furthermore, informants were consulted as to any additional uses of the firewood species named, strategies for acquiring them and the continuity of use of the species. Analysis of use continuity was based on the methodology proposed by Fernández-Llamazares et al. (2016); participants were asked whether each firewood species mentioned had been used in the past and up to the present, or if it was used only in the past, or only in the present. The point of reference used to represent the past was the informant's childhood.

Data were supplemented using other ethnobotanical techniques, such as participant observation, in-depth interviews and field excursions with key informants in order to collect botanical material of the species cited in the interviews (Guber, 2006). Samples of plants were herborized and placed in the Esquel Mountain and Patagonian Steppe Research Center. The botanical reference material was identified according to Correa (1971, 1978, 1984) and its scientific names were updated following Zuloaga and Morrone (1999) and International Plant Names Index (www.ipni.org).

# Data Analysis

Composition and richness of species and botanical families were estimated using the total number of species mentioned by the Mapuche informants.

The biogeographic origin of species was classified according to the Darwinion Institute catalog of vascular plants (Zuloaga and Morrone, 1999) and the following were considered as etic categories: native to Patagonia (indigenous species that grow approximately from 37◦ S southward) and exotic to Patagonia.

Use consensus (UC) was obtained considering the frequency of use citations for each species in the total population. It was calculated by dividing the number of informants (n) who used the species (i) by the total number of informants interviewed (N = 33) × 100, (ni/N × 100) (Molares and Ladio, 2012). This index is used as an indirect measure of the importance of use of cited species.

The versatility of use of the species was estimated using the index: UVi = P UVsi/ni (Phillips and Gentry, 1993), where UVsi is the number of uses registered by informant i for species s, and ni is the number of people who mention species i. The values obtained were put into two categories: (1). Fuel use only; (2). Fuel use plus one or more additional uses (i.e., construction, fodder, medicinal, edible).

Acquisition strategies of firewood resources were classified etically as: gathering, purchase, exchange or state subsidies.

The landscape used for gathering wood was analyzed considering the different management processes used by the population and also the different signs of domestication. These local observations of the Mapuche landscape were then grouped into several etic categories listed further: Landscape with low human intervention: this was defined as the environment with lowest human influence or management (i.e., hills and plains with mainly grasses and low bushes, with scattered clumps of native shrubs approximately 1.50 m in height, all intertwined; extensive livestock grazing; no separation of areas or zoning of pastureland; no movement of land, or irrigation, and only scarce to moderate quantities of dung; located at a distance of no less than 3 km from human settlements). Semidomestic landscape: moderate human intervention (i.e., more signs and dung from animals than previous environment, and areas with bare soil; includes areas next to rivers, with small clumps of exotic species of Salix sp.; few, widely separated irrigation canals; temporary infrastructures such as cattle sheds; shelter from wind for the animals, constructed with corrugated iron, poles and tethering posts). Domestic landscape: has undergone the highest level of human intervention (i.e., around dwellings, where outbuildings are situated, corrals, gardens and other cultivated spaces such as vegetable gardens or small fruit and vegetable farms). The continuity of use of each woody species was categorized as: current use (i.e., the person has used it within the last year), past use (i.e., the species mentioned was used only in their infancy); current-past use (i.e., the same species continues to be used from the informant's infancy up to the present time).

For general analysis, the total richness of native and exotic species was compared using a binomial test (p < 0.05), and use consensus was calculated with the Mann Whitney U-test (p < 0.05).

The richness of firewood species in the different types of landscape was compared by means of the Chi-squared test (p < 0.05), and to compare the biogeographic origin and use versatility of the species gathered in each type of landscape the binomial test was used (p < 0.05). Species similarity between environments with different levels of domestication was analyzed using the Jaccard index (JI: c/ (a + b+ c) × 100), where c is the number of species common to two environments, a is the number of unique species in environment A, and b is the number of unique species in environment B (Höft et al., 1999).

The UC of the species was compared for different types of landscape, according to the following categories: biogeographic origin, use versatility and continuity of use by means of the non-parametric Kruskall Wallis test (p < 0.05). In addition, for each landscape comparisons were made of UC in relation to versatility, using the Mann Whitney U-test (p < 0.05).Species richness was compared between the continuity of use classes using the Chi-squared test (p < 0.05).

In addition, a multinomial logistic regression analysis was carried out with the SPSS 22.0 program in order to complement the information and obtain a model to explain how the proportion of plants varies between the different levels of landscape domestication (dependent variable category) according to their biogeographic origin and use versatility (independent variable category) (Chan, 2005; Ladio and Molares, 2013). In this analysis we excluded the species obtained through purchase or subsidies (i.e., Nothofagus antarctica, Pinus spp.) and the variables (i.e., life form, continuity of use) that do not register as significant in the model. In this kind of regression, tendencies are established according to the categories under comparison; in this case, the domestic landscape. A description of the variables in the model and the distribution of all cases are shown in **Table 2**. The model was found to be significant (p < 0.05). Calculations of the odds ratios (i.e., the probability of an event happening) are shown in **Table 3** by means of ebeta = Exp (B) (Chan, 2005; Ladio and Molares, 2013).

# RESULTS

# The Role of Firewood in the Lives of Costa de Lepá and Gualjaina Inhabitants

Firewood plants provide the main fuel for heating and cooking for almost all households in Gualjaina and Costa del Lepá (100% of informants). The gathering and use of plants is an activity carried out by both men and women, all year round. Collection patterns consist of gathering (97% of informants) a mixture of dead and green wood for the purpose of long-lasting fires, or fallen pieces of wood in open wild areas, and pruned branches from forest plantations. The firewood is then transported in bundles carried on the shoulders, or the load is carried by an animal or by human-powered carts (**Figure 2**).

Approximately half the community complement firewood gathering with purchase (48%), and they mainly buy firewood from Sub-Antarctic forest such as N. antarctica. To a lesser extent (15%), locals are aided by government firewood programs called "the heat program" which provide home heating assistance in the form of 6 m<sup>3</sup> of firewood in winter, composed mainly of N. antarctica, Pinus spp., Salix sp. and Populus alba. The exchange of firewood for animals (3%) also takes place, both within the community and with neighboring communities.

In addition to the above, in some situations where wood is limited other fuel-types such as liquefied Petroleum Gas (LPG) and animal by-products, including dung from cows and sheep, are used. In general, people reported using LPG (82%) only for cooking or heating up water, due to its elevated cost.

The communities convey the important role that firewood plays in their culture; they do not simply view it in terms of meeting energy needs. The cooking stoves provide the highly appreciated space heating that allows families to sit around the fire and socialize. As well as this, the taste of some traditional meals is associated with the selection of firewood used in its preparation. For example, the embers of Nassauvia axillaris are used for cooking bread. In addition, according to informants,

FIGURE 2 | Gathering firewood of Salix sp., P. alba and a mixture of branches from different trees with human-powered carts in Costa del Lepá, Chubut.

heating a home with firewood is healthier than using other forms of heating. However, for most informants the supply of firewood now and in the future is a constant source of worry, and they feel responsible for finding alternatives due to the current scarcity. As some locals commented: "the people always used to go out to collect firewood, now you don't see it so much..."(D.A). "It's good that you folk who come from outside tell us what can be done to improve the situation, things that we don't realize..."(E.O).

# The Gathering of Native and Exotic Firewood Plants

Plant composition: Locals gather a total of 28 firewood species, belonging to 24 genera, and grouped into 14 families (**Table 1**), of which Asteraceae (5 species) and Salicaceae (4 species) are the most collected. The most frequently collected species are the native S. johnstonii (use consensus of 78.8%) and N. axillaris (54.5%), and the exotic Salix sp. (87.9%) and P. alba (54.5%) (**Table 1**). As expected, considering biogeographic origin and not taking into account the gathering environments they come from, or their different levels of domestication, the total richness of plants used as firewood was higher for native species (75%, 22 species), while exotic species accounted for 25% (6 species) (Binomial test, p = 0.02). However, it is notable that with respect to use consensus, inhabitants appear to use exotic species (27.3% ± 14.6) more than native species (18.2% ± 4), although this difference is not significant (Mann Whitney U-test, p = 0.97).

# Domesticated Landscapes and Firewood Gathering

## Species Richness According to the Level of Landscape Domestication

The richness of firewood species is not higher in environments with less domestication, and does not vary significantly between different landscape categories (Chi-squared test p = 0.328) (**Figure 3**). The richness found in domestic landscapes is represented by 10 species, whereas in the semi-domestic and low intervention landscapes it is composed of 17 species.

With regard to the similarity of woody species between different domesticated landscapes, the highest values were obtained for the low domestication and semidomestic landscapes (50% of the species). The domestic and semidomestic landscapes, meanwhile, showed only 13% similarity. No similarity whatsoever was found between the domestic and low intervention landscapes (0%).

Informants mentioned (19% of cites) that the domesticated landscapes, which have undergone most human intervention, are taken advantage of for both native and exotic firewood species (Binomial test, p = 0.75), of which a high proportion (60%, 6 species) are cultivated exotic species, while the remainder (40%) are native species (**Figure 3**). The most frequent species are P. alba and Salix sp.

The semi-domestic areas, made up of river valleys that run through the communities (rivers Gualjaina and Lepá, **Figure 1**) together with the wetlands (flood meadows and water holes), also represent areas recognized for their provision of firewood. Not only do locals find wood here for cutting, but also driftwood from

#### TABLE 1 | Firewood species used in the rural communities of Costa del Lepá and Gualjaina, in the Patagonia steppe.


UC, Use Consensus; N, Native; E, Exotic; \*Non-local landscape plants.

the rivers, locally known as "resaque," consisting of a mixture of native and exotic species, which may be used thanks to local knowledge of water pulses. From this landscape mostly native species are obtained (88%, 15 species) (Binomial test, p = 0.002) (**Figure 3**). The exotic species Salix sp. is the most collected.

The landscapes with least human intervention are also widely known and explored by locals, who reported frequenting them since they were children, when they would go out with their parents to gather firewood and do other activities such as checking on the herd. The highest proportion of inhabitants 'cites

(47%) refers to the use of this environment, from which only native species are obtained (**Figure 3**). Schinus johnstonii is the most collected; according to informants its roots provide very good quality firewood. Nevertheless, they do show concern about its increasing scarcity, as it is slow growing and suffers great pressure of use.

### Comparison between Life Forms

Most species used were shrubs (70%, 21 species), and were found mainly in semi-domestic and low human intervention landscapes, while trees (30%, 9 species) were found mostly near watercourses (semi-domestic) or cultivated by man as hedges, for wind protection and other uses in the domestic landscapes. N. antarctica and Pinus sp. are trees whose wood constitutes an important part of the firewood used by inhabitants; however, it is not obtained in the local landscape, but is purchased by the truckload or received through state aid programs.

## Use Consensus of Species According to Level of Landscape Domestication and Biogeographic Origin

The three types of landscape were used with the same intensity (Test de Kruskall-Wallis, p = 0.61). Interestingly, the domestic landscape presented a higher average UC (11.72% ± 3.71) than both the semi-domestic landscape (10.75 ± 2.49), and the landscape with least human intervention (8.52 ± 1.78) (**Figure 4**).

Results relating to biogeographic origin also went against our hypothesis, as no differences were found in the use of the different landscapes for the gathering of native (Kruskal-Wallis test = 0.160, **Figure 4**) or exotic firewood plants (Kruskall-Wallis test, p = 0.399, **Figure 4**). Nevertheless, the average value for UC of the native species is higher in the semi-domestic than in the domestic landscape, and the average UC value for the exotic species is higher in the semi-domestic and domestic areas than in those of low human intervention.

# Versatility of Use of Firewood Species

landscape (p > 0.05).

Of all informants' cites, 71.2% referred to plants which were used only for firewood (cooking and heating) (**Table 2**). The remaining cites made reference to firewood species which were also used for other purposes, from two to three different uses (i.e., construction materials, fodder, dye, medicine, food). The species of highest use versatility are exotic: P. alba, Salix sp, Prunus cerasus, and Malus domestica.

## Versatility of Use of Firewood Plants According to the Level of Landscape Domestication and Biogeographic Origin

The average use versatility of the firewood species showed no variation between the different landscapes in terms of their level of domestication (Kruskall-Wallis test, p = 0.19, **Figure 5**). Neither did they show significant differences between types of landscape in terms of biogeographic origin (Kruskall-Wallis test p > 0.05, **Figure 5**). In the domestic area, 60% of the species (6 species) are used for multiple purposes, while 4 are used only for fuel (40% of the total), although these differences are nonsignificant (Binomial test, p = 0.75). In the semi-domestic areas, in contrast, fewer of the firewood plants are used for multiple purposes (18%, 3 species) than those used for fuel alone (82%, 14 species) (Binomial test, p = 0.013). The landscape with least human intervention follows the same pattern.

In the domestic areas, the UC of multi-use combustible species (17.2% ± 7.50) was higher than the species used exclusively for fuel (3.4% ± 0), although these differences were not significant (Mann Whitney U-test, p = 0.11, **Figure 6**). The multi-use combustible species with highest UC was P. alba (48.3%). The semi-domestic landscape followed a similar pattern, since the multi-use combustible species (19.5% ± 12.8) TABLE 2 | Summary of cases for variables of multinomial logistic model representing the firewood species gathered by the Mapuche communities of Costa del Lepá and Gualjaina.


Frequency represents the total number of plant cites for each sub-category.

presented higher UC than the fuel-only species (8.9% ± 1.2), but without significant differences (Mann Whitney U-test, p = 0.59, **Figure 6**). Salix sp., a multi-use species, was the most frequently cited (44.8%).

In contrast, in the low intervention landscape the UC of fuelonly species (9.6% ± 2.05) was higher than the multi-use species (3.4% ±0), although not significant (Mann Whitney U-test, p = 0.15, **Figure 6**). Schinus johnstonii (31%) and N. axillaris (17.2%), fuel-only species, were the most frequently cited plants.

## Continuity of Use of Firewood Species

Of all the combustible species registered, only three correspond to plants currently out of use (10.7% of the total); inhabitants reported using them during their infancy. According to informants, the native shrub species Prosopis denudans var. patagonica, Retanilla patagonica, and Atriplex lampa are not in current use, a fact associated with their lack of regeneration and lack of access to the plants.

Despite the limited nature of their firewood resources, Mapuche communities allow neighbors to have free access to the semi-domestic and low anthropic intervention landscapes for firewood gathering, without establishing previous agreements within the communities to regulate this use. However, some sectors in the region have been appropriated by the previously mentioned livestock companies, which forbid the extraction of firewood plants and are drastically altering provision strategies.

The remaining species have a history of continuous use (11 species; 39.3%), amongst which S. johnstonii, N. axillaris and Corynabutilon bicolor are worthy of special note, being found mainly in the low anthropic intervention environments; either they have been incorporated gradually (16 species in current use, 50% del total) due to their environmental (i.e., cultivated species such as P. alba, P. cerasus, M. domestica, and Ulmus minor), and/or commercial availability, such as Pinus ssp. and N. antarctica. The firewood plants which have been used for a long time have a higher UC (31.2% ± 6.2) than both those recently incorporated (12.5% ± 5.6) and those which are no longer used (7.3% ± 2.1) (K-Wallis test, p = 0.003).

FIGURE 5 | Average use versatility (UV) of species according to biogeographic origin, and total value for the different landscape categories (domestic, semi-domestic and low human intervention): Key: UV of native plants; UV of exotic plants; Total UV. A: No significant differences to versatility between different categories of landscape (p > 0.05). a: No significant differences to versatility according to biogeographic origin of the firewood species in each different categories of landscape (p > 0.05).

firewood species for the different landscape categories (domestic, semi-domestic and low human intervention): Key: Fuel-only species; Multiple use species. a: No significant differences according to versatility of use of firewood species for the different landscape categories (p > 0.05).

# Differential Use of the Landscape According to Biogeographic Origin and Use Versatility of Combustible Species

Descriptions of the variables analyzed and the distribution of all cases are detailed in **Table 2**. The model to explain variation in the proportion of plants with different biogeographic origins and use versatility in the landscapes analyzed together was found to be significant (χ 2 : 110, df = 4, p < 0.05). Other models that considered continuity of use and life form did not give significant results, so these variables were excluded. By means of this multinomial analysis which evaluates the weight of variables as a function of the other variables, it was found that the biogeographic origin of the firewood plants varied for the different classes of landscape (χ 2 : 20.6, df = 2, p < 0.05). This result contributes more information than the univariate analysis (**Figure 4**). In the landscape with low human intervention, the probability of native plants use is 100 times greater in the domestic landscape (p < 0.05, **Table 3**). Furthermore, the semidomestic landscape showed no variation in biogeographic origin in relation to the domestic landscape (p = 0.07, **Table 3**). In contrast, for exotic species the patterns are reversed; the domestic areas have 100 times as many exotic plants as those with low human intervention. The use versatility of firewood plants did not vary with the different levels of landscape domestication (χ 2 : 0.8, df = 2, p = 0.66). As was found previously with univariated analysis, the availability of combustible species with greater use versatility does not seem to be associated with domestication of the landscape.

# DISCUSSION

Our results show that the landscapes with a lower level of domestication are not the most intensively used, not in terms of number of species, use intensity or species with higher use versatility. For the two communities living on the Chubut steppe, gathering patterns of firewood plants depend entirely on landscapes with different levels of domestication, which are used in an articulated way. This gathering pattern is dynamic and possibly evolves with the changes that come about in the landscape and other factors such as livestock breeding. All these environments created according to the needs of locals are used in a complementary way, possibly indicating "prudent" (Shaanker et al., 2003) and diversified use of these resources, so that the impact of gathering is distributed across the entire environmental gradient (Cardoso et al., 2017). This reflects an integrated concept of the landscape, as has been widely documented for different traditional peoples in Argentina (Crivos et al., 2004; Pirondo and Keller, 2014). In other words, the landscape units present spatial continuity in response to ecological gradient, type of management and the multiple interconnections generated by the inhabitants and their domestic animals, all of which is accompanied by global exploitation (Capparelli et al., 2011; Molares and Ladio, 2012, 2014).

The communities of Costa del Lepá and Gualjaina obtain in total 28 different firewood species, a richness similar to that registered by Cardoso et al. (2012), who recorded 27 species in other rural zones on the Patagonian steppe. This richness is the result of diverse use of the landscape and the integration of new supply strategies, such as purchase and social welfare plans (Cardoso et al., 2013). The species are shared little between environments, considering the intentional addition of exotic and new species in the most domesticated landscape, mainly for firewood purposes. However, the low diversity of species in these environments (10 species) coincides with domestication patterns that tend toward a reduction in the number of species, monoculture and specialization of species (Amico, 2002). The three types of environment analyzed showed no significant differences in the proportions of native and exotic species gathered, indicating that gathering pressure is shared between the different landscapes. Nevertheless, the multinomial analysis showed that the probability of collecting native plants is 100 times higher in environments with low domestication, while the reverse is true for exotic species. The inhabitants have incorporated a


B, Beta; Wald is the chi-square that tests the null hypothesis, df, degrees of freedom; Sig., level of significance and Exp (B) = Odds ratios calculated by exponentiation of the coefficients (the probability that an event will happen in relation to the probability it will not).

<sup>a</sup>The reference category is: Domestic landscape.

<sup>b</sup>This parameter is established as zero since it is redundant.

\*This symbol indicates significant differences.

high proportion of exotic species through cultivation, in addition to those species which have begun to grow in the wild. This could be favored by the decreasing availability of native resources and the strong influence of external government organizations which provide technical support that promotes the incorporation of exotic resources (Eyssartier et al., 2013). The use of exotic species in domestic landscapes in Patagonia and other regions of the world has been highlighted by several authors to be a local solution to various needs, since in general they are easily available, fast growing, and more tolerant than native species (Eyssartier et al., 2009; Dos Santos et al., 2010; Richeri et al., 2013; Rovere et al., 2013), in addition to complementing native species, covering new needs, and diversifying the offer of useful plants (Albuquerque, 2006; Medeiros, 2013; Ladio and Albuquerque, 2014).

The semi-domesticated landscape, which includes mainly river valleys, with an intermediate level of human intervention, was also a significant site for the gathering of firewood. In these environments the species richness of native plants gathered is higher than that of exotics, although the latter are the most used. The more frequent use of the exotic species may be associated with their abundance, accessibility and quality, as observed in other studies (Albuquerque et al., 2009; Cardoso et al., 2012). One notable example of this is Salix sp., an exotic species of rapid growth that quickly colonizes and becomes established on most Patagonian riverbanks, in many cases displacing the native species S. humboltiana, and competing successfully for resources and suitable regeneration sites (Bozzi et al., 2014).

In contrast, the landscapes where there had been relatively little anthropic intervention presented the greatest number of cites for native firewood plants (100 times more than the most domesticated environment). Previous investigations carried out in similar environments in Patagonia have highlighted the important use of native species for firewood, compared to exotics (Cardoso et al., 2012). Schinus johnstonii is the most frequently gathered resource in this landscape. The cultural importance of this species over time has been documented in archaeobotanical studies and in recent work in the region, highlighting this species' characteristics of high calorific value and long-lasting embers (Ancibor and Pérez de Micou, 1995; Cardoso et al., 2013; Morales et al., 2017).

The results also indicate that there is no variation in pattern of use of the landscape in terms of use versatility, and that in general terms the combustible plants are not very versatile, possibly due to their fuel uses (heating and cooking) being prioritized in the interview. Even so, it must be mentioned that the firewood plants, as well as providing energy, are a source of animal fodder, are used in making shelters for the animals and in construction, as also found by Cardoso et al. (2012) and Morales et al. (2017). These authors highlight the fact that the introduction of rapidly growing exotic firewood plants in peri-domestic areas increases the richness of uses of the flora, revealing mechanisms that improve inhabitants' self-sufficiency.

It is interesting to note the dynamics of use of the different landscape areas over time. According to informants, during their infancy the semi-domestic and low anthropic intervention environments were of more importance for plant gathering than they are now, given their higher offer of species with high calorific value. At the present time, the extraction of firewood is affected by the work of large livestock companies, as well as the increasing problem of desertification due to overgrazing, amongst other factors, which cause damage to the soil and plant cover. These phenomena gradually configure a new landscape, thus generating the need for new strategies to obtain supplies (Marconetto, 2006).

Analysis of the interviews reveals that historically, the purchase of firewood species was infrequent in the studied communities, as was the use of gas. It has been recorded that half the population now buys firewood plants, mainly N. antarctica, which comes from the Sub-Antarctic Forest, and is one of the most commonly used and most commercially exploited species (Arre et al., 2015). The current tendency to purchase firewood, as a provision strategy, has also been observed in other rural communities (Ramos et al., 2008; Cardoso et al., 2012); in this way the factor of uncertainty in the availability of these subsistence resources, as conceptualized by Blancas et al. (2014), is decreased.

Together with the preferential use of species adapted to the disturbances typical of these environments (e.g., fires, overgrazing) (Morales et al., 2017), this suggests an articulated, dynamic system in which processes of climate and sociopolitical change (decrease in precipitation, mineral extractivism, etc.), flora and populations all coevolve. Thus, complementary use of the landscape, selection of species that grow quickly after harvesting and innovation in supply strategies (purchase), understood as local responses to the problem of change (Ladio, 2017), promote faster recuperation of the landscape and woody species, decreasing uncertainty and increasing biocultural resilience (Ladio, 2017; Morales et al., 2017). For example, some informants revealed that since many people have started buying firewood, regeneration of native plants such as Retanilla patagonica has been observed. This moment in the collective memory of the studied populations marks the beginning of an indirect process of promotion of the availability of this species, so that it can be used in the future; this is, therefore, management that will promote domestication of the environment.

# CONCLUSIONS

Firewood gathering has always been stigmatized and associated with practices considered as destructive, and carried out by poor communities. In our case study visibilization of the firewood gathering practice from the perspective of landscape domestication allows us to reveal more complex, integral processes. Although native plants are much used and obtained from environments with a low level of domestication, inhabitants do not seem to concentrate their efforts on those environments alone. Medium and high domestication level landscapes contribute exotic (and native) plants which play a substantial role, alleviating use pressure on native species, and while this cannot prevent over exploitation, it does minimize it. The continuity of use of wood for combustible purposes is also favored, guaranteeing a supply of this resource, fundamental for subsistence, and an important component of the locals' lifestyle. Complementary use of the landscape is the underlying logic, but the incipient domestication of species present in the totally domesticated environments is also an interesting aspect that should be studied in more depth. The growth of these plants (10 species) is being encouraged with irrigation, thermal and wind protection, pruning, the adding of organic material from home composting, etc., which generates greater production of firewood and reduces impact in terms of extraction of native species.

In the arid region of Patagonia significant effects are expected in the coming years, in association with global climate change. These changes will no doubt affect the communities whose subsistence depends on their close bond with the landscape and its resources (Rabassa, 2010). Within this scenario it is essential to review government strategies and conservation plans, which are always top-down. The inclusion of bottom-up perspectives on the subject of firewood is necessary in order to find solutions that take local management and cosmologies into account in an inclusive way (Ladio, 2017). Our empirical information reveals that the inhabitants of Patagonia are creating new environments

# REFERENCES


with richness and versatility of use, and seeking new alternatives in the face of firewood scarcity, through landscape domestication.

# AUTHOR CONTRIBUTIONS

AL and SM conceived of the study, designed the questionnaires, participated in data analysis and performed statistical analysis. DM carried out questionnaires among the Mapuche communities, analyzed the tales, performed statistical analysis and drafted the manuscript. The three authors read and approved the final manuscript.

# ACKNOWLEDGMENTS

Our thanks are due to the inhabitants of the Costa del Lepá and Gualjaina communities for their friendly disposition, generous help, and for sharing their knowledge and experience. We are also grateful to the two reviewers whose suggestions helped improve this work substantially. This research was supported by PIP 0466 of CONICET.


Zuloaga, F., and Morrone, O. (1999). Flora del Cono Sur. Catálogo de las Plantas Vasculares. Buenos Aires: Instituto de Botánica "Darwinion". Available online at: www.darwin.edu.ar/proyectos/floraargentina/fa.htm

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Morales, Molares and Ladio. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Management of Fruit Species in Urban Home Gardens of Argentina Atlantic Forest as an Influence for Landscape Domestication

Violeta Furlan1,2 \*, María L. Pochettino<sup>3</sup> and Norma I. Hilgert1,2,4

1 Instituto de Biología Subtropical, Universidad Nacional de Misiones-Consejo Nacional de Investigaciones Científicas y Técnicas, Puerto Iguazú, Argentina, <sup>2</sup> Centro de Investigaciones del Bosque Atlántico, Puerto Iguazú, Argentina, <sup>3</sup> Laboratorio de Etnobotánica y Botánica Aplicada, Facultad de Ciencias Naturales y Museo, Universidad Nacional de la Plata, Consejo Nacional de Investigaciones Científicas y Técnicas, La Plata, Argentina, <sup>4</sup> Facultad de Ciencias Forestales, Universidad Nacional de Misiones, Eldorado, Argentina

#### Edited by:

Ana Haydeé Ladio, INIBIOMA, Argentina

#### Reviewed by:

Ernani Machado De Freitas Lins Neto, Universidade Federal do Vale do São Francisco, Brazil Milton Kanashiro, Embrapa Amazonia Oriental, Brazil

> \*Correspondence: Violeta Furlan violetafurlan@gmail.com

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Plant Science

> Received: 15 July 2017 Accepted: 14 September 2017 Published: 28 September 2017

#### Citation:

Furlan V, Pochettino ML and Hilgert NI (2017) Management of Fruit Species in Urban Home Gardens of Argentina Atlantic Forest as an Influence for Landscape Domestication. Front. Plant Sci. 8:1690. doi: 10.3389/fpls.2017.01690 Home gardens are considered germplasm repositories and places for experimentation, thus they are key sites for the domestication of plants. Domestication is considered a constant process that occurs along a continuum from wild to managed to domesticated populations. Management may lead to the modification of populations and in other cases to their distribution, changing population structure in a landscape. Our objective is focused on the management received in home gardens by perennial species of fruits. For this, the management practices applied to native and exotic perennial fruits species by a group of 20 women in the periurban zone of Iguazú, Argentina, were analyzed. In-depth interviews were conducted, as well as guided tours for the recognition and collection of specimens of species and ethnovarieties. Sixty-six fruit species managed in the home gardens were recorded. The predominant families are Rutaceae, Myrtaceae, and Rosaceae. The fruit species with the highest number of associated management practices are pitanga (Eugenia uniflora) and pindó (Syagrus rommanzoffiana). The 10 species with the highest management intensity are (in decreasing order of intensity) banana (Musa x paradisiaca), palta (Persea americana), pitanga (E. uniflora), mango (Mangifera indica), cocú (Allophylus edulis), mamón (Carica papaya), guayaba (Psidium guajava), limón mandarina (Citrus x taitensis), güembé (Philodendron bipinnatifidum), and mandarina (Citrus reticulata). Among the families with the greatest modifications in their distribution, abundance and presence of ethnovarieties in domestic gardens, are the native Myrtaceae and the exotic Rutaceae. The main management practices involved are cultivation, tolerance, transplant and enhancement in decreasing order. It can be concluded that in Iguazú, fruit species management shows both in plant germplasm as in environment a continuum that through tolerance, transplant and cultivation latu sensu has derived in a mosaic of species in different management situations, which in turn are representative of an anthropogenic landscape in constant domestication and change.

Keywords: landscape domestication, urban botanical knowledge, Frontier, periurban agriculture, Ethnobiology

# INTRODUCTION

fpls-08-01690 September 26, 2017 Time: 17:47 # 2

The interactions between nature and culture formed the landscape, represented by the dynamic relationship between physical spaces, people, and natural resources throughout history. This relationship is constantly shaped by cosmovisions, values, and perceptions as well as by the biodiversity of the environment (Balée, 1998; Brodt, 2001; Pochettino et al., 2002; Davidson-Hunt and Berkes, 2003; Berkes and Turner, 2006; Toledo and Barrera-Bassols, 2009; Capparelli et al., 2011; Ladio, 2011). The transformation of the environment based on cultural criteria leads to the creation of a specific landscape. This co-created environment becomes a way of extending the domestic unit, where management and domestication of the species are primary tools (Stampella, 2015).

Family farming in Latin America is diverse according to the high variability of cultural groups. The way in which settlers appropriate nature influences the generated agroecosystems in both plant diversity and in its management (Paulus and Schlindwein, 2001; Toledo and Barrera-Bassols, 2009). From people and plants constant relationship, located biocultural entities arise which have the capacity of transforming each other and, consequently, the inhabited landscape (Lema, 2013). In this sense, home gardens are important places for experimentation as a part of an inhabited landscape (Pochettino et al., 2012). That is why they have international recognition as key sites for species domestication and germplasm repositories (Huai et al., 2011).

Over the twentieth century, scientists tried to categorize cultural groups on the basis of the way they work the land. However, archeological evidence showed there were numerous intermediate ways of land management and strategies that do not fit into cultivation or gathering as they were understood at that time (Harris and Hillman, 1989). Thanks to that discordance, it was triggered the interest of unraveling other forms of management that could lead to the phenotypic and genotypic modification of a species. To understand these kinds of managements, Casas et al. (1996), working in Mexico, proposed a categorization of practices observed in Nahua and Mixtec groups. At the same time Clement (1999) proposed a theory regarding landscape domestication phases together with plant domestication processes for Amazonian crops.

Home gardens are structured and maintained over time by the constant implementation of management practices like tolerance, enhancement, protection, transplantation and planting of particular species or individuals (Casas et al., 1998). These practices lead to selective maintenance of wild vegetation and species of cultural importance, encouraging the emergence of phenotypic divergences settled in local preference criteria and domestication process itself (Casas et al., 1996).

The concept of perennial fruit species is used in this text according to Miller and Gross (2011) to group those plants that are grown in home gardens mainly for fruit consumption (although they have multiple uses) and are generally long live perennial. Botanically the group involves herbs (as Musa section), epiphytes (as Philodendron bipinnatifidum), palms (as Syagrus rommanzoffiana), shrubs and trees (as Malpighia emarginata and Psidium guajava, respectively).

Numerous studies demonstrate the process of domestication in perennial fruit species. Some well known examples belong to Cactaceae, Lauraceae, Anacardiaceae botanical families and also Amazonian species of the Annonaceae family (Miller and Schaal, 2006; Bost, 2009; Clement et al., 2010; Parra et al., 2010; Blancas et al., 2013; Aguirre-Dugua et al., 2013; Lins Neto et al., 2014).

Inside Argentina Atlantic Forest, in the province of Misiones there are four principal cities according to its economic and politic importance (Instituto Nacional de Estadísticas y Censos [INDEC], 2010). These cities are Posadas, Oberá, Eldorado, and Puerto Iguazú. The last one is surrounded by natural protected areas and is part of a green corridor called "Corredor Verde Misionero" (García Fernández, 2002), it also shows a very complex cultural composition (Belastegui, 2004; Furlan et al., 2016). For these reasons this contribution focuses only on Puerto Iguazú as a study case. The landscape in Puerto Iguazú, mostly present in periurban area, is defined as a domesticated landscape. The main characteristics of this landscape are correspondent with the intensity of domestication proposed by Clement (1999) as a cultivated area with swidden/fallow structure. Although all forms of landscape domestication are put in practice in the region, many of them occur simultaneously. Domestication process occurs with different intensity. This intensity is related to the complexity of management practices applied to the plants, the number of practices carried out and the number of people who carry them out in a particular population (González-Insuasti and Caballero, 2007). Through the recognition of management intensity, mediated by the biological characteristics of the species in question, it can be stated its cultural importance (González-Insuasti and Caballero, 2007; Blancas, 2013). Previous works by the research group (Furlan, 2017) highlighted the importance of fruit species in the domestic gardens of Puerto Iguazú. The word fruit comes from the Latin "fruor" that means to enjoy (Simpson and Ogorzaly, 1995). The main use given to perennial fruit species in Puerto Iguazú is associated with this perspective of enjoyment and complement to food and medicine. Most of the fruits are consumed at the same time of their maturation and without mediating too many preparations or preserves. The objective of this research is to determine which perennial fruit species are managed and the most common management practices for them in home gardens of the periurban of Iguazú.

# Study Area

### The Atlantic Forest in Misiones Province

The Atlantic Forest is classified as one of the hot spots of biodiversity in the planet (Myers et al., 2000; Mittermeier et al., 2004), and Argentina holds the bigger continuous remnant of this biome, which covers approximately 10,000 km<sup>2</sup> (Izquierdo et al., 2011). The province of Misiones is located in the southern limit of this ecorregion (Galindo-Leal and de Gusmão Câmara, 2003; Placci and Di Bitetti, 2006).

This biome is distributed for 3300 km along Brazil coast, southeast of Paraguay and northeast of Argentina. This area is characterized by a semi-deciduous forest with differentiated strata, abundance of epiphytes, bamboos and lianas (Campanello et al., 2009; Montti et al., 2011). The weather of the region

is subtropical humid without dry season. The average annual rainfall is 2000 mm and the mean annual temperature is 20◦C (Campanello et al., 2009). Misiones province is one of the most diversified regions of Argentina (Placci and Di Bitetti, 2006). Nowadays the Atlantic Forest of Misiones hosts 1.000.000 inhabitants and 26.500 familiar agroforestry systems (Censo Nacional Agropecuario [CNA], 2002). The interactions between people and forest has been studied from an ethnobotanical perspective (Keller, 2008; Zamudio, 2012; Kujawska and Luczaj, 2015) and from an ecological perspective (Izquierdo et al., 2008, 2011).

### Socio-Cultural Characteristics of Misiones and Puerto Iguazú

The present research focuses on Puerto Iguazú, a city located in northwest of Misiones, bordering with Brazil and Paraguay (Nuñez, 2009). This area is known as a Triple Frontier (Rabossi, 2010). The city of Puerto Iguazú, together with Foz do Iguaçú (Brazil) and Ciudad del Este (Paraguay), create an important center of attraction for population inside the province (Barreto, 2002). Also, for Latin America the area has the biggest cities in relation of all the Triple Frontiers of the region (Rabossi, 2010).

Family agroforestry systems in Misiones are called "chacras," in each one of them people make multiple use of resources (Chifarelli, 2010a). Among the activities which characterize them there are a diversity of crops, forestry production, citric production, extraction of timber and non-timber forest products (Chifarelli, 2010b).

The most important economic activities of the region are silviculture and agriculture complemented by livestock farming (Instituto Nacional de Estadísticas y Censos [INDEC], 2010). Tourism, on the other hand, represents the main source of direct and indirect incomes for Puerto Iguazú (Nuñez, 2009). The area presents a constant migration flow from neighboring rural areas (Izquierdo et al., 2008) and an ethnic composition similar to the rest of the province, being a pluricultural context with influences of criollos, guarani, eastern Europe, Brazilian and Paraguayan traditions (Furlan et al., 2016) (**Figure 1**).

The productive landscape as the social scenario of Misiones is complex. The territory of the province has been occupied by Guaraní linguistic groups long before the province and even national organization. Most of these groups came from Amazonas river basin to the area (Cadogan, 1957) where they inhabited since, at least, 1200 years (Poujade, 1995). In spite of this the region was considered during the past century, as one of the under populated areas of the country. That statement lead to colonization plans that brought together people from Eastern Europe, Argentinians from other regions, Brazilians and Paraguayan migrants to the same place. Land property was different according to formal or informal colonization. Until today there are serious tensions about land tenure and property rights for most of local population (Schiavoni, 2006). Currently three languages are used in everyday life Spanish, Guaraní and Portuguese (Instituto Nacional de Estadísticas y Censos [INDEC], 2010).

According to official records, Puerto Iguazú had a population of 32,038 inhabitants and there were 7,580 dwellings (Instituto Nacional de Estadísticas y Censos [INDEC], 2001). There are fundamental relations of interdependence with neighboring cities operating for city functioning (Nuñez, 2009). Different land use planning were designed although they have not been implemented and at present the city lacks a proper planning (Cammarata and Gandolla, 2006).

The city is inhabited by a pluricultural population with diverse traditions which influence in its way of production. Settlers maintain family and work nets with neighboring cities (Izquierdo et al., 2008; Furlan et al., 2016). Conservation areas as well as Paraná and Iguazú rivers limit the expansion of the city. Land use in Puerto Iguazú is organized in areas. Downtown area is dominated by tourism industry (such as hotels and restaurants) and Periurban is dominated by agricultural activities. Nevertheless, family farming activities can be found in domestic units of both areas (Furlan, 2017).

Periurban area is understood, according to Barsky (2010), as a territorial complex of dynamic borders that includes elements of rural and urban land; it represents a transitional area which borders are dynamic and depend upon the rhythm of urbanization. The expansion of the agricultural border which took place in the last century in Misiones was structured upon spontaneous occupation (Schiavoni, 1995). Same patterns of occupation were repeated during the expansion of the urban area, which was structured upon the constant process of mobility of local people (Nuñez, 2009; Furlan, 2017). They carry those movements out along time between different territories of the Triple Frontier, in pursuit of the most favorable conditions for their families. This constant change of domesticated landscape has influenced the selection of plants managed in each domestic garden (Furlan, 2017).

Generally, women are the principal managers of home garden diversity and the products generated are for internal use of the domestic unit and occasionally sales (Furlan, 2017). Each home garden of Puerto Iguazú is formed by a variable number of microenvironments, being the main ones garden, park, orchard, chacra (plot area used for planting staples as cassava, maize, and beans), monte (native forest area in different stages of conservation) and capuera (area of secondary forest formerly used for annual crops as cassava) (Furlan et al., 2015). The detailed characteristics of these microenvironments are described in Furlan et al. (2015) and Furlan (2017). Specific information about richness and composition of medicinal species of Iguazú home gardens can be found in Furlan et al. (2016).

Home gardens in Iguazú have a variable number of species that ranges from 50 to 150. Most of the species held in the domestic unit have local varieties. That is why the total number of ethnospecies is as higher as 619 for the home gardens studied. The uses of the species reach a total a 747, being alimentary and medicinal plants the principal uses (Furlan, 2017). All the gardens visited were bigger than 450 square meters. From previous work is known that home gardens in Puerto Iguazú are present most of the times in plots bigger than that size, as well as that women are more prone to maintain a garden. Even more if women are aged between 30 and above years old. Seeds and plants of home gardens are obtained firstly by exchange with family and neighbors

and in some occasions are bought to local sellers (Furlan, 2017).

# MATERIALS AND METHODS

# Interviewing Methods

This paper is part of a bigger project that involved the first author's doctoral thesis and postdoctoral fellowship. For this reason, the selection of interviewees for this contribution has been done carefully from a bigger sample (from 369 interviewees, 10% of Iguazú domestic units). For this contribution field work was made during 2014–2015 with 20 women living in the periurban area of Puerto Iguazú. The criteria for their selection were: to have wide diversity and variability of management practices in their gardens, have been established in Puerto Iguazú for at least 30 years and to be older than 30 years. All women involved were asked to participate of this research and the objectives, researcher's participation and destiny of information shared during interviews were explained to them and written in an informed consent note<sup>1</sup> . The first contact with women was made during 2012 as part of a bigger project, for this contribution we had already previous bond. All species records refer to plants present in their domestic units.

Semi-structured interviews were carried out along with in depth interviews and guided tours through the home gardens. In each interview regarding to plant management was asked first which plants were classified as fruits of home gardens. Once we had the group of plants considered as fruits locally, we asked each women about if the plants were already there when they came to the chacra, then if they had moved them to other place of the domestic unit, also if they find a new plant growing if they let them standing or not. We asked if they make anything to increase the number of plants for each species and which were the special cares that they give to each one too. Which of the species were planted or removed in case they did not want them somewhere. Harvesting and pruning techniques were not specifically asked, those practices arose during the interviews.

# Botanical Determination of Perennials Plants and Management Categorization

Voucher specimens of managed perennial fruit species were collected on farm. Plants were identified by the authors and stored in the Herbarium of Instituto de Biología Subtropical (IBSIHerb) in Puerto Iguazú and in the Herbarium of Instituto de Botánica del Nordeste (CTES) in Corrientes, Argentina. The botanical origin of species was checked against "Flora del Conosur" of Instituto de Botánica Darwinion<sup>2</sup> . The scientific

<sup>1</sup>The consent was informed and written in all cases. Since our institution does not asses ethics on cultural studies with local population, the informed consent form was review by a specialist in bioethics. An ethics approval was not required as per institutional guidelines and national regulations, although authors decided to do field work following the International Code of Ethics in Ethnobiology (ISE, 2006).

<sup>2</sup>http://www.darwin.edu.ar

name of plants was verified using the Plant List<sup>3</sup> and full name of plants and its botanical origin for the area are presented in Supplementary Table 1. For their categorization, the name of the species was maintained and the varieties recognized locally were taken into account.

Emphasis was placed on the management of all species, including ethno-varieties, without differentiating between those already domesticated species and those that only have management. This decision was made since management and diversification are a constant process that can occur both in domesticated species and in not domesticated ones, such as peaches in northwest of Argentina and citrus in northeast of Argentina (Stampella et al., 2013; Hilgert et al., 2014). Management practices for perennial fruit species were defined according to Casas et al. (1996) and Blancas et al. (2013) and were modified for this case study according to the concepts shown below.

Tolerance: It applies to the practice of keeping individuals during thinning (cleaning), pruning or previous managements. This term is also used for new specimens grown spontaneously in domestic units that are left for their development.

Protection: It involves actions to avoid damages caused by environmental factors (climatic factors, pathogens, herbivores) on the selected species. Or in order to prevent that small animals, either farm or wild, eat the new shoots of plants. Chemical pest control systems were not considered among the protection techniques.

Enhancement: It consists in favoring the number of individuals of a species or variety for example by eliminating competition, watering seeds, consciously dispersing seeds to increase the abundance of a particular species. The improvement of soil quality and the use of fertilizers (organic or industrial) were not considered in the enhancement.

Transplantation: It applies to those individuals who were naturally settled and moved or individuals who were tolerated and then relocated.

Sowing or planting: It refers to seed or vegetative propagation that involves establishing the species in a favorable place for its germination and growth. It also includes plants that are reared in seedlings and later transplanted. Vegetatively reproduced species are included in this group such as pineapple, banana, güembe, strawberry, tuna (Opuntia).

Removal: It refers to the elimination of individuals.

Harvesting and pruning were not proposed as management categories at the beginning, however, they were included afterward, and only in those cases referred to by the interviewees although it was not specifically asked. Harvesting was considered when people mentioned bringing fruits from the monte or capuera and also in the case of collecting from roads or between plots. Pruning finally was recorded as a particular management practice as settlers used it as a way to obtain greater fruition or flowering of a species or to maintain the architecture of the plant in the desired way.

In all cases, the management was registered only for plants, not for the microenvironments or productive spaces. Different life forms (trees, shrubs, and vines) were included as long as they were considered as perennial fruit suppliers for families.

# Data Analysis

In this contribution registered data were analyzed with descriptive tools as detailed afterward. Testimonies obtained during interviews were also incorporated as part of the ethnographic record and were examined qualitatively. **Figure 3** was made using R studio and ggplot2 package (RStudio Team, 2015).

For quantitative data exploratory and descriptive methods were applied. In **Figure 2** percentages of management practices used for the total of domestic units of Puerto Iguazú are shown.

For **Figure 3** it was considered the relative frequency in which each species is managed, according to each one of the management practices. Each management practice is represented by a particular color and the length of each color bar shows the relative frequency of that practice. For example for Eugenia uniflora is tolerated by 9 women in a frequency of 0.016.

Simplified management intensity is calculated as the sum of all relative frequencies of practices for each species.

$$\text{Relative frequency} = \frac{\text{nij}}{\text{Nij}}$$

$$\text{Nij} = \sum \text{nij}$$

IMj = P Relative frequence of practices by species

	- i = Number of people applying each management practice
	- j = Each one of the species managed

For example the simplified management intensity calculated for E. uniflora is the sum of all relative frequencies being: Tolerance: 0.016; Enhancement: 0.005; Protection: 0.002; Sowing: 0.012; Transplantation: 0.003; Pruning: 0.003, Removal: 0.003. That is to say the management intensity is 0.045. This analysis was made to see in a wide sense if a species was having more attention. In this way when the management index throws out a bigger number is an indicator of more management attention associated to that species. If we want to know which one of the species could be interesting for future studies about domestication we can take this index into account, as a preliminary way of selecting species. After that for example González-Insuasti et al. (2008) index could be put in practice, as it is planned to do it for future research.

# RESULTS

## General Findings

Sixty-six fruit species are managed in the domestic units of Iguazú, mostly of which are trees and shrubs. The predominant families managed are Rutaceae (12 species); Myrtaceae (11 species), and Rosaceae (6 species).

Considering the incidence of management practices according to the number of fruit species that undergo each one of them

<sup>3</sup>http://www.theplantlist.org/

(**Figure 2**), the main management strategies are sowing (40% of the species), tolerance (30%) and transplantation (14%). On the other hand, enhancement, protection and harvesting were used for a range from 2 to 5% of the species while removal of individuals was applied to 2% of the species.

# Management Practices for Each Perennial Fruit Species

Many of the recorded species have a previous history of management, given by older interventions in the domesticated landscape. So, some of the species are object as unexpected practices, for example tolerance, while given its geographical origin, it would be more likely a sowed species. An example of this is higo (Ficus carica). In that case, tolerance -of plants previously existing in the spaces where people settled- and harvesting are the main management practices applied by the interviewees. According to natural climate and distribution of this species it is thought that specimens were sowed by someone before the new owner arrived.

The 12 fruit species with the highest number of associated management practices are (in decreasing order) pitanga (E. uniflora), pindó (S. rommanzoffiana), mandarina (Citrus reticulata), limón mandarina (Citrus x taitensis), guayaba (Psidium guajava), araticú (Rollinia emarginata),cocú (Allophylus edulis), guabirá (Campomanesia xanthocarpa), mamón (Carica papaya), limón arrugado (Citrus x limon cv. rugoso), mango (Mangifera indica), banana (Musa x paradisiaca). The number of management practices for all species can be seen in **Figure 3**.

Species dominance changes when the simplified management intensity is taken into consideration, (**Figure 3**). The 12 species with the highest simplified management intensity are (in decreasing order) banana (Musa x paradisiaca), palta (Persea americana), pitanga (E. uniflora), mango (M. indica), cocú (A. edulis), mamón (Carica papaya), guayaba (Psidium guajava), limón mandarina (Citrus x taitensis), güembé (Philodendron bipinnatifidum), mandarina (Citrus reticulata), araticú (R. emarginata), and jabuticaba (Plinia trunciflora).

The palta (P. americana var. americana) local varieties are selected for the preference of large and creamy fruits, while palta anisada (P. americana var. drymifolia) is selected because it possesses greater aroma in its leaves used to add to the "mate" (local beverage) and its fruits are not of particular importance.

All the species removed are native. These species are frequent in open areas of the Atlantic Forest and are adapted to ruderal environments; therefore they are frequently present in the domestic units that are close to forest areas, as is the case of the periurban of Iguazú. The mamón (Carica papaya) is another species that is usually removed -in particular the male stem- but did not appear in the interviewee's mentions. Of the 17 species enhanced, only two are exotic naturalized and with great local importance such as mandarina and limón arrugado (Citrus reticulata and Citrus x limon, respectively). The native species that have the highest number of enhancement reports are ubajay (E. myrcianthes) and guabirá (Campomanesia xanthocarpa), while mora de monte (Maclura tinctoria) is only associated with this practice. Tolerance practice is associated to the rest of native species. Sowing practice is specially applied to mamón, guayaba and guavijú (Myrcianthes pungens). Protection practice is associated with several native species of the families Myrtaceae, Caricacae, and Arecaceae and with exotic species of the families Rutaceae and Anacardiaceae. From Myrtaceae family protection is applied to E. myrcianthes in first place and with equal frequency to E. uniflora, E. pyriformis, Psidium guajava, and Campomanesia guazumifolia. Among the species from Caricaceae family, jacaratiá (Jacaratia spinosa) is selected

according to the sweetness of the fruit and is managed (through prunings and cuts) to achieve wide stems and low open plants since its stem is used to make a preserve of commercial value known as "wood marmalade". S. rommanzoffiana is also protected being the only one in Arecaceae family associated to this practice. All the Rutaceae which are protected have the same relative frequency of management for that practice. While from Anacardiaceae family only M. indica is subject to protection.

Celtis iguanaea, known as talera, is the only species that is exclusively tolerated and does not present other recorded management practices. Sowing is the only recorded practice for the following species: castaña de caju (Anacardium

occidentale), carambola (Averrhoa carambola), caraguatá propio (Bromelia balansae), guaraná (Bunchosia argentea), pomelo (Citrus maxima), palmito or juçara (Euterpe edulis), ivapovó de monte (Melicoccus lepidopetalus), guaporoití (Plinia rivularis) and joaobolao (Syzygium cumini). It is worth emphasizing that the fruits of caraguatá, palmito and ivapovó de monte are highly appreciated and the interviewed mentioned that it is very difficult to obtain seedlings or seeds from the forest nearby. For this reason they brought the species from other zones of Misiones province.

Among the ethnovarieties those which presented the greatest proportions of management are mango chico (M. indica), palta con forma de pera (P. americana), palta redonda (P. americana), banana de oro (Musa x paradisiaca), banana petisa (Musa x paradisiaca), limón mandarina (Citrus x taitensis), limón arrugado (Citrus x limon cv. rugoso).

# Some General Rules of Species Management According to the Local View

Several criteria were established that organize agricultural activities in the calendar as general rules for pruning, transplanting and sowing, for instance, temperature or moon influence. These criteria were recorded through the interviews and are frequent in the interviewee's speech (textual phrases in Spanish, contextualized to English):

"para podar y trasplantar hay que hacerle en los meses sin R, no importa lo que sea (la planta) sino no viene bien, se hela o se embicha (es afectada por alguna plaga)"- pruning and transplantation must be done in months without R (in Spanish, may to august inclusive, that is winter time), no matter what (plant) it is, if not it does not grow properly, it is frozen or catches bugs (it is affected by some plague)-.

Regarding sowing it was mentioned: "para que las plantas vengan bien siempre hay que esperar la luna (1◦ ) de agosto, ahí cuando se siembra la mandioca, después"-for plants to grow, you always need to wait for the moon (1st) of August, there when the cassava is planted, after-.

According to the interviewees there are many plants that can be equally "advanced in seedling or pots" so that they are ready when necessary. There are others that are planted in autumn to be cropped in winter (like certain leafy vegetables) or to "survive the frost" and then give fruit such as passion fruit (Passiflora alata and P. edulis).

Another regularity observed refers to the origin of those fruit trees that are pruned. Pruning is particularly practiced with citrus, mango (M. indica) and palta (P. americana) including ethnovarieties. According to the testimonies, the species of native fruit trees are not usually pruned. The main management they receive is protection, which is implemented whenever a plant grows spontaneously.

# DISCUSSION

According to results it is important to notice that sowing is the main management practice applied in general. This result is coincident with the type of environment since it has a high level of anthropization. Generally in Latin American home gardens almost half of the species are food and half of those are fruits (Pulido et al., 2008). Martínez-Crovetto (1981) highlighted the importance of edible wild fruit trees for local people of Argentinian Atlantic Forest. From this study, it can be added that perennial fruits species are also important for urban settlers in the southwest region of the Atlantic Forest. Here, the incorporation of species from the local forest as exotic species to home gardens is a way of ensuring the provision of a variety of resources. The importance of fruits for the inhabitants of the Atlantic Forest diet was also mentioned among caiçaras by Giraldi and Hanazaki (2014) and for descendants of poles by Kujawska and Luczaj (2015). Regarding its management Keller (2008) also underlines it and its presence nearby the house in Mbya-Guarani populations and Stampella (2015) does it too for criollos settlers of southern Misiones. The low availability of fruit species in areas of public use and markets as well as the restriction of the use of native species in conserved areas (parks and surrounding natural protected areas) (Furlan, 2017) are likely to influence women's motivations to incorporate these species into their home gardens in Iguazú. González-Insuasti et al. (2008) proved that land tenure is another factor that influences the decision of which species to manage with greater effort and which not. In Iguazú, land tenure is precarious for all people living in the area; the security of staying in the plot is related to the negotiating capacity that a family can have with respect to the different social actors. Therefore it is very difficult to determine the direct influence of this factor in the management of the species and its intensity in the area. However, in the new neighborhoods that are being opened could be an interesting variable to take into account for future studies. Emperaire and Eloy (2014) analyzed how the cultivation of açaí (Euterpe oleracea) in the plots of Santa Isabel do Rio Negro was considered as an "improvement" of the property and its importance for negotiation when the plot was for sale or transfer. They also pointed out that the cultivation of perennial species in the plots was a local strategy to overcome the precariousness of land tenure and achieve insertion in the urban land market. In Puerto Iguazú, the cultivation of certain perennial fruits, such as those submitted to management, can also be understood, from the perspective proposed by Emperaire and Eloy (2014), as a strategy to improve the prize of the land in case the selling is needed. Particularly it is the case of palta, mango, and citrus that are always present in the domestic units and with multiple management techniques associated. In new neighborhoods, that usually present greater land conflicts than the old ones, the new settlers are likely to choose species of rapid growth to establish in the place and, along time, to incorporate others obtaining a greater structural complexity (Furlan, 2017). This characteristic is coincident with the maintenance of perennial fruit species as shown in this text. In addition, they are of importance in the construction of the inhabited space, that is to say in the construction of the territory understood from intentionality and based in exchange relations.

The importance of perennial fruit species in the results is reflected both by the number of species managed as by the relative frequency of complex practices as sowing in the domestic units.

Different species of the Myrtaceae family have been marked among the species of cultural importance for polish of the north of Misiones (Kujawska and Luczaj, 2015). In Iguazú pitanga is one of the species with greater intensity of management, which could also indicate a high cultural importance. This species is followed by guava (Psidium guajava), jabuticaba (Plinia trunciflora), and siete capotes (Campomanesia xanthocarpa), in contrast to the species mentioned as important at Kujawska and Luczaj (2015) which are in order of importance S. romanzoffiana, E. uniflora, E. involucrata, Campomanesia xanthocarpa, and A. edulis.

Citrus species and their varieties are largely shared with those reported by Stampella et al. (2013) for the Paraná and Uruguay basins. Stampella (2015) states that citrus in Misiones are cases of re-denomination of foreign species by local communities. The appropriation and recreation of the species and their associated knowledge are reflected in the diversity of local varieties and their uses. Citrus along with other fruit species, in Puerto Iguazú, can be included in that group. As examples, the great intensity of management of the species as P. americana, Musa x paradisiaca, and M. indica, and the presence of local varieties, evidenciates their importance as locally appropriated resources. At the same time those results show the principal perennial species that are part of domesticated landscape of Puerto Iguazú. The dynamism of diversification can be observed in these management practices and in their frequency.

The analysis of the number of management practices associated with a species is useful to think about which elements of the landscape are being pressured by management. To acknowledge which are those practices, their complexity and in which proportion they affect a species, allows a researcher to take into account the intensity of the species management. The simplified intensity management index applied here gives us a first clue of which of the perennial species could be interesting for pursuing future studies. This management practices can lead to frequency and distribution changes of the species and local varieties in the domesticated landscape of Iguazú home gardens as well as in the environment that contains it, the Argentinian Atlantic Forest.

Management is not the same in all individuals of the same species. This strategy is related to the search of diversification and certain logic of work by those who cultivate, which promotes the individualized management of the specimens and appreciates the intrinsic heterogeneity of the species as a value as showed in Furlan (2017). Therefore management activities particularly pruning, removal and harvesting are very variable in time and space. This means that the description of these activities and tasks are a little sample of the management universe for the species mentioned and are usually variable in the ways of carrying them out and in the times in which they are carried out in Puerto Iguazú.

Given the perennial nature of most of the fruit species managed in Puerto Iguazú, the concept of humanized biodiversity may be useful in characterizing species management. Humanized biodiversity is understood as the plants and animals that humans have altered in their biological characteristics, abundance and distribution. This concept is worked by Perales and Aguirre (2008) through several Mexican examples. For future studies, it is intended to continue using this terminology together with the analysis of management categories and their intensity as proposed at Casas et al. (1996), González-Insuasti et al. (2008) and Blancas et al. (2013). In Iguazú, as in the Andean region where Lema worked, the "crianza" concept (Lema, 2014) also reflects the spaces porosity and shows a mosaic of the perennial species in different management situations, which can influence in landscape domestication along time. The diachronic study of this phenomenon it is a line to continue research for future years.

Akinnifesi et al. (2010) showed how urban gardens can serve as a repository of native species and among them are several that are at risk of extinction in the original environments. Borges and Peixoto (2009) found that more than 50% of the known plants in villages within the Atlantic Forest are species from the forest. In Puerto Iguazú, in contrast to what was found by these authors, the exotic species were more frequent. It was recorded that species of Myrtaceae family in particular (almost all native) are well represented and their presence, management and local importance may be the starting point of in situ conservation plans of species of the Atlantic Forest. The registration of various species of the family Myrtaceae in orchards has already been described for the same phytogeographic region by Peroni et al. (2016) and their role in local conservation was also highlighted by these authors. The fruits of the Myrtaceae family are appreciated for direct consumption by local people of Iguazú. Their consumption has been also cited as of great importance for the diversification of the diet and for its nutritional contribution in other villages of the Atlantic Forest (Giraldi and Hanazaki, 2014). These species are seldom commercialized in other areas (Amaral and Guarim Neto, 2008; Kinupp and Barros, 2010) and have low availability in Iguazú local market (Furlan, 2017). Nevertheless, their availability in home gardens is not despicable, as shown in the results section. The perennial fruits species studied are also consumed by a wide variety of birds and herbivores, so they can act as a bridge between conservation areas. The fact that they are consumed by herbivores has led to hunters in Misiones to identify and use these species as a decoy to attract their prey (Giraudo and Abramson, 2000).

The use and management of fruit species is widespread in the periurban of Puerto Iguazú and management practices are similar for both native and exotic species. These results are coincident with those found by Giraldi and Hanazaki (2014) for the coastal region of the Atlantic Forest. Bonicatto et al. (2015) suggested that the practice of conservation and management of traditional seeds was extended to commercial seeds, indicating the conservation of both groups of species and varieties. In the periurban area of Puerto Iguazú, native and exotic fruit species are conserved and maintained through their management. These practices can lead to the generation of new local cultivars. This cultural selection headships to a diversification of the landscape and is intimately linked with the cultural diversity of the place (Hilgert et al., 2014), in other words to a domesticated landscape.

The Chiang Mai Declaration (WHO et al., 1988) establishes as part of conservation strategies, the propagation of native plants in agricultural systems. The study of home gardens provides relevant information to think about sustainable use

and conservation of native flora, as well as understanding local ecological knowledge (Martínez, 2015). The knowledge of the phenology of the species and their management for the vigorous development of the plants put into practice by the women of Iguazú is a fundamental pillar to incorporate their look into the strategies of local conservation. The information generated on the species managed in gardens of Puerto Iguazú can serve as a substrate to think about in situ strategies of conservation of the Atlantic Forest of the hand of the cultivators. Regarding domestication itself it is interesting to stand out that in home gardens of Puerto Iguazú, as part of agroforestry systems, perennial fruit species are one of the principal focus of management practices. Local practices of management applied to species as well tend to the diversification of plants and landscape. Finally it reinforced the idea of urban gardens as the primary base where domestication takes place in cities.

# ETHICS STATEMENT

This study was carried out in accordance with the recommendations of ISE (2006) Ethnobiology Code of Ethics with written informed consent from all subjects. Although local law did not require this informed consent to work with dwellers in interviews is important for us to do it, that is why we always apply the international standards.

# AUTHOR CONTRIBUTIONS

Conceptualization, data curation, formal analysis, investigation, writing of original draft and writing, review and

# REFERENCES


editing was made by VF. Conceptualization, funding acquisition, methodology review and editing was made by MP and NH.

# FUNDING

This project was carried out with funding from the National Council of Scientific and Technical Research through a doctoral fellowship and with partial funding from USUBI ARGIS/G53 of PNUD, CONICET Project UE IBS # 22920160100130CO, and a scholarship program of UCAR-PIA 103.

# ACKNOWLEDGMENTS

VF especially indebted to the women of Puerto Iguazú that opened the doors of their homes, without them this work would not have been possible. Also to Lic. M. E. Iezzi and Biól. Florencia Restelli for help with figures. To Dr. Ana Ladio who revised the first version of this manuscript and to Dr. Alejandro Casas which introduced me to the domestication world.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: http://journal.frontiersin.org/article/10.3389/fpls.2017.01690/ full#supplementary-material




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Furlan, Pochettino and Hilgert. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Management and Motivations to Manage "Wild" Food Plants. A Case Study in a Mestizo Village in the Amazon Deforestation Frontier

Gisella S. Cruz-Garcia1, 2 \*

<sup>1</sup> Decision and Policy Analysis Research Area, International Center for Tropical Agriculture, Cali, Colombia, <sup>2</sup> Botanical Research Institute of Texas, Fort Worth, TX, United States

#### Edited by:

Alejandro Casas, Instituto de Investigaciones en Ecosistemas y Sustentabilidad, Universidad Nacional Autónoma de México, Mexico

#### Reviewed by:

Jay Howard Samek, Michigan State University, United States Milton Kanashiro, Embrapa Amazonia Oriental (Embrapa Easter Amazon), Brazil

> \*Correspondence: Gisella S. Cruz-Garcia g.s.cruz@cgiar.org

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> Received: 12 June 2017 Accepted: 28 September 2017 Published: 26 October 2017

#### Citation:

Cruz-Garcia GS (2017) Management and Motivations to Manage "Wild" Food Plants. A Case Study in a Mestizo Village in the Amazon Deforestation Frontier. Front. Ecol. Evol. 5:127. doi: 10.3389/fevo.2017.00127 Human management of anthropogenic environments and species is tightly linked to the ecology and evolution of plants gathered by humans. This is certainly the case for wild food plants, which exist on a continuum of human management. Given alarming deforestation rates, wild food plant gathering is increasingly occurring in anthropogenic ecosystems, where farmers actively manage these species in order to ensure their availability and access. This study was conducted in a mestizo village in the Peruvian Amazon deforestation frontier, with the objective of documenting the management practices, including the human-induced movement of wild food plant species across the forest-agriculture landscape, and the motivations that farmers have to manage them using a qualitative ethnobotanical approach. The results of focus group discussions showed that 67% of the 30 "wild" food plant species reported for the village were managed, and almost all plants that were managed have been transplanted. The strongest flow of transplanted material was from forest to agricultural field (11 species), followed by market to field (five species), and field to home garden (four species). Farmers argued that the main reason for transplanting "wild" food plants was to have them closer to home, because they perceived that the abundance of 77% of these species decreased in the last years. Conversely, the most important reason for not transplanting a "wild" plant was the long time it takes to grow, stated for 67% of the species that have not been transplanted. Remarkably, more than half (57%) of the "wild" food plant species, including 76% of the species that are managed, have been classified as weeds by scientific literature. Finally, the "wild" food plant species were classified in six mutually exclusive groups according to management form and perceived abundance. The study concluded that "wild" food plant management, including management of species classified as weeds by scientific literature, is a crucial adaptation strategy of farmers aimed at ensuring their food security in scenarios of increasing deforestation. Finally, the article reflects on the major implications of human management on the ecology and evolution of food plant species.

Keywords: Peru, domestication, transplanting, perceived abundance, wild food plant

# INTRODUCTION

Wild food plant gathering is a deeply rooted component of human heritage, with millions of people gathering these species around the world. From 250,000–300,000 higher plant species known, ∼5,000 species have been managed at certain periods of time (Cotton, 1996; Heywood, 1999), but nowadays the diet of humanity largely depends on 53 crop commodities (Khoury et al., 2016). In a global context of increasing dietary homogenization (Khoury et al., 2014), the consumption of thousands of wild food plants and other underutilized food species plays a key role for food and nutritional security (Cruz-Garcia and Ertug, 2014). In addition, it has been documented that wild vegetables and fruits constitute a very important source of vitamins, minerals, and secondary metabolites (Johns, 2007), and many of these species are essential components of the diet during food scarcity periods (Scoones et al., 1992; Heywood, 1999; Cruz-Garcia and Price, 2014a).

Rural families gather wild food plants from highly intervened environments such as agricultural fields, more subsistence environments such as home gardens, and less intervened areas such as forests. They, however, increasingly collect wild food plants from anthropogenic ecosystems, given the alarming loss of natural habitats. For example, it has been documented that families that are more distant from forests (i.e., due to high deforestation rates) prefer to gather in areas closer to home (Price and Ogle, 2008). Ogle and Grivetti (1985) coined the term "botanical dietary paradox" explaining that farmers increasingly depend on agricultural "weeds" when the forest area decreases. For example, they documented in a study conducted in Swaziland that the area with higher management intensity presented a greater number of wild food plants. Likewise, Kosaka et al. (2006a,b) reported from research in Savannakhet (Laos) that households located closer to the forest depended more on forest foods, whereas those far from the forest relied more on wild food plants from agricultural fields to compensate the lack of forest resources.

Wild food plants exist on a continuum of human management from "truly" wild to semi-domesticated and cultivated species (Casas et al., 1996; González-Insuasti and Caballero, 2007). Plant management can be defined as "the set of actions or practices directly or indirectly performed by humans to favor availability of populations or individual phenotypes within populations of useful plant species" (González-Insuasti and Caballero, 2007, p. 303). Certainly, human management is tightly linked to the ecology and evolution of species (Clement et al., 2010). The interactions of humans with plants is clearly contextualized in the continuum model for agricultural (Harris, 1989) and agroforestry systems (Wiersum, 1997b). This model explains that these interactions change in time and space along a gradient that is neither unidirectional nor deterministic. The levels of interaction are not necessarily preordinated steps of increasing management intensity toward domestication; therefore most wild managed species are not necessarily becoming domesticated species (Harlan, 1975; Harris, 1989). In addition, while some plants that used to be intensely managed in the past are only tolerated or slightly protected at present, other wild food species are becoming domesticated ones (Harris, 1989).

Plant species could be grouped into three main categories according to forms of management intensity: (1a) gathered species, (1b) species with incipient management, and (1c) species cultivated ex situ. There is also a gradient within incipient management that includes: (2a) tolerance, (2b) protection, and (2c) promotion. Management practices include those related to protection, such as watering and fertilizing; practices related to promotion, like pruning and weeding; and practices related to ex situ cultivation, such as (trans)planting and sowing (Casas et al., 1996; González-Insuasti and Caballero, 2007). Additionally, incipient management practices can take place in situ, i.e., in the original place occupied by the plant, or ex situ, when transplanted to another place (Casas et al., 1996). In this way, human induced movement of wild food plants across the farming landscape, e.g., transplanting a plant from an agricultural field to a home garden, is a type of management (Cruz-Garcia and Price, 2014b). Domestication processes have (indirectly) promoted management practices such as propagation, protection, transplanting, and selective harvesting, which are important in order to ensure the availability of and access to useful plants that are in risk of decreasing or even disappearing (Price, 1997; Balemie and Kebebew, 2006; Daly, 2014). This plays a key role in the conservation of plant genetic resources particularly in the deforestation frontier.

A species management intensity and the types of management practices associated to the species might vary from place to place (Cotton, 1996; Ogle, 2001; González-Insuasti and Caballero, 2007). Furthermore, local people and scientists might use different classifications for wild and domesticated species. For instance, a species might be classified as wild by a sociocultural group but classified as domesticated by another group, or by scientists, which has implications for research (Michon and De Foresta, 1997; Clement, 1999; Orwa et al., 2009). This might be the case for the Amazon, where, although plant domestication started earlier than 8,000 years ago (Levis et al., 2017), a substantial portion of the genetic heritage was lost when the indigenous population drastically declined after European contact (Clement, 1999). Nowadays domesticated plant species persist in the forests (Levis et al., 2017), and this might hypothetically imply that some of these species are not managed or present incipient management practices, and newcomers (i.e., mestizo migrants) regard them as "wild" species.

According to Levis et al. (2017, p. 925) "domestication of plant populations is a result of the human capacity to overcome selective pressures of the environment by creating landscapes to manage and cultivate useful species." In order to better understand the processes of management and domestication it is necessary to incorporate socio-cultural aspects related to the use and valuation of a species (Casas et al., 1996; Blancas et al., 2013), which are distributed inter-culturally and intraculturally (González-Insuasti et al., 2011). Certainly, the values attributed to species by people will affect their incentives to manage them (Guijt, 1998) and to continue using them (Ogle, 2001). For instance, González-Insuasti and Caballero (2007) and González-Insuasti et al. (2008) demonstrated, from a study conducted in Tehuacán-Cuicatlán (Mexico), that management intensity depends on a species' cultural importance and biology, and these factors, together with land ownership, substantially influence farmer's decisions to intensify management practices. It has also been hypothesized by a number of authors (Stoffle et al., 1990; Cunningham, 1993; Price, 1997) that intensive management of wild food plant species in anthropogenic systems occurs when species have multiple use value and are perceived as rare. Furthermore, Price (1997) concluded from her research in Northeast Thailand that farmers increasingly manage wild food plant species with a high market value that are perceived as rare. Certainly, it has been documented by more authors that the local perception of a species abundance, i.e., perceptions of its rarity, influences the decision to manage the species (Price, 1997; Blancas et al., 2013, 2014).

Domestication is a cultural process (Clement, 1999), therefore the documentation of farmers' motivations to manage "wild" food plant species, taking into account the influence of their perceived abundance and cultural aspects related to these motivations, would contribute to the scientific study of domestication. Farmers' motivations explain why decisions are made and, ultimately, help to understand the distribution of useful species in the anthropogenic landscape. In other words, motivations are the human reasons underlying co-evolutionary domestication processes. The study of the motivations to manage food species is certainly necessary for communities in the deforestation frontier, where families are continuously adapting (or trying to adapt) to the loss of biodiversity, which provides an ideal setting for the study of contemporary domestication processes. This is certainly the case of Ucayali, which, together with Madre de Dios, are the regions with the highest deforestation rate in the country (Oliveira et al., 2007). The perspectives of mestizo farmers are of particular importance in Ucayali, since they constitute 80% of the population in this region (Porro et al., 2015), and have been frequently blamed for contributing to a major extent to the deforestation in the Peruvian Amazon (Alvarez and Naughton-Treves, 2003) as a response to secure tenure rights given a political-ecological context of high demand for land for extractive activities and industrial agriculture (Porro et al., 2015).

This paper presents an ethnobotanical perspective to domestication as an ongoing process, focused on the analysis of human factors influencing artificial selection operating on "wild" food species. In this way, the objective of this study was to document, in a mestizo village of Ucayali, the management practices, including the human-induced movement of "wild" food plant species across the forest-agriculture landscape, and the motivations mestizo farmers have for managing these species. The motivations not only include reasons for managing but also for not managing "wild" food plants. Whereas most research aimed at understanding the factors affecting plant management has been quantitative (e.g., González-Insuasti et al., 2008, 2011; Blancas et al., 2013), this study presents a qualitative approach. In addition, this article compares farmers' motivations to manage "wild" food plants in relation to their perceived abundance, and discusses the findings in a context of deforestation processes.

Given that this study was conducted from an ethnobotanical perspective, the inventory of "wild" food plants was built based on these species classified as "wild" by local people. In this way this study includes species that are not locally classified as domesticated, along a gradient of varying management intensity, from truly wild species (absence of management), wild tolerated, protected, and/or promoted and cultivated species. Management refers to practices and forms. Management forms include incipient management and ex situ cultivation (excluding gathering, unless indicated otherwise). Management practices include transplanting, watering, fertilizing, protecting, pruning, weeding, and mulching (González-Insuasti and Caballero, 2007). Transplanting includes sowing, planting and actual transplanting. Protecting refers to conscious care activities other than watering and fertilizing. Toleration is not considered a management practice per se, given that it does not imply any activity specifically aimed at promoting the growth of a plant; toleration is considered a more incipient type of management form. The motivations focus on transplanting (why farmers transplant a species or not), which was the most common management practice and it is related to the human-induced movement of these plants across the landscape. Finally, the farming landscape, or forest-agriculture landscape includes anthropogenic ecosystems along different degrees of human intervention, encompassing agricultural fields, home gardens and secondary forests (forests in the study site are mainly secondary).

# STUDY SITE

This study took place in the village of Pueblo Libre, located in Ucayali, Peruvian Amazon. In 2012, Ucayali had a total of 490,000 inhabitants. Twenty percent of the department is inhabited by indigenous communities, whereas most of the population consists of mestizos. Mestizos, who are migrants from non-Amazonian regions of Peru, are mainly settled along the Federico Basadre highway or the Ucayali river and its tributaries (Porro et al., 2015). The highway was built in 1945 and connects Pucallpa with Lima (860 Km), which is the capital of the country (Pimentel et al., 2004). Sixty percent of the population of Ucayali lives in Pucallpa, which is the capital of the department. Pucallpa is the second most populated capital in the Peruvian Amazon (INEI, 2011a).

The main economic activities of Ucayali are agriculture, livestock farming and timber industry, contributing altogether to almost 20% of the gross domestic product (MINEM-GOREU, 2007; INEI, 2011b). Certainly, Ucayali is the main center of the Peruvian timber industry (Ramos Delgado, 2009). The staple crops are cassava, maize, plantain, rice, and beans. However, during the last decade, the region experienced an increase in palm oil and cacao plantations (Salisbury and Fagan, 2013). The mean annual rainfall in Ucayali ranges from 1,800 to 3,000 mm (Fujisaka et al., 2000), whereas the mean annual temperature is 25.7◦C with 80% of relative humidity (Lojka et al., 2008).

Peru, after Brazil, is the country with the highest extension of Amazonian forest (Lu, 2009). However, Peru has an average of 64,500 ha deforested every year, which are mainly located in the departments of Ucayali and Madre de Dios (Oliveira et al., 2007). Deforestation and land degradation are prevalent in Ucayali mainly due to the expansion of legal and illegal logging, land clearing and road construction (Galarza and La Serna, 2005; Miranda et al., 2014). For instance, by 2010 about 9% of the original forest area of Ucayali, which was 8.7 million ha, had been deforested (Porro et al., 2015). Ucayali's increasing rate of deforestation goes back to 2002, when half of the forest was declared fit for permanent production and given as concessions by the Instituto Nacional de Recursos Naturales (INRENA; INEI, 2011c). After 2002, the proportion of illegal logging increased when most forest concessions lost their licenses due to a change in regulations (Smith et al., 2006). Certainly, 80–95% of logging in Peru is illegal (Sears and Pinedo-Vasquez, 2011; Cossío et al., 2014).

Pueblo Libre (174 m. asl) is a mestizo upland village situated at km. 60 of the Federico Basadre highway, followed by 22 km. of dirt road (**Figures 1**, **2**). Pueblo Libre is inhabited by mestizos who migrated from the Peruvian highlands and coast, as well as people from other regions of the Amazon. It has a population of ∼75 families encompassing more than 350 inhabitants. Most of the houses have electricity and access to drinking water. There is telephone signal and there is an internet service in town, which was built in 2013 by a project funded by the United States Agency for International Development (USAID). The main source of income consists of the production of palm oil, cacao, plantain and, to a lesser degree, livestock. There is no communally owned land in the village, and the forest, which is privately owned, is fragmented and scattered across the different land properties of the families (Vael, 2015). It has been reported that villagers from Pueblo Libre consume "wild" food plants, which are mainly gathered from agricultural fields, forests and home gardens. People mostly consume "wild" fruits (mainly gathered by men), followed by tubers and roots (mainly collected by women; Cruz-Garcia and Vael, 2017).

# METHODS

This study took as a baseline a list of 30 "wild" food plant species belonging to 18 botanical families reported by Cruz-Garcia and Vael (2017) in a study conducted in Pueblo Libre. This list was constructed during focus group elicitations using local names of plants in Spanish, which is the main language spoken in the village. The villagers use the term planta silvestre alimenticia (wild food plant) as a cultural domain; for instance, they identified the species of the list based on their local knowledge. According to Borgatti (1999), a cultural domain is a set of items that belong to the same category corresponding to a socio-cultural group. In this way, the informants explained that "wild" food plants are plants from the forest or transplanted from the forest near the house, as well as plants that grow in the home garden or agricultural field and do not require much care (Cruz-Garcia and Vael, 2017). The botanical identification of plant species was conducted by a local taxonomist from the Universidad Intercultural de la Amazonía Peruana in Pucallpa. Herbarium specimens of most identified species are on repository in the Herbarium of the University.

Fieldwork was conducted from August to September 2014 in Pueblo Libre, and encompassed three focus group discussion sessions in order to cover information for all 30 species. Focus groups were conducted with men and women ranging from 23 to 52 years of age, identified by the villagers themselves as knowledgeable about contemporarily gathered food plants. Each session lasted about 2 h and consisted of five to six informants, following Bernard's recommendations on the number of participants per focus group (Bernard, 2002). During focus groups mestizo farmers were asked if people in the village manage each "wild" food plant species from the list ("do you transplant the species?" "do you water it," "do you protect it?" "do you fertilize it?" "do you prune it?" "do you weed it?" and "do you mulch it?"). They were also asked the origin of planting material when a species was transplanted ("from where did you bring the planting material?"); the motivations for transplanting ("why do you transplant the species?" if the species was not transplanted "why you did not transplant it?"); and perceived abundance ("the species was more abundant, less abundant, or had the same abundance as 10 years ago?"). This study was carried out in accordance with the recommendations of the guidelines of the International Society of Ethnobiology Code of Ethics. Participation was voluntary and all participants provided oral, informed consent, in accordance with the Code of Ethics.

Focus groups have been well-established in the area of development studies (Rifkin and Pridmore, 2001; Desai and Potter, 2006; Chambers, 2012). Focus groups are particularly useful when a study focuses on the everyday use of culture for a particular socio-cultural group (Morgan and Kreuger, 1993). In this study, focus groups did not aim to analyze how many people carry out a management practice, nor to quantify the proportion of farmers that share a particular opinion. Rather, the implementation of focus groups aimed at providing an exploratory assessment of practices and perceptions, and the results obtained were neither suitable for estimations of statistical significance or accuracy, nor for generalization to larger populations (Kumar, 2002; Chambers, 2012). The preference for this approach was driven by the lack of information on "wild" food plant management in mestizo villages and, consequently is exploratory. In this way, the present study paves the road for future in-depth and quantitative studies on this topic.

highway and surrounding roadsides. Source: Terra-i; map prepared by Paula Paz.

The list of "wild" food plants was compared to the Global Compendium of Weeds (HEAR, 2007). Growth form and endemicity were determined for each species with literature review (USDA, 2015a; United States Department of Agriculture (USDA), 2015b). The species were classified based on their associated management practices in different types of nonagricultural management forms following González-Insuasti and Caballero (2007) and Casas et al. (1996). A data matrix was prepared in Microsoft Excel where each row was a species and each column a variable (management practices, motivations, perceived abundance, weed category, growth form, and endemicity). Data analysis consisted of comparing frequencies of species in relation to the studied variables, and was conducted with Microsoft Excel.

# RESULTS

# Management Practices and Movement of Planting Material across the Landscape

The results of the focus group discussions showed that more than two-thirds (67%) of the "wild" food plant species have associated management practices (**Figure 3**). Almost all

plants that have associated management practices have been transplanted (18 out of 20 species, see **Table 1**), with the exception of Mauritia flexuosa and Manilkara bidentata that are only pruned. Sixty-one percent of all transplanted species did not include additional management practices. Among those species that included additional management practices were Artocarpus altilis, Passiflora quadrangularis, and Solanum sessiliflorum var. sessiliflorum that were transplanted and weeded; Bactris gasipaes and Myrciaria dubia that were transplanted, watered, fertilized and weeded; Theobroma cacao that was transplanted, fertilized, weeded and pruned; and Matisia cordata, which is a native fruit tree that not only was transplanted, watered, fertilized and weeded, but also protected from chickens by placing a little fence around it. None of the species was mulched. Contrarily, species like Attalea phalerata and Oenocarpus bataua did not present any management practices (**Table 2**).

More than half (57%) of the "wild" food plant species, including 76% of the species that have associated management practices, have been classified as weeds by scientific literature. In addition, weeds such as the introduced tree A. altilis, native climber P. quadrangularis and native shrub S. sessiliflorum var. sessiliflorum are weeded by the local population.

The most important origin of planting material was the forest, with 61% of transplanted species, followed by the market (28%) and agricultural fields (22%). The most common environments where species were transplanted were the agricultural field (83% of transplanted species), followed by the home garden (33%). This was reflected in the flows of transplanted material. For instance, the strongest flow of transplanted material was from forest to farm, encompassing 11 species; followed by market to agricultural field with five species, and agricultural field to home garden with four. Flows also included species that were transplanted from one place to another within the same environment (**Figure 4**).

# Non-agricultural Management Forms for "Wild" Food Plants

A total of four species are managed in situ and 18 species are managed ex situ (**Table 3**). From the species managed in situ only two presented associated management practices (pruning, i.e., M. flexuosa and M. bidentata), whereas the other two were tolerated. The tolerated species were Passiflora acuminata and Physalis angulata, which are spared within agricultural fields and are not taken out when weeding the crop. Species cultivated ex situ include these transplanted from one environment to other (e.g., from forest to farm), within the same environment (e.g., from farm to farm) and from the market to an environment (e.g., from market to farm). All management forms, including incipient management in situ and management ex situ, include species classified as weeds by scientific literature.

# Farmers' Perceived Abundance and Motivations to Transplant "Wild" Food Plant Species

Farmers perceived that the abundance of 77% of the "wild" food plants decreased during the last decade, whereas the abundance of 20% of the species increased, and the abundance of Pourouma cecropiifolia remained the same, because, as farmers clarified, they did not cut it down given that it takes too long to grow. Instead, they transplanted P. cecropiifolia from the forest to their farms to have it nearer to their homes. Fifty-seven percent of the plants that are less abundant, have been transplanted, two are pruned but not transplanted, and the remaining are not managed. Four species that are more abundant, have been transplanted and two have not. The most important motivation for transplanting a "wild" food plant was to have it close to home, mentioned for 72% of the species that have been transplanted (n = 18). Conversely, the most important reason for not transplanting a "wild" plant


TABLE 1 | List of transplanted "wild" food plant species indicating botanical name, local name, growth form, endemicity, weed category, origin of planting material, environment where the species was transplanted, and additional management practices (n = 18).


8 Introduced from Australasia.

<sup>ř</sup>Introduced from the Caribbean.

U Introduced from Asia and the Pacific.

<sup>α</sup>AW, agricultural weed; CA, casual alien; CE, cultivation escape; EW, environmental weed; GT, garden thug; NW, noxious weed; SW, sleeper weed; W, weed. <sup>Ç</sup>Pro, protection; Wa, watering; F, fertilizing; We, weeding; Pru, pruning.

was the long time it takes to grow (cannot make immediate use of it), stated for 67% of the species that have not been transplanted (n = 12; **Tables 4**, **5**).

Villagers explained that transplanting is a local strategy to ensure the presence of species that are less abundant than a decade ago, because nowadays they are unusually cultivated in the village, used as firewood, have difficulties to grow due to their ecological requirements, or are decreasing in availability, because they are not frequently used. For example, farmers explained that they have transplanted Bixa orellana from the forests to their home gardens to give color to food and juices during carnival. They highlighted that they have transplanted B. gasipaes in their farms, because they cannot find it in other environments within the village anymore, given that they cut down the trees, which were very tall, to collect the fruits. Likewise, Pouteria caimito and Inga feuillei have been transplanted near the house to be used as firewood. Regarding the "wild" food plants that are more abundant than 10 years ago, villagers explained that although some species are characterized by their favorable ecological requirements, they have been transplanted to have them close to home. For example, Inga edulis has been transplanted from farms to home gardens, and A. altilis has been transplanted from forests to farms (**Table 4**).

There are species like P. acuminata and P. angulata that are perceived to be more abundant than 10 years ago, and have not been transplanted by villagers because, as they stated, these plants grow like "weeds" and, in the case of P. angulata, the birds bring the seeds so there is no need to propagate them. Conversely, there are species that despite their decreased abundance, have not been transplanted. Farmers explained that their abundance has decreased, because they cut down the trees to sell the timber, use the firewood, collect the fruit, or make a new agricultural field.


TABLE 2 | List of "wild" food plant species that are not transplanted indicating botanical name, local name, growth form, endemicity, and weed category (n = 12).

<sup>0</sup>This species is not transplanted but pruned.


<sup>8</sup>Cannot define the endemicity because the species has not been identified.

<sup>α</sup>AW, agricultural weed; CA, casual alien; CE, cultivation escape; EW, environmental weed; W, weed.

They argued that these species, with the exemption of Genipa americana and M. flexuosa, have not been transplanted because it takes them too much time to grow. They also indicated that G. americana is not commonly used by the mestizos but mainly utilized by indigenous communities. Although they cut down the trees of M. flexuosa to collect the fruits, they explained that they cannot transplant it because it requires a lot of soil moisture, which they do not have in the village (**Table 5**).

# Groups of "Wild" Food Plants According to Management and Local Perceptions

The "wild" food plants gathered in Pueblo Libre (n = 30) could be classified into six groups, according to management forms and local perceptions (mainly in relation to species abundance). Four groups present species classified as weeds by scientific literature (**Figure 5**).

Group 1 includes species that are not managed although famers perceived that they have decreased in abundance. Informants reported that the abundance of all species from this group, except for G. americana, has decreased because they cut down the trees to sell timber, to collect fruits, for firewood or when slashing for a new agricultural field; but they do not transplant them because it takes them too much time to grow. For example, farmers cut down the trees of O. bataua and Phytelephas macrocarpa to collect the fruits. Farmers mentioned that although G. americana decreased in numbers, it is not transplanted, because they do not use it frequently. The other species that belong to this group, are Inga sp., Euterpe precatoria, Astrocaryum sp., Spondias mombin/Spondias venosa, and A. phalerata. This group includes three species classified as weeds (I. sp., S. mombin /S. venosa and A. phalerata). All species in this group are native.

Group 2 encompasses species with non-agricultural incipient management, managed in situ and with a perceived increased abundance. This group includes the two species that are tolerated (P. acuminata and P. angulata), which farmers mentioned as



<sup>a</sup>Protection includes watering, fertilizing and other types of protection.

<sup>b</sup>Promotion includes pruning and weeding.

<sup>c</sup>Transplanting also includes planting and sowing.

<sup>d</sup>All species with incipient management ex situ have been transplanted.

growing like "weeds." P. angulata has also been classified as weed by the literature. Both species in this group are native.

Group 3 includes species with non-agricultural incipient management, managed in situ and with a perceived decreased abundance. This group includes the two species that are pruned but not transplanted (M. flexuosa and M. bidentata). Informants stated that they have decreased in abundance because they cut down the trees to collect the fruit or to sell the timber; and they do not transplant them because M. flexuosa requires a lot of soil moisture and M. bidentata takes a long time to grow. None of these species has been classified as weed. Both species in this group are native.

Group 4 comprises species cultivated ex situ that farmers perceived having a decreased abundance. The reasons for decreasing in numbers include explanations related to little use or knowledge about the uses of the plant, unfavorable ecological conditions, and cutting down the trees for firewood or to collect the fruit. The main motivations to transplant them are related to use-value (having the plant close to home, to sell it and to give color to juices). The species belonging to this group are B. orellana, M. dubia, Smallanthus sonchifolius, P. caimito, I. feuillei, Poraqueiba sericea, S. sessiliflorum var. sessiliflorum, P. quadrangularis, Dioscorea cf. trifida, Pachyrhizus tuberosus, C. allouia, Colocasia esculenta, and B. gasipaes. All the species from this group have been transplanted, and four of them presented additional incipient management practices (M. dubia, S. sessiliflorum var. sessiliflorum, P. quadrangularis, and B. gasipaes). Six species from this group have been classified as weeds (B. orellana, S. sessiliflorum var. sessiliflorum, P. quadrangularis, D. cf. trifida, P. tuberosus, and C. esculenta). C. esculenta and D. cf. trifida are the only species that have been introduced to the region (from Australasia and the Caribbean respectively).

Group 5 includes a native species, i.e., P. cecropiifolia, which is cultivated ex situ and has a perceived unchanged abundance. In contrast to the species from the previous groups, as mentioned before, this is the only plant that farmers do not cut down because it takes a long time to grow. This plant has not been classified as weed.

Group 6 encompasses species that farmers cultivated ex situ although they perceived an increased abundance. They explained that the abundance of these plants increased because of favorable ecological conditions. The motivations for transplanting these species are related to use-value. The species that belong to this group are T. cacao, I. edulis, A. altilis, and M. cordata. All species, except I. edulis, presented additional incipient management practices. M. cordata is the only species that has not been classified as weed. All species are native, except for A. altilis, which has been introduced from Asia and the Pacific.

# DISCUSSION

# General Reflections on Management of "Wild" Food Plants

This study, which was based on focus group discussions conducted with mestizo farmers, showed that more than twothirds of the "wild" food plant species in the study village are managed, and most of them have been transplanted ex situ. Certainly, humans have intervened in the populations of wild species throughout the world, for example, changing the diversity and density of food plants by transplanting or introducing new species to an environment (Wiersum, 1997a; Daly, 2014; Parrotta et al., 2015). The use and management of wild food plants have already been reported in Latin America, for example, among the Nahua and Mixtec communities (Casas et al., 1996) and in Santa Maria Tecomavaca (González-Insuasti and Caballero, 2007) in Mexico, among the Mapuche in Chile (Daly, 2014), in the Monte region in Argentina (Ladio and Lozada, 2009), in the Bolivian Amazon (Reyes García et al., 2005; Thomas, 2012) and Andes (Vandebroek and Sanca, 2007), in Pernambuco in Brazil (Cruz et al., 2013), and in Cuba (Volpato and Godinez, 2007). Furthermore, the management of wild food plants has also been documented in Africa, for example in the Collines region in Central Benin (Avohou et al., 2012) and Central Shewa of Ethiopia (Feyssa et al., 2012); and Asia, for example in Northeastern Thailand (Cruz-Garcia and Price, 2014b), among others.

Defour and Wilson (1994) reported a total of 131 wild food plant species for all Amazonia, which mainly include trees and palms, consumed by indigenous communities. Certainly, the study of management and domestication of food plants in the Amazonia has largely focused on fruit trees (Clement and Villachica, 1994; Clement, 2006; Miller and Nair, 2006); and the documentation of food plant management in this region has mainly focused on indigenous communities (e.g., Reyes-García et al., 2006; Thomas, 2012), but not on mestizo villages. The Botanical Garden—Arboretum El Huayo (JBAH) from the Universidad Nacional de la Amazonía Peruana (National University of the Peruvian Amazon) located in Iquitos, Peru, TABLE 4 | Farmers' motivations for transplanting a "wild" food plant species in relation to perceived abundance (n = 18).


possess a collection of 46 species of edible fruit plants reported in surrounding communities (Freyre, 2003). Taking into account these numbers, the amount of food plants documented by using focus group discussions for Pueblo Libre village (n = 30 species) might seem low. However, compared to other studies conducted in the Amazonia, the number of "wild" food plants documented for Pueblo Libre is higher than those reported by Reyes-García et al. (2006) for the Tsimane' communities of the Bolivian Amazon (n = 18), and very similar to the number of food species reported by Vásquez and Peláez (2015) for the inhabitants of Berlín village in Bagua Grande, Peru (n = 29).

Ucayali has been highly affected by high deforestation rates (Smith et al., 2006; Miranda et al., 2014; Porro et al., 2015). Land use change, deforestation, and unsustainable management of natural resources are the main drivers of loss of biodiversity and, consequently, decrease of wild food plants (Daly, 2014) and other underutilized species. This decline affects the food security of rural families directly (e.g., availability and accessibility to food) and indirectly (e.g., modifying the ecological conditions where species grow) (Van Noordwijk et al., 2014; Agarwal et al., 2015). Management practices: (a) allow rural families to have sufficient food and nutritional diversity by increasing the availability and access to wild food plants and other underutilized species, and (b) favors the conservation of these species in highly intervened anthropogenic environments when forests decline (i.e., by bringing planting material from forests to farms and home gardens). Certainly, the dietary diversity and nutritional quality of the diet of hundreds of millions of people in the world rely on the consumption of wild food plants and other underutilized edible plants (Grivetti and Ogle, 2000; Johns and Eyzaguirre, 2006; Heywood, 2011).

There are two points of recommendations for future research in relation to management and domestication of food plants by mestizo villagers (or immigrant communities). The first is to understand the historical process of knowledge acquisition about Amazonian flora by mestizos from indigenous peoples and/or by experimentation. The second is to assess the historical management of these species by mestizo villages. For instance, are the practices described in this study new ones that emerged as a response to deforestation and loss of biodiversity? Or are they practices that existed among mestizo villages before deforestation rates started to increase, and were adjusted to the new settings TABLE 5 | Farmers' motivations for not transplanting a "wild" food plant species in relation to perceived abundance (n = 12).


(i.e., intensifying human induced movement of biodiversity from forests to more intervened anthropogenic environments)? Such additional studies would shed light on understanding historical processes of management and domestication by non-indigenous societies.

# Human Induced Movement of "Wild" Food Plants across the Farming Landscape as a Management Strategy

The findings of this study provide evidence that mestizo farmers transplant "wild" food plant species across different environments within the farming landscape. This emphasizes the spatial and seasonal complementarity of different anthropogenic ecosystems for food provision, which is particularly important for families living in the forest-agriculture interface. The importance of this complementarity for the food security of rural families has been also reported in other regions of the world, for example in Northeastern Thailand (Cruz-Garcia and Price, 2014a) and West Java in Indonesia (Abdoellah and Marten, 1986). Certainly, Frison et al. (2011) emphasized that a major basis of dietary diversity is the diversity of environments within farming landscapes, which was also illustrated by this study in Ucayali.

The key role played by forests, agricultural fields, and home gardens for provisioning food to rural households has been highlighted by various studies around the world. For instance, the importance of forests for food security has largely been acknowledged in scientific literature (e.g., Arnold et al., 2011; FAO, 2011; Sunderland et al., 2014); as well as the importance of multiple habitats, ranging from terrestrial to aquatic ones, that diversified agricultural fields encompass to facilitate the growth of multiple food species (e.g., Altieri and Anderson, 1987; Cruz-Garcia et al., 2016); and the importance of home gardens not only as source of food but also for processes of domestication and biodiversity conservation (e.g., Miller and Nair, 2006; Galluzzi et al., 2010; Cruz-Garcia and Struik, 2015; Freedman and Stoilova, 2015). Additionally, it is important to highlight that not only the farming landscape but also the market plays a key role in providing planting material, which has also been reported in other regions of the world (e.g., Ruiz-Pérez et al., 2004).

Ogle and Grivetti (1985) in their botanical dietary paradox highlighted that wild food plant gathering increasingly occurs in more intervened anthropogenic environments rather than less disturbed environments, i.e., forests, in the face of deforestation and land use change. However, the results of this study go beyond the botanical dietary paradox, showcasing that humans have a more active role in ensuring the availability of food species, for instance supporting the flows of planting material from less to more intervened environments. This was observed in the following findings: (a) the human induced movement of "wild" food plants across the landscape occurred for all transplanted species in the study site; (b) the most common source of planting material was the forest (for 61% of transplanted

species); (c) informants constantly highlighted the need to have "wild" food plant species closer to home (for 72% of transplanted species), and (d) villagers usually bring planting material from less to more intervened environments, i.e., from forests to agricultural fields, from forests to home gardens, from agricultural fields to home gardens. Certainly Zohary (2004) highlighted that human selection leads to the dispersal of plant populations toward more disturbed anthropogenic ecosystems. The movement of planting material is necessary in order to ensure the availability of wild food plants and other underutilized species given scenarios of increasing deforestation. Undoubtedly, the importance of agricultural fields and home gardens as recipients of plant genetic material has to be taken into account in agricultural interventions.

# Reflections on the Definitions of "Wild" Food Plants

The use of cultural domains, capturing the emic or native's point of view, is a starting point of ethnobotanical research (Borgatti, 1999) that can be very useful for the study of wild food plants and domestication, given that different socio-cultural groups perceive and conceive the world differently according to their own social, historical, cultural and environmental conditions and experiences (Brosius et al., 1986). Through comparing the list of native "wild" food plants from Pueblo Libre with the results of Clement (1999) and Levis et al. (2017), it was possible to see that 10 species that were categorized as "wild" by villagers, were domesticated before European contact, whereas nine were either semi-domesticated or presented incipient domestication by that time (see **Table 6**). Clearly, villagers and scientists have different classifications for "wild" and "domesticated," as previously highlighted by Michon and De Foresta (1997) and Clement (1999).

Interestingly, some species of "wild" food plants that were domesticated before European contact, are nowadays treated as "wild," as emphasized in the mestizo's emic conceptualization of "wild" food plants where "wilderness" is associated with species that "do not require much care" (Cruz-Garcia and Vael, 2017). This was expected, given that a significant part of the crop genetic resources was lost—alongside traditional knowledge with the eradication of 90–95% of the Amazonian population after European contact (Clement, 1999), and consequently it is still possible to see that domesticated species are nowadays dominant across the Amazon forest (Levis et al., 2017). Certainly, the domesticated species reported by this study, did not exhibit management patterns in an intensity expected for domesticates, for instance fully depending on human intervention for survival (González-Insuasti and Caballero, 2007). For example, although B. gasipaes was domesticated during pre-Colombian times (Clement and Urpí, 1987), it was listed as a "wild" species by the informants. They transplant B. gasipaes to their agricultural fields, where they collect the chonta (inner core of the stem) and fruits for food (Cruz-Garcia and Vael, 2017), bringing the planting material not only from the market but also from the forest. This species is watered, fertilized, and weeded, but neither pruned nor protected.

Conversely, T. cacao was semi-domesticated before European contact, but nowadays most populations are domesticated. Families from Pueblo Libre, however, not only manage the species in the agricultural field with the objective to sell the seeds in the market, but also gather and consume the fruits of non-transplanted, non-managed individuals of T. cacao that are growing in the forest (Cruz-Garcia and Vael, 2017). Another example isI. edulis, which was semi-domesticated before European contact, but nowadays is domesticated throughout Amazonia for its fruits and wood (Clement et al., 2010). TABLE 6 | "Wild" food plant species that were reported as domesticated (D), semi-domesticated (SD), and with incipiently domesticated populations (ID) in Amazonia at European contact (Clement, 1999; Levis et al., 2017).


<sup>δ</sup>Only for native species.

Undoubtedly, González-Insuasti and Caballero (2007) explained that a species could be managed differently in different places and times, and Cruz-Garcia and Price (2014b) stated that domestication is a locally differentiated concept and process. Clement et al. (2010) also explained that domestication as a process occurs at population level (not at species level); therefore it is not correct to say that a species is a domesticate unless all its wild populations are extinct, which is unusual. Instead they recommended to affirm that a species "exhibits domesticated populations" that, for this study in Ucayali, might be the case of T. cacao and I. edulis.

It would be interesting if future studies evaluate the management and motivations to manage wild food plants for the different indigenous communities in Ucayali, for instance, to assess how do they conceptualize the cultural domain of "wild food plant," which species do they classify as part of this cultural domain, and which management practices and forms do they exhibit. Do they classify the species of this study as wild or as domesticated? Do they recognize different species as wild food plants? Do they manage those species that have been domesticated before European contact (according to Clement, 1999; Levis et al., 2017) with higher intensity than mestizos?

# Reflections on the Definitions of "Weeds"

Another term that deserves further discussion is "weed," which has historically been defined as "a plant growing where it is not wanted" (Mortimer, 1990), or, contrarily as "a plant whose virtues are yet to be discovered" (Perrins et al., 1992). Although most agronomists and agricultural extension officers recommend their eradication to favor crop production, 89% of the most aggressive weeds in the world are edible (Rapoport et al., 1995) and several of them are highly nutritional (Duke, 1992). Certainly, the consumption of weeds has been reported throughout the world (Grivetti et al., 1987; Duke, 1992; Tanji and Nassif, 1995; Casas et al., 1996; Díaz-Betancourt et al., 1999; Pieroni, 1999; Turner et al., 2011; Cruz-Garcia and Price, 2012). Likewise, the results of this study showed that more than half of food plant species reported in the village have been classified as weeds by scientific literature (HEAR, 2007). In addition, 76% of these "weeds" have been transplanted by the villagers in order to increase their availability and to have them closer to home. Indeed, "weeds" exhibited different management forms: (a) incipient management in situ (toleration) despite their increased abundance (group 2), (b) managed ex situ due to their decreased abundance (group 4), and (c) managed ex situ despite their increased abundance (group 6). Therefore, none of both definitions of "weed" adjusts to the species documented by this study: they are tolerated or managed in the place where they are growing, and their virtues have already been discovered (i.e., as food).

It might also sound somehow contradictory that species that have been classified as domesticates by scientists—as some food plants reported by this study—have also been classified as weeds by other scientists. However, on the one hand, this can be explained by the fact that weed classifications are controversial, given that the way scientists define weeds and classify species as weeds depend on their disciplinary background (Perrins et al., 1992). On the other hand, crops might become weeds in other parts of the world, depending on the biological characteristics of the species, the environment where it is growing and the associated management practices (Harlan, 1965).

# Farmers' Motivations to Manage "Wild" Food Plants

Mestizo farmers' motivations to manage "wild" food plant species affect management practices and forms. Management, including human induced movement of planting material across the landscape, influences: (a) artificial selection and, consequently, evolution; and (b) the availability and distribution of species across the farming landscape and, consequently, their ecology. The results of this study, based on focus group discussions, showed that farmers' motivations belong to two major groups: motivations related to cultural importance, particularly in relation to use-value, and motivations related to perceived abundance. These will be discussed in the following paragraphs.

Human management practices usually focus on the preservation of culturally important species. Although, as part of this study we did not directly ask informants about the cultural value of each species, culture was captured through the study of motivations to manage "wild" food plants, particularly in relation to species use-value. This is reflected in the following: (a) villagers transplanted almost three-fourths of the species to have them close to home, which facilitates their availability and access for frequent use; (b) informants also emphasized the use of species as food and medicine, or their use for coloring food and drinks, and for firewood; (c) the most common reason for not transplanting a species, accounting for more than two-thirds of the species that have not been transplanted, was the long time it takes to grow, what implies that villagers cannot make immediate use of it. This is aligned to the findings of various authors that have documented the influence of cultural importance, including use-value, on farmers' incentives to manage a species (e.g., Casas et al., 1996; Guijt, 1998; Ogle, 2001; González-Insuasti et al., 2008, 2011; Blancas et al., 2013).

Human management practices also focused on the preservation of species perceived to decrease in abundance. For instance, more than half of the species that were perceived to be less abundant, were transplanted, plus two species that were not transplanted but presented incipient management practices. Likewise, it has been reported by other studies that decreased abundance (or perceptions of a species rarity) are directly related with a farmers' decisions to manage a species (e.g., Stoffle et al., 1990; Cunningham, 1993; Price, 1997; Blancas et al., 2013, 2014).

Although the results of this study showed that human adaptation to rapidly changing environments (i.e., under scenarios of deforestation) promotes the management of food species and their movement across the farming landscape, the findings also reported the presence of destructive harvesting practices. For instance, villagers cut down trees to collect fruits (four species), for timber (four species), for firewood (two species) and when slashing and burning (one species), accounting for almost half of the species that have decreased in abundance. Whereas some of these species are transplanted to home gardens (but cut down in forests and agricultural fields), others are not transplanted because of—as villagers mentioned—the long time it takes them to grow. This is not surprising in the deforestation frontier, where multiple stakeholders, including villagers themselves, contribute to forest loss. For instance, it has been documented that timber production is facilitated by legal and illegal channels of trade, which are more available for mestizo upland villages in Ucayali than for mestizos living in lowlands or indigenous communities (Porro et al., 2015). The presence of unsustainable management practices related to useful wild plants has also been reported in other regions (e.g., González-Insuasti et al., 2011; Blancas et al., 2013). In these cases, it is necessary to promote sustainable management practices that favor the conservation of important wild food plants and other underutilized species. For example, local organizations are teaching the communities situated in the buffer zone of the Cordillera Azul National Park (Peruvian Amazonia) sustainable practices to collect the fruits of M. flexuosa without cutting down the trees (CIMA, 2012). This kind of initiatives is very important and necessary for ensuring the conservation of valuable food species, particularly those whose populations are decreasing in the Amazon deforestation frontier. Furthermore, it is necessary to implement interventions that simultaneously aim at environmental conservation, social equity, and sustainable livelihoods (Porro et al., 2015).

# Limitations of this Study

The main constraint of this study was that the results obtained were not suitable to generalization for larger populations, given that the data collection was based on focus group discussions. For instance, a limitation of participatory methods, like focus groups, is that their outputs do not allow statistical estimations (Kumar, 2002; Chambers, 2012). However, focus groups capture group perspectives and provide reliable information on topics that are significant for marginalized communities (Bernard, 2002). It is important to emphasize that the results from this study should not constitute the endpoint for decisionsupport processes, but should rather constitute the exploratory and hypothesis generating stage of future quantitative scientific projects. In this way, this study opens the possibilities for future in-depth and quantitative studies on management perspectives, for example aimed at understanding how motivations to manage wild food plants are shared within and among socio-cultural groups. In addition, future studies should take into consideration issues related to gender, education, wealth and geographical location, which affect the ways people interact with plants (and, consequently, people's motivations to manage them). This is certainly necessary, given that the study of people's motivations contributes to understanding the reasons behind management decisions and, ultimately, domestication.

# CONCLUSIONS

The results of this study on management practices, including the human-induced movement of "wild" food plant species across the forest-agriculture landscape, and the motivations that mestizo farmers have to manage them, magnify the already acknowledged importance of "wild" food plant management, including weed management, for ensuring the availability of "wild" food species. In this way, management practices see to rural families having sufficient food and nutritional diversity, and to these species being preserved in highly intervened anthropogenic environments when forests decline.

This research, conducted in a village in the Peruvian Amazon, provides empirical evidence that mestizo farmers transplant food plant species across different environments within the farming landscape, bringing planting material from less to more disturbed anthropogenic environments. This emphasizes the spatial and seasonal complementarity of different anthropogenic ecosystems for food provision, which is particularly important for families living in the forest-agriculture interface. The findings of this study showed that farmers' motivations to manage "wild" food plant species are related to their cultural importance, particularly in relation to use-value, and to their perceived abundance. However, the presence of unsustainable management practices was also reported, therefore initiatives that support the conservation and sustainable use of these species are increasingly needed in the region.

# AUTHOR CONTRIBUTIONS

The author confirms being the sole contributor of this work and approved it for publication.

# ACKNOWLEDGMENTS

I am very grateful to Maria Elena Chuspe Zans from the Herbarium of the Universidad Intercultural de la Amazonía Peruana who did the taxonomical identification of the plants. I would like to thank Lore Vael and Madeleine Hancco who collected the data, José Sanchez-Choy who coordinated data collection in Ucayali, and Paula Paz from the Terrai team from CIAT who prepared the map. I am grateful to Hub Peters who revised the English of the manuscript. I would like to extend my thanks to the villagers from

# REFERENCES


Pueblo Libre who participated in the study. Funding was partly provided by the International Center for Tropical Agriculture (CIAT). This work was associated with the 'Attaining Sustainable Services from Ecosystems using Tradeoff Scenarios' project (ASSETS; http://espa-assets.org/; NE-J002267-1), and partly funded with support from the United Kingdom's Ecosystem Services for Poverty Alleviation (ESPA) programme. ESPA receives its funding from the Department for International Development (DFID), the Economic and Social Research Council (ESRC) and the Natural Environment Research Council (NERC).


Thailand. NJAS Wageningen J. Life Sci. 78, 1–11. doi: 10.1016/j.njas.2015. 12.003


Desai, V., and Potter, R. B. (2006). Doing Development Research. London: Sage.

Díaz-Betancourt, M., Ghermandi, L., Ladio, A., López-Moreno, I. R., Raffaele, E., and Rapoport, E. H. (1999). Weeds as a source for human consumption. A comparison between tropical and temperate Latin America. Rev. Biol. Trop. 47, 329–338.

Duke, J. A. (1992). Handbook of Edible Weeds. Boca Raton, FL: CRC Press.


**Conflict of Interest Statement:** The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Cruz-Garcia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Incipient Domestication Processes in Multicultural Contexts: A Case Study of Urban Parks in San Carlos de Bariloche (Argentina)

Romina Betancurt <sup>1</sup> , Adriana E. Rovere<sup>2</sup> \* and Ana H. Ladio<sup>3</sup>

*1 INIBIOMA, CONICET, Universidad Nacional del Comahue, Bariloche, Argentina, <sup>2</sup> CONICET-Universidad Nacional del Comahue, Bariloche, Argentina, <sup>3</sup> Ethnobiology Group, INIBIOMA CONICET, Universidad Nacional del Comahue, Bariloche, Argentina*

Edited by:

*Urs Feller, University of Bern, Switzerland*

#### Reviewed by:

*Qinfeng Guo, United States Forest Service (USDA), United States Karl Kunert, University of Pretoria, South Africa*

> \*Correspondence: *Adriana E. Rovere adrirovere@gmail.com*

#### Specialty section:

*This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution*

> Received: *06 September 2017* Accepted: *11 December 2017* Published: *22 December 2017*

#### Citation:

*Betancurt R, Rovere AE and Ladio AH (2017) Incipient Domestication Processes in Multicultural Contexts: A Case Study of Urban Parks in San Carlos de Bariloche (Argentina). Front. Ecol. Evol. 5:166. doi: 10.3389/fevo.2017.00166* Up to now, the processes of domestication of urban landscapes have been little studied. The public green spaces in the city of Bariloche, an enclave with growing urbanization which lies within the Andino Norpatagonica Biosphere Reserve, offer an opportunity to evaluate cultural molding of the environment. We analyzed different management methods of woody species, both *in situ* and *ex situ*, in parks located in sectors with different environmental, socioeconomic, size, age and administration characteristics. Our hypotheses were: (1) Species richness will be higher for exotic plants, in accordance with global patterns of ornamental species selection. (2) Species richness and type of management practice will vary according to the kind of environment, the socioeconomic profile of the neighborhood, the age and size of the park, and type of administration (bottom-up or top-down). (3) Bottom-up park administration will lead to a different landscape than top-down administration. Thirty randomly selected parks of both local council and neighborhood administration and varying environmental and socioeconomic conditions were examined and the composition of their woody species identified. In addition, semi-structured and free interviews were carried out with those responsible for park management, both *in situ* (tolerance, enhancement, protection) and *ex situ* (sowing, use of cuttings or transplanting). In accordance with our hypothesis, the processes of domestication of the urban landscape show a tendency toward an anthropized diversity of 130 species, mainly exotic in origin (72%), and principally from the Holarctic region (67%). However, multinomial logistic analysis revealed that in parks under neighborhood administration tolerance of native species is higher (13 times) than in parks administrated by the local council. Species richness increases along an environmental and socioeconomic gradient, and with the age of parks, but does not vary with size. We conclude that urban parks are constructed cultural niches which, as in an agroforestry system, are scenarios which reveal processes of incipient domestication that reflect different cosmovisions and drivers typical of multicultural contexts.

Keywords: biosphere reserve, cultural niche construction, domestication, ornamental species, urban spaces, woody plant diversity

# INTRODUCTION

Human beings have been the principal drivers of change in the earth's ecological systems since the Holocene, 10,000 years ago (Smith and Zeder, 2013; Zeder, 2015). Molding of landscapes and domestication of plants and animals are some of the most important humanization processes (Kareiva et al., 2007). Due to the fact that the world population is concentrated in cities, these environments are currently the focus of particular interest (Pataki, 2015).

Cities are often described as new ecosystems or "novel ecosystems," since there were no analogous natural ecosystems previous to human population (Hobbs et al., 2006), and they are even called "novel biomes" (Pincetl, 2016). These urban ecosystems contain microenvironments and biological ensembles (such as urban parks), which in contrast to natural remnant ecosystems are constructed and designed by their inhabitants (Grimm et al., 2000). Urban vegetation is unique in that it consists of new assemblages of native and exotic tree species influenced by the biophysical conditions of the site, such as climatic factors, and also human drivers such as management or planting preferences (Aronson et al., 2015).

Urban parks can be seen as landscapes that synthesize the multiple interests (material, symbolic, emotional, etc.) of urban societies with their plant surroundings. They are essential spaces in the lives of city dwellers for various reasons, whether ecological, sociocultural or scenic (Finol, 2005; Tella and Potocko, 2009). Parks which have vegetation as a basic element of their composition can be defined as "green spaces," that is, public spaces which, due to the presence of plants, contribute to the wellbeing of their users and provide optimal conditions for sports or games, relaxation and rest (Saillard, 1962; Rodríguez-Laredo, 2008; Linhares de Souza et al., 2012). These spaces also have a role to play in the ecosystem, since they can act as a reservoir of biodiversity and wildlife (Ladio and Damascos, 2000; Rudd et al., 2002; McKinney, 2006; Nagendra and Gopal, 2011), as biological corridors facilitating connectivity with nearby conservation units (Rudd et al., 2002; Rovere and Molares, 2012), and as water flow regulators (Tella and Potocko, 2009; Argañaraz and Lorenz, 2010). Their role is currently so significant that the presence of a number of parks in a city is included as one of the principal indicators of urban habitability (Van Herzele and Wiedemann, 2003).

Even though parks have been managed in different ways to favor the wellbeing of those who use them for diverse reasons (Rozzi et al., 2003; Luck et al., 2009), they have been little studied using an ethnoecological approach. From this perspective we consider as management practices those actions which lead to the maintenance of, or even increase in, the biodiversity or abundance of plant species in an area, in such a way that their sustainability over time is favored (Berkes and Davidson-Hunt, 2006; Moreno-Calles et al., 2010). The domestication of species and landscapes is a particularly important consequence of the management of species of interest, and also those of no interest, given that they are often eliminated. In the case of species of interest, a process is carried out through artificial selection whereby humans choose individuals with certain heritable characteristics, which modifies the genetic, morphological and functional composition of the populations (Casas et al., 2014, 2016). Specifically, this incipient domestication process refers to a state that depends on the level of management intensity of plant populations, where the average phenotype of the selected character is maintained within the range of variation found under natural conditions (Clement, 1999).

These interventions, transformations or decisions regarding the elements and functional processes of natural or artificial systems have explicit socio-cultural objectives (Casas et al., 2014). According to current thinking, these practices are linked to the idea of niche construction (Zeder, 2015), a process through which organisms, by means of their activities and options, modify their own niches in order to transform the pressure of natural selection. This concept is particularly pertinent in the case of humanization processes, where considerable environmental modification is occasioned through cultural practices (Laland and O'Brien, 2011; Smith, 2011).

Casas et al. (1996) have identified different landscape management practices; in situ, which are carried out within the distribution area of the plant species, and ex situ, in sites where a species is not normally found. The first case includes situations where different components, whether native or exotic flora, are tolerated and allowed to grow, or are protected, particularly from frost and/or pests, or are weeded and pruned to favor flowering or fruiting, amongst other cultural care practices (González-Insuasti and Caballero, 2007; Eyssartier et al., 2011; Moreno-Calles et al., 2012; Parra et al., 2012). In ex situ management, on the other hand, plants may be specially cultivated by seed, cuttings and/or seedlings (Blancas et al., 2010; Duque-Brasil et al., 2011) obtained from wild areas (Eyssartier et al., 2011; Moreno-Calles et al., 2012; Parra et al., 2012), plant nurseries and/or nearby gardens.

Urban flora has been studied from different points of view, but principally from the perspective of urban ecology. It has been found, for example, that socio-economic and environmental drivers are important in determining patterns of urban plant richness (Luck et al., 2009; Avolio et al., 2015); for example, proprietors' incomes correlate positively with plant richness a relationship which has been defined as the "luxury effect" (Hope et al., 2003). Urban flora can thus be considered as a mosaic of small, public or private plots or patches along social and environmental gradients, such that each plot has its own set of drivers, and is molded by the kind of management or administration it is subject to Avolio et al. (2015).

Analysis of changes or the impact of urbanization on local ecosystems has been related to socioeconomic (e.g., income, level of education) and/or environmental (e.g., precipitation, soil fertility) drivers (Luck et al., 2009; Avolio et al., 2015). Plurality of owners and custody types, for example of urban parks, can therefore contribute enormously to the diversity of management styles of the vegetation, thus influencing the structure, composition and distribution of plant communities throughout the urban landscape (Avolio et al., 2016). For example, decisions on management may be top down, like those imposed by planning guides, conservation obligations or owners' associations, or they can be bottom-up, such as those carried out by individuals hired to manage the space, local groups of friends or individual residents and garden owners (Kendal et al., 2012).

Preliminary studies show that the selection of plant species for green spaces in cities is not a random process, but depends rather on decisions taken principally by municipal governments (Informe Ambiental Annual, 2008; Tella and Potocko, 2009; Rovere et al., 2013). However, those using the spaces often participate as principal actors, individually or collectively, whether representing neighborhood initiatives, NGOs, societies or institutions, public or private, and they also play a significant role in the conformation, care and use of these spaces. This landscape, therefore, is created and recreated by urban dwellers under different forms of custody, following diverse motives that may be sociocultural, economic, conservationist and/or symbolic in nature, amongst others (Rovere et al., 2013).

It has been found that there are global selection patterns for ornamental plants, to guide societies, principally in temperate areas, with the result that their flowers are very similar to each other, and are dominated by the Rosaceae and Asteraceae families (Rovere et al., 2013). Added to this are cultural forces, such as the desire to imitate landscape criteria used in northern hemisphere countries, a phenomenon observed, for example, in the creation of different Patagonian cities in Chile and Argentina, generating a selection by inhabitants that is oriented toward exotic species (Rozzi et al., 2003; Rovere et al., 2013).

Characterization of the use of urban spaces, in this case urban parks, is complex and dynamic, and involves numerous factors that could determine detectable gradients (Juri and Chani, 2005). In this sense, urban green spaces could be scenarios where, through the action of their inhabitants, the horizontal and vertical diversification of species in the landscape is enriched, or simply where small patches of native plants from the surrounding area are conserved (Ospina-Ante, 2003). The function and ecological sustainability of urban landscapes is strongly influenced by the composition and structure of the local plant community (Threlfall et al., 2016). However, they may also be a source of propagules of invasive species which can cause serious environmental and economic damage (Ladio and Damascos, 2000; Rovere and Molares, 2012; Rovere et al., 2013). The detailed study of practices, selection and maintenance of species in urban parks is therefore of considerable importance in terms of conservation, as well as contributing to reflection on the role of humans in the construction of this kind of niche.

This particular study was carried out in the city of San Carlos de Bariloche, situated within the Andino Norpatagonica Biosphere Reserve (UNESCO, 2010). The urbanization of this enclave is relatively new and in the process of growth, although with certain commitment in terms of conservation due to its situation within a nature reserve, which gives us an opportunity to evaluate the cultural molding of this environment due to human action. Our objectives were: to analyze species composition and different management methods in situ and ex situ of woody species in parks of San Carlos de Bariloche. The parks are distributed in sectors with environmental, socioeconomic, size, age, and management (bottom-up vs. topdown) differences. The principal hypotheses were: (1) Species richness will be greater for exotic species, in accordance with global selection patterns for ornamental species (2) Species richness and type of management practice will vary according to the environmental category, socioeconomic profile of the neighborhood, age and size of the park and type of administration (3) Bottom-up administration of parks will lead to a different landscape than top-down administration in San Carlos de Bariloche.

# METHODS

# Study Location and Data Collection

This study was carried out in the city of San Carlos de Bariloche (41◦ 08′ S and 71◦ 18′W), in the northwest of Argentine Patagonia (Rio Negro province). The city is situated in an Andean Patagonian forest environment that lies within the limits of Nahuel Huapi National Park. The city extends 45 km, (the widest extension in the country) and covers a total surface area of 22,376 ha. The west-east gradient is very marked, principally due to a decrease in precipitation caused by the orographic effect of the Andes. Climate in the region is temperate-cold and humid, with a Mediterranean-like precipitation regime, with rains and snow principally in winter. The city, founded in 1902, has a population of 133,500 inhabitants (INDEC, 2010). The community is pluricultural, with Mapuche inhabitants (the principal indigenous group of Patagonia), European immigrants and new immigrants from other urban areas of the country. Tourism is the main economic activity (Chebez, 2005). A large part of the city adjoins native forest or is situated within small isolated wooded areas with different levels of anthropic disturbance, while other neighborhoods lie within the foreststeppe ecotone (Dzendoletas et al., 2006).

There are many parks distributed throughout the urban territory, constructed by different organizations, of different sizes, age, and types of management and administration. This study was carried out in different stages: firstly, an inventory was made of the woody species present in 30 parks within the city area (**Figure 1**), and secondly, participant observation and interviews were carried out.

# Composition of the Vegetation

Identification of the woody flora was carried out during January and February of 2015, and from October 2015 to January of 2016, which correspond to spring and summer in the southern hemisphere, in order to find the highest possible number of species bearing flowers or fruit, since this aids identification. The botanical determination followed Correa (1971, 1984, 1988) and Dimitri (1977, 1978). All scientific names were updated using databases of the Instituto Darwinion (Zuloaga et al., 2008) and the Missouri Botanical Garden (www.tropicos.org). Plant specimens were placed in the Ecotono-University of Comahue herbarium. For each park, the name, size, geographical location, height, species composition and richness were recorded. Using the Instituto Darwinion and Missouri Botanical Garden databases, the botanical family each species belonged to was registered, plus its biogeographical origin (native or exotic). We considered a species as native if its natural distribution was found within Argentina, whereas those which arrived in

the country accidentally or intentionally were considered exotic species (Zuloaga et al., 2008). The biogeographic kingdoms of origin of the species (Holarctic, Paleotropical, Neotropical, Cape, Australian, or Antarctic) were determined by consulting bibliographical sources.

# Ethnoecological Field Methods

Firstly, the consent of participants was obtained according to the Society of Ethnobiology Code of Ethics guidelines (ISE, 2006). The work was based on participant observation in the parks and personal interviews with key informants, preferably those either completely or partly in charge of managing these spaces. The key informants were specialists who had knowledge and experience of the practice in question (Albuquerque et al., 2010).

Special emphasis was placed on getting to know the people who actually looked after the parks, recording whether they were council employees, employees hired by the neighborhood committees or residents from the surrounding areas. Each of these individuals was consulted as to their responsibility regarding selection of the plants used, the criteria and type of management, age of the park, and type of administration. All this information was validated through participant and nonparticipant observation (compilation of records, photographs, articles, etc.) so as to become well informed as to the use of these spaces (Höft et al., 1999; Albuquerque et al., 2014).

# Age and type of Administration

With regard to age of the parks, 6 categories were used: from 1-10 years old (1), 10-20 (2), 20-30 (3), 30-40 (4), 40-50 (5), and over 50 years (6). For analysis of the type of administration, based on the interviews carried out and journalistic and council information, the administration types were classified as council (top-down) and neighborhood (bottom-up). The first corresponds to spaces maintained exclusively by squads of personnel from the Council Parks and Gardens department, while the parks with neighborhood management are administered and maintained by members of associations or neighborhood committees (either the members themselves or hired workers) and residents of the area.

# Environmental and Socio-Economic Gradient

Categories were established for the environmental conditions where the parks are situated. These categories were based on previous knowledge and observation of each space, taking into account factors such as exposure to wind, precipitation and soil characteristics, which form part of the biophysical characterization of other, similar studies (Luck et al., 2009; Avolio et al., 2015). Values from 1 to 4 were established, based on the level of environmental stress, where (1) is high, with direct exposure to strong winds, precipitation at the low end of the range for the area (less than 800 mm annually) and poor soil; (2) is a medium to high level, characterized by considerable winds, medium to low precipitation (800–1,000 mm) and rather stony soil; (3) is medium to low, grouping sites with relatively low wind exposure, medium to high precipitation (1,000–1,400 mm) and soil which is less than favorable; finally, (4) represents low environmental stress, with low wind exposure, high levels of precipitation (1,400–2,000 mm) and fertile soils rich in organic material. The characteristics of wind exposure, precipitation and soil type were obtained from Dzendoletas et al. (2006) and Pereyra (2007).

The different socioeconomic levels of the city parks were also classified into categories, thus establishing possible socioeconomic gradients. To this end we considered the grouping carried out in 2009 and 2013, presented in reports from the Regional Studies Centre of the Universidad Fasta in Bariloche, in which socioeconomic categories were drawn up from information obtained through cadastral surveys in different neighborhoods of the city. This information was cross-checked with the income of the active working population and the value of the basic food basket, thus obtaining the following classification: low class, corresponding to the sector of the population whose income is no more than U\$S 980 per month; lower middle class, with an income of between U\$S 980 and U\$S 1,230; mid-middle class with U\$S 1,230 - U\$S 2,430; upper middle class with an income of U\$S 2,430 to U\$S 2,700; and finally, upper class, with salaries over U\$S 2,700. Following the criteria detailed above, each park was assigned one of the following categories: low class (1), lower middle class (2), mid-middle class (3), upper middle class (4) or upper class (5).

# Management Practices

For analysis of management methods and cultural care, Casas et al. (1996) were followed, whose categories are based on consideration of the different human interventions in the landscape, divided into: in situ and ex situ. Management in situ refers to interactions carried out within the space occupied by the plant species and can be grouped as follows: tolerance (the vegetation established prior to human intervention is maintained in the space), enhancement (seeks to increase the population density of plants, an example being irrigation) and protection (refers to actions such as elimination of competing or predatory species, application of fertilizers, pruning, protection against frost, the use of stakes, etc.). In contrast, ex situ methods include human-plant interactions that take place outside the areas normally occupied by a plant species; these are habitats created and controlled by humans. Two classes are distinguished: the sowing of seeds or planting of cuttings (artificial propagation of reproductive or vegetative structures) and transplanting (the movement of complete individuals from natural sites or other humanized spaces, such as plant nurseries). This information was obtained during interviews with those in charge of maintenance and/or participant observation. A questionnaire was used to identify the different actions taken in relation to the species and the selection processes of the flora, and questions were asked about the design and maintenance work carried out in each park. This information was verified through direct observation of the cultural care currently in force and the richness of species established in each park, and was then entered in a database. The presence or absence of different practices was classified for each park as 1 or 0, respectively.

# Data Analysis

Quali-quantitative analysis of the ethnoecological data was carried out (Albuquerque et al., 2014). Due to the categorical nature of the data, the analyses were principally non-parametric and multinomial logistics, using the SPSS 23 package for Windows. Species richness was calculated per park and for all the parks studied; the number of native and exotic species was also determined, as well as the richness of botanical families per park and in total (Test Binomial and X 2 , p < 0.05; Conover, 1971). The mean richness of species per park was compared for exotics and natives (Mann-Whitney, p < 0.05). The degree of invasion (DI) per park was compared under different management types (Mann-Whitney, p < 0.05), considering (DI) as the following:

$$DI = \frac{\text{exotic rhines}}{\text{(exotic} + \text{native) rhines}}$$

In addition, a Poisson regression analysis was performed in order to evaluate relationships between exotic and native species for each type of administration (top-down and bottom-up). The relation between total species richness, the number of native and exotic species, and the area and age of the parks was evaluated (Spearman Correlation, p < 0.05; Höft et al., 1999). The use consensus or cultural importance (CI) index, frequently used for analysis of ornamental species (Rovere et al., 2013), was calculated for the species and/or families (Ladio and Lozada, 2008; Molares and Ladio, 2009):

$$CI = \frac{\text{no. of parts containing the species or family}}{\text{total no. of parts}} \text{ 100}$$

The Spearman correlation test (p < 0.05) was used to evaluate the association between total richness, including native and exotic species, and the categories describing the different environmental and socioeconomic conditions. Parks with bottom-up administration were compared with top-down (council) in terms of total species richness, richness of native and exotic species, and that of botanical families (X2, p < 0.05; Conover, 1971). In addition, similarity in species composition native, exotic and total (native plus exotic) was analyzed using the Jaccard index (IJ)

$$II = \frac{c}{a+b+c} \text{ 100}$$

where c is the number of species common to both types of management, a is the number of species present in only one of the management categories and b is the number of species present in only the other type of management (Höft et al., 1999).

In addition, a multinomial logistic regression analysis was carried out with the SPSS 23.0 <sup>R</sup> program in order to obtain a model which describes how the probability of different management types varied with the type of administration,

bottom-up or top-down (dependent categorical variable; Agresti, 1996; Chan, 2005). Thus, the algebraic model was ln p/1-p = β0 + β1 tolerance + β2 enhancement + β3 protection+ β4 sowing + β5 transplanting, where p/1-p is the odds ratio; i.e., the probability that an event will happen in relation to the probability it will not. In this kind of regression, the tendencies are established according to the categories for comparison (in this case, the category bottom-up). The model allows us to see the impact of each of the factors in terms of controlling the other factors, and so the probability of each event occurring can be established. The model we found was significant (p < 0.05), with a high goodness of fit measure (Pearson and Deviance indices, p > 0.05). The calculations of the odds ratios (i.e., the probability of an event happening) were done by means of ebeta = Exp (beta) (Agresti, 1996; Chan, 2005).

# RESULTS

# Vegetation Composition

As expected, considering global patterns in relation to useful plants, a total richness of 130 species was registered (Supplementary Table 1), of which 72% were exotic and 28% native species (Binomial test, p < 0.05). These plants belong to 36 botanical families. We also found that mean exotic richness is significantly higher than for native species (Mann-Whitney test, U: 233, p < 0.05, **Figure 2**). However, no significant differences were found in degree of invasion (DI) between the different types of management (Mann-Whitney test, U: 69, p = 0.57). No relationship was found between richness of native and exotic plants for bottom-up (r <sup>2</sup> = 0.01; p = 0.708) or top-down (r 2 = 0.36; p = 0.157) administration (**Figure 3**). The majority of plants present were identified to species level, except for 3 unidentified specimens, some other specimens which belonged

to the Picea, Prunus, and Pyrus genera, some belonging to the Cupressaceae family and those belonging to the genus Rosa (except Rosa rubiginosa). The most frequently represented botanical families in the parks were Rosaceae (93%) and Pinaceae (70%) (**Figure 4**). The species with the highest CI were the native Maytenus boaria (57), the exotic Cytisus scoparius (53), Sorbus aucuparia (47), Betula pendula (43), Rosa sp. (43), the native Schinus patagonicus (43) and the exotic Prunus cerasifera (40) (Supplementary Table 1).

With regard to the biogeographical kingdom of origin of the species, 68% come from the Holarctic, 29% from the Antarctic, 9% from the Paleotropical, and 5% from the Neotropical kingdom (Supplementary Table 1). No species were identified in this case from the Cape or Australian kingdoms.

# Richness in Relation to Age and Size

The parks studied range in size from 804 to 21,592 m<sup>2</sup> , 4676 (±4601) m<sup>2</sup> being the average size. A positive relationship was observed between the area of the park, total species richness and exotic species richness (p < 0.05; r = 0.455; r = 0.534 respectively). Furthermore, total richness and exotic species richness were found to be positively correlated with the age of the park (p < 0.05; r = 0.644; r = 0.675 respectively). No significant correlation was found between the number of native species and the area or age of the park (p = 0.0820; r = −0.043 and p = 0.924; r = 0.018 respectively).

# Richness across the Environmental and Socioeconomic Gradients

The total species richness of parks was found to increase with the more favorable categories in both environmental (r = 0.418, p < 0.05) and socioeconomic (r = 0.636, p < 0.05) terms. In addition, richness of both native and exotic species increased with the socioeconomic gradient category r = 0.453, p < 0.05, r = 0.496, p < 0.05 respectively). When considering the environmental

gradient, no relation was found between this and the number of native species (r = 0.252, p = 0.18), or exotic species (r = 0.217, p = 0.250).

# Bottom-Up vs. Top-Down Administration and Management Practices

The management systems of the parks in Bariloche fall into 5 categories (both ex situ and in situ), the most frequent being

protection (87%), transplanting (87%), tolerance (60%), and enhancement (60%). Cases of seed sowing in parks were not registered in this work. In the case of protection, the activities recorded were staking (33% of the total number of parks) and pruning (77% of all parks).

Most parks (23, 77%) are neighborhood-run (bottom-up), and are spaces whose maintenance is carried out by members of the corresponding neighborhood committees, associations or individual residents. Seven parks (23%) are top-down managed, in the charge of squads from the Parks and Gardens department of Bariloche Local Council. The council-run parks hold 79 species in total, of which most are exotic (78%). In the neighborhood parks, on the other hand, 99 species were found, of which 65% were exotic and 35% native. Comparing the council-run (topdown) and neighborhood-run (bottom-up) parks, it was found that the former had significantly higher total richness (Mann-Whitney test, U: 39, p < 0.05, **Figure 5**) and exotic species richness (Mann-Whitney test, U: 31, p < 0.05, **Figure 5**). No significant differences were found in the number of native species (Mann-Whitney test, U: 79, p = 0.962), or in the area covered by the green spaces (Mann-Whitney test, U: 56, p = 0.245) when the two types of management were compared. Total species similarity for the two types was 36% (IJ), for native species 46% (IJ), and for exotic species 32% (IJ).

As expected, the parks with bottom-up administration apply different management practices than the top-down administration. In the council administrated (top-down) parks, the most frequent practices are transplanting, enhancement and the planting of cuttings (**Figure 6**). The practices of protection, enhancement and transplanting, however, are more frequent in the neighborhood parks (bottom-up). The practice of protection is markedly similar in both park types (X <sup>2</sup> = 0.007, p = 0.933, **Figure 6**). In the council parks the only means of protection registered is pruning, whereas in the neighborhood parks pruning was registered (74%), but also the use of stakes (43%).

Similarly, no significant differences were found between council and neighborhood parks for the practices of transplanting (X <sup>2</sup> = 1.405, p = 0.236) or enhancement, which includes irrigation and removal of competing species (X 2 = 2.516, p = 0.113). In the neighborhood parks both irrigation and removal of competing species were registered, while in the council parks only irrigation was carried out. This occurs because the latter have areas covered by lawn, and so the appearance of other woody species is unlikely. Park irrigation is generally carried out by the residents/those in charge, by hand, with water being provided either from the park itself, outside sources (e.g., a house), or spray irrigation systems. Irrigation is used in 86% of council and 39% of neighborhood parks. As to the removal of competitive species, in this work the extraction of two exotic invasive species was registered—retama (Cytisus scoparius) and rosa mosqueta (Rosa rubiginosa)—in 5 neighborhood parks (3 of which had no irrigation). The practice of tolerance is significantly higher in bottom-up managed parks than in top-down ones (Binominal Test, p < 0.05), while the planting of cuttings is more frequent in council parks, where pruning remains are taken advantage of, thus reducing the need to purchase plants (Binomial Test, p < 0.05).

The practice of protection is similar in both types of park, although in the case of council administration this percentage represents pruning carried out to prevent branches breaking due to the weight of snow in winter. No staking, however, was registered in these parks, since they are older than the other ones in the city, and the plants are therefore adult specimens. In contrast, the neighborhood parks use the protection techniques of pruning and staking; the presence of stakes indicates that there are much younger plant specimens here. In relation to transplanting, no significant differences were found between the two types of park, although there was a tendency for this to be more common in council parks, given the high proportion of Rosas sp introduced through cuttings—these are in fact the only species to be introduced in this way.

The multinomial logistic regression model (**Table 1**) confirms the tendency mentioned above, considering all the factors together. The results show that although there are variations in forms of management depending on who is in charge of maintaining the parks, only tolerance differs significantly in the presence of the other forms of management (**Table 1**), being 13 times greater in neighborhood parks than in council ones (β = 2.583).

# DISCUSSION

# The Total Species Richness of Parks in Bariloche Reflects a Europeanizing Cosmovision

In agreement with our first hypothesis, the species richness of parks in Bariloche is mainly composed of exotic species (78%), an aspect which appears to be shared with other parks worldwide (Nagendra and Gopal, 2011; Linhares de Souza et al., 2012; Kowarik et al., 2013). The species found in Bariloche parks are common in local urban flora (Damascos et al., 1999) and in parks and public spaces throughout Argentina (Alvarez et al., 2009). In line with this, in terms of ornamental plant availability in local and regional plant nurseries, exotic species predominate over native ones. This could be a consequence of lack of knowledge of native plants or the techniques required for their cultivation, and also the preferences mentioned previously. In addition, exotic flora often present more rapid growth than many native species, and reproduce very well through vegetative propagation, and at a low cost Rosa, Cotoñaster, Photinia, Ligustrum, etc.). These plants are very attractive, both to tourists and residents, due to their diversity and contrasting leaf colors, and the size of their fruit and flowers.

Nevertheless, the exotic species used for ornamental purposes do present disadvantages in terms of cultural care, since they often require fertilizing, protection from frost, and more frequent watering. Their major impact, however, is doubtless their naturalization and invasion of ecosystems (Rovere et al., 2015). Native species, in contrast, have developed in the area and generally require less cultural care. They have co-evolved with pollinators and dispersers, and are very important for biodiversity. Native species form part of the local biocultural heritage, and so their diffusion not only helps to preserve natural ecosystems, minimizing the spread of exotic species, but also promotes appreciation of and respect for the local biocultural system.

The total richness of woody species registered in Bariloche parks (130 species), is similar in number to the ornamental woody species (137) used for hedging in eight Patagonian cities, including Bariloche (Rovere et al., 2013). These two


TABLE 1 | Multinomial logistic regression model, considering type of administration as dependent variable. The presence or absence of management practices (tolerance, enhancement, protection, planting of cuttings and transplanting) was considered the independent variable.

*(A) Likelihood ration test (B) Parameter estimations. Calculation of odds ratios (i.e., the probability that an event will occur) is shown with analysis of ebeta* = *Exp(beta).* β*, beta; S.E., standard error, Wald is the value of the Chi-squared test; df, degrees of freedom; Sig., level of significance; and Exp(*β*), Odds ratios calculated (probability of occurrence of an event).* \**Significant values.*

situations reflect similarity in the value placed on exotic plants as environmental assets of cultural importance, and reveal processes of construction of an anthropized landscape that have prioritized the showy species of Eurasia. Most species found in Bariloche parks correspond to the Holarctic biogeographical kingdom (68%), following this tendency. This result coincides with the findings of Rozzi et al. (2003) in their analysis of native and exotic trees in the parks of Magallanes, in the south of Chile, where between 70 and 100% of the species are of European or North American origin.

It is important to note that the parks in Bariloche, of both administration types, are mainly used by local children and young people, while the top down parks are more frequently visited by tourists, given that they are mainly located in the downtown area. Nonetheless, we found no evidence to suggest that the preferences of tourists are considered by those who take care of the parks; decisions are made according to the particularities and preferences of local management.

Our results demonstrate in these landscapes the projection of conceptions and values of the hegemonic cultures which dominate the market, with species mainly from the northern hemisphere. In agreement with Rovere et al. (2013) in the case of hedges, the preferences of the first Swiss, German, and Austrian immigrants have been strongly expressed in the city's parks, reproducing the "Argentine Switzerland" idea of landscape domestication.

The values found here for total richness indirectly show a set of actions over time, from the foundation of the city 115 years ago, which has molded the urban vegetation and favored certain plants over others. Based on a Eurocentric conception that what is most beautiful and valuable comes from that region, native biological and cultural diversity has been underestimated and undervalued (Rozzi et al., 2003; Roger et al., 2014). Where the city of San Carlos de Bariloche lies today, there lived indigenous Mapuche communities which were decimated during the Desert Campaign organized by the Argentine Government at the end of the nineteenth century (Moyano, 2017), and so all the biocultural richness of this region was also decimated and discriminated against.

Nevertheless, this bias in the domestication of park vegetation toward the construction of landscapes that are very distant from the native forests of the region, seems to have been undergoing a process of change in recent times. This is due to multiple factors rooted in changes occurring in the cosmovision of urban society, now moving toward its own ideas of conservationism. Although no indirect relation was found between the age of the parks and native plant richness, we did demonstrate a luxury effect in relation to native plants, such that parks situated in neighborhoods with higher socioeconomic levels are associated with higher native plant richness. The neighborhood-administrated parks, generally newer, have 13 times greater probability of finding practices of tolerance, evidencing greater interest in native plants.

It is to be expected that the presence of almost 30% species of Antarctic origin in urban parks of Bariloche will increase in the future, especially with the spread of conservationist ideas on native flora, due to the mass media and environmental education programs in schools of the region. The temperate forest area of Argentina and Chile is home to many attractive, ornamental native species, which in fact are used in other parks and gardens around the world (Puntieri and Grosfeld, 2009). Historical records reflect patterns of colonialism in Patagonian urban flora, which has already been pointed out by Rapoport (1988) and reveals how profound processes of cultural domination markedly mold the landscapes, even those used for recreation.

The parks in San Carlos de Bariloche therefore constitute a model to be followed, in terms of species selection and management, at both regional and national levels. Even though exotic species still predominate, the parks are slowly being enriched with ornamental species from the temperate forest, and local residents are becoming involved in the care of these parks.

# Incipient Domestication of Native Plants?

Although more research is required, our work enables us to assert that the different management practices and the use of some native ornamental species, as well as exotic species, reflect processes of incipient domestication processes in urban plots. For example, the native tree species Maytenus boaria, of great cultural importance in Bariloche parks and subjected to different practices of tolerance, protection and enhancement, is a case worthy of future study. This species is also the most frequently used for hedging, according to a study carried out in 8 Andean Patagonian cities (Rovere et al., 2013). It is a perennial tree with numerous uses (Rapoport et al., 2001), which tolerates constant pruning of its foliage in the maintenance of hedges and parks, and thermal protection due to urbanization. It has a high capacity for post-disturbance regeneration (Damascos et al., 1999) and its seeds are dispersed by urban birds (Amico and Aizen, 2005), attributes which confer a great adaptive capacity, therefore ideal for domestication.

This case shows how domestication is a collective process, with fluctuations in progress over time. Further research along these lines could be carried out to determine the genetic variability of native species growing in urban environments in comparison with those found in the surrounding wild areas.

It should be pointed out that the parks of Patagonia are important spaces for the conservation of our patrimony, both natural and cultural (Rozzi et al., 2003); they can contribute to the conservation of endemic and/or endangered species, where domestication takes on added value in terms of conservation of biodiversity.

# The Largest, Oldest Parks Hold More Anthropized Exotic Vegetation

The average area of Bariloche parks is highly variable (4,676 ± 4,601 m<sup>2</sup> ), but in comparison with a study carried out on 22 parks in the city of Aracaju (Brazil), the area occupied by each park is significantly less (Linhares de Souza et al., 2012), revealing the need to increase the number or size of parks in the city.

In accordance with our second hypothesis, and as a consequence of the reasons mentioned above, the largest and oldest parks are those which have highest species richness, particulary considering exotic species, a pattern which has been found in other parks around the world. For example, in the parks of Aracaju (Brazil), 58% of the plants are exotic species (Linhares de Souza et al., 2012). Changes in plant species have also been documented across urban gradients (Aronson et al., 2015; Threlfall et al., 2016). On analyzing the pattern of urbanization and the species richness of native and non-native woody species in the metropolitan region of New York, an urban-rural gradient was registered such that native species decrease and non-native species increase with urban coverage, and the flora is dominated by non-native species (Aronson et al., 2015), a pattern similar to that registered in the city of Berlin (Kowarik et al., 2013), and in other European cities.

In this work, however, native species richness did not vary with size or with the age of the park. Park age was not related to greater richness of native species since the older parks were designed in sectors with no vegetation, under the cosmovision of the time of their creation, and so mainly exotic species were used. Park size was not related to native species richness, but was possibly related to their abundance, given the higher density of species observed in larger parks. Since our analysis did not consider species abundance, our results are limited. Nevertheless, we can say indirectly that native plants continue to be represented in the parks due to the effects of particular sites which compensate for processes of anthropization with exotic plants, but there is probably a lower number of individuals in each of these areas.

On the other hand, the age of most Bariloche's urban parks is more than 10 years (93% of the parks), which allows us to say that the selection of exotic over native species has been prevalent in the history of Bariloche parks. Although the value of native flora is currently being given more importance, the effects of this will be seen in the medium or long term, together with an increase in new parks following a more conservationist cosmovision. In this study, in the older parks we have not seen the replacement of exotic trees with native trees; they have not been replaced because the trees are still healthy, and Bariloche society respects them as part of their tree cover. All we have seen is a woody species being eliminated if this species has been causing problems.

In general terms we can say that human activity, through management practices such as cultivation, protection, and enhancement, tends to increase the diversity of species in the zones where they are carried out (Berkes and Davidson-Hunt, 2006). In India the oldest parks have been recorded as having few large trees, but a higher diversity of species than newer or recently established parks, since the large trees are gradually replaced by trees of other, smaller species that are easier to maintain (Nagendra and Gopal, 2011). In our case, since Bariloche is a city with a short history (115 years since its foundation), these replacements have not been necessary, given that the longevity of the exotic species exceeds the age of most of the parks.

# Not All Parks Are the Same: Environmental and Socio-Economic Gradients

In support of our second hypothesis, the total species richness of Bariloche's parks increases with better environmental and socioeconomic conditions. This coincides with the "luxury effect" mentioned by Hope et al. (2003). The neighborhoods with a better socioeconomic position, in general situated in the more ecologically favorable areas of the city, usually count on trained gardeners who are hired exclusively to maintain these green spaces.

Similarly, a study on urban trees in southern California found that socioeconomic situation was a more important driver than the environmental aspect, registering in general higher richness of trees in richer neighborhoods (Avolio et al., 2016). In addition, this shows that the species richness of urban trees is influenced by the preferences and perceptions of both managers and residents (Avolio et al., 2016). In our case this has a substantial effect, given that in the less environmentally and socioeconomically favorable sites, those involved in administering the parks have commented on limitations in terms of irrigation, pruning and adequate maintenance of the parks. The interviews suggest that if conditions were more favorable for maintenance of the species, the parks would be better managed and cared for than they are at present. We can therefore ensure that landscape domestication in urban parks has drivers which have been strongly established by the inequality of conditions found throughout the city.

Although we did not study in more detail whether educational level is an important factor in the tendency to change to parks with more native plants, we do not think that there is a linear relationship with the level of education. Our studies in rural areas with illiterate inhabitants always show that inhabitants value and care deeply for the environment.

# Top-Down vs. Bottom-Up Park Administration

Council-run parks present higher total species and exotic species richness than neighborhood-run, but the two types of park are similar in terms of native species richness and degree of invasion. Furthermore, no relationship was found between richness of native and exotic species for each type of administration. Perhaps if species abundance (density or cover) had been considered, the effects of management would have been more evident. There is no doubt that the action of the council (as assessor or species donor) in the forestation of these green spaces is the most relevant factor in determining the composition of exotic species. The council parks, most of which are older and situated in sectors that were originally devoid of vegetation, are found ornamented mainly with foreign species. This arose mainly as a consequence of action carried out in the 1940s by the Isla Victoria National Plant Nursery (Vivero Nacional de la Isla Victoria), which formed part of the National Parks administration in its last years of existence, during which time it donated a percentage of its production to parks, public areas, schools and gardens (Vallmitjana, personal communication).

In contrast, the neighborhood parks, generally newer, had no support of this kind, and are scenarios where the practice of tolerance of some native plants are most frequent, together with the enrichment of new exotic species. One of the most interesting aspects is that native species richness does not vary between the two types of administration, probably due to the fact that both types have low richness, with high standard deviation; for example, if you compare them with forest remnants on vacant urban lots (Damascos et al., 1999). Most of the council-run parks had no woody specimens at the time of their creation, since the zone where the city now lies was converted from an agricultural colony, and the land had been totally cleared of vegetation. Therefore, the low tolerance is because there were no species to tolerate, rather it was reforested. The neighborhoodrun parks, on the other hand, already had much more plant coverage at the time of their creation, which was maintained, according to informants in the interviews. This was mainly due to conservationist criteria and/or in some cases, because of a lack of economic resources to extract the plants.

In support of our third hypothesis, bottom-up park management practices are different to top-down ones, although only tolerance differs notably. It has been suggested that the increase in use of exotic species for ornamentation of hedges and gardens brings with it the need for more management practices (Rovere et al., 2015). Since the exotic plants are not adapted to the climate, they require more cultural care, generating high costs and considerable maintenance time (Ramírez-Hernández et al., 2012). In contrast, the conservation of native species calls for fewer and less frequent management practices, principally enhancement and protection, since these species are adapted to the environmental conditions (Puntieri and Grosfeld, 2009). This would explain why enhancement practices are the most frequent in neighborhood parks, conserving in situ species that form part of the Andean Patagonian forests.

# CONCLUSION

Landscape domestication processes in cities respond to sociobiocultural concerns. They generate a unique molding of local biodiversity, with possible ecological and genetic effects on the species selected. In particular, the comparison of different management practices documented in the bottom-up and topdown administrations reveal certain effects caused by man (whether through action or omission) in different processes of landscape domestication, the product of their varying possibilities and circumstances. The incipient domestication of some species reflects different cosmovisions and drivers, specific to particular pluricultural contexts, which should be studied further. We conclude that urban parks are unique cultural niches which, similar to agroforest systems, are scenarios that manifest the historical processes at work in their creation, and eventual cultural domination, but also processes of biocultural change. These unique urban niches are not static, but vary with the demands and experience of a Latin American city that gradually recognizes the value of its native plants, and the value of local biocultural diversity. Niches like those in Bariloche constitute a model that helps in understanding how multiple factors mold the landscape, due mainly to changes in cosmovision. In Bariloche it seems there is a tendency to value local, native elements more highly than before. Through management, the relation between native and exotic richness in each park can be modified. In the design of new parks or maintenance of existing ones, the strategy of gradually replacing exotic species for native ones should predominate, so as to maintain, enrich and recreate native environments in the urban matrix, in line with local socio-cultural interests. We trust that this work will help to guide management decisions toward conservation of the Andino Norpatagonica Biosphere Reserve, articulating top-down and bottom-up administration.

# AUTHOR CONTRIBUTIONS

AR and AL conceptualized the study and planned the data design and data analysis. RB, AR, and AL collected the data and wrote

# REFERENCES


the manuscript. All the authors edited and revised drafts of the manuscript, approved the final version and agree to be held accountable for the work.

# ACKNOWLEDGMENTS

This research was carried out with partial funding from the National Council of Scientific and Technical Research of Argentina (CONICET,) PIP: 0466 and PIP: 0196 assigned to Ana Ladio and Adriana Rovere respectively, and funding from Universidad Nacional del Comahue. We dedicate this work to the memory of our dearly loved Dr. Eddy Rapoport, who passed away in May, 2017, and who has left us an immense legacy of work in pursuit of biocutural conservation.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2017.00166/full#supplementary-material


"chichipera" cactus forest in the arid Tehuacán Valley, Mexico: their management and role in people's subsistence. Agrofor. Syst. 84, 207–226. doi: 10.1007/s10457-011-9460-x


Saillard, M. (1962). "Infraestructure," in Urbanisme (Paris).


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Betancurt, Rovere and Ladio. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# How People Domesticated Amazonian Forests

Carolina Levis 1, 2 \*, Bernardo M. Flores <sup>3</sup> , Priscila A. Moreira<sup>4</sup> , Bruno G. Luize<sup>5</sup> , Rubana P. Alves <sup>1</sup> , Juliano Franco-Moraes <sup>6</sup> , Juliana Lins <sup>7</sup> , Evelien Konings <sup>2</sup> , Marielos Peña-Claros <sup>2</sup> , Frans Bongers <sup>2</sup> , Flavia R. C. Costa<sup>8</sup> and Charles R. Clement <sup>9</sup>

<sup>1</sup> Programa de Pós-graduação em Ecologia, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil, <sup>2</sup> Forest Ecology and Forest Management Group, Wageningen University & Research, Wageningen, Netherlands, <sup>3</sup> Departamento de Biologia Vegetal, Instituto de Biologia, Universidade Estadual de Campinas, Campinas, Brazil, <sup>4</sup> Programa de Pós-graduação em Botânica, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil, <sup>5</sup> Programa de Pós-graduação em Ecologia e Biodiversidade, Instituto de Biociências, Universidade Estadual Paulista (UNESP), Rio Claro, Brazil, <sup>6</sup> Programa de Pós-graduação em Ecologia, Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil, <sup>7</sup> Instituto Socioambiental, São Gabriel da Cachoeira, Brazil, <sup>8</sup> Coordenação de Pesquisas em Biodiversidade, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil, <sup>9</sup> Coordenação de Tecnologia e Inovação, Instituto Nacional de Pesquisas da Amazônia, Manaus, Brazil

Edited by: B. Mohan Kumar, Nalanda University, India

#### Reviewed by:

Bharath Sundaram, Nalanda University, India Jean Kennedy, Australian National University, Australia Louis S. Santiago, University of California, Riverside, United States

> \*Correspondence: Carolina Levis carollevis@gmail.com

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> Received: 31 July 2017 Accepted: 13 December 2017 Published: 17 January 2018

#### Citation:

Levis C, Flores BM, Moreira PA, Luize BG, Alves RP, Franco-Moraes J, Lins J, Konings E, Peña-Claros M, Bongers F, Costa FRC and Clement CR (2018) How People Domesticated Amazonian Forests. Front. Ecol. Evol. 5:171. doi: 10.3389/fevo.2017.00171 For millennia, Amazonian peoples have managed forest resources, modifying the natural environment in subtle and persistent ways. Legacies of past human occupation are striking near archaeological sites, yet we still lack a clear picture of how human management practices resulted in the domestication of Amazonian forests. The general view is that domesticated forests are recognizable by the presence of forest patches dominated by one or a few useful species favored by long-term human activities. Here, we used three complementary approaches to understand the long-term domestication of Amazonian forests. First, we compiled information from the literature about how indigenous and traditional Amazonian peoples manage forest resources to promote useful plant species that are mainly used as food resources. Then, we developed an interdisciplinary conceptual model of how interactions between these management practices across space and time may form domesticated forests. Finally, we collected field data from 30 contemporary villages located on and near archaeological sites, along four major Amazonian rivers, to compare with the management practices synthesized in our conceptual model. We identified eight distinct categories of management practices that contribute to form forest patches of useful plants: (1) removal of non-useful plants, (2) protection of useful plants, (3) attraction of non-human animal dispersers, (4) transportation of useful plants, (5) selection of phenotypes, (6) fire management, (7) planting of useful plants, and (8) soil improvement. Our conceptual model, when ethnographically projected into the past, reveals how the interaction of these multiple management practices interferes with natural ecological processes, resulting in the domestication of Amazonian forest patches dominated by useful species. Our model suggests that management practices became more frequent as human population increased during the Holocene. In the field, we found that useful perennial plants occur in multi-species patches around archaeological sites, and that the dominant species are still managed by local people, suggesting long-term persistence of ancient cultural practices. The management practices we identified have transformed plant species abundance and floristic composition through the creation of diverse forest patches rich in edible perennial plants that enhanced food production and food security in Amazonia.

Keywords: cultural forests, patch formation, dominance, Amazonian useful species, indigenous management, landscape domestication, Terra Preta de Índio

# INTRODUCTION

The notion of pristine rainforests has been questioned by increasing archaeological and ecological evidence suggesting long-term human activities across even the most intact forests worldwide (Denevan, 1992; Van Gemerden et al., 2003; Willis et al., 2004; Ross, 2011; Boivin et al., 2016; Roberts et al., 2017). Amazonia is no exception — over thousands of years with humans living in the region, forest composition has been altered significantly (Clement et al., 2015; Levis et al., 2017b). Many dominant species in Amazonian forests are widely used as food resources by native indigenous peoples (ter Steege et al., 2013), and at least 85 tree and palm species were domesticated to some degree during pre-Columbian times (Clement, 1999; Levis et al., 2017b). Plant domestication is a long-term process that results from the capacity of humans to overcome environmental selection pressures with the purpose of managing and cultivating useful plants (Kennedy, 2012; Boivin et al., 2016; Levis et al., 2017b), leading to significant changes in natural ecosystems and plant communities across landscapes (Clement, 1999; Terrell et al., 2003). First, useful individuals are managed in situ (Rindos, 1984; Wiersum, 1997a) and later humans select the best varieties with more desirable morphological traits for cultivation (Darwin, 1859; Rindos, 1984; Clement, 1999). Over time, humans create a mosaic of domesticated landscapes to favor numerous useful plant populations, each domesticated with different intensities and outcomes (Wiersum, 1997b). In modern Amazonian forests, legacies of past human societies are evident in the surroundings of archaeological sites, where humans enriched the forest with useful, especially edible, and domesticated plants (Balée, 1989; Erickson and Balée, 2006; Junqueira et al., 2010; Levis et al., 2017b). These pre-Columbian legacies suggest that Native Amazonians interacted with natural ecological processes and shaped the distribution of plants and entire forest landscapes across the region (Balée, 2013).

In Amazonia, as in any other ecosystem, natural ecological processes drive the formation of plant assemblages and communities (Keddy, 1992; Zobel, 1997; Lortie et al., 2004; ter Steege et al., 2006). The first ecological process described to structure plant communities is the plant's capacity to disperse its seeds across landscapes (Ricklefs, 1987; Lortie et al., 2004), which depends on the regional species pool and multiple dispersal strategies, including occasional events of long distance dispersal (Ricklefs, 1987; Nathan et al., 2008). In wet Neotropical forests, animal dispersal is used by 75–98% of the tree species (Howe and Smallwood, 1982; Muller-Landau et al., 2008) and mammals disperse large-seeded species over long distances (Jordano, 2017). Once a propagule arrives in a given location, the second ecological process is related to how plants are able to overcome local environmental filters to successfully germinate and survive (Lortie et al., 2004). Plants compete with their neighbors for limited amounts of resources, such as light, nutrients and water (Moles and Westoby, 2006). The understory of a tropical forest is typically light-limited, forcing trees to either grow tall or survive in shady conditions (Poorter et al., 2003). Soils are also limited in water and nutrients, and plants need to compete in the rooting zone (Barberis and Tanner, 2005; Schnitzer et al., 2005). The third ecological process structuring plant assemblages is interaction with other organisms, such as herbivores and pathogens (Lortie et al., 2004; Bagchi et al., 2014). These multiple environmental and biological filters act simultaneously, resulting in trade-offs. For instance, species that grow fast under high light conditions tend to produce leaves that are less protected from herbivores, compared to the tougher and more resistant leaves of shadetolerant species (Coley, 1983). In the long run, these ecological processes result in the selection of numerous adaptive plant traits (Reich et al., 2003), allowing species to thrive in complex and highly diverse systems, such as Amazonian forests. The high diversity of tropical ecosystems is in part maintained by natural disturbances and local biotic interactions, sometimes promoted by herbivores and pathogens that reduce the abundance of the most effective competitors, creating space for other species (Connell, 1978; LaManna et al., 2017).

Nonetheless, a few tree species often dominate plant assemblages forming oligarchic forests in diverse tropical forests (Connell and Lowman, 1989; Peh et al., 2011), including Amazonia (Peters et al., 1989; Pitman et al., 2001, 2013; ter Steege et al., 2013), Africa (Hart et al., 1989; Hart, 1990; Peh et al., 2011), Mesoamerica (Campbell et al., 2006), and Asia (Connell and Lowman, 1989; Peh et al., 2011). Natural and anthropogenic origins for the hyperdominance of tree species in Amazonian forests have been proposed. Aggregated patches of a few pioneer species occur after human or natural disturbance, while aggregated patches of a few shade-tolerant species may occur due to dispersal limitations (Valencia et al., 2004). Other hypotheses to explain why some species dominate large areas of Amazonian forests include: the species' ability to tolerate multiple environmental conditions, and to disperse over long distances (Pitman et al., 2001, 2013); and, in the case of useful species, the intentional or non-intentional enrichment promoted by past and contemporary human societies (Balée, 1989, 2013; Peters et al., 1989; ter Steege et al., 2013; Levis et al., 2017b).

During the Holocene, useful plant populations benefited from a new set of interactions when humans started to transform landscapes (Denevan, 1995; Smith, 2011; Boivin et al., 2016), and manage plant populations, consciously or not (Rindos, 1984; Wiersum, 1997a,b; Peters, 2000). Indigenous management practices were formally defined by Wiersum (1997a, p. 7) as "the process of making and effectuating decisions about the use and conservation of forest resources within a local territory." When humans consciously manage forest resources, the underlying intention of their actions is not to domesticate forests, but to achieve certain short-term objectives, for instance to favor individual plants in the forest and promote their regeneration. Although changes in forest composition may not be the main goal of human actions, management practices also modify forest composition and structure beyond the targeted species in a long-term process. In tropical and subtropical forests worldwide, native societies have managed plants and landscapes, promoting oligarchic forests dominated by useful plant species, also defined as cultural or domesticated forests (Balée, 1989, 2013; Peters et al., 1989; Campbell et al., 2006; Michon et al., 2007; Reis et al., 2014; Morin-Rivat et al., 2017).

Today, many indigenous and traditional peoples recognize the handprints of their ancestors in the landscape (Frikel, 1978). Indigenous people are defined here as the descendants of native ethnic groups and members of an indigenous community that retains historical and cultural connections with the social organization of pre-Columbian indigenous societies (https://pib.socioambiental.org)<sup>1</sup> . Traditional peoples can be understood as culturally differentiated and recognizable groups that have their own forms of social organization using knowledge, innovations and practices generated and transmitted by tradition, but they are not recognized as a member of indigenous communities (Brazilian Federal Decree No. 6.040)<sup>2</sup> . In Amazonia, traditional peoples are generally descendants of migrants who intermarried with local indigenous peoples and they often exchange practices, objects and knowledge with members of indigenous communities. Although contemporary indigenous and traditional societies both cultivate fruit trees in their territory, they also take advantage of the aggregated patches of fruit trees created by the practices of previous generations (Frikel, 1978; Balée, 1989, 2013). These ancient cultivated landscapes were probably created by integrated agroforestry systems that included homegardens, swiddens and managed fallows in which tree and non-tree crops were intertwined (Denevan et al., 1984; Stahl, 2015). Such integrated systems were likely more efficient, in terms of food production, than long-fallow shifting cultivation systems when only stone axes were used to clear the forest in the past (Denevan, 1992). This is supported by the fact that past indigenous tree cultivation (arboriculture) was a common and widespread practice covering large areas of forest-savanna transition zones in Amazonia (Frikel, 1978).

Because trees persist in the forest following management (Levis et al., 2017b) and annual crops disappear after human abandonment (Clement, 1999), contemporary indigenous and traditional people commonly attribute the aggregated distribution of useful perennial plants to the action of their ancestors. Based on this knowledge, they sometimes select a new place to settle in the forest (Frikel, 1978; Politis, 2007; Rival, 2007; Zurita-Benavides et al., 2016). For instance, the Nukak Indians in Colombian Amazonia prefer camping around sororoca plants (Phenakospermum guyannense), because they believe that these plants were brought by their ancestors to "their living world," and they discard a large quantity of seeds around their temporary camps, contributing to form new patches (Politis, 2007). Given that multiple human generations have moved around through time, places like riverine settings and archaeological sites were frequent dispersal routes of people and their cultures, and consequently of useful plants in preand post-Columbian times (Denevan, 1996; Hornborg, 2005; Guix, 2009; Heckenberger and Neves, 2009; Clement et al., 2010; Levis et al., 2017a,b). The intimate connections between Native Amazonians, their ancestors and their plants can reveal how persistent pre-Columbian forest management practices (Balée, 2000) contributed to the large-scale vegetation patterns we observe in modern forests (Pitman et al., 2011; Levis et al., 2017a,b).

Our study aimed to unravel how people interacted with natural ecological processes to transform pristine forests into domesticated forests with different degrees of human intervention through unintentional and intentional management practices. How indigenous and traditional peoples have used and shaped Amazonian forests is described in ethnographical, ethnobotanical, archaeological, paleoethnobotanical, paleoecological, and ecological publications. Here we used a historical-ecological perspective to evaluate the available information about how Native Amazonians have affected the distribution of plant species used mainly as food resources. Based on the information gathered from the literature, we developed an interdisciplinary conceptual model of how multiple management practices transformed pristine forests into domesticated forests, considering temporal and spatial contexts. In the field, we collected data about management practices and the composition of forest patches dominated by useful plants surrounding 30 contemporary villages, settled on or near archaeological sites. We compared field and literature data by documenting the multiple management practices known by 33 informants from two villages along the lower Tapajós River, and by relating these practices to the distribution and composition of the forest patches surrounding all 30 villages.

# MATERIALS AND METHODS

# Construction of the Conceptual Model of Forest Domestication

We reviewed the scientific literature for evidence of management practices of 22 useful perennial species (mainly used as food resources) that occur in forest patches in different parts of the Amazon basin (see Supplementary Table 1 for information about the species). These species were also chosen because the authors had previous field knowledge about them and they include a variety of useful plants with wild, cultivated and domesticated populations. Although our review focused on edible perennial

<sup>1</sup>https://pib.socioambiental.org/files/file/PIB\_institucional/No\_Brasil\_todo\_ mundo\_é\_índio.pdf

<sup>2</sup>http://www.planalto.gov.br/ccivil\_03/\_Ato2007-2010/2007/Decreto/D6040.htm

plants, we used the general concept of useful plants to define plant species that are currently used for any purpose or have been used by any human group in the past. Eightyone studies in ethnographical, ethnobotanical, archaeological, paleoethnobotanical, paleoecological and ecological publications, including books, scientific articles and dissertations, were analyzed (Supplementary Data). The literature review was conducted using the scientific name, English name and Portuguese name of each species as keywords in Web of Science and as title in Google Scholar.

Based on the information gathered for the 22 species, we classified the multiple management practices into eight categories that consist of a summary of all practices reported in the literature (**Table 1**): (1) removal of non-useful plants, (2) protection of useful plants, (3) attraction of non-human dispersers of useful plants, (4) human transportation of useful plants, (5) selection of phenotypes useful to humans, (6) fire management, (7) planting, and (8) soil improvement. The literature review provides examples to identify the role of—in many cases multiple management practices in the formation and persistence of domesticated forests in Amazonia.

We combined different management practices into a category depending on: (1) what people want to achieve; (2) whether the effects of the practice are directional or not in the way they fundamentally shape plant species assemblages; and (3) whether the practices result in similarities in terms of forest composition, abundance and distribution of useful species. For instance, practices that remove non-useful plants in the forest, such as opening the canopy, clearing the understory, weeding and cutting lianas, are used to selectively benefit useful species or enhance their growth rate by reducing the competition of non-useful plants around the targeted plants. As a side effect, humans increase light availability in the forest and tend to favor light demanding species that may therefore be protected if useful. More similarities are expected inside each category than between them because each category leads to a unique type of interference in natural ecological processes. Nonetheless, their interactions may result in a diverse composition of useful species with different or even contrasting adaptations. Below we detail each of these eight categories, providing a definition, interaction with ecological processes and some examples.

### Removal of Non-useful Plants

The most common practices used to remove non-useful plants in the forest are: opening the canopy; clearing the understory; weeding; cutting lianas; and removing unproductive individuals of useful species. These practices are used to selectively benefit useful species by reducing the costs of competition, and are expected to increase the performance of the selected useful plants. Competition can be reduced either by controlling the abundance of non-useful species (directly excluding them), or increasing the amount of available resources (e.g., light or space). Practices that reduce leaf and root density of lianas, for example, can release the growth of some trees (Schnitzer et al., 2005), and increase fruit production (Kainer et al., 2014). Similar to other small-scale natural disturbances (Connell, 1978), these longterm management practices may increase the diversity of plants between plant communities at a regional scale (beta-diversity) (Balée, 2006). The Hotï Indians from northern Amazonia act as ecological disturbance agents by constantly creating and managing gaps that increase the amount of light inside the forest necessary to cultivate light-demanding useful plants (Zent and Zent, 2004). In southern Amazonia, the Kayapó Indians create forest islands by managing savanna landscapes, increasing the heterogeneity of the landscape and the resource abundance for humans, game animals and plants (Posey, 1985). The Nukak Indians from western Amazonia constantly move between old camps for hunting and gathering activities; when returning to old camps, they selectively clear the understory and canopy, altering plant composition and benefiting useful and domesticated plants by promoting their growth and reproduction (Politis, 1996).

# Protection of Useful Plants

Humans protect plant seedlings, juveniles, adults and their fruits by keeping them alive through several practices: taking care of fruits, seedlings and adult plants; using non-destructive extractive practices; avoiding fire near useful trees; pruning; and repelling leaf-cutting ant species. Protection can be targeted to individuals with specific traits or to whole plant populations, by reducing the abundance of herbivores, predators, and natural disturbances. For instance, the Kayapó Indians in southern Amazonia use Azteca ants to repel leaf-cutting ants that eat useful species' leaves (Posey, 1987). The Huaorani Indians in western Amazonia and Hotï Indians in northern Amazonia increase the abundance of several useful plant species by keeping fruit trees alive in their territory (Rival, 1998; Zent and Zent, 2012). Aggregated patches of many useful plants are spared when clearing the forest for crop cultivation (Shanley et al., 2016), increasing the survival rates of these plants. This practice protects useful plant populations of Amazon nut trees (Bertholletia excelsa), uxí trees (Endopleura uchi), tucumã palms (Astrocaryum aculeatum), and açaí palms (Euterpe oleracea) in different parts of Amazonia (Shanley et al., 2016). Babaçu palms (Attalea speciosa) with more inflorescences are also protected in agroforestry systems of eastern Amazonia (Anderson et al., 1991).

## Attraction of Non-human Dispersers of Useful Plants

The natural process of seed dispersal can be enhanced by human practices. Leaving some fruits under the mother tree for animals in domesticated landscapes and cultivating large-seeded species to attract game are common practices in traditional communities of Amazonia (Shanley et al., 2010). Although humans were responsible for population declines, and even local extinctions of large vertebrates across Neotropical forests (Guimarães Jr. et al., 2008), humans have also positively interacted with terrestrial animals by increasing their food availability via cultivation and protection of fruit trees in domesticated landscapes (Balée, 1993), thus increasing the dispersal capacity and distribution of useful plant species. Dispersal strategies among large-seeded species and their dispersers may result in aggregated distributions of Amazonian plant species. For instance, forest patches of inajá palm (Attalea maripa) are associated with tapir latrines, suggesting that tapirs are partly responsible for the aggregated distribution of this palm in Amazonian forests (Fragoso et al., TABLE 1 | Examples of all management practices classified into eight categories.


(Continued)

#### TABLE 1 | Continued


Lines refer to the eight categories of management practices. Columns present examples of management practices from the literature for each category, the useful species that were involved in each example of a practice and the references used in the literature review. See Supplementary Data for the complete reference list corresponding to each number and Supplementary Table 1 for the complete scientific names of all species.

2003). Seeds of bacaba palm (Oenocarpus distichus) persist in secondary forests of Ka'apor Indians after abandonment, because game is attracted to these food resources and disperse even more seeds within these forests (Balée, 1993, 2013). Attracting animals to domesticated landscapes may indirectly contribute to form and maintain multi-species patches of useful plants from ancient homegardens and swiddens (Balée, 2013).

### Human Transportation of Useful Plants

Human transportation is the intentional or non-intentional movement of seeds and plants by humans from one place to another, outside or within the geographical limits of the plant population. For instance, planting seedlings or dispersing seeds intentionally and non-intentionally along forest trails, in swiddens and homegardens. During the Holocene, humans may have acted as primary long-distance dispersal vectors by transporting seeds of useful plants over long distances, often surpassing natural evolutionary barriers (Hodkinson and Thompson, 1997; Nathan et al., 2008). Past humans intentionally transported seeds, seedlings and clones of useful plants over long distances across the world (Boivin et al., 2016). As a consequence, the expansion of sedentary farming populations in Amazonia is associated with the dispersal of important native crops across the basin, such as manioc (Manihot esculenta) (Arroyo-Kalin, 2012), Amazon nut trees (Shepard and Ramirez, 2011; Thomas et al., 2015), and cacao trees (Theobroma cacao) (Thomas et al., 2012). Over short distances, human seed dispersal occurs when plants are exchanged among groups (Eloy and Emperaire, 2011), during periodic movements of groups to new areas (Posey, 1993), systematic movements between forests and settlements (Ribeiro et al., 2014), and between temporary camps (Politis, 2007). Short distance dispersal within a plant population's range is also reported, when seeds are scattered along trails during hunting and gathering activities, often non-intentionally (Zent and Zent, 2004; Ribeiro et al., 2014). The Hotï spend days in the forest to collect large quantities of umirí (Humiria balsamifera) fruits, many of which drop from baskets on the way back to the village, explaining its high abundance surrounding their villages (Zent and Zent, 2004). Similarly, the Kayapó transport large amounts of Amazon nut seeds, suggesting that the high density of seedlings along trail margins results from seeds accidentally dropped during transport (Ribeiro et al., 2014). Extensive trail systems were described in the Kayapó territory where they intentionally plant, transplant and spread useful species (Posey, 1993), forming landscapes full of useful plant species.

### Phenotypic Selection of Useful Plants

Trait selection practices are motivated by human preferences for specific phenotypes, for instance, fruits with larger sizes or larger contents of desirable properties, such as sugar, starch and oil. Humans often protect individuals previously selected for their preferred traits and they propagate these individuals outside their original population (see section Human Transportation of Useful Plants), resulting in plant domestication (Rindos, 1984; Clement, 1999). Phenotypic selection promotes morphological and genetic divergence from the ancestral population based on human criteria (Clement, 1999). The set of phenotypic traits that distinguish domesticated from wild plant populations is called the domestication syndrome (Hammer, 1984; Harlan, 1992; Meyer et al., 2012). Selection does not necessarily imply intentionality; however, if unconscious practices lead to changes in plant traits, followed by selection and propagation, these actions start to be systematically repeated (Rindos, 1984; Zeder, 2006). Human criteria for selecting plant traits vary across geographical regions, through time and with cultural interests (Meyer et al., 2012), and depend on the availability of useful populations in the landscape and the knowledge to interpret and manage morphological variation (Terrell et al., 2003). In Amazonia, some studies have described domestication syndromes for useful plants: variation in the toxicity of manioc roots that were selected for different soil types (McKey et al., 2010; Fraser et al., 2012); peach palm (Bactris gasipaes) may have been first selected for its small oily fruits or wood, and later for large starchy fruits with better fermentation qualities (Clement et al., 2009); the selection of annatto (Bixa orellana) with increased pigment yield from its seeds, and changed fruit dehiscence (Moreira et al., 2015); the high morphological variation of pequí fruit (Caryocar brasiliense) varieties selected by the Kuikuro Indians of the upper Xingu River (Smith and Fausto, 2016); selection of varieties of Virola elongata with exudates of different hallucinogenic qualities, and varieties of Cyperus articulatus with rhizomes having different medicinal properties selected by Yanomami groups in Northwestern Brazil (Albert and Milliken, 2009). Along the lower Tapajós River, traditional people selected non-bitter fruits of Caryocar villosum, domesticating them accidentally or intentionally (Alves et al., 2016). The importance of selection for promoting agrobiodiversity in Amazonia is underscored in ethnographies of cultivated plants, such as manioc (Boster, 1984; Rival and McKey, 2008) and pequí (Smith and Fausto, 2016).

## Fire Management

Fire has been a land management tool since pre-historical times (Pausas and Keeley, 2009). People have used prescribed fire in forests or swiddens mainly for cultivation, and also highly controlled fire for waste management near their houses. People manage fire for hunting activities, group communication, rituals, and to prevent uncontrollable fires (Mistry et al., 2016). Fire was intensely managed by pre-Columbian peoples in homegardens or settlement areas for domestic activities, such as cooking and burning waste. This domestic use may have contributed in the long run to fertilize the soil, producing the Terra Preta de Índio (TPI or Amazonian Dark Earths – ADE) (Smith, 1980; Schmidt et al., 2014) found throughout the Amazon basin (McMichael et al., 2014). Fire was also managed in swiddens to improve soil fertility with intensive cultivation techniques in ancient times, forming fertile dark brown soils, a soil slightly less fertile than TPI (Denevan, 2001; Woods et al., 2013). Management practices involving fire also increase availability of other resources, such as light, by reducing the abundance of competitors, and promoting useful species that are more nutrient demanding, such as chili peppers (Capsicum spp.) (Junqueira et al., 2016a). Patches of burití palms (Mauritia flexuosa), for instance, are associated with fire history in the Gran Savana, where people have used fire to prevent forest re-expansion into savannas (Montoya et al., 2011). When people manage fire to reduce competition for cultivated plants, fire-adapted species are often selected (Jakovac et al., 2016a). Many plants, useful or not, have evolved to tolerate contact with fire, allowing them to persist through time in frequently burnt places (Bond and Midgley, 2001). Some examples are the light-demanding sororoca (P. guyanense) that resprout after fire, cumatí trees (Myrcia splendens) that form patches in gaps managed with fire (Elias et al., 2013) and babaçu palms that persist in burnt sites due to cryptogeal germination (Jackson, 1974). The ancient connection between fire and humans(Bowman et al., 2011) and the intense fire history in Amazonian forests is revealed by the high charcoal abundance in forests around old settlements (Bush et al., 2015), which are expected to be dominated by fire-adapted species.

# Planting

Planting is defined here as the intentional planting, sowing and transplanting of seeds and seedlings to cultivated landscapes. It is important to note that when seeds and seedlings are transported by humans (see section Human Transportation of Useful Plants) with the intention of planting, these categories overlap. When humans disperse seed without this intention (e.g., when gathering fruits in the forest) the overlap between planting and human transportation doesn't exist, which justifies separating these categories of practices. Planting practices may increase a useful plant's performance, survival and reproduction because people usually take care of seedlings after planting. In Amazonia, several tree and palm species are planted mostly in agroforestry systems, forest gardens and forest gaps surrounding settlements (Denevan et al., 1984; Balée, 1993; Zent and Zent, 2012). In the past, indigenous groups also planted several perennial species, originating patches of useful trees and palm species across the basin (Frikel, 1978). Therefore, the presence and abundance of edible trees and palms in Amazonian forests and their proximity to ancient settlements may indicate past indigenous planting activities (Balée, 2013; Levis et al., 2017b). Some examples in Amazonia are forest patches of Poraqueiba sericea (Padoch and De Jong, 1987; Franco-Moraes, 2016) in western Amazonia, C. brasiliense in the upper Xingu River (Smith and Fausto, 2016), C. villosum in the lower Tapajós River (Alves et al., 2016), and B. excelsa in Amapá (Paiva et al., 2011) that are all associated with past indigenous planting.

## Soil Improvement

In some parts of the Amazon basin, terra-firme forests are poor in nutrients, which selected for plants with efficient nutrientconservation mechanisms (Herrera et al., 1978). Amerindians, however, interfered with these processes by changing soil structure and increasing soil fertility (Kleinman et al., 1995). Soil improvement involves several practices, such as the addition of charcoal and ashes that release nutrients and carbon in the soil; the use of organic additives, such as human and animal wastes, ash, garbage, crop residues, leaves, compost, cleared weeds, seaweed, mulch, urine, ant nest refuse, turf, muck, and water; and also by building mounds in floodable landscapes (Denevan, 1995, 2001). The improvement of soil conditions was observed for piquiá trees inside the forest, in which local people accumulate leaf litter under the trees (Alves et al., 2016), and for açaí, uxí, and peach palm through organic additives (Shanley et al., 2016). Also, extremely fertile TPI were probably created in pre-Columbian refuse heaps in which ash and charcoal, human and animal wastes, and ceramics accumulated (Woods and McCann, 1999; Schmidt et al., 2014). Although TPI soils were a product of sedentary human settlement and cannot be classified as a management practice, modern people usually take Levis et al. Amazonian Forest Domestication

advantage of these fertile soils to cultivate crops (Junqueira et al., 2016b). Brown soils were probably formed in cultivation zones with ash and charcoal that originated from frequent burning, and by composting and mulching the soil (Denevan, 1995). Unintentional and sometimes intentional soil improvement practices that resulted in the creation of TPI and brown soils were probably common in the past, since anthropogenic soils occur across most of the Amazon basin (Woods et al., 2013). The improvement of soil structure and fertility creates a new environmental filter that favors plants of interest and excludes species not adapted to the new soil conditions. Species with adaptations to resist or tolerate fire or to benefit from fertile soils may become dominant in improved soils. As a consequence, useful species adapted to fertile soils can form aggregated patches in TPI sites across the basin (Balée, 1989). This is may be case H. balsamifera trees, dominant in soils previously burned in the upper Negro River (Franco-Moraes, 2016), and palm species, such as Elaeis oleifera, Attalea phalerata, and Astrocaryum murumuru, which are indicators of anthropogenic soils along the Madeira River (Junqueira et al., 2011).

### Synthesis

As a synthesis of the information obtained about these eight management practices, their interactions and how each practice affects natural ecological processes, we present a new conceptual model that explains the process of Amazonian forest domestication. Following Goldberg et al. (2016), we describe a temporal continuum from the late Pleistocene until today. We also present spatial gradients from settlements through swiddens to domesticated forests, and from oldgrowth forests to domesticated forests, illustrating at which distances from settlements these different practices operate to form domesticated forests with different degrees of human intervention. Although Goldberg et al. (2016) modeled human population dynamics during the Holocene without data from Central Amazonia, this model is the only one available describing a temporal continuum of past human population in South America. We considered a temporal dynamic that starts in the Pleistocene when humans arrived, and follows human population growth rates during the Holocene (Goldberg et al., 2016). In our conceptual model, we considered pristine forests to exist when humans had not yet altered natural ecological processes (Denevan, 1992). Pristine forests were the norm during the Pleistocene and, with at least 13,000 years of growing human populations across the Amazon basin, pristine forests gradually disappeared (Clement et al., 2015) and old-growth forests mature forests without recent human interference, but not necessarily pristine (Wirth et al., 2009)—cover most of the basin today.

## Field Surveys

All authorizations to conduct the study were obtained before field work. The study was approved by the Brazilian Ethics Committee for Research with Human Beings (Process n◦ 10926212.6.3001.5020, 2013), the Federation of the Indigenous Organizations of the Negro River–FOIRN and the Regional Coordinator of the Brazilian National Indigenous Foundation - FUNAI, and the Brazilian System of Protected Areas (SISBIO,

process n◦ 47373-1, 2014). In each village, we obtained the informed consent of each local traditional or indigenous leadership at the beginning of the study.

In the field, we studied 30 contemporary villages settled on river banks distributed in nine sub-basins of four major rivers (Madeira, Solimões, Negro, Tapajós) across Brazilian Amazonia (see Supplementary Table 2 for names of the villages visited and their distances to archaeological sites). We visited from 2 to 10 villages in each sub-basin and selected villages located on or near archaeological sites with TPI. Archaeological sites with anthropogenic soils are ancient sedentary settlements (Neves et al., 2003), and they were chosen for our study because they indicate long-term human occupation, where rich soils, new landforms and domesticated plants accumulated through time in response to human agency (Clement et al., 2015). In each village, from March 2013 until March 2015 (3 months per year during the rainy season), we searched for indigenous and traditional ecological knowledge about the forest patches dominated by useful plant species in the surroundings of these villages.

Of the 30 contemporary villages along river banks, 27 are currently inhabited by traditional peoples (ribeirinhos) that have lived there for at least one generation; most of them are descendants of migrants who intermarried with local indigenous peoples. Their daily activities include farming, fishing, hunting, timber, and non-timber forest product extraction, and two villages are involved in community-based tourism. Three villages in the upper Negro River are inhabited by members of the Baré indigenous group, descendants of Arawak speaking groups, who lost their original language and adopted the Tupi-based Nheengatu, taught by the missionaries.

In each village, we searched for patches of native forest species used mainly as food resources. We focused on edible fruits because previous studies showed that these resources accumulated around ancient indigenous villages (Frikel, 1978; Balée, 1989, 1993). We interviewed 56 local people (on average 2 per village) regarding the occurrence and distribution of these forest patches, and used participatory mapping techniques (Gilmore and Young, 2012) to locate these patches around the villages. We used the suffix "zal" or "al," which means abundance, aggregation or patches in Portuguese, and "tíwa" (in the Nheengatu language) to communicate with local people. These terms are used by contemporary people that associate the suffix with the name of the dominant species and identify a forest patch of useful species based on their traditional knowledge. For instance, a patch of bacaba palm (Oenocarpus bacaba/O. distichus) is named a bacabal in Portuguese and a iwakátíwa in Nheengatu. All patches of useful species were mapped with participatory mapping and complemented with the information collected during guided tour (Gilmore and Young, 2012; Albuquerque et al., 2014). Participatory mapping techniques are used to map local knowledge about the landscape, and to translate indigenous and local representations into techno-scientific language (Chapin et al., 2005; Heckenberger, 2009; Gilmore and Young, 2012). All local residents were invited to participate in a participatory mapping workshop that occurred during one morning or afternoon in each village. People were encouraged to draw and identify first the main local rivers, second TPI sites, and third different patches of useful species on

maps made with georeferenced grids on top of recent cloud-free LANDSAT TM images of the area. With participatory mapping, we obtained the approximate location and size of TPI sites, and patches of useful species surrounding the villages. With guided tour we validated the location of at least one TPI site and/or one patch of useful species per village. Village members chose one person to guide us and visit the most accessible forest and TPI site. During the guided tour, we collected geographical coordinates of TPI sites and useful forest patches, and documented all useful species observed according to local knowledge. The botanical species were pre-identified in the field using some books of fruit trees and palms (Henderson, 1995; Cavalcante, 2010), and when possible, botanical material was also collected for final identification. The botanical identification was confirmed by José Ramos, a parataxonomist at INPA (Instituto Nacional de Pesquisas da Amazônia). Some plants were only identified to genus level in the field due to logistical limitations. The distribution of all forest patches identified around the villages was documented during the interviews, participatory mapping and guided tour. In total, we studied 21 patches visited with local informants dominated by 14 different useful species, as some patches visited concentrate the same dominant species. Forest patches are located up to 5 km from archaeological sites, and we documented a minimum of four useful species, a maximum of 21, and median of seven useful species per patch. In each of the nine sub-basins visited in the field, we documented a minimum of six useful forest patches dominated by different species, a maximum of 14 and a median of nine patches.

We compared our results obtained from field surveys and the literature review with field data from two villages along the right margin of the lower Tapajós River, where we documented all management practices performed by local people with the species that dominate local forest patches. This comparison served as ground-truth for our conceptual model. During free listing interviews and guided tour (Albuquerque et al., 2014) local informants described practices with which they benefit useful species found in patches of this subbasin. In January and February of 2015, we interviewed 33 informants who know and use forest species in Maguarí and Jamaraquá villages in the Tapajós National Forest (FLONA). We also walked approximately 80 km along trails in the FLONA Tapajós with the seven most experienced informants to identify useful species in the forest. During these guided tour, the informants explained how they manage the useful species found in forest patches. With information about how local residents manage useful species, we compared the number and frequency of the practices obtained in the field with the same information obtained from the literature review.

We used ArcGis software to map the information collected in the field with participatory mapping and GPS. The closest (minimum distance) and longest (maximum distance) linear distances from each patch of useful species to the closest TPI were calculated manually using a digital ruler. We calculated the frequency of forest patches that occur at intervals of a minimum distance of 1 km to the nearest TPI. Using the minimum distance from forest patches to the closest TPI sites, we compared the spatial gradient of our conceptual model (settlements, swiddens or old-growth forests) with the location of the forest patches found in the field: patches on top of TPI sites were associated with pre-Columbian settlements, those located in fallows close to TPI sites were associated with past swiddens, and forest patches more distant from TPI sites were associated with old-growth forests, and confirmed by local knowledge and the presence of large trees.

# RESULTS

# A Conceptual Model of Forest Domestication in Amazonia

Our conceptual model shows how pristine forests were converted into domesticated forests by a long-term process involving the interaction between eight human management practices (**Figure 1**). The conceptual model presents three general aspects of the forest domestication process: (1) a time span since the Pleistocene (**Figure 1A**); (2) interactions among human practices (arrows in **Figure 1B**); and (3) a spatial zone of influence for each management practice (arrows in **Figure 1C**). First, our model proposes that the frequency of these management practices increases with human population in South America (Goldberg et al., 2016), resulting in more extensive domestication of Amazonian forests through the Holocene (**Figure 1A**). Second, each arrow presented in our conceptual model indicates interactions among a pair of categories of management, showing that one practice can positively affect others (**Figure 1B**). For instance, humans remove non-useful plants (Practice 1–P1) while often selectively protecting useful individuals with desirable phenotypes (P5), or plant selected individuals (P5) in forest gaps (natural or created by humans–P1), swiddens and homegardens (P7). Native Amazonians protect plants (P2) as sources of seeds for future planting (P7) and selection (P5), and also to attract animal dispersers (P3). A gradual transformation of the forest is expected to occur by the interaction between humans (P4) and non-human dispersers (P3). Seeds and seedlings of selected useful plants (P5) are transported by humans from natural to domesticated landscapes (P4), guaranteeing their planting and propagation (P7). Fire management (P6) is often used in association with protection of species (P2) with plants previously selected for traits of interest (P5). The combination of fire management (P6) with the protection of certain species (P2) in domesticated landscapes may allow even useful fire-sensitive plants to form patches in ancient cultivated systems. Ancient planting practices (P7) attract dispersers (humans and nonhumans; P3 and P4) and improve soil conditions (P8). The planting of useful edible trees (P7) attracts game animals that may disperse their seeds throughout the area (P3), thus increasing the abundance of the species locally. Indigenous people disperse seeds of plants (P4) and plant them in agroforestry systems and along forest trails (P7) when they move from one place to another, increasing food availability during long walks in the forests. Trees planted in agroforestry systems (P7) may enrich soil fertility (P8), reproducing the nutrient-conservation mechanism observed in the forest. By improving naturally

outside of Amazonia (adapted from Goldberg et al., 2016). (B) Management practices (1–8), their interactions and their effects on the forest domestication process through time [from top (16 kyBP) to bottom (0 kyBP)]. Natural ecological processes operate during all moments in time and along a domestication gradient from pristine to domesticated forests. Management practices may have a positive direct effect (dark arrows) or hypothetical positive effect (light arrow) on other practices that intensify as human population increases (from light green to dark green). (C) The forest domestication process in a spatial context of human influence from settlements, through swiddens, domesticated forests to old-growth forests, which may have been domesticated in the past, but lack recent human intervention. Domesticated forests can originate (arrows) from settlements and swiddens, or from old-growth forests. Our model describes an open-ended process.

nutrient-poor soils (P8), pre-Columbian societies enhanced food production in Amazonian landscapes, also allowing their population expansion.

Third, the gradient of soil improvement is illustrated in the spatial representation in our conceptual model (**Figure 1C**). Five practices, removal of non-useful plants (P1), protection of useful plants (P2), attraction of non-human dispersers of useful plants (P3), human transportation of useful plants (P4), and selection of phenotypes useful to humans (P5) occur across the entire gradient of human influence from settlements, through swiddens, to domesticated forests to old-growth forests. Fire management (P6), direct planting (P7), and soil improvement (P8) are practices mainly used in swidden/fallows and settlements, giving rise to domesticated forests with useful plants related to these activities.

# Relationships among Management Practices: Evidence from the Literature and Field

We found that all eight categories of management practices described in the literature (**Table 1**) are also known by traditional people in the two villages along the lower Tapajós River that we studied (**Figure 2**). Transportation of plants by humans, planting of useful plants and selection of desirable phenotypes were the most frequent practices in the literature, whereas clearing the understory, cutting lianas and weeding (P1-removal of nonuseful plants) and not cutting useful plants (P2-protection of the useful) were the most cited practices in field interviews (**Figure 2**). Attraction of dispersers and soil improvement were the least frequent practices in the literature and field interviews, documented for less than 40% of the species investigated.

More than half of the useful plant species investigated in the literature and the field are managed with at least five practices. Based on the literature, four species (A. maripa, C. villosum, M. flexuosa, T. cacao) are managed with seven practices, and for these species at least five different uses were reported (Supplementary Table 1). Based on field data, two species (C. villosum and E. uchi) are managed with seven practices and used for several purposes, such as food, medicine and hunting (Supplementary Table 1). Local people reported that they do not clear the land or use fire in places where aggregated patches of these species occur, with the purpose of protecting the whole population. One species, M. splendens, with only two uses reported in the literature (manufacturing and fuel), is managed with only one practice (P6 - fire management) based on the literature.

# Multi-species Patches of Useful Plants

We found multiple forest patches of useful species surrounding the 30 contemporary villages visited in Amazonia (**Figure 3**). In total, people cited 35 patches with different names and corresponding to 38 useful species (Supplementary Table 1). The most common patches were açaízal (E. precatoria), babacal (O. bacaba), castanhal (B. excelsa), piquiázal (C. villosum), patauázal (O. bataua), and uxízal (E. uchi) (**Figure 4**). Most patches are common in more than one sub-basin visited and a few patches are only common in one sub-basin visited; some examples of localized patches are cf. Neoxythece elegans in the lower Madeira River basin, Duguetia stenantha in the upper Solimões River basin, H. balsamifera in the upper Negro River basin, and Hymenea parvifolia in the lower Tapajós River basin. Detailed information of the regional differences of forest patches across Amazonia is given in Supplementary Tables 1, 3. Of all species that dominate the patches, 90% are used for more than one purpose (**Figure 3**).

Although forest patches are dominated by one species after which they are named, they concentrated multiple useful species that dominate forest patches in different sub-basins of Brazilian Amazonia (**Table 2**). We visited 21 patches that are dominated by 14 out of 38 useful species that form patches across the basin. Palm species of the genus Oenocarpus occur in 75% of the 21 forest patches visited across the basin. We found regional differences in the composition of useful palm species that occur in the forest patches: A. maripa were found in most patches of the Madeira River basin, E. precatoria of the Solimões River basin, O. bataua of the Negro River basin and O. distichus of the Tapajós River basin. Forest patches dominated by B. excelsa species are the most common and the most diverse patches: they concentrate 5–8 useful species that also are dominant species in other forest patches in different parts of the basin (**Figure 4** and **Table 2**). In total, 87 useful species were cited in the patches visited (Supplementary Table 3) and the number of useful species cited increases with the number of patches visited (Supplementary Figure 1).

Most patches are small in size (less than 1 km<sup>2</sup> ), and occur at various distances from archaeological sites (0–40 km), implying that they may have originated from all spatial contexts: settlements, old swiddens, or old-growth forests (**Figure 5**). Few patches are restricted to TPI sites and old villages. Half of all patches are located up to 1 km from the archaeological sites, although some patches can be found up to 40 km away from these sites (**Figure 5** and Supplementary Figure 2). As a common pattern and according to local people, patches dominated by

patches presented in the figures are based on local knowledge descriptions and local drawings. See Supplementary Table 1 for more information about the forest patches presented in this figure. Archaeological sites are ancient sedentary settlements with anthropogenic soils (TPI) and have been re-occupied by contemporary peoples.

useful palm species are more common in valley forests, whereas patches dominated by tree species occur commonly in other environmental settings, such as plateau forests and white-sand forests (campinaranas).

# DISCUSSION

Based on our multidisciplinary approach, we provide a framework for understanding how human practices have led to the formation of patches of useful perennial plant species across Amazonian forests. Our conceptual model portrays how Amazonian peoples manage forests in multiple ways through eight categories of management practices that interfere with natural ecological processes and promote domesticated forests around human settlements. The similarities between ethnographic descriptions of management practices across the basin and our field observations of two villages indicate the commonness of these practices, suggesting that pre-Columbian and contemporary peoples transformed forest composition at varying distances from their settlements by multiple management practices. In the field, we confirmed that multiple diverse patches of useful species, currently managed by indigenous and traditional peoples, occur mainly near these settlements. Overall, our results support the view that these diverse patches of useful plant species were created and maintained by human actions.

Our conceptual model also reflects positive long-term interactions between humans and plants (Smith, 2011), as described in other tropical regions worldwide (Wiersum, 1997a; Michon, 2005; Kennedy, 2012; Reis et al., 2014; Boivin et al., 2016; Roberts et al., 2017). Previous models had suggested that the plant and forest domestication processes are associated with the cultivation of domesticated tree crops (Wiersum, 1997a,b). Although our model is inspired by previous studies (Harris, 1989; Wiersum, 1997a,b), we present a new framework to understand the domestication of Amazonian forests that simplifies the complex network of interactions between human actions and natural ecological processes. Because these interactions cannot be understood by separately assessing only individual management practices or species, the intricate groups of management practices shown in our model illustrate how multiple human actions interact to shape Amazonian forests. Species-specific details are scattered in the literature, and here we synthesized this information into a single model that can be tested with individual site-specific situations.

In our model, forest domestication is defined as an open-ended process (Rival, 2007; Kennedy, 2012), in which domesticated forests can originate through varying degrees of human intervention from settlements and swiddens, and also from old-growth forests. This perspective makes the typical distinction between hunter-gatherers vs. farming groups


2|Listofusefulspeciesthatoccurinthe21forestpatchesvisitedduring

January 2018 | Volume 5 | Article 171

#### Levis et al. Amazonian Forest Domestication

(Continued)


review), (settlements, swiddens or old-growth forests) are described in this table. Use category: (F) Food. (C) Construction, (T) Thatch, (Fu) Fuel, (M) Medicinal, (Ma) Manufacturing or Technology, (Co) Commerce, (A) Attractive for game, (Af) Animal food, Ritualistic, and (O) Other. Management practices: (1) removal of non-useful plants, (2) protection of useful plants, (3) attraction of non-human dispersers of useful plants, (4) human transportation of useful plants, (5) selection of phenotypes useful to humans, (6) fire management, (7) planting, and (8) soil improvement. See Supplementary Table 3 for the complete scientific name of all species.

Levis et al. Amazonian Forest Domestication

inappropriate for the Amazonian context (Terrell et al., 2003; Kennedy, 2012), as most ancient Native Amazonians (often characterized as hunter-gatherers) were actually practicing many activities, including planting tree species (Frikel, 1978). Amazonian forests that were once cultivated and domesticated are often transformed into swiddens or settlements as a cyclic pattern that has also been observed in Indonesian forests (Michon, 2005). Because early successional species usually depend on forest gaps for recruiting, they are maintained with management practices, similar to fully domesticated plant populations that require human care for survival and reproduction (Clement, 1999).

Although it is likely that current management practices maintain the legacy of past societies (Junqueira et al., 2017), the effects of past forest domestication have been detected in forests even without recent management activities (Van Gemerden et al., 2003; Dambrine et al., 2007; Ross, 2011; Levis et al., 2017b). The persistent effect of pre-Columbian plant domestication on modern forest composition has been revealed in Amazonian old-growth forests (Junqueira et al., 2017; Levis et al., 2017b), secondary forests (Junqueira et al., 2010) and even in highly dynamic homegardens growing in archaeological sites (Lins et al., 2015). Domesticated species adapted to stable soil conditions created by management practices, such as TPI, may persist for a long time after abandonment (Quintero-Vallejo et al., 2015). This may explain why domesticated palms dominate modern forests growing on pre-Columbian mounds, anthropogenic soils and geoglyphs abandoned more than 400 years ago (Erickson and Balée, 2006; Quintero-Vallejo et al., 2015; Watling et al., 2017b). Another possible explanation for this persistence is the continuous recruitment of useful and domesticated plants present in the forest seed bank (Lins et al., 2015). Pre-Columbian peoples may also have played a major role in disseminating large multi-seeded fruits within and across Neotropical biomes during the Holocene, resulting in the spread of diverse patches of useful plants associated with human settlements and trails (Guix, 2005). Human-mediated dispersal of invasive plants is well-documented (Hodkinson and Thompson, 1997; Nathan et al., 2008); however, ecological studies frequently overlook this mechanism when considering native species (Levis et al., 2017a).

Modern Amazonian peoples who live on pre-Columbian settlements seem to have inherited indigenous knowledge, including these management practices that benefit useful and domesticated plant populations. Our field data show that most useful species dominant in forest patches occur in more than one sub-basin visited, suggesting a widespread use and management of forest resources by past and contemporary peoples. The forest domestication process was assimilated by contemporary societies through the transmission of indigenous knowledge from one generation to another, as described for indigenous groups from Ecuadorian Amazonia (Zurita-Benavides et al., 2016) and traditional people in Brazilian Amazonia (Alves et al., 2016). Villages with homegardens that were occupied by several pre-Columbian cultures contain a higher beta diversity of useful plants compared to villages with homegardens occupied by a single culture (Lins et al., 2015), suggesting that previously existing useful plants were incorporated into new agroforestry

(R)

systems when old villages are re-occupied (Miller and Nair, 2006). Some practices, however, have changed in intensity and extension through time. Slash-and-burn agriculture, for instance, has increased since the arrival of European societies that introduced metal tools to cut down the forest (Denevan, 2001). In pre-Columbian times, sedentary societies frequently improved soil conditions by managing fire in their habitation and cultivation zones (Denevan, 2001; Neves et al., 2003; Woods et al., 2013). Sedentary societies with high human population densities were responsible for the formation of anthropogenic soils that are no longer being created on a broad scale (Neves et al., 2003). These same anthropogenic soils, however, are widely used by modern societies to cultivate crops, allowing the diversification and intensification of food production in Amazonia (Woods et al., 2013; Junqueira et al., 2016b).

species because people couldn't determine the location of these patches in the maps we used.

Amazonian societies managed fire, planted useful species and improved soils that resulted in substantial transformation in forests close to their homes. Although some scholars argue for a localized impact involving these three practices in pre-Columbian Amazonia, associating them with the margins of the main rivers (McMichael et al., 2012, 2014; Bush et al., 2015; Piperno et al., 2015), the impact of long-term management practices has been detected in the forests of interfluvial areas (Levis et al., 2012; Franco-Moraes, 2016; Watling et al., 2017b) and across the Amazon basin (Levis et al., 2017b). These findings suggest that even in remote areas, far from known archaeological sites, contemporary people also manage the forest, protecting useful species and removing the non-useful, which are the most frequent practices reported by contemporary societies. Logistical limitations constrain our ability to detect the long-term effects of these practices away from current human settlements (Stahl, 2015), and even the participatory techniques used in this study are based on current knowledge about the forest, requiring ethnographic projection to infer the impact of past peoples. For instance, patches of rubber tree (Hevea brasiliensis) have been managed by modern societies driven by economic interest since the mid-nineteenth century (Schroth et al., 2003), but were probably managed differently before that time. Although several socio-economic factors push contemporary peoples to concentrate their activities on market-oriented forest resources (Jakovac et al., 2016b), they occasionally use and manage forest patches located up to 40 km from their villages for hunting animals and gathering fruits (**Figure 5**; Franco-Moraes, 2016). As an alternative approach, the abundance and richness of useful plants, especially of domesticated species, might be used to predict the location of ancient human settlements in these remote Amazonian areas (Levis et al., 2017b).

Future multidisciplinary studies that combine alternative methods may help to reconstruct forest composition dynamics (Stahl, 2015), as Watling et al. (2017b) did in the geoglyph region of Acre, revealing more details of the influence of past peoples in Amazonian forests. The integration of paleoecology, archaeology, archaeobotany and forest ecology is a promising combination (Mayle and Iriarte, 2014; Iriarte, 2016; Watling et al., 2017a,b). In southwestern Amazonia, archaeobotanical remains have revealed that past peoples consumed a rich diet, including many palm fruits (Dickau et al., 2012). The increase in palm abundance is also visible in soil profiles of archaeological sites across the region (McMichael et al., 2015; Watling et al., 2017b), suggesting that past societies enriched the forest with useful palms to improve food production. Today, useful and domesticated palms are dominant in southwestern Amazonian forests (Levis et al., 2017b), growing on abandoned pre-Columbian mounds, anthropogenic soils and geoglyphs created by past management practices (Erickson and Balée, 2006; Quintero-Vallejo et al., 2015; Levis et al., 2017b; Watling et al., 2017b). Many palm species were found in most of the forest patches investigated here, suggesting long-term human management. Regional contrasts in palm and other plant species composition across Amazonia may reveal different human practices or specific environmental conditions that should be investigated in detail.

We conclude that our literature review, conceptual model and field results contribute to explain how domesticated forests were formed in Amazonia, in part by revealing how integrated categories of management practices interfere with natural ecological processes that shape plant communities in tropical forests. Different degrees and types of management, cultural preferences and environmental conditions may lead to a wide variety of outcomes and explain why diverse combinations of useful species were found in Amazonian forest patches. Insights from agroforestry systems in tropical and sub-tropical regions confirm that indigenous management practices have been used worldwide to domesticate plant species and entire forest landscapes (Wiersum, 1997a,b; Michon, 2005; Kennedy, 2012; Reis et al., 2014). Learning about indigenous knowledge of forest management is important not only to understand the plant and landscape domestication processes, but also to guide policies for forest conservation, local people's empowerment, and food production (Michon et al., 2007; Roberts et al., 2017). In Amazonia today, millions of people live in rural landscapes, with partial dependence on forest resources for their well-being, and with profound local knowledge that should be incorporated in environmental conservation and management plans.

# AUTHOR CONTRIBUTIONS

CL conceived the study; CL, BF, PM, BL, RA, JF-M, JL, and EK collected data; CL, BF, PM, BL, RA, JF-M, JL, EK, FB, MP-C, FC, and CC designed the analyses; CL, BF, PM, BL, RA, JF-M, JL, and EK performed the analyses; CL, BF, PM, BL, RA, JF-M, JL, EK, FB, MP-C, FC, and CC discussed further analyses; CL, BF, PM, BL, RA, JF-M, JL, FB, MP-C, FC, and CC wrote the manuscript.

# FUNDING

Fundação de Amparo a Pesquisa do Estado do Amazonas - FAPEAM Universal proc. no. 3137/2012 and 062.03137/2012; Conselho Nacional de Desenvolvimento Científico e Tecnológico - CNPq Universal proc. no. 473422/2012-3 and 458210/2014-5.

# ACKNOWLEDGMENTS

We thank local residents for their participation, the Instituto de Desenvolvimento Agropecuário e Florestal do Amazonas, the Centro Estadual de Unidades de Conservação do Amazonas, the Instituto Chico Mendes de Conservação da Biodiversidade, the Instituto de Desenvolvimento Sustentável Mamirauá, the Instituto Socioambiental de São Gabriel da Cachoeira, the Cooperativa Mista da Flona do Tapajós and the Federação das Organizações Indígenas do Rio Negro for field assistance, and Sara Deambrozi Coelho for information about the uses of species. CL thanks CNPq for a doctoral scholarship, RA and JL thank INPA and CNPq for research scholarships, JF-M thanks CNPq for a master's scholarship, FC and CC thank CNPq for research fellowships, BF thanks São Paulo Research Foundation (FAPESP) for grant #2016/25086-3. BL thanks FAPEAM for research fellowship and FAPESP for grant #2015/24554-0.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2017.00171/full#supplementary-material

# REFERENCES


Xingu (Brasil). Bol. Mus. Para. Emílio Goeldi. Cienc. Hum. 11, 87–113. doi: 10.1590/1981.81222016000100006


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer BS and handling Editor declared their shared affiliation.

Copyright © 2018 Levis, Flores, Moreira, Luize, Alves, Franco-Moraes, Lins, Konings, Peña-Claros, Bongers, Costa and Clement. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Domesticated Landscapes in Araucaria Forests, Southern Brazil: A Multispecies Local Conservation-by-Use System

Maurício S. Reis <sup>1</sup> \*, Tiago Montagna<sup>1</sup> , Andréa G. Mattos <sup>1</sup> , Samantha Filippon<sup>1</sup> , Ana H. Ladio<sup>2</sup> , Anésio da Cunha Marques <sup>3</sup> , Alex A. Zechini <sup>1</sup> , Nivaldo Peroni <sup>4</sup> and Adelar Mantovani <sup>5</sup>

<sup>1</sup> Núcleo de Pesquisas em Florestas Tropicais, Programa de Pós-Graduação em Recursos Genéticos Vegetais, Universidade Federal de Santa Catarina, Florianópolis, Brazil, <sup>2</sup> Grupo de Etnobiología, INIBIOMA, Universidad Nacional del Comahue, Bariloche, Argentina, <sup>3</sup> Floresta Nacional de Três Barras, Instituto Chico Mendes de Conservação da Biodiversidade, Três Barras, Brazil, <sup>4</sup> Laboratório de Ecologia Humana e Etnobotânica, Departamento de Ecologia e Zoologia, Universidade Federal de Santa Catarina, Florianópolis, Brazil, <sup>5</sup> Centro de Ciências Agroveterinárias, Universidade do Estado de Santa Catarina, Lages, Brazil

### Edited by:

Pan Kaiwen, Chengdu Institute of Biology (CAS), China

#### Reviewed by:

Fernando José Cebola Lidon, Faculdade de Ciências e Tecnologia da Universidade Nova de Lisboa, Portugal Indira Devi Puthussery, Kerala Agricultural University, India

> \*Correspondence: Maurício S. Reis msedrez@gmail.com

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Ecology and Evolution

> Received: 29 July 2017 Accepted: 22 January 2018 Published: 07 February 2018

#### Citation:

Reis MS, Montagna T, Mattos AG, Filippon S, Ladio AH, Marques AdC, Zechini AA, Peroni N and Mantovani A (2018) Domesticated Landscapes in Araucaria Forests, Southern Brazil: A Multispecies Local Conservation-by-Use System. Front. Ecol. Evol. 6:11. doi: 10.3389/fevo.2018.00011 Araucaria forest is a sub-tropical component of the Atlantic Forest Biome, occurring between 18 and 30◦ S latitude, and 500–1,800 m altitude in Southern and Southwestern Brazil and Northwestern Argentina. In recent history (Twentieth century), this forest has undergone non-sustainable exploitation and reduction in area dedicated to agricultural and forestry use. However, smallholders continue using several resources from this forest, even adapting management systems. The main system is geared toward the production of yerba mate (Ilex paraguariensis) under araucaria trees (Araucaria angustifolia), which holds economic, cultural, and social relevance for thousands of farmers. Seeking evidence of domestication and conservation of the resources managed in this system, we compared different landscapes on the Northern Plateau of Santa Catarina in Southern Brazil. Focusing on three species-yerba mate, araucaria and caraguatá (Bromelia antiacantha)—we characterized management practices (interviews and guided tours), demographic structure (permanent plots in farming zones and in a protected area), and genetic diversity in populations from the same places. Demographic structure and genetic diversity from different landscapes were compared to evaluate the system potential for conservation. The results indicated that the three species are intentionally promoted with practices of protection, transplanting and/or selection, in different ways and with different functions (caraguatá hedges, yerba mate harvesting, and collection of pinhões—seed like nuts—of araucaria). Landscapes are managed for yerba mate harvesting and cattle raising, for both economic and cultural reasons, with a consequent reduction in the density of most other plant species. In all cases the genetic diversity of the species was high for most of the sampled properties, and the set of farmers' populations did not differ from the protected area. The set of populations of each species operates as a metapopulation and local management practices contribute to conservation. Thus, the farmers' management systems and practices maintain the landscape with productive forest fragments, favoring the conservation-by-use of these species. The system requires these management practices, which bring about changes in various species and are motivated by cultural and economic factors, in order to maintain the landscapes domesticated.

Keywords: cultural landscapes, Araucaria angustifolia, Ilex paraguariensis, Bromelia antiacantha, genetic diversity, ethnobotany, local management system, non-timber forest products

# INTRODUCTION

Recent literature has presented several examples of a human signature imprinted on the landscape, reflecting cultural strength in the prehistoric/historical determination of composition and structure of forest formations (Denevan, 2001; Balée, 2006; Bitencourt and Krauspenhar, 2006; Balick, 2007; Behling and Pillar, 2007; Reis et al., 2014; Clement et al., 2015; Iriarte et al., 2017; Levis et al., 2017; Roberts et al., 2017). One example of such human action with the greatest accumulation of evidence is the Amazonian Dark Earth (Terra Preta de Índio). Several studies have revealed that Amazonian Dark Earth is a cultural mark in forest formation—eventually considered pristine (Clement and Junqueira, 2010)—perceived since the arrival of European botanists in the Americas (Clement and Junqueira, 2010; Clement et al., 2015; Levis et al., 2017).

These advances encourage a new perspective on the dynamics of supposedly natural ecosystems and the ecology of species, as well as possibilities for conservation and use. In particular, two aspects emerge from this new vision:


Thus, domestication systems involving different intensities of use/intervention initiated in the past remain in the present, associated with the distinct needs and perspectives of different peoples and cultures. Management practices entailing protection, promotion and selection are present in today's "extractivism" systems, and imply effective adjustments of landscapes and populations to the dimensions of current use (see Wiersum, 1996, 1997b; Martins, 2005; Emperaire and Peroni, 2007; Miranda and Hanazaki, 2008; Capparelli et al., 2011; Parra et al., 2012; Dawson et al., 2013; Steenbock and Reis, 2013; Roberts et al., 2017).

Plant domestication in forest environments begins with a process mediated by humans which favors the occurrence of the chosen species to the detriment of others (Wiersum, 1997a). Later, in the same environment, the implantation of selected species is common (Wiersum, 1997a). Therefore, from the conceptual reference of Clement (1999), domestication can be understood as a process acting on the landscape as a result of human intervention, and, influenced by cultural background. Thus, landscapes are produced where the degree of human interference can lead to different forest structures and different genotypic and phenotypic frequencies in plant populations of interest, some also being affected negatively. In the same landscape humans can target one or several resources, in different intensities and microenvironments, thus producing cultural landscapes (sensu Berkes et al., 2000).

These cultural landscapes (sensu Berkes et al., 2000; Berkes and Folke, 2002; Berkes and Turner, 2006; Ladio, 2011) are more common than previously imagined, and this perception brings the need for a change in posture regarding the possibility of conservation and use. The human presence and its cultural values can be determinant in the landscape perceived today, and favor both conservation and use at the same time. Furthermore, several recent studies (Heywood et al., 2007; Clement et al., 2010; Capparelli et al., 2011; Shepard and Ramirez, 2011; Steenbock et al., 2011; Parra et al., 2012; Dawson et al., 2013; Steenbock and Reis, 2013) demonstrate the importance of management systems and local practices associated with obtaining resources from forest environments, as discussed by Roberts et al. (2017). These studies have been carried out with a view to understanding different forms of landscape domestication, or in the sense of recognizing and valuing this strategy for obtaining income, and the diversification of family farming (Kubo et al., 2008; Vieira-da-Silva and Reis, 2009; Vieira-da-Silva and Miguel, 2014).

The main innovation of these kinds of studies is a more realistic understanding of how the composition of species and the structure of landscapes are determined. This understanding encompasses the active presence of humans and their interests, allowing more consistent actions for conservation and use of landscapes, including public policies (see Dawson et al., 2013; Roberts et al., 2017). Studies with this emphasis are recent and scarce in the Brazilian Atlantic Rainforest, despite its importance for the sustainable use of species domesticated by the native people, and for the establishment of public policies for conservation and use of biodiversity. In this context the Araucaria forest (Ombrophilous Mixed Forest), a forest formation of the Atlantic Rainforest, has been the object of several studies (see Reis et al., 2010; Zechini et al., 2012; Steenbock and Reis, 2013; Vieira-da-Silva and Miguel, 2014; Adan et al., 2016).

Before the arrival of Europeans in South America, the Araucaria forest occupied an estimated area of 200,000 km<sup>2</sup> in Brazil and Argentina (Hueck, 1972). However, since the beginning of the Twentieth century this ecosystem has suffered more than 100 years of wood exploitation (Trajano, 1996; Nodari and de Carvalho, 2010), intensive agriculture and livestock production, and the effects of habitat fragmentation (Guerra et al., 2002; Ribeiro et al., 2009). This has left the Araucaria forest restricted to only 12% of its original area in Brazil (Ribeiro et al., 2009), and produced several legal restrictions to its use.

However, smallholders (such as family farmers) continue to use several resources from this forest, even adapting management systems. The main system is geared toward the production of yerba mate (Ilex paraguariensis) under araucaria trees (Araucaria angustifolia)—the native yerba mate system, NYMS—which holds economic, cultural, and social relevance for thousands of farmers (Marques, 2014). For instance, only in 2014, more than 8,700 tons of pinhões (seed-like nuts from A. angustifolia) were gathered in Brazil, representing a total product value of R\$ 19,325,000.00 (IBGE, 2014; at that time over US\$ 8.2 million).

According to the Brazilian Institute of Geography and Statistics (IBGE, 2014), in 2014 Brazil produced 333,017 tons of native yerba mate, and it was the main non-timber product in terms of production volume (IBGE, 2014). Production of native yerba mate is carried out in 5,150 establishments (predominately family farmers) in Santa Catarina State (EPAGRI/CEPA, 2016).

In this context we investigated landscapes with NYMS, characterizing the principal practices and motivation of family farmers, seeking evidence of landscape domestication and evaluating whether the system conserves the principal resources managed.

Our study has focused on three species in the NYMS: yerba mate (I. paraguariensis), araucaria (A. angustifolia), and caraguatá (Bromelia antiacantha). If on the one hand these species represent significant resources for the subsistence of family farmers, on the other hand they are interesting case studies to exemplify how these farmers play an active role in the conservation of plant populations. In addition, in order to evaluate whether the NYMS can conserve genetic diversity of the studied species, we analyzed genetic diversity in the yerba mate, araucaria and caraguatá populations from NYMS and from a protected area.

# MATERIALS AND METHODS

# Study Area

The study was conducted in southern Brazil, in the Northern Plateau of Santa Catarina state. Plant populations and ethnobotanical studies were realized in farmers' properties situated in five different municipalities, and in a Protected Area - PA (Três Barras National Forest) (**Figure 1**). The climate in this region is described as subtropical humid (Cfa), according to Köeppen's classification, with mean annual temperature ranging from 15.5 to 17◦C, and annual precipitation between 1,360 and 1,670 mm (EPAGRI, 2001). The soil in all sampled plots (further described) is classified as a sedimentary, nutrient-poor, deep and red latosol, and the relief is classified as smooth wavy, and the mean altitude is 750 m (Santa-Catarina, 1986).

The Northern Plateau of Santa Catarina covers 10,466.70 km<sup>2</sup> , is composed of 14 municipalities and has an average HDI of 0.79 (IBGE, 2006). The total population of the territory is 351,332 inhabitants, of which 23.76% live in the rural area (IBGE, 2014). The Northern Plateau produces 27.2% of Santa Catarina native yerba mate and 38.33% (1,974) of the state's productive rural properties which grow native yerba mate are situated here (EPAGRI/CEPA, 2016).

# Study Species

Araucaria angustifolia, popularly known as Paraná pine, pinheiro-do-paraná, or pinheiro-brasileiro, is a conifer that has been intensively exploited since the beginning of the Twentieth century for wood production (Guerra et al., 2002; Nodari and de Carvalho, 2010), and is listed as critically endangered by the IUCN (IUCN, 2017). The species is dioecious and wind pollinated with a reproductive cycle of 2–3 years (Mantovani et al., 2004); it covered approximately 200,000 km<sup>2</sup> in southern Brazil and northern Argentina (Reitz and Klein, 1966; Mattos, 1994; Guerra et al., 2002). Nowadays this species is economically and culturally important (Vieira-da-Silva and Reis, 2009; Assis et al., 2010; Zechini et al., 2012; Reis et al., 2014; Adan et al., 2016), especially for the production of seeds (pinhões), which are used as food and provide a source of income for many family farmers in southern Brazil (Guerra et al., 2002; Vieira-da-Silva and Reis, 2009; Vieira-da-Silva et al., 2011; Adan et al., 2016).

Ilex paraguariensis is a tree species native to South America and popularly known as yerba mate, erva-mate or mate (Edwin and Reitz, 1967). The leaves are used to make a traditional tea-like beverage (Chimarrão) in Brazil, Argentina, Uruguay, and Chile, and it is also a medicinal plant. It is a dioecious species with entomophilous pollination and zoocorical dispersal (mainly by birds) (Ferreira et al., 1983; Pires et al., 2014; Mattos, 2015). In Brazil it is the main non-timber product (in tons) from extractive activity (IBGE, 2014), with great economic and social importance in southern Brazil, contributing to the conservation of forest remnants through NYMS (ervais—landscapes with yerba mate) (Marques, 2014; Mattos, 2015). The NYMS, however, presents a great diversity of situations, due to the different management methods and meanings these systems can have for the farmers, resulting in different landscapes (Marques, 2014; Mattos, 2015).

Bromelia antiacantha (caraguatá) is a bromeliad native to the Atlantic Forest, and occurs in different forest formations, including Araucaria forest (Reitz, 1983). This plant has attracted the attention of researchers (Santos et al., 2004; Andrighetti-Fröhner et al., 2005; Brehmer, 2005; Duarte et al., 2007; Filippon et al., 2012a,b) due to its medicinal, ornamental and industrial potential. Studies show that the use of B. antiacantha forms part of the history of Northern Plateau communities (Duarte et al., 2007; Filippon et al., 2012a,b). The caraguatá is found in Araucaria forest fragments (Hanisch et al., 2006, 2010; Mattos, 2011; Mello, 2013), areas where cattle graze, areas where yerba mate is grown and in hedges (Filippon, 2009).

# Field and Laboratory Methodology

This study integrates different field methodologies, including interviews with family farmers, as well as demographic and genetic studies of the above-mentioned species in a Protected Area and on farmers' properties. In all cases the plant populations sampled were the same as in the demographic and ethnobotanical studies. Below we detail each of the analyses:

### Ethnobotanical Studies

Ethnobotanical studies were conducted through semi-structured interviews with family farmers that employ the NYMS on their properties (eight interviews focused on A. angustifolia; 93 on I. paraguariensis and 41 on B. antiacantha), which sought to identify and develop questions directly related to management of the species and local use. Before each interview we obtained a prior informed consent in accordance with the code of ethics of the International Society of Ethnobiology. The interviews addressed topics associated with the naming, use and management practices employed for the three species. Guided tours were also conducted (Albuquerque et al., 2008) with the farmers to describe the management systems and practices employed in each area. We intentionally selected the interviewees, considering only farmers who use and/or manage the target species. Interviewees were contacted through the snowball method described by Bailey (1994). The results are presented in a descriptive interpretative form, based on interviews and observations, and expressed as percentage of occurrence or average values with standard deviation, when necessary.

### Demographic Studies

For characterization of the population structure of the three studied species, permanent plots (40 × 40 m) were marked out in forest fragments of farmers' properties, in distinct landscape units, and in the PA (in an area not managed in the last 50 years). All the trees and shrubs were marked and diameter at breast height (DBH) was measured. For the analysis we determined the density of I. paraguariensis, A. angustifolia and all species of the Myrtaceae family, considered the key structuring species (Assis et al., 2010; Mello and Peroni, 2015).

For the demographic characterization of I. paraguariensis and A. angustifolia, sampling was carried out in two different situations on the farmers' properties, according to the local management mentioned in the interviews, and in the Protected Area:


Additionally, for A. angustifolia, one sample of 60 female plants, 30 on farmers' properties and 30 in the PA, was evaluated annually (2010–2012) in terms of number of pinhas (reproductive structures) per plant, and productivity per pinha (only 2010).

For the demographic characterization of B. antiacantha, permanent plots of 20 × 40 m were used in forests and 20 linear meters to study the caraguatá hedges. All rosettes of the species were counted, classified as vegetative or reproductive (considering the presence of inflorescences or infructescences) and leaf length was measured. Bromelia antiacantha populations were studied in the following areas:


The results are expressed as average frequencies estimated per hectare, with respective standard deviations.

### Genetic Characterization

The three studied species were genetically characterized using allozyme markers, in starch gel (13%), following the recommendations of Kephart (1990) and Alfenas (1998). The extraction protocol was the same for all species, macerating fresh leaves with an automatic homogenizer and using extraction solution n◦ 1 (Alfenas, 1998). The enzymatic systems scored for each species, as well as the buffers are described in **Table 1**. Three buffers were utilized in the electrophoresis process: Tris Citrate pH 7.5 (Tris 27 g.L−<sup>1</sup> and citric acid 16.52 g.L−<sup>1</sup> ), Morpholine Citrate pH 6.1 (7.68 g.L−<sup>1</sup> citric acid) and Histidine pH 8.0 (105.82 g.L−<sup>1</sup> sodium citrate tribasic).

Leaf tissues were collected from 50 individuals in each population. For A. angustifolia and I. paraguariensis sampling respected a minimum distance of 50 m between individuals, in order to minimize family structure. For B. antiacantha, adults/mature rosettes (leaves more than 2 m in length) were sampled, respecting at least 15 m distance between them and avoiding the collection of ramets from the same plant.

## Genetic Data Analysis

Based on the genotypes obtained in gel, the following genetic descriptors were estimated: allelic frequencies (Supplementary Table 1), number of alleles per population/group (ˆk), number of unique alleles per population/group (A<sup>ˆ</sup> un), mean allelic richness per loci based on the lowest sample size (A<sup>ˆ</sup> <sup>n</sup>), observed (H<sup>ˆ</sup> <sup>O</sup>), and expected heterozygosity (H<sup>ˆ</sup> <sup>E</sup>) and fixation index (<sup>ˆ</sup> f). Statistical significance (<sup>p</sup> <sup>&</sup>lt; 0.05) of <sup>ˆ</sup> f was tested, permuting alleles between individuals within populations/groups. All these analyses were performed using the FSTAT program, version 2.9.3.2 (Goudet, 2002).

In order to evaluate whether farmers conserve species genetic diversity, all these descriptors were estimated, in each species, for groups of populations inside the PA and on farmer's properties. In the case of B. antiacantha and I. paraguariensis, in order to evaluate possible differences between landscape units as described in the Demographic studies item, all genetic descriptors were also estimated for groups of populations in each landscape unit.

Possible genetic differences between farmers' properties and PA, and between landscape units were tested through confidence intervals (95%) for <sup>A</sup><sup>ˆ</sup> <sup>n</sup>, jackknifing the values across loci in R language (R Development Core Team, 2015) and for <sup>H</sup><sup>ˆ</sup> <sup>E</sup> and <sup>H</sup><sup>ˆ</sup> <sup>O</sup> through 1,000 bootstraps of individuals within population/group, also in R language, but using the "PopGenKit" package (Paquette, 2012).

TABLE 1 | Enzymatic systems used for genetic characterization of Araucaria angustifolia, Bromelia antiacantha, and Ilex paraguariensis populations.


EC, Enzyme commission number; CM, Morpholine Citrate; HIS, Histidine; TC, Tris Citrate. 6PGDH, 6-phosphogluconate dehydrogenase; ACP, acid phosphatase; DIA, diaphorase; GOT, glutamate oxaloacetate transaminase; GTDH, glutamate dehydrogenase; IDH, isocitrate dehydrogenase; LAP, leucyl aminopeptidase; MDH, malate dehydrogenase; ME, malic enzyme; NADH, NADH dehydrogenase; PGI, phosphoglucose isomerase; PGM, phosphoglucomutase; PO, peroxidase; SKDH, shikimate dehydrogenase.

# RESULTS AND DISCUSSION

# Structural Characteristics of Landscapes with NYMS

The forest structure of the landscapes with NYMS in the region is presented in **Table 2**, which also includes information on the total number of species, density of the araucaria and the total number of individuals of Myrtaceae and yerba mate, considered as structuring species of these landscapes (Assis et al., 2010; Mello and Peroni, 2015).

Demographic results presented great variation within landscape units (high standard deviations), showing peculiarities between and within the managed areas of each group (**Table 2**). Yerba mate represented 20.9% of the individuals (588.6 plants/ha); the Myrtaceae group represented 18% (507.0 plants/ha) and araucaria 6.7% (189.4 plants/ha). These species, considered structuring species of the Araucaria forests, represented up to 45.6% of the existing plants (1,285.0 plants/ha). This result reinforces the perspective that these species are key cultural species in the context of landscapes with management of yerba mate, as mentioned in Assis et al. (2010). The result also demonstrates the high number of other species in the system (see also Mello and Peroni, 2015).

The three structuring species (Myrtaceae, araucaria and yerba mate) have a strong historical/prehistoric relationship with the family farmers, caboclos (mestizos), and the indigenous peoples, in terms of food production and other uses in forest environments (Brandt and Campos, 2008; Gerhardt, 2009; Reis et al., 2014). This reinforces the recognition of human contributions to changes in vegetation patterns. Human activity influences species dynamics, which can lead to a new equilibrium and directional changes, or result in vegetation with different characteristics (Wiersum, 1997a; Clement, 1999; Trombulak et al., 2004; Casas et al., 2007; Steenbock et al., 2011).

# Local Perceptions of Landscapes with NYMS

Historically on the Northern Plateau (SC) there is a link between native landscapes with the presence of yerba mate and harvesting practices carried out by farmers (Valentini, 2003; Barreto, 2009; Gerhardt, 2009; Grzebieluka and Sahr, 2009; Schuster and Sahr, 2009). This link should be considered in order to understand how farmers interpret their landscapes and make decisions with regard to their subsistence. These management practices are multi-dimensional and involve actions that are fundamental for maintenance of useful resources. This know-how constitutes part of the people's traditional ecological knowledge (TEK), which has been defined as a cumulative body of knowledge, beliefs and practices developed through adaptive processes, and involves cultural transmission over generations (Berkes et al., 2000).

For all the interviewed farmers the presence of A. angustifolia defines the landscape and holds connotations of regional identity associated with their ancestors. However, these farmers collect pinhão only for their own consumption, not for commercialization, since they consider yerba mate and cattle as the income-generating elements of the system. This differs from other regions, where gathering is carried out primarily for sale (Vieira-da-Silva and Reis, 2009; Vieira-da-Silva et al., 2011; Zechini et al., 2012; Adan et al., 2016).

Most of the informants (57%) referred to the forest environment with yerba mate harvesting using local terms, reflecting the close connection between farmers and their forest area. This form of naming consists of identification and classification of distinctive components of the managed environments. It contains empirical observations and information about the behavior and abundance of flora and domestic fauna. Farmers relied on such knowledge to ensure a supply of food, medicines and other subsistence resources. They mentioned four names: caíva (51.9%), potreiro (16.7%), invernada (16.7%), and mato nativo (14.7%). Caívas are ecosystems containing remnants of native Araucaria forests, with different levels of forest structure and grazing in the herbaceous stratum; pastures may be natural or naturalized (Marques et al., 2008; Reis et al., 2013). This landscape has been created due to the permanence, for decades, of a traditional productive system combining grazing of the herbaceous stratum with yerba mate and firewood extraction in the understory (Marques et al., 2008; Reis et al., 2013; Marques, 2014). Thus, the presence of cattle



(n), number of 40 × 40 m plots; SD, standard deviation; h, Ilex paraguariensis mean height (m); DBH, I. paraguariensis mean diameter at breast height (cm); Ip, I. paraguariensis mean density (individuals per hectare); Ip r, I. paraguariensis mean regeneration density; Aa, Araucaria angustifolia mean density; My, Myrtaceae individuals mean density; Total, total plants mean density (including all tree and shrub species).

in many areas is a determining factor of this system, due to the value added by the sale of livestock. Mato nativo, potreiro and invernada are areas with some degree of Araucaria forest cover and yerba mate management; in the first there is no livestock production, in contrast to the other two.

Most of the NYMS areas (ervais) are located in places (soil, topography) of excellent to good agricultural aptitude, and consequently, with high pressure for conversion into crops. Yerba mate is cited as one of the main reasons for non-conversion in 80% of cases, and the multiple-use forest (timber, firewood, and yerba mate) with the livestock, in 55% of cases.

On the Northern Plateau (SC), yerba mate is managed in several situations; for instance, Marques (2014) identified 13 typologies (including situations with planted yerba mate). However, the main landscape is considered to be forest, occupying about 80% of the area, with yerba mate management, among which the caívas stand out.

All farmers considered yerba mate important mainly because it is a low-risk activity, with little investment and labor demand, associated with landscapes which have different uses. According to all interviewed farmers, the NYMS constitutes a stabilizing element on their property, capable of generating resources for forest areas and linked to strong cultural aspects, as well as being a pleasurable activity for the great majority of farmers, according to those who were interviewed. These aspects reinforce the role of these family farmers as maintainers of Araucaria forest fragments nowadays.

An additional relevant aspect is the preference shown by Brazilian consumers for a product (ground tea-like leaves for chimarrão) with a milder flavor (de Oliveira Suertegaray, 2002) from herbs grown in shade (native yerba mate), implying higher prices paid by the industry for the raw material originating from this system (EPAGRI/CEPA, 2017).

In these forest systems, rather than generating significant income in relation to total gross income, yerba mate constitutes an important reserve for 72% of the families, with the characteristics of a savings account, to be used in investments, emergencies and payment of debts.

# Local Management Practices in NYMS

Yerba mate occurs spontaneously in the forest understory, but farmers (98.9%) promote it through removal of certain tree species and periodic mowing of shrub and arborescent vegetation. According to farmers, the presence of animal grazing also contributes to the control of this vegetation. In addition, yerba mate is planted (59.1%) in many situations using seedlings removed from other forest sites (5.4%), produced on the property (23.2%) or even bought from commercial nurseries (71.4%). However, the observed yerba mate natural regeneration is low (**Table 2**), since the farmers tend to prune all the plants without worrying about leaving seed trees.

For araucaria, besides the collection of pinhão no other practices were recorded for the species in the interviews. However, farmers recognize at least four varieties of female pine nut production plants (population variation as detected by Adan et al., 2016 in another region): São José, Comum, Cayová, and Macaco. The differences are associated with the maturation times of pinhão, an aspect already highlighted in other works (Vieirada-Silva and Reis, 2009; Zechini et al., 2012; Adan et al., 2016) and they are even classified as botanical varieties (Reitz and Klein, 1966; Mattos, 1994) (see Adan et al., 2016 for further discussion).

The pinhões average yield, evaluated by a sample of 30 trees from farms, was 11.1 pinhas/tree between 2010 and 2012, with 3.6, 8.5, and 21.1 pinhas/tree for the years 2010, 2011, and 2012 respectively. Average productivity over the years was 71.1 kg/ha, with 23.0 kg/ha in 2010, 54.9 kg/ha in 2011 and 135.3 kg/ha in 2012. In the PA sample observations were made in the years 2011 and 2012, and the average production for these was 17.5 pinhas/tree, with averages of 11.8 in 2011 and 23.2 pinhas/tree in 2012. The average productivity of 2011 and 2012 was 333.0 kg/ha, varying from 224.3 kg/ha in 2011 to 441.7 kg/ha in 2012.

Pinhões productivity per hectare estimated for all populations was higher than that found by Vieira-da-Silva and Reis (2009) in 2006 (44.3 kg/ha) and by Mantovani et al. (2004) in 2001, and similar to that found by these authors in 2002 (117 and 160 kg/ha, respectively). In general, differences in productivity are attributed to the number of female individuals per hectare and the degree of forest evolution (Vieira-da-Silva and Reis, 2009; Figueiredo-Filho et al., 2011). The number of female plants producing pinhas varied between years; considering the entire period evaluated, the average number of female plants in production was 13.9 on farms, while in PA the this average was 29.6.

In economic terms, forests located in certain areas would yield (at the average price paid by regional markets of R\$ 2.00/kg of pinhão) on average, R\$ 142.00 per hectare, from commercialization of the pinhão. Although pinhão is less economically attractive than yerba mate in the region of Três Barras (Zechini et al., 2012), this resource represents an important source of income for many rural families in the state of Santa Catarina (Vieira-da-Silva and Reis, 2009; Adan et al., 2016).

An important aspect is the great variation in pinhão productivity in different years of production, as already mentioned by Mattos (1994), Mantovani et al. (2004), and Vieirada-Silva and Reis (2009), and even between relatively close sites (Zechini et al., 2012). This variation reflects the unpredictability of production that is also used by farmers as an argument for the lower value of pinhão as a trade product in the region.

Considering the management practices carried out by farmers, the population structure of yerba mate and structure of the forest fragments (considering the set of structuring species—araucaria, yerba mate and several Myrtaceae), two main landscapes were detected:


In both cases there are practices for protection of yerba mate plants, including possible transplanting and enrichment with plants from the same forest fragment or seedlings produced by farmers.

# The Caraguatá Uses and Management in the NYSM

Caraguatá has been used for a long time in the region, for various purposes (Filippon, 2014). The fruits of this plant are used for medicinal treatment of, for example, pulmonary diseases (bronchitis/asthma) and influenza (Filippon, 2009; Filippon et al., 2012a). Uses of the species for making liquor and jelly were also cited in Filippon (2014) but the most frequently cited use (60%) was for hedges.

Caraguatá hedges were very common in the region from 1900 to 1960. According to farmers, this kind of fencing was used especially for pig farming; animals can't pass through the hedge due to the high density of rosettes and the thorns in their leaves. Caraguatá hedges are also used in other kinds of animal breeding like sheep, horses and cattle. Nowadays this type of pig breeding is no longer common; however, the hedges still remain on the properties. Some farmers still plant new caraguatá hedges, following tradition, and taking advantage of the efficiency of this type of fencing to hold animals in and establish property boundaries.

The density of B. antiacantha rosettes varies between studied areas. The unmanaged area presented the greatest density of rosettes and reproductive rosettes per hectare when compared to the managed areas (**Table 3**). Hedges possess, on average, the highest population density of all the sampled areas (316 times greater than that of the population in the unmanaged area), and also the highest number of the reproductive rosettes (**Table 3**).

Ramet proportions were high in all three situations; however, among these, the hedges stood out with up to 97% ramets (**Table 3**). In comparison to managed areas and hedges, the unmanaged area showed the highest average densities of genets per hectare. There was also variation in the proportion of reproductive rosettes. This variation is reflected in all landscape units including the hedges; that is, when the number of reproductive individuals was low in managed and unmanaged areas, it was also low in hedges (**Table 3**).

TABLE 3 | Rosette density (per hectare) of Bromelia antiacantha in unmanaged, managed, and hedge populations.


average values ± confidence interval; α = 0.05. \*Number of 20 × 40 m permanent plots; \*\*number of samples of "20 linear meters."

The caraguatá hedges were, and still are, made with rosettes harvested from the Araucaria forest fragments, where I. paraguariensis is extracted and where cattle graze. Sometimes the ramets used in hedges are donations from neighbors who want to "clean" the area to improve pasture for cattle, or want better conditions for working with yerba mate, since the caraguatá thorns make the farmers' work more difficult.

On these occasions the number of rosettes removed is generally high, and the farmers who receive them have an abundance of ramets to plant, so they can generally choose the strongest, lushest and youngest rosettes (0.8–1 m leaf length). This intention of making a hedge and the selection of rosettes allows us to highlight a domestication process. These management practices (choosing rosettes, building, and hedge maintenance) employed by local farmers generate a change in landscape, tending toward greater productivity and convenience.

In these areas caraguatá is not the main focus, but a means of adaptation of the landscape to facilitate the development of activities that generate income for the property, such as the harvesting of yerba mate and cattle grazing. Thus, from this perspective, management (mowing) of caraguatá can be seen as a consequence of this domesticated landscape for the production of yerba mate and/or cattle grazing.

# Genetic Diversity and Conservation on Farms

Genetic descriptors for groups of populations in PA or on farmers' properties (the same for the three species) are presented in **Table 4**. In general, the data indicate that genetic diversity of the three species is being better maintained on the farmers' properties, compared to the PA. This result can be interpreted as arising from the common effect of past overexploitation; however, it is also an indication that the medium-term possibilities of maintaining diversity are similar in both situations, for the three species, reinforcing the possibility of conservation by use.

The mean expected heterozygosity (genetic diversity - <sup>H</sup><sup>ˆ</sup> E) estimated for A. angustifolia varied between 0.079 and 0.072 and did not differ in the four situations evaluated (**Table 4**). Thirtyone distinct alleles were detected, ranging from 23 to 26 in the different situations, with unique alleles present in both PA (2) and farmers' properties (3). It should be noted that the number of alleles found in PA (26) was higher, on average, than that found on each of the properties; however, the number found for the set of properties (28) was higher, including 3 unique alleles. In all situations the fixation index (<sup>ˆ</sup> f) was not different from zero (**Table 4**).

The estimated values were lower than the mean values obtained in a large study in the SC (Reis et al., 2012), which evaluated 31 A. angustifolia populations (13 allozymic loci). The values found were lower than the SC average for <sup>H</sup><sup>ˆ</sup> E (0.124) and total number of alleles (51), but similar in terms of regional (Northern Plateau) <sup>H</sup><sup>ˆ</sup> <sup>E</sup> mean (0.104) and total number of alleles detected (30). Thus, in general, for A. angustifolia, there were no differences between the diversity maintained in PA and on farmers' properties. In all the studied situations for


TABLE 4 | Genetic descriptors estimated for Araucaria angustifolia (A.a), Ilex paraguariensis (I.p), and Bromelia antiacantha (B.a) populations grouped per occurrence in Protected Area (PA) or on Farmers' Properties.

n, Sample size; <sup>ˆ</sup>k, number of alleles per group; <sup>A</sup><sup>ˆ</sup> un, number of unique alleles per group; <sup>A</sup><sup>ˆ</sup> <sup>n</sup>, mean allelic richness per loci based on the lowest sample size (51 for A.a, 50 for I.p and 41 for B.a); <sup>H</sup><sup>ˆ</sup> <sup>O</sup>, observed heterozygosity; <sup>H</sup><sup>ˆ</sup> <sup>E</sup> , expected heterozygosity; <sup>ˆ</sup>f, fixation index; Confidence intervals (95%) are shown in parenthesis; \*<sup>p</sup> <sup>&</sup>lt; 0.05.

A. angustifolia on agricultural properties, the practices associated with the species are similar, involving maintenance of the adult individuals and cone collection.

For yerba mate populations, the mean <sup>H</sup><sup>ˆ</sup> <sup>E</sup> ranged from 0.255 to 0.243, with no difference between the situations evaluated (**Table 4**). For the set of populations 35 different alleles were detected, ranging from 24 to 31 in the different situations, with the presence of unique alleles in both PA (3) and farmers' properties (2). The set of populations on farm properties presented 32 alleles while 31 alleles were detected in PA. Estimated <sup>ˆ</sup> f was not different from zero in all situations (**Table 4**).

Caraguatá populations presented mean <sup>H</sup><sup>ˆ</sup> <sup>E</sup> ranging from 0.148 to 0.178, not differing between the studied situations (**Table 4**). Thirty-one different alleles were detected for the set of populations, varying from 24 to 27 in the different situations, and unique alleles were detected in both PA (1) and farmers' properties (5). Protected Area unit presented 25 alleles while the set of farm populations harbored 29 alleles. The fixation index was significant (0.144) only in PA, not differing from zero on farms (**Table 4**).

# Genetic Diversity in Different Landscapes

For yerba mate, management practices and systems determine two distinct landscapes. Thus, **Table 5** presents the diversity indexes considering population groups sampled in both landscapes, as well as the indexes referring to two populations sampled in the Três Barras National Forest (PA), representing unmanaged populations. The mean values for <sup>H</sup><sup>ˆ</sup> <sup>E</sup> do not differ between situations; however, allelic richness (A<sup>ˆ</sup> <sup>96</sup>) was higher in the unmanaged area. On the other hand, <sup>ˆ</sup> f was significant only in the unmanaged area (**Table 5**). Unique alleles were found in two situations, 4 in non-managed populations and 3 in populations with yerba mate management and cattle grazing.

Caraguatá populations presented some differences in heterozygosities (H<sup>ˆ</sup> <sup>O</sup> and <sup>H</sup><sup>ˆ</sup> <sup>E</sup>) between landscapes. Mean values of <sup>H</sup><sup>ˆ</sup> <sup>O</sup> and <sup>H</sup><sup>ˆ</sup> <sup>E</sup> for hedges were significantly higher than those estimated for unmanaged landscape (**Table 6**). Hedges also presented higher numbers of alleles (ˆ<sup>k</sup> <sup>=</sup> 28), unique alleles (A<sup>ˆ</sup> un <sup>=</sup> 3) and allelic richness (A<sup>ˆ</sup> <sup>n</sup> = 2.15) than all other landscape units. Fixation indexes were significantly different from zero in the unmanaged landscape. In this case, the reduced number of rosettes when compared to the hedges may contribute to more crossing between relatives.

The higher <sup>H</sup><sup>ˆ</sup> <sup>O</sup>, <sup>H</sup><sup>ˆ</sup> E, <sup>ˆ</sup>k, <sup>A</sup><sup>ˆ</sup> un and <sup>A</sup><sup>ˆ</sup> <sup>48</sup> values and the nonsignificant <sup>ˆ</sup> f estimated for hedges, compared to unmanaged populations, could be related to the origin of the plants (from the same or different locations) used for hedge construction and to the selection of plants for hedges. As mentioned earlier, on many occasions neighbors donated the seedlings for hedges, mainly when the land was cleared, so any single hedge could have been made with seedlings from different populations, possibly increasing their number of alleles and heterozygotes.

In this context, observing the genetic diversity indexes obtained for B. antiacantha, it can be inferred that the hedges presented greater genetic diversity than populations that had not been managed for over 50 years, in accordance with results from other studies (Otero-Arnaiz et al., 2005; Zizumbo-Villarreal et al., 2013). This aspect is possibly due to the way the hedges are constructed, with rosettes from sites of natural occurrence of the species (forests).

Demographic studies showed that either by greater exposure to light in the hedge or other factors such as soil and temperature, the hedge had more reproductive rosettes (**Table 3**). The rosettes may come from various populations/farms with different genetic characteristics, which may contribute to greater diversity in hedges. In this sense, caraguatá fences can be seen as a metapopulation of the species, where each population of a landscape unit is a subpopulation.

Finally, it is possible to argue that the decrease in genetic diversity, as one of the indicators of domestication generally suggested in the literature, must be taken into account in relation to management practices. Thus, the fact that intensively managed populations present greater genetic diversity than unmanaged


PA, Protected Area; n, sample size; <sup>ˆ</sup>k, number of alleles per landscape unit; <sup>A</sup><sup>ˆ</sup> un, number of unique alleles per landscape unit; <sup>A</sup><sup>ˆ</sup> <sup>96</sup>, mean allelic richness per loci based on the lowest sample size (96); <sup>H</sup><sup>ˆ</sup> <sup>O</sup>, observed heterozygosity; <sup>H</sup><sup>ˆ</sup> <sup>E</sup> , expected heterozygosity; <sup>ˆ</sup>f, fixation index; confidence intervals (95%) are shown in parenthesis; \*<sup>p</sup> <sup>&</sup>lt; 0.05.


n, Sample size; <sup>ˆ</sup>k, number of alleles per landscape unit; <sup>A</sup><sup>ˆ</sup> un, number of unique alleles per landscape unit; <sup>A</sup><sup>ˆ</sup> <sup>48</sup>, mean allelic richness per loci based on the lowest sample size (48); <sup>H</sup><sup>ˆ</sup> O, observed heterozygosity; <sup>H</sup><sup>ˆ</sup> <sup>E</sup> , expected heterozygosity; <sup>ˆ</sup>f, fixation index; confidence intervals (95%) are shown in parenthesis; \*<sup>p</sup> <sup>&</sup>lt; 0.05.

ones does not imply that they are considered less domesticated than the others, in the sense of human intervention in the population and in the landscape.

In addition, the farmers' way of life related to the gathering of rosettes in different populations is a result of management of the ervais (NYMS). In this sense there is an opportunity to "collect seedlings" from an activity directed by other managements which generate income for the property, like cattle and yerba mate. This fact differentiates the caraguatá from other plants cultivated and used in hedges: the use of these seedlings is desired, and in turn they are collected from different populations. Thus, the fact that the yerba mate areas are managed together with the existence of farmers interested in making caraguatá hedges increases species diversity. Therefore, this domestication process has a tendency to increase genetic diversity.

Both ethnobotanical and genetic studies show that caraguatá hedges can be seen as a form of on-farm conservation of genetic diversity. Traditionally, the focus of conservation has been the creation of reserves, and much has been debated about the size, form and number of reserves that must be created (Wiens, 1997). However, more recently this trend has been directed toward seeking alternatives that allow not only conservation, but also use, maintenance, and even an increase in population diversity (Clement et al., 2007). In this sense, on-farm conservation has been seen as a great opportunity for the conservation of species of human interest (Dawson et al., 2013).

# CONCLUSION: CONSERVATION-BY-USE IN ARAUCARIA FORESTS

The management systems and practices conducted by farmers maintain the landscape with productive forest fragments, favoring the conservation of these species. The different landscape units, which include the three studied species at the same time, occur predominantly in fragments of less than 50 hectares (mean = 45.8 ± 33.4 ha), and represent, on average, 45.8% of the total property area. Several landscapes and management systems generally occur on the same property, forming a mosaic of situations (Marques, 2014; Mattos, 2015). These distinct fragments of a wider landscape form a unique situation that allows a high level of gene flow between the various units, favoring maintenance of the genetic diversity of each species and of the whole, as a metapopulation. Each landscape unit studied has structural peculiarities that influence ecological and genetic patterns. The units also have an interconnection with allele movement (pollen, seeds and individuals) mediated by fauna and man, allowing the maintenance of diversity in the metapopulation. However, such metapopulation is created and maintained for cultural and economic motives, in addition to ecological ones.

Thus it is possible to state thatcaívas are fundamental pieces in conservation of the species and landscapes with araucaria, mainly due to different management methods that result in mosaics of vegetation with different demographic, and consequently genetic, structures. This factor (management of areas) reinforces the idea of conservation-by-use of fragments, where in addition to conserving the species, the conditions that allow the development of new germplasm are also maintained. On-farm conservation is based not only on the conservation of existing germplasm; but the genetic variability maintained in this type of conservation (backyards, gardens and agroforestry systems) also allows the maintenance of biodiversity through evolutionary processes (FAO, 1996; Clement et al., 2007).

The system requires these management practices in the landscape, with changes in various species motivated by cultural and economic factors, in order to maintain domestication of these landscapes. However, contrary to expectation, domesticated landscapes for the production of yerba mate did not lead to an important reduction in genetic variability, since there is no conscious selection action in the species studied.

It is important to note that maintenance of Araucaria forest remnants in the region of native yerba mate exploitation is directly associated with the possibility of exploitation of this resource, which is related to the conservation-by-use of these fragments (Petersen et al., 2001; Mazuchowski, 2004; Mattos, 2011, 2015; Marques, 2014). However, there is also a potential risk that could jeopardize the system due to economic and sociocultural reasons. The prices paid for yerba mate show significant fluctuations between years (EPAGRI/CEPA, 2017), which may be a reason to substitute the system for another agricultural or forestry activity. In addition, the new generations do not show interest in continuing farming activities, implying risks to the maintenance of regional cultural values (Marques, 2014).

Thus, the perspective of conservation-by-use of the multispecies system described here depends heavily on valorization of cultural aspects of the region, as well as on the economic valuation of yerba mate from this artisan system. Mechanisms of certification of origin could be an alternative of great regional importance, increasing the possibility of in situ, on-farm conservation of several autochthonous species from these Araucaria forests.

# AUTHOR CONTRIBUTIONS

MdR: Conceived the research, designed the research, was responsible for resource acquisition, directed AGM, SF thesis, AZ dissertation, codirected AdCM thesis and wrote the manuscript. TM: Conducted field and laboratory work, analyzed genetic data and wrote the manuscript. AGM: Conducted field work (including interviews), analyzed part of data and wrote the manuscript. SF: Conducted field work (including interviews), analyzed part of data and wrote the manuscript. AL: Conceived the research, provided comments on the manuscript. AdCM: Conducted fieldwork (interviews), analyzed part of data and provided comments on the manuscript. AZ: Conducted field and laboratory work and analyzed part of data. NP: Conceived the research, co-directed the AGM and SF thesis, and provided comments on the manuscript. AM: Conceived the research, codirected the AZ master dissertation, was responsible for resource acquisition and provided comments on the manuscript. This research is the product of AZ master's thesis and AGM, AdCM, and SF Ph.D. thesis.

# FUNDING

This study was supported by the Fundação de Amparo à Pesquisa e Inovação do Estado de Santa Catarina (FAPESC – process no. 4448/2010-2 and 11939/2009) and Empresa Brasileira de Pesquisa Agropecuária (EMBRAPA – Macroprojetos 2/2009). Conselho Nacional de Desenvolvimento Científico e Tecnológico provided a Productivity Scholarship for MdR (CNPq – 309128/2014-5) and NP, Coordenação de Aperfeiçoamento de Pessoal de Nível Superior (CAPES) provided doctoral scholarships for TM, AGM, and SF and a master's scholarship for AZ. Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET) supported AL.

# REFERENCES


# ACKNOWLEDGMENTS

We are profoundly grateful to all farmers for their kindness in sharing their knowledge with us, and for their hospitality. We would also like to thank Núcleo de Pesquisas em Florestas Tropicais (NPFT), Laboratório de Fisiologia do Desenvolvimento e Genética Vegetal (LFDGV), and Instituto Chico Mendes de Conservação da Biodiversidade (ICMBio) for logistical support.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fevo. 2018.00011/full#supplementary-material


human guided selection and gene flow. J. Ethnobiol. Ethnomed. 8:32. doi: 10.1186/1746-4269-8-32


Santa-Catarina (1986). Atlas de Santa Catarina. Rio de Janeiro: Aerofoto Cruzeiro.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Reis, Montagna, Mattos, Filippon, Ladio, Marques, Zechini, Peroni and Mantovani. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Domesticated, Genetically Engineered, and Wild Plant Relatives Exhibit Unintended Phenotypic Differences: A Comparative Meta-Analysis Profiling Rice, Canola, Maize, Sunflower, and Pumpkin

#### Edited by:

Charles Roland Clement, National Institute of Amazonian Research, Brazil

#### Reviewed by:

Rubens Onofre Nodari, Universidade Federal de Santa Catarina, Brazil Shabir Hussain Wani, Michigan State University, United States

\*Correspondence:

Ana E. Escalante aescalante@iecologia.unam.mx

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Plant Science

> Received: 10 August 2017 Accepted: 14 November 2017 Published: 05 December 2017

#### Citation:

Hernández-Terán A, Wegier A, Benítez M, Lira R and Escalante AE (2017) Domesticated, Genetically Engineered, and Wild Plant Relatives Exhibit Unintended Phenotypic Differences: A Comparative Meta-Analysis Profiling Rice, Canola, Maize, Sunflower, and Pumpkin. Front. Plant Sci. 8:2030. doi: 10.3389/fpls.2017.02030 Alejandra Hernández-Terán<sup>1</sup> , Ana Wegier<sup>2</sup> , Mariana Benítez1,3, Rafael Lira<sup>4</sup> and Ana E. Escalante<sup>1</sup> \*

<sup>1</sup> Laboratorio Nacional de Ciencias de la Sostenibilidad, Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico City, Mexico, <sup>2</sup> Laboratorio de Genética de la Conservación, Jardín Botánico, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico, <sup>3</sup> Centro de Ciencias de la Complejidad (C3), Universidad Nacional Autónoma de México, Mexico City, Mexico, <sup>4</sup> Facultad de Estudios Superiores Iztacala, Universidad Nacional Autónoma de México, Mexico City, Mexico

Agronomic management of plants is a powerful evolutionary force acting on their populations. The management of cultivated plants is carried out by the traditional process of human selection or plant breeding and, more recently, by the technologies used in genetic engineering (GE). Even though crop modification through GE is aimed at specific traits, it is possible that other non-target traits can be affected by genetic modification due to the complex regulatory processes of plant metabolism and development. In this study, we conducted a meta-analysis profiling the phenotypic consequences of plant breeding and GE, and compared modified cultivars with wild relatives in five crops of global economic and cultural importance: rice, maize, canola, sunflower, and pumpkin. For these five species, we analyzed the literature with documentation of phenotypic traits that are potentially related to fitness for the same species in comparable conditions. The information was analyzed to evaluate whether the different processes of modification had influenced the phenotype in such a way as to cause statistical differences in the state of specific phenotypic traits or grouping of the organisms depending on their genetic origin [wild, domesticated with genetic engineering (domGE), and domesticated without genetic engineering (domNGE)]. In addition, we tested the hypothesis that, given that transgenic plants are a construct designed to impact, in many cases, a single trait of the plant (e.g., lepidopteran resistance), the phenotypic differences between domGE and domNGE would be either less (or inexistent) than between the wild and domesticated relatives (either domGE or domNGE). We conclude that (1) genetic modification (either by selective breeding or GE)

**234**

can be traced phenotypically when comparing wild relatives with their domesticated relatives (domGE and domNGE) and (2) the existence and the magnitude of the phenotypic differences between domGE and domNGE of the same crop suggest consequences of genetic modification beyond the target trait(s).

Keywords: genotype–phenotype, unintended phenotypic effects, phenotypic profiling, Oryza sativa, Brassica napus, Helianthus annuus, Zea mays, Cucurbita pepo

# INTRODUCTION

Plant domestication and the phenotypic modifications it produces have a long history with humans and have involved practices ranging from traditional management to genetic engineering (GE). The effectiveness of traditional practices, or human selection, is possible because the selected traits have a genetic basis that are phenotypically expressed in particular agroecological and cultural environments (Gepts, 2004; Meyer and Purugganan, 2013). Consequently, domestication processes, either with or without GE, may have important evolutionary effects in cultivated plants (Abbo et al., 2014; Hake and Ross-Ibarra, 2015). Genetically modified crops are also domesticated plants, since the genetic modifications are performed in isogenic lines of the crop of interest (Setlow, 1991). Nonetheless, these domestication processes are qualitatively different. On the one hand, in traditional plant breeding new genetic combinations are, in general, obtained by sexual crosses between individuals of the same species. In GE, on the other hand, DNA sequences (of potentially non-related organisms) are inserted into the crop of interest via bioballistics, Bacillus thuringiensis (Bt crops) (Agrawal et al., 1999; Nodari and Guerra, 2001) and other novel techniques (e.g., CRISPR, RNAi) (McManus and Sharp, 2002; Gaj et al., 2013). Thus, the main differences between the two genetic modification techniques involved in domesticated plants are (i) the origin of the novel or foreign DNA that is incorporated in the modified organism, and (ii) the procedure to accomplish such incorporation (Gepts, 2001; Nodari and Guerra, 2001).

Agronomic modification via human selection, domestication without genetic engineering (domNGE), or through genetic engineering (domGE) have phenotypic effects that may not correspond, in magnitude, to the associated genetic changes (Burke et al., 2007). Some of these phenotypic changes are unintended and are usually unrelated to the target traits (Filipecki and Malepszy, 2006). Some studies have attributed these unintended phenotypes to pleiotropic effects in which certain phenotypic traits may be linked and affected by the genetic modification of another trait (Filipecki and Malepszy, 2006), as well as to bottlenecks, selective sweeps, phenotypic plasticity, or gene × environment (G × E) interactions (Remington et al., 2001; Pozzi et al., 2004; Gunasekera et al., 2006; Doust et al., 2014). This phenomena, in which the domesticated organisms show phenotypes that do not correspond to the target traits of domestication, has been documented in many crops, such as potato (Solanum tuberosum), soybean (Glycine max), and wheat (Triticum aestivum) (Dale and McPartlan, 1992; Gepts, 2004; Lenser, 2013). Some of these modified non-target traits have been found to be related to species fitness, which in turn can have a direct impact in the evolution of the plants in potentially unexpected ways (Meyer and Purugganan, 2013).

The unintended phenotypic effects and their evolutionary (and potentially ecological) consequences are of particular relevance if we consider that most of the modifications are done in economically important crops. As such, unintended changes in phenotypes have been observed in crops that are key for global food production, such as rice (Oryza sativa), canola (Brassica napus), sunflower (Helianthus annuus), pumpkin (Cucurbita pepo), and maize (Zea mays) (Snow et al., 1998; Spencer and Snow, 2001; Halfhill et al., 2005; Guadagnuolo et al., 2007; Cao et al., 2009). Moreover, for cases such as maize, pumpkin, and rice, their cultivation represents important sources of cultural value that involve practices related to their diversification, achieved through the traditional selection of ancestral populations (Purugganan and Fuller, 2009; Chen et al., 2015), and represent an important cultural and genetic repository (Altieri and Merrick, 1987).

Moreover, in the context of food security under climate change and high uncertainty scenarios, in situ conservation of agrobiodiversity is of key importance, including not only phenotypic and genetic diversity, but also the accompanying management practices and the environmental context that allows future adaptation (including wild relatives) (Kahane et al., 2013). Therefore, and beyond the merely evolutionary consequences of unintended phenotypic changes, agrobiodiversity studies that look into specific trait changes can help improve protocols of biosafety and risk assessment (Smyth and Mchughen, 2008).

Although the phenomena of the unintended effects of genetic modification have been widely reported, these observations are the product of many individual studies. Thus, we propose a meta-analysis profiling approach in order to perform an unbiased analysis with high statistical power. Meta-analysis profiling allows for the integration of large quantities of data in order to identify patterns among different studies that share a common theoretical framework, but that have been conducted independently (Fiehn et al., 2000). This approach represents a valuable tool that has been used to identify patterns in plant functional genomics (Fiehn et al., 2000) and in phenotypic traits related to growth in plants (Kjemtrup et al., 2003). In the present study, we aimed to profile as many observations as possible into a meta-analysis of the phenotypic consequences of agronomic improvement in five economically and culturally important crops: rice, canola, sunflower, pumpkin, and maize. For the analysis, we included functional phenotypic traits that are potentially related to plant fitness, independently of whether these traits were modified through traditional practices (domNGE) or genetic engineering (domGE), so we could

determine whether there were unintended phenotypic and thus evolutionary consequences. This profile includes 120 scientific publications (110 papers and 10 theses), which cover the period 1990–2017.

# MATERIALS AND METHODS

# Data Collection

In order to determine whether genetic modifications have unintended phenotypic consequences in plants, we identified suitable studies for our analysis by looking for articles published in agricultural and ecological journals, as well as in the thesis database for the National Autonomous University of Mexico (UNAM) for the case of maize. We focused on five of world's most economically important species: rice, canola, sunflower, pumpkin, and maize. We searched for information in the Scopus <sup>R</sup> , GoogleScholar <sup>R</sup> , and UNAM theses databases. For Scopus <sup>R</sup> and GoogleScholar <sup>R</sup> databases, we employed Boolean operators for each crop, such as "Cucurbita [AND] wild [OR] domesticated [OR] GMO [AND] fitness."

To be included in the database, all publications had to satisfy three selection criteria: (1) An estimate of plant fitness between wild relatives and domesticated varieties with and without GE must have been measured; (2) Tests must have been performed under conditions in which the agent of selection was absent; and (3) The genetic background must have been controlled to minimize differences affecting the fitness traits being measured. In cases where experiments included extreme biotic and/or abiotic conditions (e.g., soil fertilization, heat, drought), only the data of the controls were used, since we considered these treatments as perturbations and not as natural environmental variation. In the case of maize, we also used information from thesis reports in which a yield comparison between wild relatives and domesticated relatives was performed. All thesis reports had gone through a peer review process [Reglamento General de Estudios de Posgrado (RGEP), UNAM]<sup>1</sup> and were obtained from the National Autonomous University of Mexico theses database. We applied these criteria rigorously, rejecting hundreds of comparisons that did not satisfy all of them.

Of all the available information, only 110 articles and 10 theses, representing 990 comparisons of wild relatives and domesticated varieties with and without GE of the five species were incorporated into our dataset. The comparison for each genotype and the number of analyzed publications by crop are shown in **Table 1**. The data reported in the articles were collected for the period 1990–2017, representing the timeframe of the first release of genetically modified organisms to date. Although the available literature sometimes reports more phenotypic traits, only six were chosen in the analysis presented here: height (cm), number of flowers, days to flowering, number of seeds, pollen viability (%), and number of fruits. Those traits were chosen because they are functional traits that have potential impacts in survival and reproduction of the plants (Dafni and Firmage, 2000; Saatkamp et al., 2011; Huang et al., 2016; Williams and Mazer, 2016), besides their availability in most of the published studies. The full dataset can be found in the **Supplementary Data Sheet S1**.

# Data Analysis

To standardize data from different traits, we followed a procedure based on Song et al. (2004). The method consists in taking all the values of a single trait from low to high, and normalizing between zero and one. Outlier data points were identified using the Viechtbauer and Cheung (2010) approach. In this approach, a multivariate detection method (Cook's distance) is used to calculate the distance among all data points, and then the data points that do not fall into the general model are identified as "influential data points" or outliers. Given the potential biological meaning of outliers (extreme phenotypes), we decided to investigate the experimental origin of each data point before removing it from the database. We considered that the only biologically meaningful outliers would be those which corresponded to common garden experiments of the domGE with their domNGE isogenic lines, in which case, and despite the outlier category of the data point with respect to the general model, we did not remove these data points from the rest of the analysis. This process was performed for all traits and all crops. As we mentioned before, in most cases the genetic modification is performed in domesticated lines, therefore we decided to separate the three categories in all crops with the labels: "wild" for wild relatives, "domNGE" for domesticated organisms that have not gone through a GE process, and "domGE" for those which have been genetically modified to show new traits.

To determine statistical differences among wild, domNGE, and domGE categories within species, we used a Generalized Linear Model (GLM). In the cases where the p-value was less than 0.05, we carried out a Glht (Tukey) as a post hoc test in the R Multcomp package (Hothorn et al., 2008). A graphic representation of the data was constructed as a Spider Chart using R Fsmb package (Nakazawa, 2014). In addition, to determine differences between categories (wild, domNGE, and domGE) within species, we conducted a Discriminant Analysis (DA) with the R MASS package (Venables and Ripley, 2002) using the genotypes as categories and the values of each trait as predictor variables. To test the significance of differences between categories of the DA per crop, we conducted a followup Multivariate Analysis of Variance (MANOVA). Finally, we

TABLE 1 | Comparisons for each category [wild, domesticated without genetic engineering (domNGE), domesticated with genetic engineering (domGE)] and total number of publications analyzed in each crop (N = number of reviewed publications).


<sup>1</sup>http://www.ddu.unam.mx/index.php/reglamento-general-de-estudios-deposgrado

delimited groupings by drawing 95% confidence interval ellipses around the centroids using the ggplot2 R package (Wickham, 2009). All the analyses were conducted in R program (version 1.17.15) (R Core Team, 2013) and all the scripts utilized for the analyses are available online at https://github.com/LANCISescalante-lab/plant\_phenotype\_metaanalysis.

# RESULTS

The results of the 990 comparisons show significant phenotypic differences among the three categories (wild, domNGE, and domGE) for almost all of the analyzed crops and the majority of the traits. With regard to outlier management, the number of points that lie outside the normal distribution was significantly less than the total number of comparisons for each crop. In the case of canola, the outliers represent 2% of the total comparisons, for sunflower 1.8%, for rice 4.3%, and for maize 12%. In the case of pumpkin, we did not find any outliers.

The differences between wild relatives and the domesticated categories (either domNGE or domGE) were expected, but unexpected differentiation between the domesticated categories (domNGE and domGE) was also observed in the analyzed traits (which were not the target of selection or genetic modification). Since the proportion of outliers within the dataset is relatively small, this general pattern observed in the results is maintained regardless the outlier treatment, with only some specific differences per crop (**Figure 1** and Supplementary Figure S1).

# Phenotypic Variation Can Be Identified As Wild, domNGE, and domGE

Through the DA of the phenotypic traits of all crops (height, days to flowering, number of seeds, pollen viability, number of flowers, and number of fruits), we found a clear distinction of phenotypic variation in three groups, which correspond with the wild, domNGE, and domGE categories [**Figure 1**, all (a) panels]. These three groups are different in size, position, and/or direction along the axes of the DA. In some cases, the overlapping of the groups is larger than in others. For instance, in canola, although the groups can be differentiated, the overlapping of the three groups is the largest compared with the other analyzed crops (MANOVA F(2,52) = 1.541, p = 0.166), while in maize (MANOVA F(2,116) = 8.5571, p = 1.058e−07) and rice (MANOVA F(2,100) = 11.284, p = 2.868e−11) the overlapping is the smallest of all, with pumpkin (MANOVA F(2,46) = 13.357, p = 1.066e−08) and sunflower (MANOVA F(2,48) = 4.1348, p = 0.00081) in an intermediate range of overlapping [**Figure 1**, all (a) panels]. Moreover, the percentage of variation explained by the discriminant axes varies considerably among crops, with the most extreme cases being maize and sunflower. For maize, the total phenotypic variation is distributed in the two discriminant axes (LD1 = 74%; LD2 = 25%), while in sunflower and canola, the variation is mainly explained by LD1 (93 and 90%, respectively). A more detailed analysis of the DA results shows that the dispersion of the phenotypic variants within groups is, in most cases, larger in wild groups than in domesticated ones (domNGE and domGE) [**Figure 1**, all (a) panels]. The only case where the phenotypic variation found in the wild group was less than that found in the domGE groups was in sunflower.

# Variation in Phenotypic Traits Changes from Wild to Domesticated Populations

The DA results show a change in the direction of variation between wild and domesticated (domNGE and domGE) categories [**Figure 1**, all (a) panels]. This observation implies that the traits that define the phenotypic variation within each group are different, at least between wild and domesticated categories [**Figure 1**, all (b) panels]. In fact, in almost all the cases, the phenotypic variation of the domesticated groups goes in the same direction, while the wild group is almost orthogonal, and more evenly distributed between the two axes. This observation holds for all of the five analyzed crops.

The weight of the different traits in the resulting grouping per crop is provided by the associated coefficients of each discriminant function (Supplementary Table S1). Thus, it is possible to identify the traits that are statistically more important in the observed differences among groups. For rice, "height" is the trait with the highest coefficient for LD1 and "days to flowering" for LD2. For canola, "number of seeds" is the trait with the highest coefficient for LD1 and "height" for LD2. For sunflower, "days to flowering" has the highest value for LD1 and "number of seeds" for LD2. For pumpkin, "number of fruits" and "number of seeds" were the traits with highest values for LD1 and LD2, respectively. Finally, for maize, "height" is the trait with the highest value for both LDs.

The GLM analysis identifies the traits that explain pairwise differences in phenotypic variation among groups and the results are shown in the (c) panels of **Figure 1**. For instance, for sunflower none of the four analyzed traits show significant differences between wild and domesticated populations. In contrast, for maize, pumpkin, and rice almost all of the analyzed traits show significant differences (days to flowering, number of seeds and height for maize, number of fruits, number of seeds and number of flowers for pumpkin, and height, number of seeds, and pollen viability for rice) [**Figure 1**, (c) panels].

# Changes in Phenotypic Variation among Wild, domNGE, and domGE

The normal sequence of reduction of genetic (and potentially phenotypic) variation in the process of domestication and human interventions suggests that wild relative populations represent the largest pool of diversity, which is then reduced during domestication and genetic modification through GE (Flintgarcia, 2013). Moreover, since GE constructs start from isogenic lines (representing the domNGE), and since the modifications are allegedly directed to modify specific phenotypic traits (not included in the present analysis), it was expected that the phenotypic variation of the analyzed populations would be a sequence of subgrouping and reduced phenotypic variation going from wild to domNGE and finally domGE. However, through the DA and GLM analyses [**Figure 1**, panels (a) and (c), respectively], we find evidence that, overall, supports these

expectations for the comparison of wild and domesticated categories (domNGE and domGE), but that do not hold for the comparisons between domesticated categories (domNGE vs. domGE). A graphical representation of these results, showing only mean values for all traits and populations, is found on **Figure 1**, (b) panels.

Regarding the comparison between the wild and domesticated (domNGE and domGE) categories, we observe that only canola

fits the expectation of subgrouping. In contrast, regarding the reduced phenotypic variation of domesticated groups compared with their wild relatives, 4/5 analyzed crops fit the expectation (sunflower was the exception). These four crops (rice, canola, pumpkin, and maize) show that, although the phenotypic variation is reduced in the domesticated groups, this is not a subgroup within the wild group. The exceptional case, of sunflower, shows that domGE groups have increased phenotypic variation compared with both domNGE and wild relative groups. The results of the GLM [**Figure 1**, (c) panels], which investigates pairwise differences between wild and domesticated groups (Wild-DomPp), show that rice, pumpkin, and maize have statistically significant differences for almost all traits.

Regarding the comparison of domNGE vs. domGE, we observe that, on the one hand, rice and canola are cases in which the results show some subgrouping of domGE within domNGE populations. On the other hand, maize, sunflower, and pumpkin represent almost the opposite scenario, with almost no overlap, nor subgrouping of the domGE within domNGE populations. Regarding the expectation of reduced phenotypic variation in domNGE, we observe a case of increased phenotypic variation, and specifically we found that domGE groups of sunflower have more variation than their domNGE relatives. Moreover, we also found statistically significant differences in the pairwise comparisons between domNGE and domGE groups in almost all crops and traits [**Figure 1**, (c) panels]. For rice, we found differences between domesticated groups in "pollen viability" and "height;" for canola we found differences in "pollen viability;" for pumpkin the differences were found in "number of fruits" and "flowers;" finally, in maize we found statistical differences in "days to flowering," "height," and "number of seeds." Overall, these results suggest unintended phenotypic effects, and no consistency in the specific traits that change due to human interventions in wild populations, either through domestication or GE modifications.

# DISCUSSION

Human interventions in plants of economic, cultural, or nutritional importance via traditional practices (domestication) and, more recently, via GE have a long history in crop management. Despite the major importance of the consequences of these human-driven interventions in crops, no systematic investigation of the actual consequences in plant populations exists. In this study, we conducted a meta-analysis profiling the phenotypic consequences on non-target traits that domestication and GE have had for five global important crops, and here we discuss the potential causes and implications of our observations.

The nature of any meta-analysis implies a large amount of data points or measurements that may correspond to many different individual studies, with different environmental conditions and subject to different sources of error. Given this, it is important to consider carefully both the meaning and treatment of outlier data points, and the implications in the results of the implicit environmental variation. On the one hand, in this study we only removed those outliers that did not correspond to common garden experiments, and thus had biological relevance; in this case the occurrence of extreme phenotypes or big evolutionary leaps [possible "hopeful monsters" (Goldschmidt, 1940; Gould, 1977)]. Nonetheless, of all the comparisons in our analysis, only 3.2% were identified as outliers and, among these, 1% was "true" outliers (not coming from common garden experiments). Moreover, a very limited number of traits of the phenotype were included in the analyses, which precludes us from making major biological or evolutionary inferences about the identified outliers in the different crops, although we recognize the relevance of a more in-depth investigation of those outliers in the evolution of domesticated plants. On the other hand, and regarding the contribution of environmental variation to our overall results, given that different data points correspond to experiments conducted in different environmental conditions, it is not possible to rule out that the observed variation in phenotypes corresponds (in some proportion) to the variation in environmental conditions, and therefore caution should be taken in attributing the observations solely to the genotypic background of populations.

# Direction and Magnitude of Phenotypic Variation Changes between Wild and Human-Modified Plant Populations

The differentiation of wild and domesticated populations was expected due to the genetic changes that occur in the evolutionary process of domestication. The genetic changes can be the result of genome level modifications (e.g., genetic bottlenecks), but also can be the result of more localized effects associated with genetic linkage of selected regions (e.g., selective sweeps) (Gepts, 2004, 2014; Pozzi et al., 2004). The phenotypic and genetic variation of wild populations represents the pool from which some variants are selected, thus reducing the original variation via domestication and GE of isogenic domesticated lines (Innan and Kim, 2004). This phenomenon of paired phenotypic variation reduction due to genetic bottlenecks has in fact been described in previous studies with the same crops in this study and others (Miller and Tanksley, 1990; Tenaillon et al., 2004; Stupar, 2010). Nonetheless, we found a notable exception in sunflower, in which variation increases from that observed in the wild relatives. This exceptional case could be explained by the large phenotypic and genetic variation found in the continuum of landraces, hybrids, and genetically modified organisms that increases the phenotypic amplitude in the domesticated populations (McAssey et al., 2016).

Moreover, for most cases, we also observe change in the axis of the variation that could be attributed to the selection of certain variants for the target traits that will then vary in another direction, carrying along linked phenotypic variation in non-target traits. Altogether, the expected reduction of

phenotypic variation and the change in the direction of this variation is in accordance with the concept of the domestication syndrome (Doebley et al., 2006; Meyer et al., 2012). However, we did not find consistency in the specific traits that varied among the three categories (wild, domNGE, and domGE). Potential explanations for this lack of shared traits in the differentiation of populations among crops could be, on the one hand, that although some phenotypic and general traits have been linked to the domestication syndrome, there are many others that are particular to each crop which are associated with specific aspects of their biology. For example, one of the most extreme cases of domestication is maize, where the phenotypic similarities between teosinte (wild ancestor) and contemporary maize are very small. The most important traits that define the domestication syndrome in maize are the change in the number and arrangement of ears and the presence of shorter lateral branches (Wills and Burke, 2007). Nevertheless, in many crops difficulties and ambiguities still exist in defining the domestication syndrome. One good example of such difficulties is Asian rice, in which the high levels of introgression between wild relatives and domesticated populations have caused genetic exchange that makes it difficult to identify the phenotypic traits that distinguish wild from domesticated populations (Vaughan et al., 2008). On the other hand, during the domestication process via selection, the phenotypic targets (or traits) are different for different crops. For instance, while in the case of rice the target of selection is the number of grains (seeds) (Vaughan et al., 2008), in the case of pumpkin, it is size and number of fruits (Meyer et al., 2012). In the same sense, GE of crops has different goals, thus different traits are introduced to different crops. For example, for canola a broad range of traits added through GE exists, such as insect resistance (Lepidopteran), herbicide tolerance (glyphosate/glufosinate), and virus resistance, while in pumpkin, the most frequent genetic transformation is focused on mosaic virus resistance (ZYMV) (Supplementary Table S2).

# Unexpected Phenotypic Changes in Human-Modified Plant Populations – Changes in Non-target Traits

As mentioned before, given that the GE constructs start from isogenic lines (represented here as domNGE), and that the modifications are allegedly directed to modify specific phenotypic traits not included in the present analysis, it was expected that the phenotypic variation of the domGE would be a subgroup of that in the domNGE group. This expectation is based on the premise that GE works usually with foreign DNA in order to introduce traits that are not present in the species, and this is performed in isogenic hybrid lines; thus, theoretically, the only differences between a Genetically Modified Organism (GMO) and its isogenic line will be the added trait(s) (Cellini et al., 2004). However, we did not find evidence that supports this expectation, suggesting unintended phenotypic effects of GE modifications. Specifically, we identified the most dramatic cases in rice, pumpkin, and maize, where almost all analyzed traits differ statistically between domNGE and domGE categories.

Generally, the intended effects of a genetic modification refer to a specific phenotype. But the transgene may also impart a range of phenotypes that constitute the unintended effects of the transgene. These new (unintended) phenotypes can appear due to the interaction of the transgenes with another genes (pleiotropic effects) (Rijpkema et al., 2007) or by position effects; thus, these unintended phenotypes are usually unpredictable (Miki et al., 2009). The cases in which the transgene, due to genetic interactions, causes unexpected phenotypes have been seen in canola (Légère, 2005), sunflower (Snow et al., 2003), rice (Chen et al., 2006), and maize (Guadagnuolo et al., 2007) among others. Although we intended to control the data for major environmental variation in the comparisons, we cannot rule out phenotypic plasticity due to GxE interactions that may be introducing a confounding effect in the observations. Moreover, these phenotypic differences between closely related (genetically) organisms can also be associated with other factors that depend on the origin and specific context of domestication that may end up in different phenotypic scenarios, causing phenotypic diversity between organisms of the same species (Gepts, 2001).

In addition to pleiotropy, other phenomena related to the genetic modification, such as position effects, that result from non-directed insertions of DNA fragments (i.e., transgenes) in the target genomes can occur (Filipecki and Malepszy, 2006). These position effects can have deleterious consequences on the engineered organisms, but also non-deleterious effects that allow survival of the organisms with no major or apparent phenotypic consequences (Ladics et al., 2015). Nonetheless, in the present study, all the GE crops analyzed were subject to non-directed insertions and we did find significant and unintended phenotypic effects. Currently, GE technology has apparently overcome the problem of position effects through the use of CRISPR/Cas9 (Clustered Regularly Interspaced Short Palindromic Repeats) technologies, which guarantees a more accurate genetic modification through a precise insertion in known locations within the target genomes (Ma et al., 2017). However, the precision of the insertion does not necessarily prevent unintended genetic interactions, such a mutagenesis and pleiotropic effects that could be traced to the phenotypes (Solovieff et al., 2013). The precision of this technology has been recently challenged by research that reveals the presence of high-frequency off-target mutagenesis induced by CRISPR/Cas9 in human and animal cells (Fu et al., 2013; Schaefer et al., 2017).

In maize, phenotypic variation of domGE is reduced and shifted along LD1 compared with domNGE, which is worth noting given the potential implications of introgression (gene flow) with domesticated non-transgenic populations (Quist, 2007) that in turn could decrease the overall variation of domNGE and affect the genetic and cultural repository of diversity that these populations represent. Given the potential of introgression and the risk of affecting the genetic and cultural

repositories of biodiversity, the analysis of these unintended effects is extremely valuable for understanding the destiny of hybrids in natural habitats, particularly in the context of environmental biosafety (Arriola and Ellstrand, 1997; Snow et al., 2003).

# Evolutionary Significance of Unintended Phenotypic Changes

In all of the analyzed cases, we can see phenotypic differences among categories (wild, domNGE, and domGE). However, it is worth noting that these differences, although evident through DA and spider charts [**Figure 1**, panels (a) and (b)], are not all statistically significant in the GLM pairwise comparisons between groups [**Figure 1**, (c) panels]. In particular, for sunflower, there is no statistically significant result for the GLM analysis, although differences among populations can be observed in both DA and spider chart analyses. This apparent inconsistency could be rooted in some fundamental reflections about evolutionary processes. For instance, it is known that genetic variation, even if not reflected as statistically significant differences between populations, can be of evolutionary significance (West-Eberhard, 2003). Given that phenotypic variation is directly exposed to natural selection, even small differences that are not statistically significant can have evolutionary consequences for the populations and species. The best example of this is domesticated plants, in which both natural and human selection act on phenotypes, leading to rapid fixation of even rare variants (Zhang et al., 2009; Tang et al., 2010) that statistically could appear as non-significant variation within or among populations. Moreover, this reflection leads to further examination of the finding of these unintended phenotypic effects in the analyzed crops, as it calls attention to the consequences (phenotypic) of genetic introgression events in different economically, culturally, or ecologically relevant crops (domGE→domNGE; domGE→wild). This is particularly important because there is evidence of some of these introgression events [e.g., maize (Quist and Chapela, 2001), rice (Song et al., 2006), and cotton (Gossypium hirsutum) (Wegier et al., 2011)]. Although we did not examine consequences of introgressed populations in this study, our results suggest that unintended effects of introgression are possible, and thus need further investigation looking at phenotypic traits that are usually out of scope (such as those associated with fitness), and that can shed more light on the evolution of domesticated (GE and NGE) and wild crop populations.

In addition, the results presented here show how human interventions in plant populations have different consequences in reducing and changing the direction of phenotypic variation to produce food. Historically, these strategies have proven to be effective, but it is worth reflecting on the unintended effects that some interventions can have in these natural resources that might reduce our options to adapt in the future. The reflection on the strategies to follow in this adaptation to future climate change conditions must include a revision of the regulations of crop technologies given the major consequences that this can have in global food security (Smyth and Mchughen, 2008). For example, given the current regulations, it is worth mentioning that the results presented here are in contradiction with the concept of substantial equivalence between NGE and GE crop lines, which argues that given the fact that the lines are isogenic, the resulting lines will only differ in the added trait (Cellini et al., 2004), which can be in fact demonstrated if only the added or target traits are analyzed, but the contrary can happen when looking at non-target traits (Smyth and Mchughen, 2008).

Finally, and in the context of climate change, there is an undeniable urgency to adapt to future uncertain conditions (Wise et al., 2014). However, there is little recognition that some of the current ecosystems (agroecosystems included) may transition to entirely different and unpredictable states, with different goods, services, and natural resources, and that adaptive cycles of decision will be needed in the most ample spectrum of possibilities (Wise et al., 2014). Thus, it is of major importance to preserve options for future decisions, which includes genetic and phenotypic options, in other words biodiversity (Rockstrom et al., 2014).

# CONCLUSION

The results presented show how human interventions in plant populations have different consequences in reducing and changing the direction of phenotypic variation to produce food. In particular, we found that (1) genetic modification (either by selective breeding or GE) can be traced phenotypically when comparing wild relatives with their domesticated ones (GE and NGE), and (2) the existence and magnitude of the phenotypic differences between domGE and domNGE of the same crop suggest consequences of genetic modification beyond the target trait(s). Further studies documenting phenotypic changes in human modified crops must include as many traits as possible, preferably non-target traits, to design interventions that do not compromise the decision spectrum in the face of trade-offs for adaptation to current versus future conditions.

# AUTHOR CONTRIBUTIONS

AH-T: designed the research, did the analysis, and wrote the manuscript. AW: designed the research and wrote the manuscript. MB: designed the research and revised previous versions of the manuscript. RL: designed the research. AE: designed the research and wrote the manuscript.

# FUNDING

This work was financially supported by CONACyT (PN247672) and the Dirección General del Sector Primario y Recursos Naturales Renovables (DGSPRNR) that belongs to the SEMARNAT and CONABIO.

# ACKNOWLEDGMENTS

fpls-08-02030 December 2, 2017 Time: 15:57 # 9

This work constitutes part of the doctoral research of AH-T, who received a scholarship from the Consejo Nacional de Ciencia y Tecnología (CONACyT, scholarship no. 660255), and extends thanks to the Doctorado en Ciencias Biomédicas, Universidad Nacional Autónoma de México (UNAM). The authors acknowledge technical assistance of

# REFERENCES


M.Sc. Fidel Serrano Candela and M.Sc. I. Karen Carrasco Espinosa.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2017.02030/ full#supplementary-material

DATA SHEET S1 | Full dataset of values for all the phenotypic traits included in the meta-analysis profiling rice, canola, maize, sunflower, and pumpkin.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2017 Hernández-Terán, Wegier, Benítez, Lira and Escalante. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Genetic Resources in the "Calabaza Pipiana" Squash (Cucurbita argyrosperma) in Mexico: Genetic Diversity, Genetic Differentiation and Distribution Models

Guillermo Sánchez-de la Vega<sup>1</sup> , Gabriela Castellanos-Morales1,2,3, Niza Gámez<sup>1</sup> , Helena S. Hernández-Rosales<sup>1</sup> , Alejandra Vázquez-Lobo1,4, Erika Aguirre-Planter<sup>1</sup> , Juan P. Jaramillo-Correa<sup>1</sup> , Salvador Montes-Hernández<sup>5</sup> , Rafael Lira-Saade<sup>2</sup> \* and Luis E. Eguiarte<sup>1</sup> \*

#### Edited by:

Charles Roland Clement, National Institute of Amazonian Research, Brazil

#### Reviewed by:

Yong Liu, Hunan Academy of Agricultural Sciences (CAAS), China Maria Isabel Chacon Sanchez, Universidad Nacional de Colombia, Colombia

#### \*Correspondence:

Rafael Lira-Saade rlira@unam.mx Luis E. Eguiarte fruns@unam.mx

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Plant Science

> Received: 17 August 2017 Accepted: 13 March 2018 Published: 29 March 2018

#### Citation:

Sánchez-de la Vega G, Castellanos-Morales G, Gámez N, Hernández-Rosales HS, Vázquez-Lobo A, Aguirre-Planter E, Jaramillo-Correa JP, Montes-Hernández S, Lira-Saade R and Eguiarte LE (2018) Genetic Resources in the "Calabaza Pipiana" Squash (Cucurbita argyrosperma) in Mexico: Genetic Diversity, Genetic Differentiation and Distribution Models. Front. Plant Sci. 9:400. doi: 10.3389/fpls.2018.00400 <sup>1</sup> Departamento de Ecología Evolutiva, Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico, Mexico, <sup>2</sup> Unidad de Biotecnología y Prototipos, Facultad de Estudios Superiores Iztacala, Universidad Nacional Autónoma de México, Mexico, Mexico, <sup>3</sup> Departamento de Conservación de la Biodiversidad, El Colegio de la Frontera Sur, Villahermosa, Mexico, <sup>4</sup> Centro de Investigación en Biodiversidad y Conservación, Universidad Autónoma del Estado de Morelos, Cuernavaca, Mexico, <sup>5</sup> Campo Experimental Bajío, Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Celaya, Mexico

Analyses of genetic variation allow understanding the origin, diversification and genetic resources of cultivated plants. Domesticated taxa and their wild relatives are ideal systems for studying genetic processes of plant domestication and their joint is important to evaluate the distribution of their genetic resources. Such is the case of the domesticated subspecies C. argyrosperma ssp. argyrosperma, known in Mexico as calabaza pipiana, and its wild relative C. argyrosperma ssp. sororia. The main aim of this study was to use molecular data (microsatellites) to assess the levels of genetic variation and genetic differentiation within and among populations of domesticated argyrosperma across its distribution in Mexico in comparison to its wild relative, sororia, and to identify environmental suitability in previously proposed centers of domestication. We analyzed nine unlinked nuclear microsatellite loci to assess levels of diversity and distribution of genetic variation within and among populations in 440 individuals from 19 populations of cultivated landraces of argyrosperma and from six wild populations of sororia, in order to conduct a first systematic analysis of their genetic resources. We also used species distribution models (SDMs) for sororia to identify changes in this wild subspecies' distribution from the Holocene (∼6,000 years ago) to the present, and to assess the presence of suitable environmental conditions in previously proposed domestication sites. Genetic variation was similar among subspecies (H<sup>E</sup> = 0.428 in sororia, and H<sup>E</sup> = 0.410 in argyrosperma). Nine argyrosperma populations showed significant levels of inbreeding. Both subspecies are well differentiated, and genetic differentiation (FST) among populations within each subspecies ranged from 0.152 to 0.652. Within argyrosperma we found three genetic groups (Northern Mexico, Yucatan Peninsula, including Michoacan and Veracruz, and Pacific coast plus Durango). We detected low levels of gene flow among populations at a regional scale (<0.01), except

for the Yucatan Peninsula, and the northern portion of the Pacific Coast. Our analyses suggested that the Isthmus of Tehuantepec is an effective barrier isolating southern populations. Our SDM results indicate that environmental characteristics in the Balsas-Jalisco region, a potential center of domestication, were suitable for the presence of sororia during the Holocene.

Keywords: Cucurbita, cultivated squash, genetic diversity, genetic structure, nuclear microsatellites, species distribution models

# INTRODUCTION

Domestication is an ideal model to study evolution because it is usually fast and gradual (Purugganan and Fuller, 2011; Meyer and Purugganan, 2013; Gaut, 2015; Gaut et al., 2015). Population genetics studies allow to analyze the dynamics of the domestication process and to make inferences about the origins and histories of crops (Meyer and Purugganan, 2013; Aguirre-Liguori et al., 2016).

Sometimes the ancestral wild populations can still be studied along with the domesticated forms and varieties, allowing paired comparisons between populations under different selection processes in the same environment (Aguirre-Liguori et al., 2016). Also, the coexistence and possibility of hybridization of domesticated taxa and their wild relatives allows having a source of genetic variation during domestication, increasing genetic diversity and the presence of alleles of agronomic value (Warschefsky et al., 2014). Nevertheless, the possibility of hybridization raises questions, such as: (1) How do domesticated and wild relatives remain genetically differentiated? and (2) How frequent is introgression among wild and domesticated relatives? It is also important to mention that signals of domestication may be confused by long-distance human-mediated dispersal and by intermittent crosses between domesticated and wild taxa, sometimes making it difficult to disentangle the history of domestication (Besnard et al., 2013; Meyer and Purugganan, 2013). As domestication and crop improvement involve genetic bottlenecks (Gaut et al., 2015), they can lead to a reduction of genetic diversity and increased inbreeding. During domestication, crops are transported from their center of domestication to new environments, which may lead to new local adaptation that in some cases can be achieved through introgression with their wild relatives or other domesticated relatives (Gaut et al., 2015).

The mechanisms for the maintenance of the genetic differentiation among domesticated populations and their wild relatives have seldom been studied. It has been proposed that in some species, such as Cucurbita argyrosperma and Zea mays, gene flow is asymmetric, being more frequent from the wild to the domesticated taxa (Montes, 2002; Hufford et al., 2013). Moreover, Hufford et al. (2013) found resistance to gene flow from domesticated maize into wild teosinte, which could be explained by low gene flow rates, by the fact that domesticated genes are not advantageous for wild taxa, or strong selection by humans against hybrids. Cruz-Reyes et al. (2015) observed that domesticated-wild hybrids of Cucurbita showed lower reproductive output. Hufford et al. (2013) found that many alleles that characterize domesticated varieties are found at lower frequencies in their wild relatives, suggesting that the attributes associated with domestication are not produced by de novo mutations, but constitute part of the standing genetic variation of wild taxa (Doebley et al., 2006).

Surveys of genetic variation of wild populations and their cultivated relatives is a first step for the description of genetic resources, such as analyzing how much genetic variation is still found in domesticated taxa compared to their wild relatives, their degree of differentiation, and evaluating how much ancestral and ongoing gene flow (hybridization) exists among wild and domesticated taxa (Warschefsky et al., 2014). These topics are relevant for the management of domesticated populations and for the future preservation of genetic resources (Warschefsky et al., 2014; Govindaraj et al., 2015). Moreover, these comparative studies are the first step toward understanding the origin and diversification of domesticated plant taxa. Current molecular tools, along with population genetics and modern phylogeographic approaches, allow understanding the distribution of genetic variation from an evolutionary perspective (Eguiarte et al., 2013; Aguirre-Dugua and González-Rodríguez, 2016; Aguirre-Liguori et al., 2016).

The study of crop origins has traditionally involved identifying geographic areas of high diversity and sampling populations of wild progenitor species (Kraft et al., 2014). Linking genes, crops, and landscapes through a geographical analysis of genetic data is one important way to achieve multilevel integration (Van Etten and Hijmans, 2010; Hufford et al., 2012; Besnard et al., 2013). Furthermore, species distribution modeling, projected into past conditions, offers a view of the potential geographic pattern of taxa during the domestication process (Hufford et al., 2012; Kraft et al., 2014).

The genus Cucurbita (pumpkins, squashes, and gourds), with 20 taxa of perennial or annual plants, is native to the Americas. Mexico is considered its center of origin and diversification (Lira-Saade, 1995; Mapes and Basurto, 2016). Cucurbita represents an interesting system for the study of domestication (Lira et al., 2016b) with five different domesticated species: C. pepo, C. moschata, C. ficifolia, C. maxima, and C. argyrosperma (Wilson et al., 1992; Sanjur et al., 2002; Kocyan et al., 2007; Gong et al., 2013; Zheng et al., 2013; Kates et al., 2017). Cucurbits were some of the first plants domesticated in the Americas, ca. 10,000 years ago (Smith, 1997; Zizumbo-Villarreal et al., 2016). Within the genus Cucurbita, each domestication event occurred independently, sometimes on more than one occasion

(Lira et al., 2016b). Today, domesticated cucurbits still have a fundamental role in the diet of people in Mexico, Central and South America, and in many other regions of the world, and they are considered an essential phytogenetic resource (FAO, 2010).

Among domesticated cucurbits, C. argyrosperma, known in Mexico as calabaza pipiana or calabaza mixta, is highly appreciated for its seeds, which are used in Mexican gastronomy. Also, fruits are medicinal, commercial, and food resources (Lira-Saade, 1995; Villanueva, 2007). It is a species with cultural and economic importance both locally and worldwide. The oldest evidence of domestication for this species is ∼8,600 years old from the Xihuatoxtla shelter, in the state of Guerrero (Rannere et al., 2009). This is a highly diverse species in form, color and size of its seeds and fruits (Lira-Saade, 1995; **Figure 1**). C. argyrosperma is currently divided into two subspecies: the domesticated C. argyrosperma ssp. argyrosperma (argyrosperma hereafter) and its wild relative C. argyrosperma ssp. sororia (sororia hereafter; **Figure 1**) (Organization for Economic Cooperation and Development [OECD], 2012; Gong et al., 2013; Zheng et al., 2013; Kates et al., 2017). Both wild and cultivated subspecies can be found in tropical and semi desert regions from the Southeastern United States through Mexico and northern Central America, reaching Nicaragua, from sea level to 1,700 m above sea level (Villanueva, 2007; Lira et al., 2016b). These subspecies have a sympatric distribution in most of their range, except for the Yucatan peninsula, where the wild subspecies is absent (Lira-Saade, 1995; Organization for Economic Cooperation and Development [OECD], 2012; **Figure 2**).

Cucurbita argyrosperma is an important crop in local agriculture systems in Mexico and in other countries in the Americas. It is grown and selected in traditional ways. It is commonly found as a seasonal crop, but irrigation is used in some areas. A large amount of its production is not reported because it is used in subsistence agriculture in Mexico and Central and South America (Montes, 1991, 2002; Villanueva, 2007; Organization for Economic Co-operation and Development [OECD], 2012). In other regions of the world it is not extensively cultivated because of the low quality of its flesh (Lira-Saade, 1995; Organization for Economic Co-operation and Development [OECD], 2012), but there are records of some genetically improved cultivars grown in the United States and Canada. Some improved lines show differences in fruit and seed size, shape, and color, such as "Green Striped Cushaw," "White Cushaw," "Magdalena Striped," "Papago," "Japanese Pie," "Hopi," "Taos," "Parral Cushaw," "Veracruz Pepita," and "Silver Seed Gourd" (Organization for Economic Co-operation and Development [OECD], 2012).

Nevertheless, there are few studies focused on analyzing the genetic resources of cucurbits and covering most of their distributions (Bellon et al., 2009; Lira et al., 2016b). Only a few studies have analyzed the genetic variation of C. argyrosperma, including an analysis at a local scale (a region in the state of Jalisco) using isozymes (Montes-Hernández and Eguiarte, 2002), which found that argyrosperma has less genetic variation (H<sup>E</sup> = 0.35–0.41) than its wild relative (H<sup>E</sup> = 0.433), and low levels of genetic differentiation among populations (FST = 0.077). Two studies, one based on isozymes in commercial cultivars (Decker-Walters et al., 1990), and another with accessions using RAPDs (Cerón et al., 2010), found lower genetic diversity in argyrosperma (H = 0.039 and 0.063 for isozymes and RAPDs, respectively) than in other domesticated taxa of the genus, such as C. moschata (H = 0.052 and 0.11 for isozymes and RAPDs, respectively) and in C. pepo (H = 0.068 and 0.104 for isozymes and RAPDs, respectively) (Decker-Walters et al., 1990; Cerón et al., 2010). Recently, Balvino-Olvera et al. (2017) studied wild populations of sororia along the Pacific coast of Mexico with microsatellites, reporting high levels of genetic variation (H<sup>E</sup> = 0.756) and higher genetic diversity and heterogeneity among southern populations in the states of Guerrero and Oaxaca. Clearly, there is a lack of population-based genetic diversity analyses that include both the cultivated and wild C. argyrosperma throughout its range.

The main aim of this study was to use molecular data (microsatellites) to assess the levels of genetic variation within and among populations of domesticated argyrosperma across its distribution in Mexico. We also analyzed populations of its wild relative, sororia, to compare levels of genetic variation and differentiation. Additionally, we estimated the levels of recent gene flow among populations and subspecies, and performed projections of the wild subspecies' distribution area in the mid-Holocene (∼6,000 years ago), in order to identify environmental suitability in previously proposed domestication centers, such as the Balsas-Jalisco region based on archeological records (Sanjur et al., 2002; Piperno et al., 2009; Rannere et al., 2009). We expected to find lower genetic diversity and higher levels of inbreeding in cultivated argyrosperma than in its wild relative sororia, in accordance with a previous study (Montes-Hernández and Eguiarte, 2002). In addition, we expected to find lower genetic differentiation among populations in the same geographic area than in more distant areas, and signals of on-going gene flow among subspecies, as reported by Montes (2002) and Montes-Hernández and Eguiarte (2002).

# MATERIALS AND METHODS

# Sampling and DNA Extraction

We obtained seeds from at least 3 fruits (one fruit from each different plant) from 19 populations of cultivated landraces of argyrosperma and 6 wild populations of sororia representative of the species distribution in Mexico (**Figure 2** and Supplementary Table S1). Ten populations were obtained from the germplasm collection of the Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias (INIFAP), Campo Experimental Bajío, in 2014 (BG in Supplementary Table S1). Fruits from 15 additional populations were collected in the field between 2013 and 2015. Sampled wild populations were located close to cultivars of argyrosperma to assess levels of gene flow among subspecies. All the collected fruits were stored in greenhouse conditions at the Institute of Ecology, UNAM, until they became ripe. Between 5 and 20 seeds from each collected fruit were grown in commercial substrate under greenhouse conditions (35◦C in average) for 40 days, and young leaves were collected for DNA extraction.

DNA extraction was performed using a modified CTAB protocol (Doyle and Doyle, 1987). For nuclear microsatellite loci, we genotyped a total of 440 individuals, 327 of which were attributed to argyrosperma and 113 to sororia.

# Microsatellite Analyses

We amplified 12 of the nuclear microsatellite loci reported by Gong et al. (2008) for C. pepo (Supplementary Table S2). Loci were selected from different chromosomes to improve genome coverage and to reduce the probability of linkage disequilibrium. For better results, we selected only highly variable dinucleotide loci. We used a multiplex approach for microsatellite amplification in a 15 µl final volume, consisting of 1× Buffer, 1.2 mM MgCl2, 0.2 mM of dNTPs, 0.13 µM of each primer (six primers per multiplex, forward primers were marked with one of the following fluorescent dyes: 6-FAM, HEX and VIC), 1 µl of Taq polymerase (PROMEGA) and 10 ng of genomic DNA. Amplification reactions were performed in a Veriti 96 well Thermal cycler (Applied Biosystems) with the following program: 95◦C for 5min, followed by 35 cycles of 95◦C for 40s, 60◦C (Ta) for 40s, 75◦C for 55s, and a final step of 72◦C for 5min followed by 4◦C. To control for possible contamination, we used blank controls for each reaction. All products were verified in 2 % agarose gels and PCR products were sent to the Roy J. Carver Biotechnology Center at the University of Illinois, United States for genotyping<sup>1</sup> . Electropherograms were analyzed with PeakScanner (Applied Biosystems) to build a matrix with the genotypes of each individual.

# Null Alleles and Measures of Genetic Diversity

We conducted a null allele analysis using the method proposed by Chakraborty et al. (1992) implemented in the Microchecker v2.2.3 (Van Oosterhout et al., 2004). In addition, we performed a Hardy-Weinberg exact test and a linkage disequilibrium test using Arlequin v. 3.0 (Excoffier et al., 2005). We obtained allele frequencies by direct estimation using Arlequin v. 3.0, and determined the number of private alleles for each population and subspecies by direct count from the allele frequencies data. We also obtained descriptive statistics, such as the proportion

<sup>1</sup>www.biotech.illinois.edu

of polymorphic loci per population (P), allelic richness (A), and the expected (HE) and observed (HO) heterozygosities with the same software, and estimated the inbreeding coefficient (FIS) for each population using Genepop 4.0 (Rousset, 2008). In addition, we obtained the rarefied allelic richness with ADZE 1.0 (Szpiech et al., 2008) accounting for the lowest population size of six individuals.

# Genetic Differentiation and Genetic Structure

To assess the genetic structure among subspecies and among populations we used Structure v 2.3.4 (Pritchard et al., 2000). This program uses Bayesian probability to assign individuals to different genetic clusters (K) based on allele frequencies without considering the population of origin. We performed previous runs to assess the best combination of priors to be used for the analysis and the length of the Markov Chain Monte Carlo (MCMC) chains. Accordingly, we performed a final run with admixture and correlated allele frequencies as priors, and without considering the putative population of origin of each individual. We used a burn-in of 500,000 chains and 1,000,000 MCMC chains, and tested values of K from 1 to 10, and 10 repetitions for each K. The results were run through Structure Harvester v 0.6.93 (Earl and vonHoldt, 2012), and the results from the Evanno test (Evanno et al., 2005) were considered to determine the value of K that showed fit to our data. We performed an analysis of molecular variance (AMOVA; Excoffier et al., 1992) considering the genetic clusters obtained with Structure.

As an additional test to identify the number of genetic groups formed by our data, we used the adegenet library to perform discriminant analysis of principal components (DAPC; Jombart et al., 2010). DAPC is a multivariate analysis that summarizes the genetic differentiation between groups. This analysis identifies genetically related individuals by partitioning the within group and among group genetic variation (Jombart et al., 2010). We performed two independent DAPC analyses. The first analysis included all individuals to assess the relationship among subspecies. We conducted a cross-validation test to determine the number of PCs to be retained. Accordingly, we retained 40 PCs and two discriminant functions. For the second analysis, we excluded the populations that showed high genetic differentiation to allow depicting the relationship among populations within argyrosperma. We retained 25 PCs and two discriminant functions in accordance to the cross-validation test.


The table shows the State and Population ID of collected samples, number of individuals (n), proportion of polymorphic loci (P), allelic richness (A), corrected allelic richness (RA), Number of private alleles (Pa), observed heterozygosity (HO), expected heterozygosity (HE), and inbreeding coefficient (FIS), (<sup>∗</sup> ) significant at p < 0.05. Standard deviation (SD) is shown in parenthesis. NA could not be estimated by the program due to the presence of missing data.

We estimated the genetic differentiation among populations through pairwise FST using adegenet (Jombart, 2008; Jombart and Ahmed, 2011) for R v.1.4.2 (R Core Team, 2016). To depict the genetic relationships among populations, we used the pairwise FST matrix to construct a dendrogram with the complete agglomeration method using the hclust function in the package ape (Paradis et al., 2004) for R. To determine the degree of statistical support for internal nodes we made an UPGMA dendrogram with R v.3.2.0, and evaluated 1000 trees constructed from bootstrap resampling of the loci with this same library.

To test for isolation by distance in each subspecies, we used ade4 (Dray and Dufour, 2007) for R to perform a Mantel test with 999 permutations. For this test, we first used the Geographic Distance Matrix Generator (Ersts, 2011) to transform sample coordinates into a geographic distance matrix. We also performed an AMOVA (Excoffier et al., 1992) testing different scenarios to determine whether subspecies, or genetic clusters provide a better explanation of the genetic variance in the species C. argyrosperma.

# Gene Flow

To obtain estimates of the migration rates among populations and subspecies, we used BayesAss v.3.0.4 (Wilson and Rannala, 2003). This program, based on Bayesian probability, detects immigrant ancestors up to two generations in the past, even if overall genetic differentiation is low. An advantage of this approximation is that it does not assume that populations have reached equilibrium (Rannala and Mountain, 1997), which may be the case for species that have undergone rapid demographic expansion, such as domesticated taxa. We performed several runs to determine the best number of MCMC, to tune the priors and to check for convergence. Accordingly, we performed 30,000,000 MCMC iterations, with a burn-in of 3,000,000 and a sampling frequency of 2,000. We set the parameters as follows: deltaA = 0.70 (mixing parameter for allele frequencies), deltaF = 0.90 (mixing parameter for inbreeding coefficient), deltam = 0.05 (mixing parameter for migration rates) to obtain an acceptance rate between 0.2 and 0.6, as suggested by Rannala (2007). We obtained a trace file to check for convergence with Tracer v.1.5<sup>2</sup> (Rambaut and Drummond, 2009).

To detect barriers to gene flow, we used the Monmonier algorithm (Monmonier, 1973; Manni et al., 2004) implemented in adegenet for R, considering both subspecies and for each subspecies separately. The Monmonier algorithm conducts a heuristic search used to define barriers based on dissimilarity scores. First, the genetic distance between contiguous populations is computed and the two populations with the highest level of differentiation are used to specify the starting boundary of the barrier. Then, the barrier is followed to both ends until either end reaches the edge or a barrier. These steps are repeated until the within-group sum of squares indicates that regional subdivision has progressed considerably (Monmonier, 1973). We used the optimize.monmonier function, which uses different starting points to find the solution that better explains genetic distances among populations based on the largest sum of local distances. We used values of pairwise FST as the distance matrix to perform the analysis, and the number of starting points set to 10 for argyrosperma and to three for sororia (i.e., half of the number of populations).

# Species Distribution Models

To assess environmental suitability in possible areas of domestication we used species distribution models (SDMs) projections into the mid-Holocene (6,000 years ago) for the wild relative sororia. We constructed a database with geographic coordinates of collected and known sororia populations. Points from Central America were downloaded from GBIF<sup>3</sup> , and 699 points from Mexico were obtained from Salvador Montes-Hernández, for a total of 720 occurrence. This database was purged to eliminate duplicated pixels. In addition, to ensure that all points fell within the species distribution, we estimated Mahalanobis distances using previously selected environmental variables (see below for selection methodology).

<sup>2</sup>http://beast.bio.ed.ac.uk

<sup>3</sup>www.gbif.org

The points deviating by two standard deviations or more from the mean were mapped, checked, and discarded if they fell outside the species range. We finally retained 273 occurrence points to perform the SDMs (Supplementary Figure S1).

To reduce the uncertainty associated with SDMs, it is necessary to select only the more informative and uncorrelated climatic variables (Hirzel et al., 2002). To do so, we downloaded the set of 19 bioclimatic variables taken from the worldwide temperature and rainfall data within the WorldClim 1.4 dataset (Hijmans et al., 2005). To determine which climatic variables to use, we analyzed the 273 occurrence points in a principal component analysis (PCA) and a Spearman correlation matrix. For the PCA, we considered as informative the components that, taken together, represented 87% of the variance associated with the data. For the Spearman correlation matrix, we defined an uncorrelated model by using a threshold of r < 0.85 (Booth et al., 1994). Nine bioclimatic layers were selected: Mean Temperature of Warmest Quarter, Mean Temperature of Coldest Quarter, Isothermality (BIO2/BIO7) (<sup>∗</sup> 100), Maximum Temperature of Warmest Month, Precipitation of Wettest Month, Precipitation of Driest Month, Precipitation Seasonality, Precipitation of Warmest Quarter, and Precipitation of Coldest Quarter.

We generated SDMs for current and past climate conditions with MaxEnt 3.3.3 (Phillips et al., 2004, 2006), using the 273 occurrence points and nine bioclimatic variables. We limited the analysis by cropping all climate layers to the distribution of sororia (9.51003◦N to 34.18357◦N and −116.1351◦W to −70.80147◦W). MaxEnt was executed using a 20% random test rate, 30 replicates, replicated bootstraps, 1000 maximum iterations and a convergence threshold of 0.00001, with extrapolation and clamping turned off. The distribution model was derived from the average model and evaluated using the score of the area under the curve (AUC; Elith et al., 2006).

For SDM projecting to mid-Holocene climate conditions, we downloaded the layers corresponding to atmosphericocean general circulation models (AOGCM) based on the Community Climate System Model CCSM4 (Collins et al., 2006), which incorporates dynamics of atmospheric processes, including radiation, convection, condensation and evaporation. This AOGCM has already been used in the reconstruction of past distributional models in the region (Waltari et al., 2007; Peterson and Nyári, 2008; Waltari and Guralnick, 2009; Holmgren et al., 2010; Gámez et al., 2014; Scheinvar et al., 2017). All environmental analyses were performed at a resolution of 30 arcsec (∼1 km<sup>2</sup> ).

In order to create a presence/absence map, we used the 95th percentile value of observed sample points as a threshold for the logistic model. This value assumes that up to 5% of the records used for generating the model are subject to error. For current and mid-Holocene times, we generated presence/absence maps for sororia. To identify areas with suitable environmental conditions for the species under current and past climate conditions, we performed a sum of the SDMs, thus highlighting the areas with

potential stability conditions from the mid-Holocene to the present.

# RESULTS

# Genetic Diversity

Three microsatellites (CMTp175, CMTm187, and CMTm144) showed evidence of null alleles and were therefore excluded from further analyses. As expected, there was no evidence of linkage disequilibrium among loci.

All loci showed significant deviations from Hardy-Weinberg equilibrium (HWE) in at least one population (Supplementary Table S3). Nevertheless, we performed multiple comparisons, which show that loci with significant deviations from HWE are different among populations.

We obtained a total of 84 alleles for the nine analyzed loci. At least one locus was monomorphic in each population (**Table 1**). We found higher levels of polymorphism per population (p = 0.02) in the cultivated populations of argyrosperma (P = 0.775 ± 0.39 SD; **Table 1**) than in the wild sororia (P = 0.641 ± 0.11 SD; **Table 1**). For argyrosperma, the proportion of polymorphic loci ranged between 0.44 and 0.88. For sororia, the proportion of polymorphic loci ranged between 0.55 and 0.77.

Forty alleles were private to argyrosperma and eleven were found only in sororia. Mean number of private alleles per population was 1.21 in argyrosperma and 1.33 in sororia. Within argyrosperma, populations Tih and Teh (full name of geographic locations are shown in **Table 1**) showed the highest number of private alleles. Within sororia, populations Soax and SoSin showed the highest number of private alleles. Overall, the populations from Oaxaca showed the highest proportion of private alleles in both subspecies. Allelic richness was similar (p = 0.23) in cultivated argyrosperma (A = 2.786 ± 0.76 SD) and in wild sororia (A = 2.93 ± 1.17 SD). For argyrosperma, rarefied allelic richness ranged from 1.9 in Aut to 1.36 in Dgo (**Table 1**). For sororia, rarefied allelic richness ranged from 1.8 in four populations to 1.6 in Aut (**Table 1**).

Mean observed and expected heterozygosity were similar among subspecies (H<sup>O</sup> = 0.388 and H<sup>E</sup> = 0.428 in sororia, and H<sup>O</sup> = 0.36 and H<sup>E</sup> = 0.410 in argyrosperma; H<sup>O</sup> p = 0.484; H<sup>E</sup> p = 0.656). For argyrosperma, genetic diversity (HE) ranged from 0.588 in SinalP to 0.25 in Ek. For sororia, mean genetic diversity (HE) was 0.428, with the highest value in Soax (0.502) and the lowest in Sgro (0.3) (**Table 1**).

Both subspecies showed similar overall inbreeding coefficients (FIS = 0.033 ± 0.069 SD, p = 0.34 in argyrosperma and FIS = 0.077 ± 0.16 SD, p = 0.33 in sororia, and were not statistically different p = 0.656). Within argyrosperma, five populations exhibited heterozygosity excess, while nine showed heterozygote deficiency and the rest were in HWE (**Table 1**). The highest values for heterozygosity deficiency were found in Chan and SinalP (FIS = 0.39) and Yec (FIS = 0.32) and for heterozygosity excess the lowest values were found in Ek and Ome (FIS = −0.253 and −0.26, respectively; **Table 1**). Within sororia, two populations exhibited heterozygosity deficiency,

while the rest were in HWE (**Table 1**). The highest FIS value was found in Soax (0.303) (**Table 1**).

# Population Structure

The analysis performed with Structure suggested a value of K = 2, followed by K = 4 (**Figure 3**). For K = 2 there was a clear genetic differentiation between subspecies (**Figure 3A**), except for the sororia populations SoSin and SoSon, that were more similar to argyrosperma. The Structure barplot for K = 3 shows that within argyrosperma, the populations from the Yucatan peninsula, Chiapas, Veracruz, Michoacán and CCC constituted a cluster, while the populations from northern and central Mexico formed another cluster (**Figure 3B**). Finally, for K = 4 the clusters largely corresponded to subspecies' geographic distributions (**Figures 3**, **4**). The first cluster consisted of four sororia populations: Schis, Soax, Sgro, Sjal (black in **Figure 4**). The other two sororia populations (SoSin and SoSon) were assigned to a second cluster with argyrosperma populations CCC, and SinalP located in northern Mexico (pink in **Figure 4**). The third cluster consisted of argyrosperma populations from Ek, Mot, Chan, Champ and Pal, from the Yucatan Peninsula and populations from Michoacán (Sah) and Veracruz (Tih) (blue in **Figure 4**). The fourth cluster was constituted by populations Teh, Mix, Ome, Tla, Aut, and Yec, from the Pacific coast and Durango (green in **Figure 4**). The results from this analysis showed some degree of admixture among populations, particularly within argyrosperma (populations Tan, SJI and Mtp in **Figure 3C**), but the populations from sororia had low levels of admixture with the domesticated subspecies.

The results from the DAPC analysis were consistent with the results from Structure (**Figure 5**). Four sororia populations were clearly differentiated from argyrosperma, while two populations clustered within argyrosperma. All argyrosperma populations were grouped together, except for Tih from the state of Veracruz, which seemed in this analysis to be well differentiated from the other populations (**Figure 5**). In this DAPC 97.9% of the variance is explained by 40 PCs. A DAPC analysis considering only argyrosperma populations, except Tih (**Figure 6**), showed that Mtp and Tehuantepec from the states of Guerrero and Oaxaca, respectively, were well differentiated. Some populations formed very cohesive clusters, i.e., populations from the Yucatan Peninsula and populations from the Pacific Coast. In this second DAPC 94.6% of the variance is explained by 25 PCs.

Overall genetic structure was higher for wild sororia (FST = 0.492, RST = 0.610) than for domesticated argyrosperma (FST = 0.264, RST = 0.4). Genetic differentiation among populations (pairwise FST) of argyrosperma was moderate to high (FST = 0.031–0.515), while genetic differentiation among populations of sororia was higher in general (FST = 0.171– 0.639). Genetic differentiation among populations of the different subspecies was medium to high (FST = 0.152–0.652; **Figure 7** and Supplementary Table S5). When we estimated genetic differentiation among populations of wild sororia without escaped populations (SoSin and SoSon) values were moderate (FST = 0.181–0.352).

The dendrogram built using pairwise FST values (**Figure 7**) showed two well-defined groups: one group including only sororia populations from different states along the Pacific coast in Mexico (Oaxaca, Guerrero, Chiapas and Jalisco), and another group of argyrosperma populations, including the two sororia populations (SoSin and SoSon) mentioned above. Bootstrap values are in general low, as is usually found in intraspecific studies (due to both gene flow and recent common ancestry). Low bootstrap values within argyrosperma could also be due to homoplasy and care should be taken with their interpretation. Nevertheless, the two groups had higher support values (above 50%) (**Figure 7**).

An AMOVA that considered each subspecies, explained 20.05% of the genetic variance between subspecies, and most of the variance was found within individuals (50.3%), followed by among populations within subspecies (26.2%), and only 3% of the variance was found among individuals within populations. Given that two putatively wild populations (SoSin and SoSon) were identified as belonging to argyrosperma in all genetic structure analyses, and the seeds show an intermediate morphology for size and color (**Figure 1**), we also performed an AMOVA analysis considering these populations as argyrosperma. This analysis showed that a higher percentage (30.3%) of the genetic variance was allocated between subspecies; most variance was still found within individuals (46.1%), and less among populations within subspecies (21.4%), finally, 2.3% of the genetic variance is found among individuals within populations. On the other hand, an AMOVA analysis considering the partition suggested by the Structure analysis, K = 4, explained 24.5% of the genetic variance among clusters, the variance among populations within clusters was 30.8%, and most of the variance was found within populations (44.6%).

Mantel tests were significant for both subspecies, indicating spatial structure due to isolation by distance. We found that geographically closer populations are genetically more similar than expected by chance (**Figure 8**).

# Estimates of Gene Flow

Estimates of recent gene flow suggest that the total proportion of migrants for each population was from 17 to 33% (Supplementary Table S4). Nevertheless, in general the proportion of migrants among pairs of populations was low ≤ 0.01; the exceptions were between some populations in the Yucatan Peninsula and in Chiapas, the Pacific coast, and in the northern portion of the Pacific coast, for argyrosperma; and in the southern-central portion of the Pacific coast for sororia (Supplementary Table S4). The only case where gene flow between cultivated argyrosperma and wild sororia populations was detected involved the two Northern sororia populations (SoSin and SoSon) and a Northern population in San Luis Potosi state, Tan; other analyses strongly suggest that SoSin and SoSon are argyrosperma populations escaped from cultivation. This suggests that gene flow between cultivated and truly wild populations is low.

The Monmonier analysis indicated that for argyrosperma (**Figures 9A,D**), the northern part of the Sierra Madre Occidental may function as an effective barrier to gene flow, isolating the SinalP and Yec populations. This contrasts with results from the DAPC and structure analyses, where these populations do not seem to be isolated. This can be due to differences in the methodologies, in which Monmonier analysis takes spatial distances into account. For sororia, the southern portion of the Sierra Madre Occidental also functions as an effective barrier to gene flow, and isolates population Sjal (**Figure 9B**). When both subspecies were analyzed together, we observed that the main barrier is located in the region of the Isthmus of Tehuantepec, isolating the populations from the Yucatan Peninsula (**Figure 9C**).

# Species Distribution Models

The SDM for the wild subspecies, sororia (**Figure 10**), showed stability and good support (AUC: 0.96). The SDM projection to the mid-Holocene (∼6000 years ago) suggests that the distribution area of sororia has been more or less stable since domestication (**Figure 10**). Nevertheless, the analysis also suggests that sororia may have been present in the Yucatan

sororia populations. Four sororia populations are well differentiated from the argyrosperma populations.

Peninsula during the mid-Holocene and its distribution in the regions of Oaxaca and Guerrero, where the most ancient archeological remains have been found, may have been more continuous than today (**Figure 10**). In addition, the distribution of sororia in Central America may have been wider and more continuous from Guatemala and Honduras to the northern area of Nicaragua.

# DISCUSSION

The present study represents the first wide range analysis of the genetic variation, genetic structure and gene flow of C. argyrosperma, covering the cultivated argyrosperma distribution in Mexico, and including populations of the wild sororia distribution in the Pacific Coast from Northern Mexico (Sonora) to Southern Mexico (Chiapas). Our analyses show similar levels of genetic variation in the cultivated populations and in its wild ancestors. Genetic differentiation is higher in wild sororia (FST = 0.492) than in domesticated argyrosperma (FST = 0.264), but this estimate is probably the product of including two escaped populations (SoSin and SoSon) that were misclassified and analyzed as sororia. When we remove these populations, differentiation in wild sororia (FST = 0.243) became even lower than the differentiation found in cultivated argyrosperma. Gene flow at a regional level is associated to movement of pollen by Cucurbita pollinators and to human cultural practices, such as seed exchange among populations (Montes-Hernández et al., 2005; Organization for Economic Cooperation and Development [OECD], 2012). Some patterns of gene flow detected in argyrosperma may be the result of these seed exchanges, but these hypotheses should be tested with ethnobotanical data in future analyses.

# Genetic Variation and Inbreeding

Priori et al. (2013) reported an 85% transferability for microsatellites designed for C. pepo to cultivated C. argyrosperma. Accordingly, nine of twelve microsatellite loci used in this study were adequate for C. argyrosperma, while we discarded the additional three because of a high number of null alleles.

Cultivated species often show low levels of genetic variation (Gaut et al., 2015). Surprisingly, both subspecies showed similar levels of genetic diversity (**Table 1**). Comparable values of

FIGURE 6 | Discriminant analysis of principal components (DAPC) for only 18 argyrosperma populations (excluding Tih). Mtp and Teh from the states of Guerrero and Oaxaca, respectively, are well differentiated. Population ID are as shown in Table 1.

polymorphic loci, allelic richness and genetic diversity among argyrosperma and sororia suggests that the subspecies have had similar effective population sizes, and that the theoretical bottleneck associated with domestication was either mild and/or of short duration, followed by a rapid population expansion (Hedrick, 2011).

Genetic variation was similar to what has been reported for other annual plants in microsatellite studies (H<sup>E</sup> = 0.46; Nybom, 2004), but lower when compared to other outcrossing species (H<sup>E</sup> = 0.65; Nybom, 2004). Also, the mean allele number in C. argyrosperma was lower than those reported for C. pepo using nuclear microsatellite loci (A = 3.2–5.6; Formisano et al., 2012; Gong et al., 2012, 2013; Priori et al., 2013; Ntuli et al., 2015) and lower than those reported by Balvino-Olvera et al. (2017) in wild sororia populations along the West coast of Mexico (A = 12.3). Low levels of allelic richness and genetic diversity in sororia may suggest that this species has undergone one or several bottlenecks due to ecological shifts during the Pleistocene, followed by rapid population expansion, as suggested by Kistler et al. (2016). Nevertheless, these comparisons should be taken with caution because analyses were performed with different sets of microsatellite loci, and these hypotheses should be investigated in future studies.

Certain aspects of agricultural management, such as seed exchange, may also affect the levels of genetic variation in C. argyrosperma (Montes-Hernández et al., 2005) and in C. pepo (Enríquez Cotton, 2017; Enríquez et al., 2017). In particular, the milpa system, which predominates in the central and southern portions of Mexico (Lira et al., 2016a), is a form of polyculture (i.e., growing several Cucurbita species in the same area) and seed exchange, that can reduce inbreeding at the local level. Nevertheless, it is advisable to perform similar analyses in other wild and domesticated cucurbits to gain further insight into the amount of genetic variation present in Cucurbita.

For argyrosperma, populations located in the extremes of its distribution (SinalP in Sinaloa and Chan in Quintana Roo) showed the highest levels of genetic variation, while the populations from the Yucatan Peninsula (except Chan) showed the lowest levels of genetic variation. These results, together with the barriers analysis, suggest that the cultivated populations from the Yucatan Peninsula are isolated genetically (Zizumbo-Villarreal and Colunga-GarcíaMarín, 2010; Moreno-Estrada et al., 2014). Moreover, the wild subspecies, sororia, is not

sororia populations. Populations of the wild subspecies (sororia) are indicated (w). Dot colors agree with the pie graphs corresponding to the proportion of individuals per population assigned to each genetic cluster obtained with STRUCTURE for K = 4 shown in Figure 6. Population ID is shown in Table 1. Node values represent bootstrap support.

distributed in the Yucatan Peninsula, thus affecting the potential gene flow among subspecies in this area. In subspecies sororia we did not find a geographic pattern for the distribution of its genetic diversity, and Oaxaca was the population that showed the highest genetic variation.

Cultivated species often show high levels of inbreeding (Gaut et al., 2015). Estimates of inbreeding coefficients (FIS) in argyrosperma were highly variable (**Table 1**), with some populations showing heterozygote deficiency (9 populations), as may be expected in a domesticated species, and other populations showing heterozygote excess (5 populations), as previously reported by Montes-Hernández and Eguiarte (2002), may be related to the type of agriculture and management (Cerón et al., 2010), as well as pollinator availability and home range. Heterozygote deficiency could be the result of the short flight capacity of the bees that pollinate these species (Montes, 2002), or to the fact that in traditional subsistence agriculture only a few fruits are selected to plant the next generation (thus, within a field all individuals are highly related; Montes, 2002), while in northern populations the use of improved inbred genetic lines (Servicio de Información Agroalimentaria y Pesquera [SIAP], 2016) could be the cause of heterozygosity deficiency. Negative FIS values found in some cases suggests that seed exchange is frequent at a local level (i.e., among neighbor populations) promoting outbreeding, but at a regional level (i.e., among extremes of the distribution) gene flow is low. It will be important to conduct detailed ethnobotanic studies in different regions of the country, along with genetic analysis, to test the effect of agricultural management on the genetic variation of this crop (Montes-Hernández et al., 2005; Bellon et al., 2009).

# Genetic Structure

When populations are isolated, genetic drift promotes the random fixation of alleles, thus the number of private alleles among populations can be used as a reference for population connectivity (Hedrick, 2011). We found a high number of private alleles among subspecies (40 in argyrosperma and 11 in sororia), while the mean number of private alleles per population was similar within subspecies (1.31 in sororia and 1.21 in argyrosperma).

The number of private alleles found in each subspecies suggests that overall levels of gene flow among subspecies have been low, thus promoting their divergence since the domestication of argyrosperma ∼ 8,600 years ago (Rannere et al., 2009). A coalescent based approach such as those implemented in Approximate Bayesian Computation (ABC) analyses, together with a genome-wide approach (thousands of SNPs) will be conducted in the future to test whether these patterns relate to incomplete lineage sorting, ancestral introgression or current introgression.

In argyrosperma, we found a high number of private alleles in the populations from the states of Veracruz (Tih) and Oaxaca (Teh). Tih is geographically distant from other sampled populations, and its private alleles may be present in other populations from the Gulf of Mexico; thus, it is advisable to include more populations from this area in further analyses. The population Teh from Oaxaca is located in the area of the Isthmus of Tehuantepec that has been previously identified as an important barrier for the Mexican biota (Ornelas et al., 2013). Moreover, for sororia, the population with highest number of private alleles is located in the same area (Soax), further supporting the Isthmus of Tehuantepec as an important biogeographical barrier. Furthermore, seed morphology is distinctive in the populations of argyrosperma of Southeastern Mexico, where seeds show clear gray margins

in contrast to the golden color found in northern populations (**Figure 1**). People in southeastern Mexico have a strong preference for local varieties (G. Sánchez de la Vega, personal observation). In addition, this may suggest that the high number of private alleles in this area may be related to strong selection pressures associated with seed morphology, as has been reported in C. pepo commercial varieties (Formisano et al., 2012). Selection for morphological characters promotes selective sweeps that results in allele fixation in neutral sites of the genome (Meyer and Purugganan, 2013). Alternatively, a high number of private alleles could relate to isolation of these populations. Therefore, we need to conduct genomic and morphologically detailed analyses to test these hypotheses.

The results from the Structure and DAPC analyses show clear genetic differentiation among subspecies (**Figures 3**, **5**). Within argyrosperma, Structure analyses (**Figure 3**) show geographically associated groups: (1) a northern group; (2) Yucatan Peninsula; and (3) Pacific coast (**Figure 4**). It is interesting that these genetic groups roughly correspond to the genetic groups reported by Moreno-Estrada et al. (2014) for human Native American populations, which suggest that cultural aspects may be important in determining the genetic structure of this crop. Further analyses should test for the correlations between genetic clusters in domesticated taxa and human Native American populations.

Values of genetic differentiation (FST) were variable among populations of both subspecies. Sororia showed similar levels of genetic differentiation (FST = 0.243, RST = 0.3) as argyrosperma (FST = 0.264, RST = 0.4). In addition, our Mantel test results show that geographically close populations of both subspecies are genetically more similar than expected by chance. Both subspecies have wide distributions (**Figure 2**) that cover a distance of over 1,000 km, thus promoting genetic differentiation among extreme populations. Genetic differentiation in sororia could also be related to its patchy distribution and the limited movement (∼0.7 km) and local low densities of its main pollinators from the genera Peponapis and Xenoglossa (Kohn and Casper, 1992; Montes, 2002; Enríquez et al., 2015). Levels of genetic differentiation among argyrosperma populations are similar to those reported for other outcrossing plants (FST = 0.22; Nybom, 2004). It is advisable to include more sororia populations in future analyses to determine fine patterns of genetic differentiation.

FST pairwise values are directly related to the degree of phenotypical resemblance among populations and provide insights into their demographic history (Holsinger and Weir, 2009). Our pairwise FST analyses, along all our results of genetic differentiation suggest that both subspecies are genetically welldifferentiated.

Also, it is worth noticing that all the analyses of genetic differentiation consistently suggest that the sororia populations SoSon and SoSin from the states of Sonora and Sinaloa are more like subspecies argyrosperma than sororia, including some morphological characteristics of the seeds (**Figure 1**). Our results suggest that these may be escaped populations of argyrosperma, as suggested by Merrick and Bates (1989) and Villanueva (2007), who mentioned that individuals from argyrosperma are capable of surviving without agricultural management, but that these

individuals show a reduction in seed size. Reports suggest that cultivars of C. pepo and C. moschata in Tamaulipas, Mexico are also capable of surviving and producing fruits in semi-wild or in extreme environmental conditions (Hanselka, 2010).

Gene flow is frequent among subspecies of Cucurbita and with other taxa at local levels (Montes-Hernández and Eguiarte, 2002; Lira et al., 2016b). Our analysis suggests that gene flow is less frequent at the regional level, with a few exceptions: (1) in the Yucatan Peninsula, and Chiapas, people mentioned that they usually exchange seeds among neighbors and family members and sell seeds for cultivation, which is consistent with estimated levels of gene flow in this area, and (2) the northern portion of the Pacific Coast, that apparently acts as a genetic corridor that has been previously reported for other crops (Zizumbo-Villarreal and Colunga-GarcíaMarín, 2010).

The results from the barrier analysis are consistent with this idea, where the northern portion of the Sierra Madre Occidental isolates the populations located along the Pacific coast. A pattern of isolation of populations located in Jalisco has also been reported for Zea mays ssp. parviglumis, the wild relative of maize (Aguirre-Liguori et al., 2017). Finally, when both subspecies were included in the analysis, the Isthmus of Tehuantepec appears as an important barrier, as has been previously reported for many wild taxa (Ornelas et al., 2013).

# Species Distribution Models

Species distribution models of wild relatives of domesticated taxa are a useful tool to corroborate hypotheses of possible domestication sites and environmental suitability for the presence of the wild species (Hufford et al., 2012; Besnard et al., 2013). The SDM for sororia suggests that its range has been more or less stable since the mid-Holocene, with possible presence in the Yucatan Peninsula and a more continuous range in Oaxaca and Guerrero during the mid-Holocene (∼6,000 years ago). Many domestication events occurred during this time because of environmental changes and vegetation transitions associated with the end of the Last Glacial Maximum-Holocene, and with the impact of anthropogenic activities (Flannery,

1986; Piperno et al., 2007; Aguirre-Liguori et al., 2016). For the region of Guerrero, Piperno et al. (2007) and Rannere et al. (2009) proposed that the end of the Pleistocene was cold, and as the Holocene advanced this area became warmer, promoting a transition from temperate arboreal elements to tropical forests, environment conditions associated to principal events of domestication in the Balsas basin.

Previous genetic, molecular, biogeographic, and archeological analyses suggest that argyrosperma was domesticated in the Balsas-Jalisco region, approximately 9,000 years ago (Sanjur et al., 2002; Piperno et al., 2009; Rannere et al., 2009; Lira et al., 2016b). The oldest archeological remains are from caves located in the Balsas region (Guerrero) from 6,100 to 8,500 years ago. Initial domestication (before 7,000 years ago) was followed by early diversification (Lira et al., 2016b). Our SDM results indicate that environmental characteristics were suitable for the presence of sororia in this area during this period.

# CONCLUSION

Our analyses describe broad patterns of genetic variation, genetic differentiation and gene flow among domesticated and wild C. argyrosperma. The levels of genetic variation and genetic differentiation were similar for sororia and argyrosperma. These could relate to their demographic histories, but further analyses should be conducted to test different demographic hypotheses. Isolation by distance and gene flow analyses suggest that gene flow is more common at a local scale than at a regional scale, perhaps because of pollen movement by specialized pollinators and to human cultural practices, such as seed exchange among populations, but these hypotheses should be tested with ethnobotanical data in future analyses. Sororia's distribution has been relatively stable since the mid-Holocene and suggests the presence of this subspecies in previously described domestication centers based on archeological records. Future analyses should gather information about agricultural management, morphological variation and the behavior of pollinators, along with a wider sampling of the wild populations and the use of massive sequencing data to expand our knowledge of squash domestication.

# AUTHOR CONTRIBUTIONS

GS-dlV and GC-M contributed to fieldwork, lab work, molecular and population genetics analysis, drafting the manuscript, and final approval of the version to be published. NG contributed to analysis of species distribution models (SDM), drafting the manuscript, and final approval of the version to be published. HH-R contributed to fieldwork, lab work, molecular and data analyses, and final approval of the version to be published. AV-L contributed to laboratory work, logistics, correcting the manuscript, and final approval of the version to be published. EA-P contributed to laboratory work, logistics and molecular analysis, correcting the manuscript, and final approval of the version to be published. JJ-C contributed to correcting the manuscript and final approval of the version to be published. SM-H project leader, contributed to fieldwork, germplasm collection, generated database for project design, and final approval of the version to be published. RL-S project leader, contributed to logistics and final approval of the version to be published. LE project leader, designed and coordinated the project, logistics, drafted and corrected the manuscript, and final approval of the version to be published.

# FUNDING

This work was funded by CONABIO KE 004, Diversidad genética de las especies de Cucurbita en México e hibridación entre plantas genéticamente modificadas y especies silvestres de Cucurbita, by CONACYT Investigación Científica Básica 2011.167826 (clave de identificación oficial CB2011/167826), Genómica de poblaciones: estudios en el maíz silvestre, el teosinte (Zea maysssp. parviglumis y Zea mays ssp. mexicana) and by CONACYT Problemas Nacionales through the grant number 247730. GS-dlV was supported by grant number 292164 of the Consejo Nacional de Ciencia y Tecnología (CONACYT). The sabbatical leave of LEE at the Department of Plant and Microbial Biology, University of Minnesota, was supported by the program PASPA-DGAPA, UNAM.

# ACKNOWLEDGMENTS

This manuscript is presented in partial fulfillment for the requirements to obtain a Ph.D. degree by GS-dlV in the Posgrado en Ciencias Biológicas, Universidad Nacional Autonóma de México. We acknowledge the Posgrado en Ciencias Biológicas for the support provided during the development of this project. Special thanks to Dr. Valeria Souza and Dr. Daniel Piñero for supporting this research. We thank the Laboratorio de Evolución Molecular y Experimental, the Instituto de Ecología, and Facultad de Estudios Superiores Iztacala, Universidad Nacional Autónoma de México, and Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Campo Experimental Bajío. We also thank for help in laboratory, computer analyses, and fieldwork Enrique Scheinvar, Laura Espinosa Asuar, Gabriel Manuel Rosas, Leslie Mariel Paredes, Paulina Hernández, Josué Barrera, Dulce C. Hernández, Silvia Barrientos, Karen Y. Ruíz Mondragón, Jonás A. Aguirre-Liguori, Talitha E. Legaspi, and the technicians at Instituto Nacional de Investigaciones Forestales, Agrícolas y Pecuarias, Campo Experimental Bajío José Manuel Escutia Ponce and Miguel Ángel Mora Martínez. This paper was written during a sabbatical leave of LEE in the Department of Plant and Microbial Biology, University of Minnesota in Dr. Peter Tiffin's laboratory.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018.00400/ full#supplementary-material

# REFERENCES


L. (Cucurbitaceae) in the Guatemalan cloud forest. Pan-Pac. Entomol. 91, 211–222. doi: 10.3956/2015-91.3.211


Ecology 83, 2027–2036. doi: 10.1890/0012-9658(2002)083[2027:ENFAHT]2.0. CO;2


domestication in the Central Balsas River Valley, Mexico. Proc. Natl. Acad. Sci. U.S.A. 106, 5014–5018. doi: 10.1073/pnas.0812590106


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Sánchez-de la Vega, Castellanos-Morales, Gámez, Hernández-Rosales, Vázquez-Lobo, Aguirre-Planter, Jaramillo-Correa, Montes-Hernández, Lira-Saade and Eguiarte. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# An Initiative for the Study and Use of Genetic Diversity of Domesticated Plants and Their Wild Relatives

Alicia Mastretta-Yanes<sup>1</sup> \*, Francisca Acevedo Gasman<sup>2</sup> , Caroline Burgeff<sup>2</sup> , Margarita Cano Ramírez<sup>2</sup> , Daniel Piñero<sup>3</sup> and José Sarukhán<sup>4</sup>

<sup>1</sup> CONACYT – Comisión Nacional para el Conocimiento y Uso de la Biodiversidad, Mexico City, Mexico, <sup>2</sup> Comisión Nacional para el Conocimiento y Uso de la Biodiversidad, Mexico City, Mexico, <sup>3</sup> Instituto de Ecología, Universidad Nacional Autónoma de México, Mexico City, Mexico, <sup>4</sup> Instituto de Ecología, Comisión Nacional para el Conocimiento y Uso de la Biodiversidad, Universidad Nacional Autónoma de México, Mexico City, Mexico

#### Edited by:

Alejandro Casas, IIES – Universidad Nacional Autónoma de México, Mexico

#### Reviewed by:

Ilias Travlos, Agricultural University of Athens, Greece Umesh K. Reddy, West Virginia State University, United States

\*Correspondence: Alicia Mastretta-Yanes amastretta@conabio.gob.mx

#### Specialty section:

This article was submitted to Agroecology and Land Use Systems, a section of the journal Frontiers in Plant Science

> Received: 03 August 2017 Accepted: 05 February 2018 Published: 20 February 2018

#### Citation:

Mastretta-Yanes A, Acevedo Gasman F, Burgeff C, Cano Ramírez M, Piñero D and Sarukhán J (2018) An Initiative for the Study and Use of Genetic Diversity of Domesticated Plants and Their Wild Relatives. Front. Plant Sci. 9:209. doi: 10.3389/fpls.2018.00209 Domestication has been influenced by formal plant breeding since the onset of intensive agriculture and the Green Revolution. Despite providing food security for some regions, intensive agriculture has had substantial detrimental consequences for the environment and does not fulfill smallholder's needs under most developing countries conditions. Therefore, it is necessary to look for alternative plant production techniques, effective for each environmental, socio-cultural, and economic conditions. This is particularly relevant for countries that are megadiverse and major centers of plant domestication and diversification. In this white paper, a Mexico-centered initiative is proposed, with two main objectives: (1) to study, understand, conserve, and sustainably use the genetic diversity of domesticated plants and their wild relatives, as well as the ongoing evolutionary processes that generate and maintain it; and (2) to strengthen food and forestry production in a socially fair and environmentally friendly way. To fulfill these objectives, the initiative focuses on the source of variability available for domestication (genetic diversity and functional genomics), the context in which domestication acts (breeding and production) and one of its main challenges (environmental change). Research on these components can be framed to target and connect both the theoretical understanding of the evolutionary processes, the practical aspects of conservation, and food and forestry production. The target, main challenges, problems to be faced and key research questions are presented for each component, followed by a roadmap for the consolidation of this proposal as a national initiative.

Keywords: Mexico, food security, food sovereignty, milpa, forestry, agroecology, conservation genetics

# EVOLUTION UNDER DOMESTICATION FACES MODERN CHALLENGES

Species' domestication is an evolutionary process in which humans, by means of artificial selection, take advantage of the genetic diversity of a wild species and modify it to our needs (Darwin, 1859; Casas et al., 2016). The domestication of plants started around 10,000 years ago, first for food production and then for forestry. Domesticating plants led to the independent invention of agriculture by several cultures around the globe and the emergence of 'agrobiodiversity'

(Diamond, 2012). The process of domestication is ongoing, and today occurs in a wide range of systems that span traditional farming to industrialized large-scale agriculture. These, and all forms of farming, have a common challenge: how to feed humankind in the future within a context of food sovereignty and climate change, while conserving people's biocultural legacy and the remaining natural ecosystems of Earth.

Domestication was particularly modified by formal plant breeding after World War II, with the onset of intensive agriculture (Harlan, 1975). This type of agriculture was then introduced to developing countries during the 'Green Revolution' (1960–1990). In this period, food security was treated as an issue of increasing production through breeding elite cultivars under conditions of high inputs (e.g., fertilizers and pesticides), and selecting for higher yields, wide (instead of local) adaptation, and adaptability to mechanical harvest technologies (Sonnenfeld, 1992; Crow, 1998; Baranski, 2015). All these considerably increased total yields of a small number of grain species, which allowed for dramatically increases in food production and lower global food prices. However, high-input agriculture, promoted by the Green Revolution, also had important detrimental consequences and limitations (Tilman, 2001; Evenson and Gollin, 2003), among them: First, elite cultivars work well only in high quality soil, with high water availability and intensive use of inorganic fertilizers (Duvick, 2005; Ceccarelli, 2009). Second, it promotes speciesand genetic-homogeneity, making cultivars vulnerable to pests and diseases (Ullstrup, 1972), thus making necessary the use of pesticides in an "arms race" (Després et al., 2007). The heavy use of these fertilizers and pesticides is also harmful to the wider environment. Examples of this damage are the vast marine "dead zones" that exist at every coastline where rivers coming from intensive agriculture areas meet the ocean (Rabalais et al., 2010). Third, breeding for intensive agriculture switched the domestication process from the farmers to researchers and commercial seed companies (Troyer, 2009; Ceccarelli, 2015). As a result, many local species and varieties were abandoned (Ceccarelli, 2009); the capacity to keep, generate, and apply traditional agroknowledge started to disappear (Gómez-Baggethun and Reyes-García, 2013); and the food security of several areas became dependent on a decreasing number of crops (Khoury et al., 2014), that are increasingly controlled by a few agroindustrial companies (Howard, 2009). Lastly, focusing artificial selection only on yield led to deficiencies in micronutrients (FAO, 2010).

These unintended consequences cannot be ignored, especially in developing countries that still hold important remnants of natural ecosystems and native agrobiodiversity. Also, given the diversity of environments and social conditions where agriculture occurs in such ecologically diverse countries, it is highly unlikely that any single agricultural system will solve their problems of food and fiber production (Kahane et al., 2013). Therefore, we should re-think the path we have been taking in relation to modern-day domestication, and look for alternatives that are effective for each environmental, socio-cultural and economic context.

# THE INITIATIVE

In this white paper we describe a Mexico-centered initiative with two objectives: (1) to study, understand and conserve the genetic diversity of native crops and their wild relatives, and preserve the ongoing domestication processes that generate and maintain this diversity; and (2) to use this diversity to strengthen food and forestry production in a socially fair and environmentally friendly way. These objectives relate to biodiversity conservation because the fate of the remaining ecosystems of Earth depends on how we undertake agricultural and forestry production over the following decades (Tscharntke et al., 2012); moreover, they are relevant to food sovereignty because the rights of peoples to healthy and culturally appropriate food depends on our ability to conserve and effectively use domesticated species. These objectives are relevant to Mexico, one of the Vavilov Centers for plant domestication (Vavilov, 1951), because it is a megadiverse country where smallholders of a variety of cultural groups practice agriculture and forestry in a diversity of agricultural systems and environments. However, the core of this initiative should be useful for similar countries.

This initiative proposes going back to two core elements of domestication: (1) the genetic diversity of domesticated species, their wild relatives and associated microbiome, and (2) the evolutionary potential that implies having millions of smallholders cultivating extensive areas of diverse crops in different environments. These are core elements for the following reasons.

First, genetic diversity provides options to grow diverse and nutritious food with fewer resources, adapted to harsher environments, and making cultivars less susceptible to pests and diseases. Proof of this is that cultivars can already be grown in a wide variety of environments, including marginal conditions where commercial lines do not perform well (Ceccarelli, 2009; Dwivedi et al., 2016). Similarly, crops' wild relatives tend to have higher genetic diversity in terms of drought, pest, and disease resistance than their cultivated counterparts (Maxted et al., 2013). To that diversity, we can add an even larger set of microorganisms that have co-evolved with these species and their environment. This microbiome can greatly influence plant performance, but further applied research is needed (Sessitsch and Mitter, 2015; Gopal and Gupta, 2016; Qin et al., 2016; Benitez et al., 2017).

Second, to take full advantage of and maintain the evolutionary processes that generated this diversity, we should change our vision to vindicate the role of smallholder farmers (<5 ha) not only as a productive force, but also as an un-substitutable engine for crop evolution under diverse and challenging environments. Crop genetic diversity is not useful by itself: it needs to be related to production practices that can make the most out of the traits given by genetic diversity. The people who use this genetic diversity and possess its associated traditional knowledge are the heirs of the domestication processes that indigenous groups started 1000s of years ago (Boege, 2008). They tend to be smallholder campesinos that keep their own seed and devote part of their production for selfsufficiency. Although they are commonly seen as 'unproductive,' they are the backbone of food security (FAO, 2014). Similarly,

the generation and maintenance of crop genetic diversity depends on millions of smallholders cultivating under different environmental conditions and cultural preferences which, from an evolutionary perspective, represent the best way to maintain and generate genetic diversity (Enjalbert et al., 2011; Perales, 2016; Comisión Nacional para el Conocimiento y Uso de la Biodiversidad [CONABIO], 2017).

# COMPONENTS

The initiative focuses on five components that span the source of variability available for domestication (genetic diversity and functional genomics), the context in which domestication acts (breeding and production) and one of the main challenges it faces (environmental change). The genetic diversity within crops' wild relatives, cultivated species, and their associated microbiome is the basis of domestication. Functional genomics represents a second layer of information that allows mapping genetic diversity to useful traits, both in human and environmental terms. Genetic drift, linkage disequilibrium, and epigenetics also play an important role in shaping diversity among populations and within genomes. These types of data can help to understand and monitor domestication and environmental adaptation (Gepts, 2014; Lasky et al., 2015). However, to conserve and use this diversity to adapt crops to environmental change, it is also necessary to consider the context in which the evolutionary forces of domestication act (Enjalbert et al., 2011). This context includes the biocultural and environmental factors that are given by breeding and production.

The Mexican context and the initiative's aim by component are summarized below, and Supplementary Table S1 shows challenges and key research questions.

# Genetic Diversity

Mexico has ∼280 native plant species with forestry potential (FAO, 2011) and more than 130 that are used as food sources. Among the latter are maize, beans, pumpkins, chili, amaranths, vanilla, and 20 more that are of high economic importance worldwide (Acevedo et al., 2009). Mexico is the center of domestication or diversification of these and several more crop species (Acevedo et al., 2009). There are also potentially 1000s of wild species related to them, as shown by the existence of ∼270 wild relatives belonging to the gene pool of the 12 main Mexican crops (CONABIO and UICN, 2016). Within each domesticated species there are dozens of varieties (e.g., 59 maize landraces, and 60 chili types; Aguilar-Rincón et al., 2010; Comisión Nacional para el Conocimiento y Uso de la Biodiversidad [CONABIO], 2011). The microbiome of these species has only just started to be explored, but it is likely very large. This large diversity of species, cultivars, microbes, and genes is not static: it is still evolving in a complex context of environmental conditions that range from sea level to cold highlands, and under the continuous domestication of 68 indigenous groups (Comisión Nacional para el Desarrollo de los Pueblos Indígenas [CDI], 2014) and campesino farmers.

There is a considerable body of research on key Mexican domesticated species like maize (e.g., Arteaga et al., 2016; Romero Navarro et al., 2017), but limited work for most of the rest (Bellon et al., 2009; Piñero et al., 2009; **Figure 1**). Similarly, maize landraces' distribution and domestication history has been widely analyzed (Kato et al., 2009; Comisión Nacional para el Conocimiento y Uso de la Biodiversidad [CONABIO], 2011), but this information remains unknown or has not been systematized for the rest of the species. This initiative aims to study, conserve, evaluate, safe keep, analyze, and sustainably use this genetic diversity.

# Functional Genomics

From studies in functional genomics we know that the phenotypes obtained during the early stages of domestication are governed by relatively few genomic loci, which, in general, are different from the loci involved in later phenotypic diversification, and from the loci subjected to natural selection (Meyer and Purugganan, 2013). Research on these topics has mostly been conducted abroad, but Mexican research institutions devoted to these areas were recently created. In developing this research, it is important to consider native varieties and molecular markers representative of this variation (Ganal et al., 2011; Caldu-Primo et al., 2017).

This component aims to have well-annotated genomes and diversity panels for the main agriculture and forestry species of importance to Mexico, along with their wild relatives. This should allow for a better understanding of the molecular basis of domestication, but also to apply this information to breeding, conserving, and monitoring.

# Production

In Mexico, most agricultural and forestry land belongs to campesino or indigenous communities. Their agriculture tends to be performed in blocks <5 ha, which characterizes them as smallholders. Although they are often perceived as 'unproductive,' the campesinos' aggregated production is the backbone of Mexico's food security (Bellon et al., under review). However, since the 1980s, the programs directed at smallholder agriculture were drastically reduced, or redirected to provide farmers with Green Revolution packages (Turrent-Fernández and Cortés-Flores, 2005b). Remarkably, many smallholders continue to use traditional varieties (Eakin et al., 2014), because these are adequate and competitive when grown under their local conditions (Muñoz et al., 1976), and because smallholders not only focus on yield, but also look to fulfill cultural preferences (Brush and Perales, 2007; Bellon and Hellin, 2011).

Importantly, smallholder farmers tend to be able to obtain some degree of usable yield in underperforming environments without aid (Bellon et al., under review). This has two main implications. First, the productivity gap of many of these farmers could be closed with minor agronomic improvements and breeding, with important consequences. Second, millions of farmers, spread across a wide variety of environments and cultural preferences, represent the ideal scenario for evolution under domestication to continue to occur at the scale and range needed to effectively maintain and generate new genetic diversity.

This component aims attending agricultural and forestry production so that they can be increased in a sustainable and

sovereign way, while also fulfilling the Mexican needs regarding desired cultivars, quantity, quality, and local cultural preferences. For this, it is crucial to protect smallholders' production and target it with adequate programs that consider management, markets, and education.

knowledge for each species. Differences in line color are only for visual purposes.

# Plant Breeding

Formal plant breeding was introduced into Mexico in replication of the model of the United States. However, in Mexico most production does not use seeds generated by 'formal' breeding (Turrent-Fernández and Cortés-Flores, 2005a,b; Donnet et al., 2012). This could be interpreted as a failure for formal breeding, but also as the success of smallholders' practices of domestication.

Commercial lines have not been widely adopted for several motives, one is that the formal breeding lines tend not to be useful to the smallholders' production systems and environments. This is a consequence of the way breeding is performed and targeted. The environmental conditions characteristic of the areas where the vast majority of smallholders are located are poorly represented by the Mexican public research stations where improved lines have been developed (Bellon et al., 2005). Also, the selection goals of the smallholders can vary depending on the particular characteristics of the local environment and the producer's interests, which do not necessarily focus on maximizing yield. For these reasons, in the areas and conditions where local varieties are traditionally grown, they tend to have a better performance than the improved lines in terms of yield, nutritional value, forage quality, local appreciated taste, or precocity (Muñoz et al., 1976; Perales et al., 1998; Sociedad Mexicana de Fitogenética [SOMEFI], 2007, 2009).

Given the range of conditions under which smallholders conduct agriculture and the limited success of formal breeding over the last 70 years in Mexico, it is unlikely that commercialtype breeding would become a successful strategy. Instead, Mexico should recognize, and incorporate into breeding, the diversity of traditional and local knowledge on shaping and adapting cultivars. For this, public policies should be shifted to better support informal breeding, and to focus formal breeding

on the smallholders' needs and adaptation to local conditions. For this, landraces should be incorporated as the base material of breeding programs, instead of only as donors for elite materials (Comisión Nacional para el Conocimiento y Uso de la Biodiversidad [CONABIO], 2017).

The aim for this component is to have breeding programs for a range of native species, using alternative tools to accelerate and improve the breeding process, where the objective would be breeding for the smallholders' needs under present and future, social and environmental conditions. To accomplish this, alternative tools need to be explored, including documenting and sharing campesino-to-campesino experiences, participatory breeding, genomic selection and evolutionary breeding.

# Environmental Change

Mexico's agriculture and forestry production are facing environmental change in the form of soil degradation, pollution, invasive species and climate change. It is difficult to generalize how environmental change would affect a particular crop or wild species; for instance, the effect of climate change on maize depends on the plant's genotype, local environment, and management (Mercer and Perales, 2010). Nonetheless, we know that environmental change has important economic impacts. For example, ∼10% of Mexican agriculture land is eroded, which translates into ∼50% of PROCAMPO aid costs (Sánchez-Colón et al., 2009; Ávalos et al., 2011). Given the large diversity of environments where Mexican crop and wild species occur, it is likely that useful variation to cope with new sources of stress already exists. What is needed is to make that diversity available to producers by enhancing seed-exchange networks, breeding and access to seeds, and environmental information at local and national scales. Therefore, this component aims to accelerate the adaptation of cultivated plants to environmental change and look for effective mechanisms to conserve the capacity of wild populations to adapt.

# ROADMAP

Research on the previous components could target and connect both the theoretical understanding of the evolutionary processes, and the practical aspects of applying this knowledge to conservation and production. However, for this to happen in a National scale, it is necessary to systematize and make data available to both the academy and wider public, and to influence public policy. CONABIO is a Mexican inter-ministerial commission in charge of that type of activities regarding biodiversity. We therefore envision that CONABIO would help to develop the initiative's in the following early stages:


Examples of the first steps are already ongoing. For instance, in the case of wild relatives, a systematic conservation planning analysis is being performed incorporating both the distribution of genetic diversity and social variables<sup>1</sup> . For the cultivated forms, maize is the species with more data available on the distribution of native races and their genetic characterization. What is next needed is to integrate its management practices, uses and environmental and microbiome data, to then make this information available and accessible to farmers, breeders and wider audience. The target of this should be to strengthen seed exchange networks, resources for participatory breeding and campesino-to-campesino experience sharing. Molecular tools would help to accelerate breeding, especially at the stage of crosses design and genetic diversity monitoring. We estimate that with minimal breeding support and agronomic improvements, it would be possible to increase the average yield of 4 million ha from 1.3 to 2.3 ton/ha. This would be enough to cover the maize needs of c.a. 88.5 million people (Comisión Nacional para el Conocimiento y Uso de la Biodiversidad [CONABIO], 2017) without implementing intensive agriculture systems.

# AUTHOR CONTRIBUTIONS

AM-Y, MCR, and CB performed literature reviews and contributed to the discussion. AM-Y, DP, FAG, and JS wrote the manuscript. All authors conceived and designed the manuscript.

# FUNDING

This work was supported by Secretaría de Medio Ambiente y Recursos Naturales (SEMARNAT) through the grant "Contribución de la Biodiversidad para el Cambio Climático" to CONABIO and by Consejo Nacional de Ciencia y Tecnología through the grant 247730 to DP.

# ACKNOWLEDGMENTS

This document represents a summary of CONABIO's ideas and aims, which we are constructing in collaboration with external researchers, organizations, and areas within CONABIO. We are thankful to all these colleagues for helping to conceive the initiative, to Mauricio Bellon and Nancy Arizpe for

<sup>1</sup>http://www.psmesoamerica.org/en/

improving it regarding social aspects, to Patricia Koleff, Elleli Huerta, Raúl Jiménez, Rafael Obregón, and Jorge Larson for their comments to this document, to María Andrea Orjuela, Oswaldo Oliveros, and Alejandro Ponce for the technical contributions, and to Karl Philips for editing the English of the manuscript.

# REFERENCES


# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018.00209/ full#supplementary-material



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling Editor declared a shared affiliation, though no other collaboration, with the authors.

Copyright © 2018 Mastretta-Yanes, Acevedo Gasman, Burgeff, Cano Ramírez, Piñero and Sarukhán. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Mating System of the Wild-to-Domesticated Complex of Gossypium hirsutum L. Is Mixed

Rebeca Velázquez-López<sup>1</sup> \* † , Ana Wegier<sup>1</sup> \* † , Valeria Alavez<sup>1</sup> , Javier Pérez-López<sup>1</sup> , Valeria Vázquez-Barrios<sup>1</sup> , Denise Arroyo-Lambaer<sup>1</sup> , Alejandro Ponce-Mendoza<sup>2</sup> and William E. Kunin<sup>3</sup>

<sup>1</sup> Laboratorio de Genética de la Conservación, Jardín Botánico, Instituto de Biología, Universidad Nacional Autónoma de México, Mexico City, Mexico, <sup>2</sup> Comisión Nacional para el Conocimiento y Uso de la Biodiversidad, Mexico City, Mexico, <sup>3</sup> Department of Ecology and Evolution, Faculty of Biological Sciences, University of Leeds, Leeds, United Kingdom

#### Edited by:

Charles Roland Clement, National Institute of Amazonian Research, Brazil

#### Reviewed by:

Sevan Suni, Harvard University, United States Alexandre Magno Sebbenn, Instituto Florestal, Brazil Fernanda Amato Gaiotto, Universidade Estadual de Santa Cruz, Brazil

#### \*Correspondence:

Rebeca Velázquez-López rebecavelazquezl@gmail.com Ana Wegier awegier@ib.unam.mx; awegier@gmail.com

†These authors have contributed equally to this work.

#### Specialty section:

This article was submitted to Evolutionary and Population Genetics, a section of the journal Frontiers in Plant Science

> Received: 18 August 2017 Accepted: 13 April 2018 Published: 09 May 2018

#### Citation:

Velázquez-López R, Wegier A, Alavez V, Pérez -López J, Vázquez -Barrios V, Arroyo-Lambaer D, Ponce-Mendoza A and Kunin WE (2018) The Mating System of the Wild-to-Domesticated Complex of Gossypium hirsutum L. Is Mixed. Front. Plant Sci. 9:574. doi: 10.3389/fpls.2018.00574 The domestication syndrome of many plants includes changes in their mating systems. The evolution of the latter is shaped by ecological and genetic factors that are particular to an area. Thus, the reproductive biology of wild relatives must be studied in their natural distribution to understand the mating system of a crop species as a whole. Gossypium hirsutum (upland cotton) includes both domesticated varieties and wild populations of the same species. Most studies on mating systems describe cultivated cotton as self-pollinated, while studies on pollen dispersal report outcrossing; however, the mating system of upland cotton has not been described as mixed and little is known about its wild relatives. In this study we selected two wild metapopulations for comparison with domesticated plants and one metapopulation with evidence of recent gene flow between wild relatives and the crop to evaluate the mating system of cotton's wild-to-domesticated complex. Using classic reproductive biology methods, our data demonstrate that upland cotton presents a mixed mating system throughout the complex. Given cotton's capacity for outcrossing, differences caused by the domestication process in cultivated individuals can have consequences for its wild relatives. This characterization of the diversity of the wild relatives in their natural distribution, as well as their interactions with the crop, will be useful to design and implement adequate strategies for conservation and biosecurity.

Keywords: cotton, crop wild relatives, mating system, reproductive success, xenogamy, autogamy, domestication process, introgression

# INTRODUCTION

Plant domestication is a complex and continuing process (Casas et al., 2007; Vaughan et al., 2007). For 10,000 years, humans have selected attributes of interest in a range of economically valuable plants through their management and utilization (Gepts, 2004); consequently, different techniques, trait preferences, environments, and selection intensities have shaped the degree of domestication of each species (Meyer and Purugganan, 2013). Today we can find: (1) crop populations that are highly domesticated and depend on human intervention for survival; (2) semi-domesticated populations with recognizable traits of the domestication syndrome, but able to survive in the wild if human intervention ceases; (3) incipiently domesticated populations whose selected traits

**269**

have not yet diverged markedly from those found in wild populations; (4) incidentally co-evolved populations that adapt to human disturbed environments, but without direct human selection; (5) feral populations derived from 2, 3, or 4; and (6) wild relatives (Clement, 1999). Given these diverse scenarios, the biological diversity contained in wild-to-domesticated complexes should be considered in studies about crop ecology and evolution (Warwick and Stewart, 2005; Casas et al., 2007).

In plants, one of the key life history traits is the mating system (Vaughan et al., 2007). This feature helps determine the genetic composition of populations and, therefore, has a crucial role in the evolution of species (Charlesworth, 2006); additionally, it explains who is mating with whom, which is a fundamental issue for conservation biology (Barrett and Harder, 1996). The mating system often changes during domestication (Meyer et al., 2012), and wild relatives contain the plesiomorphic state of this trait (Ellstrand et al., 1999; Doebley et al., 2006; Andersson and de Vicente, 2010). A shift from the ancestral system toward a new one can be selected until fixation; for instance, there are some crops that are unable to reproduce without human intervention (Ellstrand et al., 1999), such as vegetatively propagated sycamore fig and other fruit trees (Zohary and Spiegel-Roy, 1975). Some crops have multiple mating systems, such as domesticated 'Maradol' (Carica papaya), which is hermaphroditic, while native varieties and wild papayas are dioecious (Carvalho and Renner, 2012). Importantly, the characterization of the mating system of many plant species has been biased toward the domesticated counterparts, because only a sub-sample of the wild-to-domesticated complex was used (e.g., Carica papaya (Damasceno et al., 2009), Persea americana (Ish-Am et al., 1999), Piper nigrum (Thangaselvabal et al., 2008). This bias may have profound consequences for the conservation of conspecific wild relatives, especially because conclusions drawn from studies with domesticated varieties are extrapolated to the whole species, failing to consider the genotypic and phenotypic diversity that wild relatives possess. The conservation of this diversity is fundamental, because it is a genetic reservoir that includes a wider range of adaptive traits that may be of additional agricultural relevance, such as resistance to pests and pathogens and tolerance to abiotic stresses (Warschefsky et al., 2014).

Upland cotton, Gossypium hirsutum, is an economically important plant species, particularly known for being the leading source of natural fiber. Worldwide, over 90% of cotton production comes from cultivars of G. hirsutum and in 2014 the species ranked eighth in the world's harvested area, reaching almost 35 million hectares (Crop production, FAOSTAT, 2017). Given the economic importance of the species, its mating system has been the focus of several studies since 1903 (Simpson, 1954); however, the majority of them concentrated on domesticated cotton and have described it as predominantly autogamous and self-pollinated (see Supplementary Material 1). On the other hand, studies on pollen dispersal of G. hirsutum, from the beginning of its modern breeding as a crop to the present day, refer to cotton's ability to produce offspring by crossing (1944– 2016; see Supplementary Material 1 for a review); however, the mating system is not described as mixed (Loden and Richmond, 1951; Richmond, 1951; Simpson, 1954; Imam and Allard, 1965; Meredith and Bridge, 1973). A specific study on the mating system of wild populations in their natural distribution is lacking.

In Mesoamerica, G. hirsutum exists as a complex of wild to domesticated forms (Brubaker and Wendel, 1994); hence, it is an ideal region to characterize the mating system of this upland cotton complex, identify possible differences, and to integrate this information into regional management plans. In Mexico - its center of origin, diversity and domestication (Ulloa et al., 2005; Burgeff et al., 2014; Pérez- Mendoza et al., 2016) – the complex includes cultivated and highly improved varieties, genetically modified varieties, traditionally managed landraces, feral, and wild populations. All of them belong to the primary gene pool of the species (Andersson and de Vicente, 2010) and gene flow among them occurs, even over long distances (Wegier et al., 2011). Moreover, eight wild G. hirsutum metapopulations have been recognized, based on geographic, ecologic, and genetic differences (Wegier et al., 2011; Bauer-Panskus et al., 2013). Wegier et al. (2011) demonstrated that recent gene flow, followed by introgressive hybridization, occurs between a number of wild populations distributed in the north and south of Mexico, and commercial cotton cultivars in the northern states of the country. Our study provides the first data on the mating system of wild G. hirsutum in situ within the natural distribution of the species in Mexico. In order to assess this, we evaluated the capacity of domesticated cotton, wild cotton, and wild cotton with evidence of introgression, to produce offspring by either xenogamy (cross-pollination between different genets) or autogamy (self-fertilization).

# MATERIALS AND METHODS

# Study System

Upland cotton, G. hirsutum L., is a species with wild, feral, and semi-domesticated populations (Brubaker and Wendel, 1994). All cultivated forms, including the highly improved varieties or genetically engineered varieties, cannot be considered as fully domesticated, because they are able to survive even if human intervention stops.

Wild G. hirsutum flowers all year round. Flowers are white, hermaphrodite, cup shaped, with a single central style surrounded at the bottom by stamens (Meade, 1918; Smith and Cothren, 1999). Some plants exhibit flowers with a colored disk inside of the base of the cup that ranges from deep red to light yellow (Tan et al., 2013). Flowers remain open between 8 and 11 h; at the start of the day, they are all white and when they close the sepals start turning pink at the base (Smith and Cothren, 1999). Anthesis takes place in the morning, as soon as the flower completely opens, and the stamens start to release pollen soon afterward (Smith and Cothren, 1999). Flowers produce both pollen and nectar as a reward for visitors (Wäckers and Bonifay, 2004).

# Study Sites

Sampling was performed in coastal dunes and dry forests of Mexico, in three of the eight wild cotton metapopulations defined genetically, geographically, and ecologically by

Wegier et al. (2011), namely: Central Pacific Metapopulation (CPM), Yucatan Peninsula Metapopulation (YPM), and South Pacific Metapopulation (SPM). SPM is of particular interest due to evidence of recent introgressive hybridization with domesticated plants (Wegier et al., 2011). Given the distinctive extinction-colonization dynamic observed in cotton metapopulations (Wegier, 2013), the full extent of CPM, YPM, and SPM was surveyed to find G. hirsutum patches with enough flowers (**Figure 1**). Hand-pollination treatments (Tate and Simpson, 2004; Machado and Sazima, 2008; Hernández-Montero and Sosa, 2016) were carried out during the dry season, between November 2012 and May 2013: CPM in November 2012, YPM in December 2012, and SPM in February 2013. Sites were revisited for fruit collection after 86 days, on average. In addition, domesticated cotton plants, bought in local markets, were kept under greenhouse conditions in Mexico City to maintain a suitable temperature (**Figure 1**).

# Mating System

In order to execute the hand-pollination treatments to test for different mating systems, a search was conducted for flower buds before anthesis (Tate and Simpson, 2004; Machado and Sazima, 2008; Hernández-Montero and Sosa, 2016). In each metapopulation, 40 replicates of the five pollination treatments were set up (**Table 1**), anticipating the risk of collecting too few fruits afterward: assisted self-pollination, automatic selfpollination, assisted cross-pollination (Kearns and Inouye, 1993), emasculated control (to avoid automatic self-pollination), and control (open-pollination). Multiple treatments were placed on the same plant where possible to control for individual variation; however, due to the variability of the number of flowers, not all of the plants held the same type or number of treatments. Moreover, when flowers were scarce, treatments were placed daily within each study site, in up to four patches per metapopulation, until the 40 replicates per treatment were completed. Special care was taken to avoid changing or altering the environment (i.e., without introducing new genotypes or changing plant abundances or distributions). The same experimental design was applied for domesticated plants in a greenhouse that allowed the entry of local insects. Some treatments required mesh bags to exclude any pollinator access that could alter the results (**Table 1**). The treatments that did not include bagging before anthesis were bagged after flower closure to help control for mechanical damage from bagging.

# Reproductive Success

Fruit-set was calculated as the percentage of recorded fruits produced by each treatment in each metapopulation (Dafni, 1992). In each study site, 20 flowers not involved in the pollination-treatments were collected and brought back to the laboratory in separate sealed containers with 70% alcohol. Each flower was dissected to count the number of ovules present. An average number of ovules was calculated for each wild metapopulation and for domesticated plants. Afterward, seed-set (Schoper et al., 1987; Burd, 1994) was calculated as the percentage of seeds obtained from each fruit for each pollination treatment in relation to the average number of ovules of the study population to which these fruits belonged. Additionally, all seeds were weighed individually to estimate the seed weight per treatment in each study site. Later, all the seeds were germinated individually. Each seed was washed with 2% Captan (PESTANAL <sup>R</sup> , Merck) solution and covered with a damp cotton swab; tissue culture lids were used. Seeds were checked daily until all reached emergence of the radicle. While some studies on seed germination consider only a set of seeds (Schemske, 1983; Gil and López, 2015; Raphael et al., 2017; Farooq et al., 2018), we took into account all of the collected seeds for the analysis.

# Outcrossing Rate

The outcrossing rate (Te) was calculated for each study site following Barrett et al. (1996):

$$Te = 1 - S \tag{1}$$

where S is the selfing rate, estimated with the fruit-set results from our selfing (Ws) and outcrossing (Wx) treatments, i.e., automatic self-pollination and emasculated control, respectively. For CPM, Wx was obtained with the fruit-set from the assisted crosspollination treatment, because none of the emasculated control results were found when revisiting the metapopulation for fruit collection.

$$\mathbf{S} = (\frac{\boldsymbol{\omega}\_{\mathbf{s}}}{\boldsymbol{\omega}\_{\mathbf{s}} + \boldsymbol{\omega}\_{\mathbf{x}}}) \tag{2}$$

# Statistical Analyses

To test if there were significant differences in seed-set and seed weight among treatments, a Generalized Linear Mixed Model GLMM (Zuur et al., 2009) was used considering the plant as a random factor, because the pollination treatments were not equally represented in each plant (as explained in section 2.3). For GLMM analyses, a Quasi-Poisson distribution was considered for seed-set and a Gaussian distribution for seed weight (Cayuela, 2009). Afterward, a Tukey post hoc test was performed to evaluate the significance of the results. To compare germination frequencies and percentage of fruit-set, a chi square test was used with the post hoc standardized residue test for each one. Outliers were identified using the method described by Viechtbauer and Cheung (2010); to summarize, a multivariate detection method (Cook distance) was used to calculate the distance among all data points, and those that were not included in the general model were identified as "influential data points" or outlier values. Germination was calculated as the number of germinated seeds in relation to the total number of seeds (Gómez, 2004; Gil and López, 2015). All tests were carried out with the lme4, multcomp, stats, and ggplot2 packages of R version 3.4.3 (R Core Team, 2017). The scripts utilized for the analyses are available online at https://github.com/conservationgenetics/ BiologiaReproductiva.git.

# RESULTS

# Fruit-Set, Seed-Set, and Seed Weight

All treatments produced fruits regardless of the metapopulation (**Table 2**). The CPM open-pollinated control showed the highest

FIGURE 1 | Approximate distributions of the Central Pacific (blue), Southern Pacific (green) and Yucatan Peninsula (orange) upland cotton metapopulations studied in Mexico, with locations of field study sites (leaves) and the greenhouse in Mexico City where cultivated upland cotton was grown out for this study. The Central Pacific and Yucatan Peninsula metapopulations are wild and the Southern Pacific metapopulation contains populations with evidence of recent introgression with cultivated upland cotton.

TABLE 1 | Characteristics of each of the pollination treatments applied to G. hirsutum flowers, following Dafni (1992), in three field wild metapopulations and cultivated cotton in the greenhouse.


value of all treatments among all groups; YPM showed the highest fruit-set produced by outcrossing, and the lowest by automatic self-pollination and control treatment. On the other hand, the highest fruit-set was observed for all the treatments in SPM, with exception of the open-pollination control.

Seeds were produced both by outcrossing and selfing treatments in all metapopulations (**Figure 2**). The average number of seeds per fruit was 15.9 in SPM, 15.4 in domesticated, 12.3 in CPM, and 10.0 in YPM, while the average number of ovules was 16.3 in SPM, 28.6 in domesticated, 15.6 in CPM, and 13.4 in YPM. Regarding seed-set, the control treatment of domesticated cotton was lower than that of wild and introgressed plants [P(χ 2 ) = 4.13 × 10−<sup>4</sup> , df = 3] (**Figure 3**). When evaluating the results of each metapopulation individually, CPM presented seed-set differences between the control and the rest of the treatments (P(χ 2 ) = 0.1 × 10−<sup>4</sup> , df = 3), in SPM the differences were found between cross-pollination and all treatments, except emasculated control [P(χ 2 ) = 0.001, df = 4], while in YPM and the domesticated there were no significant differences among treatments (**Figure 2**). On the other hand, seed weight only presented differences in SPM, between cross-pollination and both assisted and automatic self-pollination [P(χ 2 ) = 0.026,


<sup>∗</sup>Categories significantly different using adjusted standardized residuals greater than 2.0 and less than −2.0 (χ 2 test). ND: Fruit-set could not be determined because none of the treated flowers were found when revisiting the metapopulation for fruit collection. NA: the chi-square test could not be determined because one of the treatments was not collected.

df = 4] (**Figure 2**). In YPM, CPM, and the domesticated cotton, no significant differences among treatments were observed (**Figure 2**).

# Germination

Less than 20% of the seeds from wild metapopulations CPM and YPM germinated. Regarding the domesticated group, 40– 63% of the seeds germinated, except for the seeds produced by cross-pollination that only reached 28%. The seeds from the five treatments assessed at SPM showed germination percentages above 83%, with cross-pollination reaching the highest value of 96% (**Table 3**).

With regard to the germination rate of the seeds that germinated (**Figure 4**), the slope of the curve suggests that wild upland cotton presents some kind of inhibition to the completion of germination, whereas domesticated populations do not display this behavior. As shown in **Figure 4A**, domesticated seeds germinated faster, within the first 6 days, whereas seeds presenting evidence of introgressive hybridization (SPM) reached 95% of germination within the first 7 days and continued germinating for 48 days. Unlike domesticated and SPM seeds, the seeds of wild plants germinated over the course of 73 days (**Figure 4A**). Concerning the pollination treatments from all study sites, 50% of the seeds of all treatments germinated within the first 5 days; however, after the 5th day the difference in germination rate is evident between autogamy and the rest of the treatments (**Figure 4B**).

## Outcrossing Rate

All study groups presented outcrossing rates different from 0 and 1 (i.e., 1 > Te > 0), which is indicative of a mixed mating system. Wild metapopulations (YPM and CPM) recorded a higher outcrossing rate (0.72 and 0.71, respectively) than SPM (0.40) and domesticated (0.65) (Supplementary Material 1).

# DISCUSSION

# Mating System of Upland Cotton's Wild-Domesticated Complex

Richards (1997) defined autogamy as within-flower or selfpollination, and allogamy as the pollination between pollen and ovules of different flowers; moreover, he further divided allogamy into geitonogamy (i.e., pollination between different flowers on the same genet) and xenogamy (i.e., pollination between pollen and ovules of different genets). Our results show that wild and domesticated cotton produce offspring in all pollination treatments (**Figures 2**, **4B** and **Tables 2**, **3**); thus, the analyzed plants have the capacity to produce progeny by both autogamy and xenogamy. To discard geitonogamy, it is necessary to perform a molecular genetic analysis of paternity. However, since autogamy is common in our system, there is no need to discard this type of allogamy. Furthermore, previous studies (see Supplementary Material 1), together with our own, indicate that the G. hirsutum wild-domesticated complex has a mixed mating system. This result is particularly relevant in upland cotton's center of origin, because of its significance on strategies for long-term conservation of genetic diversity in the event of gene flow between wild and domesticated relatives (Ellstrand, 1992).

Barrett and Eckert (1990) and Barrett et al. (1996) described the outcrossing rate (Te), which indicates that when the value is 0.5 the mating system is equally balanced between self and cross-pollination. Any value different from 0 (completely selfpollinated) or 1 (completely cross-pollinated) implies a mixed mating system; when Te > 0.5, the system is predominantly allogamous-xenogamous, whereas when Te < 0.5, the system is predominantly autogamous. Our observed rates vary from Te > 0.5, e.g., 0.71 (CPM), 0.72 (YPM) and 0.65 (domesticated), to Te < 0.5, e.g., 0.40 (SPM). Domesticated plants, and wild CPM and YPM, have a greater contribution of seeds from crosspollination in the next generation, although the contribution of self-pollination is high and important, and it contributed to maintenance of genetic structure. The high contribution of selfpollinated seeds in SPM is striking, far from being similar or intermediate between wild and domesticated; local factors may be affecting the result and should be addressed in a future study.

To further explore the mixed mating system of the species, we compared the germination rate of seeds produced by different pollination treatments. We found that within the first 5 days the seeds for all treatments reach 50% of germination. After the 10th day, a notorious difference on germination rate (<15%) among autogamy and the other treatments is observed. Such discrepancy is due to the difference in number of seeds produced in each treatment (**Figure 4B**). As suggested theoretically, when germination does not differ among treatments, self-pollination

is not the cause of inbreeding depression (Charlesworth and Willis, 2009). The mating system described in our study coincides with Baker's law of reproductive assurance (Pannell and Spencer, 1998), where species that migrate long distances colonize or recolonize patches initially by self-fertilization; then, because of its perennial nature, generations overlap in the same area and plants are pollinated by close relatives or by themselves in the absence of pollinators (Kalisz et al., 2004). The information described here, agrees with the ecological and genetic evidence that describes the metapopulation dynamics of G. hirsutum, along with the ability to migrate long distances, historically and currently (Wegier et al., 2011).

In addition, Wegier et al. (2011) reported high values of gene flow among metapopulations in the same study area, which could homogenize genetic variation, but their data exhibit population structure (k = 8) and high FST. Self-pollination and cross-pollination seem to maintain the genetic diversity of the species in the wild, although crossings with domesticated members of the complex (Wendel et al., 1992; Wegier et al., 2011) or even domesticated plants of Gossypium barbadense (Brubaker et al., 1993; Brubaker and Wendel, 1994; Ellstrand et al., 1999; Ellstrand, 2014) might be contributing to these results. In addition, gene flow with feral cotton can also take place (Rache Cardenal et al., 2013; de Menezes et al., 2015).

Finally, one of the fitness components measured in plants is seed weight (Primack and Kang, 1989), due to the fact that larger seeds perform better because of the higher amount of resources they possess (Armstrong and Westoby, 1993; Westoby et al., 1996). In our research, seed weight showed no significant difference between treatments within metapopulations (**Figure 2**).

# Differences of Reproductive Traits Within G. hirsutum's Wild to Domesticated Complex

Our analyses show differences in characters linked to some of the reproductive structures of upland cotton, which can be associated with the domestication syndrome and will be discussed below.

### Ovule Number

There are significant differences in ovule number [P(χ 2 ) = 0.001, df = 2], which was initially estimated to obtain the seed-set in each population (Supplementary Material 2). Wild plants produce on average 14.5 ovules per flower, while cultivated plants produce twice as many. Several authors have described a change in ovule number as a consequence of evolutionary processes. For instance, Pasquet (1998) found that ovule number supports the physiological division of cultivated cowpeas [Vigna unguiculata (L.) Walp.] into two different groups: cultivars able to flower early under inductive conditions, with ovule number lower than 17 (Biflora and Melanophthalmus) and cultivars not able to do so, with ovule number higher than 17 (Unguiculata and Sesquipedalis). Moreover, Andargie et al. (2014) reported a pair of quantitative trait loci (QTLs; qon1 and qon3) that regulate ovule number in cowpea; the alleles from the wild parent increase this trait as opposed to the cultivated, which reveals a feature of cowpea's domestication syndrome. In the case of climbing common bean (Phaseolus vulgaris L.), among the changes that occurred during the domestication process is the modification on the number of ovules, which changed from 5–8 to 2–9 ovules (Gepts and Debouck, 1991).

## Seed-Set

As shown in **Figure 3**, there are significant differences in the seed-set of wild and domesticated populations. From a much larger number of ovules, domesticated plants (open pollination controls) produce, proportionally, a lower quantity of seeds, which implies that they are not efficiently using the resources invested on ovule production (Cilas et al., 2010). Variation in seed number per boll is produced by the interplay of the plant genetics and the environment, which in turn generates either the lack of seed fertilization or completion of embryo


TABLE 3 | Germination of seeds obtained from each upland cotton metapopulation and treatment.

+All the repetitions of the emasculated control were placed, but none of them were found when revisiting the metapopulation for fruit collection. <sup>∗</sup>Categories significantly different using adjusted standardized residuals greater than 2.0 and less than −2.0 (χ 2 test).

growth post-fertilization (Davidonis et al., 1996); therefore, our results are influenced by the experimental design and, in the future, a common garden experiment will provide insight into the effect of the environment. In comparison, wild plants are more efficient, producing seeds from nearly all of their ovules, although the net number of seeds is smaller than that produced by domesticated fruits. Many features associated with domestication are not advantageous in terms of reproduction and survival of following generations lacking human intervention (Gepts, 2004), because the selective pressures by which they have evolved are determined by humans (see categories 1–4 of the classification proposed by Clement, 1999). As a result, gene flow between wild relatives and cultivated plants could have negative consequences (Andersson and de Vicente, 2010), however, it could also give rise to in situ reservoirs of domesticated genes for the future (Ellstrand, 2018). Each domesticated cotton plant develops 50% more descendant plants than the wild plants do within their natural distribution, so the ecological-evolutionary consequences of this result will depend on the evolutionary process and the agro-ecological or ecosystem context in which plants are developed.

### Germination

One of the traits selected for during domestication is rapid germination (Frary and Doganlar, 2003), as this helps crops to start to grow at the same time and contributes to synchronous fruiting. Over time, this trait contributes to harvesting efforts and, therefore, unconsciously selects for loss of dormancy. In natural habitats, conditions are less predictable, and dormancy will contribute to different seeds germinating in different environmental conditions (Long et al., 2015). Our results on seeds that reached germination agree with what has been described for other domesticated plants that have undergone similar evolutionary processes (Fuller and Allaby, 2009; Abbo et al., 2014; Hernández et al., 2017): domesticated seeds germinate faster and practically simultaneously, whereas their wild relatives display dormancy (**Figure 4A**).

### Distinctive Traits of SPM

With respect to SPM (selected for study because of evidence of recent introgression with domesticated plants; Wegier et al., 2011), the reproductive system is mixed, as it is in wild populations without introgression and in domesticated populations. However, some of the traits that determine reproductive success are unique to this population: the variability in seed-set values is markedly different (**Figures 2**, **3**); its fruits produce more seeds than the other populations (similar to domesticated fruits, but from half the number of ovules, which makes them very efficient) (Supplementary Material 2); and these seeds have a higher percentage of germination than the other populations (**Table 3**). These characteristics can have demographic consequences in the short term, unless there are other factors that regulate this growth. On the other hand, contrary to what was expected for SPM, their resemblance to domesticated seed germination is higher than with the wild ones. The behavior is also dissimilar for introgressed seeds, which took longer to complete germination than domesticated and wild seeds (**Figure 4A**). This last phase displays a very slow response in SPM, probably associated with the loss of physiological responses, resembling domesticated plants. It is important that these analyses are repeated in subsequent years, to confirm if there is an eco-genetic trend (Price and Waser, 1979;

Edmands, 2007; Ellstrand and Rieseberg, 2016), or if it was the result of local conditions.

# Conservation and Biosecurity Implications

Many nations want to defend the rights of the next generations to enjoy and decide about biodiversity and its services, aware that decisions made today will have an impact on the natural resources available in the future (Bennett et al., 2015; Steffen et al., 2015; Morales et al., 2017). Upland cotton is a remarkably important plant for humanity, not only due to the versatile uses of its fiber, but for many other applications (Wegier et al., 2016). It follows that cotton's wild-to-domesticated complex and its environment should be a conservation priority. Mesoamerican dry forests and coastal dunes contain the ecosystems and evolutionary processes that originated, mold, and maintain wild cotton diversity and its interactions. These evolutionary services (Faith et al., 2010; Bailey, 2011; Rudman et al., 2017) are essential for species conservation, because preserving this genetic diversity allows the capacity to adapt to environmental changes (Ellstrand, 1992; Hartl, 2000). However, the factors that mold each part of the wild-to-domesticated complex are different; for example, the conservation of native traditional varieties depends to a great extent on the communities that cultivate them, their management techniques, and interests (Zhang et al., 2007). Hence, the parts of the complex that could be used for crop improvement will depend on the objectives of the new processes of domestication and breeding (Ellstrand, 2018; Mastretta-Yanes et al., 2018).

Gene flow between crops and wild relatives should be examined on a case-by-case basis (Stewart et al., 2003), especially when genetically modified organisms (GMO) are involved, because the consequences depend on the nature of the transferred genes and their regulatory mechanisms (Ellstrand, 2003). For instance, a recent study has demonstrated that genetic modifications can affect fitness traits in the long-term (Hernández-Terán et al., 2017). An important issue to keep in mind is that for gene flow to occur, a crop must be within pollination distance of a compatible population (Ellstrand and Hoffman, 1990), but in the case of domesticated plants the distances can be shortened by human activities (Dyer et al., 2009; Wegier et al., 2011). Several studies have documented hybridization events between crops and their wild relatives; for instance, in the United Kingdom, one-third of the 36 species analyzed by Raybould and Gray (1993) hybridize with at least one element of the local flora; in the Netherlands, a quarter of 42 species does (de Vries et al., 1992); and all but one of the 13 crops reviewed by Ellstrand et al. (1999) hybridize naturally with their wild relatives in some part of their agricultural distribution (including G. hirsutum and other species of subgenus Karpas). These hybridization events could lead to a decline in wild genetic diversity, as opposed to native semi-domesticated varieties in traditional Mesoamerican systems where there is evidence that domesticated genomes have formed not only by selection under domestication, but also by gene flow with other closely related populations and species (Rendón-Anaya et al., 2017). For this reason, the wild-to-domesticated dynamics in terms of genetic diversity, reproductive biology, and gene flow should be well understood in the natural distribution of the species of interest, because extrapolating conclusions based on external or incomplete information about species complexes is inconsistent with the objectives of conservation and biosafety (Beebe et al., 1997; Acevedo et al., 2016).

In this study we found that the reproductive capacity of introgressed cotton is greater than that of wild and domesticated plants. This reveals a scenario that de Wet (1968), de Wet and Harlan (1975) and Keeler et al. (1996) had already described, where wild relatives of some introgressed crops can become weeds that are difficult to control. The wide genetic diversity of G. hirsutum, along with factors modified by traditional genetic improvement and modern genetic engineering, will be problematic for agroecosystems (Altieri, 2000) and ecosystem conservation if they increase cotton's weediness or invasiveness (Schafer et al., 2011). Cotton has already been reported to persist in a few tropical regions, such as the north of Australia, Vietnam, México, the continental United States and Hawaii (Hawkins et al., 2005; Andersson and de Vicente, 2010; USDA, 2018), so it will be necessary to monitor these changes in wild populations given the species great capacity for long distance migration by natural and anthropogenic means.

Finally, local conditions can influence the results of reproductive biology studies (Ellstrand and Foster, 1983; Hucl, 1996; Murray et al., 2002); hence, it was essential to assess the mating system of G. hirsutum within its natural distribution. Some of the factors that have an effect on the results can be associated with the environment (pollen viability, nectar production, and pollinator activity due to environmental conditions; Ahrent and Caviness, 1994; Ibarra-Pérez et al., 1997; Chaves-Barrantes et al., 2014), ecological interactions (foraging rate, floral consistency, efficiency of pollen deposition, interactions with arthropodofauna, and composition of pollinator species; Rudgers, 2004; Kessler et al., 2012; Johnson et al., 2015), as well as the landscape (species abundance and surrounding species distributions; Murray et al., 2002). In this study, the results of automatic self-pollination and the emasculated control provide evidence that autogamy and allogamy occur naturally in upland cotton's natural distribution. The occurrence of the latter highlights the importance of native pollinators on the reproductive biology of G. hirsutum and, consequently, conservation strategies should take this key interaction into consideration.

# CONCLUSION

This study found that upland cotton's wild-to-domesticated complex presents a mixed mating system. This information is new for wild, domesticated, and introgressed G. hirsutum in its natural distribution, but it is in agreement with previous studies in populations of domesticated cotton (Supplementary Material 3). Consequently, G. hirsutum should be considered as having a mixed reproductive strategy throughout its whole complex, rather than being primarily autogamous. Management strategies and policies meant to conserve the diversity of cotton's wild-to-domesticated complex must take this into account.

Furthermore, physiological differences were found between cultivated cotton and its wild relatives, especially in traits such as the number of ovules per flower, number of viable seeds per fruit, and their germination behavior. Given the evidence of gene flow and introgression, these traits should be monitored systematically in wild populations and agroecosystems of interest for conservation, as well as the impact on ecological interactions, such as pollination. On the other hand, the diversity contained in the wild-to-domesticated complex must be included in longterm conservation strategies, so that future generations can have access to genetic resources with greater chances of surviving the changing environments.

# AUTHOR CONTRIBUTIONS

AW, VA, and RV-L designed the research, participated in fieldwork, performed the analyses, and wrote the manuscript. AW coordinated the study. WK designed the fieldwork and revised the analyses. AP-M participated in fieldwork and performed the analyses. JP-L, VV-B, and DA-L conducted the analyses. All authors analyzed the results and wrote the manuscript.

# FUNDING

This work was financially supported by the project "Program for the conservation of wild populations of Gossypium hirsutum

in Mexico", DGAP003/WN003/18 funded by the DGSPRNR (Dirección General del Sector Primario y Recursos Naturales Renovables) that belongs to SEMARNAT and CONABIO; complemented with the support of CONACYT scholarships (213557, 609346) and projects CONACYT-PN247672, UNAM PAPIIT No. IV200117.

# ACKNOWLEDGMENTS

We are grateful for the support of all the people in the communities where our work was done. We express our gratitude to the reviewers, whose comments and suggestions greatly improved our manuscript, and especially thank Charles R. Clement for his recommendations and advice. Also, we wish to thank Mariana Benítez, Alejandro Correa-Metrio, Ana Elena Escalante, Ulises Rosas, and Kristy Walker for their valuable comments on our work. In addition, we would like to acknowledge the support of the colleagues who assisted with sampling and laboratory work: Néstor Chavarría, José Luis Caldú,

# REFERENCES


Erick Tovar, Atsiry López, Adriana Uscanga, Cirene Gutiérrez, Luis Barba, Adriana Calderón, Lislie Solis, Florencia García-Campusano, Haven López-Sánchez, Brian Urbano, and especially Diana Peña. Pamela Rodríguez collaborated with the historical analysis. We would also like to thank Professor Zenón Cano for his teachings that have contributed to our professional development through generations. DA-L is carrying out a postdoctorate funded by CONACYT-PN247672. JP-L and VV-B thank CONACYT for grants 609346 and 477713, respectively as well as the Maestría en Ciencias Biológicas, UNAM. Finally, we are especially grateful to Nancy Corona and the CARB-CONABIO team, who have provided support and wisdom through our entire journey.

# SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018.00574/ full#supplementary-material


of ovules per ovary, number of seeds per pod, and seed weight. Tree Genet. Genomes 6, 219–226. doi: 10.1007/s11295-009-0242-9




Zuur, A. F., Ieno, E. N., Walker, N. J., Saveliev, A. A., and Smith, G. M. (eds). (2009). "GLMM and GAMM," in Mixed Effects Models and Extensions in Ecology with R (New York, NY: Springer), 323–341.

**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Velázquez-López, Wegier, Alavez, Pérez-López, Vázquez-Barrios, Arroyo-Lambaer, Ponce-Mendoza and Kunin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.