# GENETICS AND GENOMICS OF PLANT REPRODUCTION FOR CROP BREEDING

EDITED BY : Gianni Barcaccia, Luciana Baldoni, Marta Adelina Mendes, Emidio Albertini, Fulvio Pupilli, Andrea Mazzucato, Sara Zenoni, Silvia Vieira Coimbra, Antonio Granell and Dabing Zhang PUBLISHED IN : Frontiers in Plant Science

#### Frontiers eBook Copyright Statement

The copyright in the text of individual articles in this eBook is the property of their respective authors or their respective institutions or funders. The copyright in graphics and images within each article may be subject to copyright of other parties. In both cases this is subject to a license granted to Frontiers. The compilation of articles constituting this eBook is the property of Frontiers.

Each article within this eBook, and the eBook itself, are published under the most recent version of the Creative Commons CC-BY licence. The version current at the date of publication of this eBook is CC-BY 4.0. If the CC-BY licence is updated, the licence granted by Frontiers is automatically updated to the new version.

When exercising any right under the CC-BY licence, Frontiers must be attributed as the original publisher of the article or eBook, as applicable.

Authors have the responsibility of ensuring that any graphics or other materials which are the property of others may be included in the CC-BY licence, but this should be checked before relying on the CC-BY licence to reproduce those materials. Any copyright notices relating to those materials must be complied with.

Copyright and source acknowledgement notices may not be removed and must be displayed in any copy, derivative work or partial copy which includes the elements in question.

All copyright, and all rights therein, are protected by national and international copyright laws. The above represents a summary only. For further information please read Frontiers' Conditions for Website Use and Copyright Statement, and the applicable CC-BY licence.

ISSN 1664-8714 ISBN 978-2-88963-887-1 DOI 10.3389/978-2-88963-887-1

#### About Frontiers

Frontiers is more than just an open-access publisher of scholarly articles: it is a pioneering approach to the world of academia, radically improving the way scholarly research is managed. The grand vision of Frontiers is a world where all people have an equal opportunity to seek, share and generate knowledge. Frontiers provides immediate and permanent online open access to all its publications, but this alone is not enough to realize our grand goals.

#### Frontiers Journal Series

The Frontiers Journal Series is a multi-tier and interdisciplinary set of open-access, online journals, promising a paradigm shift from the current review, selection and dissemination processes in academic publishing. All Frontiers journals are driven by researchers for researchers; therefore, they constitute a service to the scholarly community. At the same time, the Frontiers Journal Series operates on a revolutionary invention, the tiered publishing system, initially addressing specific communities of scholars, and gradually climbing up to broader public understanding, thus serving the interests of the lay society, too.

#### Dedication to Quality

Each Frontiers article is a landmark of the highest quality, thanks to genuinely collaborative interactions between authors and review editors, who include some of the world's best academicians. Research must be certified by peers before entering a stream of knowledge that may eventually reach the public - and shape society; therefore, Frontiers only applies the most rigorous and unbiased reviews.

Frontiers revolutionizes research publishing by freely delivering the most outstanding research, evaluated with no bias from both the academic and social point of view. By applying the most advanced information technologies, Frontiers is catapulting scholarly publishing into a new generation.

#### What are Frontiers Research Topics?

Frontiers Research Topics are very popular trademarks of the Frontiers Journals Series: they are collections of at least ten articles, all centered on a particular subject. With their unique mix of varied contributions from Original Research to Review Articles, Frontiers Research Topics unify the most influential researchers, the latest key findings and historical advances in a hot research area! Find out more on how to host your own Frontiers Research Topic or contribute to one as an author by contacting the Frontiers Editorial Office: researchtopics@frontiersin.org

# GENETICS AND GENOMICS OF PLANT REPRODUCTION FOR CROP BREEDING

Topic Editors:

Gianni Barcaccia, University of Padua, Italy Luciana Baldoni, Institute of Bioscience and Bioresources, Italy Marta Adelina Mendes, University of Milan, Italy Emidio Albertini, University of Perugia, Italy Fulvio Pupilli, National Research Council (CNR), Italy Andrea Mazzucato, University of Tuscia, Italy Sara Zenoni, University of Verona, Italy Silvia Vieira Coimbra, University of Porto, Portugal Antonio Granell, Consejo Superior de Investigaciones Científicas (CSIC), Spain Dabing Zhang, Shanghai Jiao Tong University, China

Citation: Barcaccia, G., Baldoni, L., Mendes, M. A., Albertini, E., Pupilli, F., Mazzucato, A., Zenoni, S., Coimbra, S. V., Granell, A., Zhang, D., eds. (2020). Genetics and Genomics of Plant Reproduction for Crop Breeding. Lausanne: Frontiers Media SA. doi: 10.3389/978-2-88963-887-1

# Table of Contents


Jérôme Grimplet, Sergio Ibáñez, Elisa Baroja, Javier Tello and Javier Ibáñez

*52 The Occurrence of Seedlessness in Higher Plants; Insights on Roles and Mechanisms of Parthenocarpy*

Maurizio E. Picarella and Andrea Mazzucato


Zhiliang Xiao, Fengqing Han, Yang Hu, Yuqian Xue, Zhiyuan Fang, Limei Yang, Yangyong Zhang, Yumei Liu, Zhansheng Li, Yong Wang, Mu Zhuang and Honghao Lv


Pankaj Kaushal, Krishna K. Dwivedi, Auji Radhakrishna, Manoj K. Srivastava, Vinay Kumar, Ajoy Kumar Roy and Devendra R. Malaviya

*136 Construction of the First SNP-Based Linkage Map Using Genotyping-by-Sequencing and Mapping of the Male-Sterility Gene in Leaf Chicory*

Fabio Palumbo, Peng Qi, Vitor Batista Pinto, Katrien M. Devos and Gianni Barcaccia

*151 Genomics of Flower Identity in Grapevine (*Vitis vinifera *L.)* Fabio Palumbo, Alessandro Vannozzi, Gabriele Magon, Margherita Lucchin and Gianni Barcaccia


Michel Hernould, Christian Chevalier, Hiroshi Ezura and Tohru Ariizumi

*211 Finding a Compatible Partner: Self-Incompatibility in European Pear (*Pyrus communis*); Molecular Control, Genetic Determination, and Impact on Fertilization and Fruit Set*

Hanne Claessen, Wannes Keulemans, Bram Van de Poel and Nico De Storme


F. Alagna, M. E. Caceres, S. Pandolfi, S. Collani, S. Mousavi, R. Mariotti, N. G. M. Cultrera, L. Baldoni and G. Barcaccia

*271 A High-Density Linkage Map of the Forage* Grass Eragrostis curvula *and Localization of the Diplospory Locus*

Diego Zappacosta, Jimena Gallardo, José Carballo, Mauro Meier, Juan Manuel Rodrigo, Cristian A. Gallo, Juan Pablo Selva, Juliana Stein, Juan Pablo A. Ortiz, Emidio Albertini and Viviana Echenique


Francesca Caselli, Veronica Maria Beretta, Otho Mantegazza, Rosanna Petrella, Giulia Leo, Andrea Guazzotti, Humberto Herrera-Ubaldo, Stefan de Folter, Marta Adelina Mendes, Martin M. Kater and Veronica Gregis


Carlos A. Acuña, Eric J. Martínez, Alex L. Zilli, Elsa A. Brugnoli, Francisco Espinoza, Florencia Marcón, Mario H. Urbani and Camilo L. Quarin

#### *352 Genetic Mapping of the Incompatibility Locus in Olive and Development of a Linked Sequence-Tagged Site Marker*

Roberto Mariotti, Alice Fornasiero, Soraya Mousavi, Nicolò G.M. Cultrera, Federico Brizioli, Saverio Pandolfi, Valentina Passeri, Martina Rossi, Gabriele Magris, Simone Scalabrin, Davide Scaglione, Gabriele Di Gaspero, Pierre Saumitou-Laprade, Philippe Vernet, Fiammetta Alagna, Michele Morgante and Luciana Baldoni

*365 Ploidy-Dependent Effects of Light Stress on the Mode of Reproduction in the* Ranunculus auricomus *Complex (Ranunculaceae)*

Fuad Bahrul Ulum, Camila Costa Castro and Elvira Hörandl

# Establishment of Apomixis in Diploid F<sup>2</sup> Hybrids and Inheritance of Apospory From F<sup>1</sup> to F<sup>2</sup> Hybrids of the Ranunculus auricomus Complex

#### Birthe H. Barke\*, Mareike Daubert and Elvira Hörandl

Department of Systematics, Biodiversity and Evolution of Plants, Albrecht-von-Haller Institute for Plant Sciences, University of Göttingen, Göttingen, Germany

Hybridization and polyploidization play important roles in plant evolution but it is still

#### Edited by:

Emidio Albertini, University of Perugia, Italy

### Reviewed by:

Qiang Fan, Sun Yat-sen University, China Ross Bicknell, The New Zealand Institute for Plant & Food Research Ltd, New Zealand Joann Acciai Conner, University of Georgia, United States

\*Correspondence:

Birthe H. Barke birthe-hilkka.barke@biologie uni-goettingen.de

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 23 May 2018 Accepted: 10 July 2018 Published: 03 August 2018

#### Citation:

Barke BH, Daubert M and Hörandl E (2018) Establishment of Apomixis in Diploid F2 Hybrids and Inheritance of Apospory From F1 to F2 Hybrids of the Ranunculus auricomus Complex. Front. Plant Sci. 9:1111. doi: 10.3389/fpls.2018.01111 not fully clarified how these evolutionary forces contribute to the establishment of apomicts. Apomixis, the asexual reproduction via seed formation, comprises several essential alterations in development compared to the sexual pathway. Furthermore, most natural apomicts were found to be polyploids and/or hybrids. The Ranunculus auricomus complex comprises diploid sexual and polyploid apomictic species and represents an excellent model system to gain knowledge on origin and evolution of apomixis in natural plant populations. In this study, the second generation of synthetically produced homoploid (2x) and heteroploid (3x) hybrids derived from sexual R. auricomus species was analyzed for aposporous initial cell formation by DIC microscopy. Complete manifestation of apomixis was determined by measuring single mature seeds by flow cytometric seed screen. Microscopic analysis of the female gametophyte formation indicated spontaneous occurrence of aposporous initial cells and several developmental irregularities. The frequency of apospory was found to depend on dosage effects since a significant increase in apospory was observed, when both F<sup>1</sup> parents, rather than just one, were aposporous. Other than in the F<sup>1</sup> generation, diploid Ranunculus F<sup>2</sup> hybrids formed BIII seeds and fully apomictic seeds. The results indicate that hybridization rather than polyploidization seems to be the functional activator of apomictic reproduction in the synthetic Ranunculus hybrids. In turn, at least two hybrid generations are required to establish apomictic seed formation.

Keywords: apospory, developmental biology, endosperm balance, FCSS, gametophytic apomixis, hybrid, polyploidy, Ranunculus

### INTRODUCTION

Apomixis in Angiosperm plants is, by definition, seed formation via asexual reproduction, resulting in clonal, maternal offspring (Nogler, 1984a). Gametophytic apomixis, which is the focus of our study, combines two steps: (1) apomeiosis, i.e., the formation of an unreduced embryo sac, and (2) parthenogenesis, i.e., the development of an unfertilized egg cell into an embryo. Almost all apomictic plants are polyploids and/or hybrids but the role of these processes for establishment of apomixis is still not well-understood. There is evidence that the functional establishment of

**6**

apomixis is not exclusively ploidy-dependent but an important factor in increasing and optimizing related gene expression (Quarin et al., 2001; Bicknell and Koltunow, 2004; Comai, 2005). A reason for the importance of polyploidy in apomictic plants can be conjectured by gene dosage effects, which state that haploid gametophytes abort due to recessive lethal effects of apomixis-controlling genetic factors (Nogler, 1982, 1984a,b). This assumption is supported by the rarity of diploid apomicts but a few exceptions are Scandinavian Potentilla argentea biotypes, diplosporous Boechera species (Müntzing, 1928; Böcher, 1951; Sharbel et al., 2009), Paspalum and Ranunculus kuepferi individuals (Ortiz et al., 2013; Schinkel et al., 2016, 2017). However, emergence of apomixis is not only achieved by ploidy but could be also an effect of hybridization (Asker and Jerling, 1992). Often hybridization of sexual plants leads to severe disturbances influencing genetic and epigenetic composition or meiotic cell division that can result in progeny with reduced fitness (Carman, 1997; Rieseberg et al., 1999; Comai, 2005). Disturbances are thought to be attenuated by the mentioned allopolyploidization, which in turn might lead to asynchronous gene expression due to stabilization and inheritance of genomic changes (Mogie, 1992; Carman, 1997). One possibility to get away from hybrid sterility is the switch to apomictic reproduction as hypothesized by Darlington (1939).

This switch is still not well-understood but many hypotheses have been developed, which involve several different molecular scenarios like genetic control mechanisms or epigenetic regulation. One popular hypothesis claims that heterochronic expression of sexual reproduction genes, which is caused by hybridization, is the trigger for apomictic seed formation (Carman, 1997; Sharbel et al., 2009, 2010). This idea is supported by recent findings of Hojsgaard et al. (2014), who discovered severe changes in the timing of megagametogenesis in synthetic Ranunculus auricomus F<sup>1</sup> hybrids. In early studies, it was assumed that apomixis is inherited as single dominant trait and maybe as only one gene (e.g., Nogler, 1984a; Savidan, 1992). More recent studies have shown that important apomictic characteristics such as apomeiosis, parthenogenesis and fertilization-independent endosperm formation seem to be controlled by several independent loci (e.g., Schallau et al., 2010; Ogawa et al., 2013). The developmental pathways of Hieracium apomicts support these findings because mutant plants were able to return to sexuality, when lacking the apospory locus (Catanach et al., 2006; Koltunow et al., 2013). Although, gene expression studies were carried out, no connection between apomixis and certain gene clusters were identified, but it was determined that apomixis often co-segregates with a block of gene-poor heterochromatin (Huo et al., 2009; Ochogavía et al., 2011; Grimanelli, 2012). Apomictic reproduction in angiosperm plants is a heritable and facultative process probably regulated by differently expressed genes responsible for controlling sexual development or it might be the result of reversible, epigenetic silencing (Hand and Koltunow, 2014). Amongst others, Carman (1997) proposed that the switch to asexual seed formation is triggered by gene duplication subsequently followed by changes in epigenetic gene expression (e.g., Koltunow, 1993). Today, it is verified that hybridization and polyploidization can result in altered epigenetic regulations as well as genetic changes in plants (Comai, 2005). DNA modifications such as methylations or RNA interference are heritable and do not affect DNA sequences (Jaenisch and Bird, 2003) but such dosage effects might be the activator of apomictic development after hybridization or polyploidization events (Ozias-Akins and van Dijk, 2007). Thus, epigenetic regulation and reprogramming of plant development can be important factors for apomixis activation (Grimanelli, 2012). Identification of apomixis loci is difficult because recombination is often suppressed in these regions, which might be caused by allelic divergence (Hand and Koltunow, 2014).

The R. auricomus complex consists of mainly apomictic polyploid species but additionally a few di- and tetraploid obligate sexual species (R. carpaticola, R. cassubicifolius, and R. notabilis) are known (Hörandl and Gutermann, 1998; Paun et al., 2006; Hörandl et al., 2009; Hojsgaard et al., 2014). Sexually reproducing species were found to be self-incompatible, while the apomicts, like typical allopolyploids, were characterized as self-fertile (Hörandl, 2008). In R. auricomus plants gametophytic apomixis was described already by Nogler (1984a,b), starting with aposporous formation of an unreduced embryo sac from a somatic nucellar cell in short proximity to a meiotically developed megaspore tetrad or embryo sac that subsequently aborts. The embryo is formed parthenogenetically, whereas successful endosperm development usually requires fertilization of the polar nuclei (pseudogamy; Koltunow and Grossniklaus, 2003; Koltunow et al., 2011). Asexual Ranunculus taxa are not obligate apomicts because they still comprise, to some extent, the capacity to reproduce sexually (Nogler, 1984a,b; Hojsgaard et al., 2014; Klatt et al., 2016).

Although apomixis has been studied for more than 100 years now, it is still unclear, how the effective switch toward apomixis in natural plant populations is achieved. More specifically, the specific effects of hybridity vs. polyploidy on developmental pathways are unclear and difficult to entangle in natural allopolyploid apomicts. This study wants to shed light on the developmental events right upon hybridization vs. polyploidy in synthetic F<sup>2</sup> plants of the R. auricomus complex as a potential cause of apomixis. Hojsgaard et al. (2014) analyzed the corresponding parental F<sup>1</sup> hybrid generation to the plants used in this study and described first evidence of spontaneous apospory and developmental asynchrony in diploid and triploid hybrid Ranunculus gametophytes. However, functional apomictic seeds were only produced in polyploids, at very low frequencies. Here, we investigate F<sup>2</sup> hybrid plants generated by manual crossing, where either both parents or one parent had apospory before (Hojsgaard et al., 2014). Since hybridization often is connected to allopolyploidization, which was also shown for natural hybrids of the R. auricomus complex (Paun et al., 2006; Pellino et al., 2013), the determination of potential ploidy shifts in the F<sup>2</sup> plants was checked by flow cytometry. According to Carman (1997) theory, we expected that allopolyploid F<sup>2</sup> hybrids would have

**Abbreviations:** AIC, aposporous initial cell; DIC, differential interference contrast microscopy; ES, embryo sac; FCSS, flow cytometric seed screen; FM, functional megaspore; SSR, simple sequence repeat.

higher frequencies of apospory and apomictic seed formation than diploid ones due to asynchrony of gene expression. We expect an increase of apospory, not only from the first hybrid generation to the next, but also higher frequencies in F<sup>2</sup> plants descending from both aposporic parents, due to (epi)allelic dosage effects (Nogler, 1984b). Apomictic reproduction can be passed on to the next plant generation by male pollen (Nogler, 1984a; Van Dijk et al., 1999), which led to the assumption that maternal, aposporous plants pollinated by an aposporous paternal plant will result in an accumulation of apomictic dosage effects in the offspring. Furthermore, we carefully analyzed female development to test whether similar severe alterations and temporal irregularities during gametogenesis occur as previously observed by Hojsgaard et al. (2014). To get insights into abortion rates during seed development we analyzed seed set of the F<sup>2</sup> plants. To test the hypothesis that diploid hybrid plants are also capable of producing apomictic seeds, the well-developed seeds were analyzed by flow cytometric seed screening. This step is depending on successful coupling of apospory to parthenogenesis and proper endosperm formation. Finally, by generating manual crosses of the F<sup>2</sup> plants and by raising F<sup>3</sup> seedlings, we have experimentally proven the viability of the next hybrid generation.

### MATERIALS AND METHODS

#### Plant Materials

Two hundred synthetic F<sup>2</sup> hybrid plants were generated from crossing diploid F, J, and triploid G plants that had shown apospory before (**Table S1**, Hojsgaard et al., 2014). Since the F<sup>1</sup> had formed almost no apomictic seed (Hojsgaard et al., 2014), the F<sup>2</sup> is expected to have maternal and paternal genome contributions. All plants were grown under equal outdoor conditions in the old botanical garden of the Albrecht-von-Haller Institute for Plant Sciences at the University of Goettingen, Germany. First flowering of the plants occurred after 2–3 years of cultivation.

#### Ploidy Determination

The ploidy of the F<sup>2</sup> plants was determined by analyzing leaf material by flow cytometry (Matzk et al., 2000; Hojsgaard et al., 2014). Small Silica gel-dried leaf pieces of ∼5 mm² were chopped in 200 µl Otto I buffer (Otto, 1990) with a razor blade and filtered through a CellTrics <sup>R</sup> filter (30µm mesh, Sysmex Partec GmbH, Görlitz, Germany) into a flow cytometry sample tube (3.5 ml, 55 × 12 mm, Sarstedt, Nümbrecht, Germany). DNA in the filtrate was stained by adding 800 µl DAPI-containing Otto II buffer (Otto, 1990). The fluorescence intensity of stained leaf nuclei were performed with a CyFlow <sup>R</sup> Space flow cytometer (Sysmex Partec GmbH) at a gain of 416 nm. As ploidy references di- and polyploid F<sup>1</sup> hybrid plants analyzed by Hojsgaard et al. (2014) were used. For all samples a minimum of 3,000 nuclei was counted and data analyses were done with the FloMax software version 2.81 (Sysmex Partec GmbH).

## Genotyping of F<sup>2</sup> Plants

A simple sequence repeat (SSR) genotyping approach was conducted to verify the parentage of plants. In order to exclude spontaneous selfing, unintended cross-contaminations during manual pollination as well as clonal, apomictic origin of the F<sup>2</sup> Ranunculus generation, six loci (**Table 1**) were used to verify the hybrid origin of the plants following the genotyping protocol of Klatt et al. (2016). Genomic DNA was extracted from dried leaf samples using the DNeasy Plant Mini Kit (Qiagen) according to the manufacturer's protocol. Polymerase chain reactions (PCR) were performed with a final sample volume of 25 µl, containing 1 µl template DNA, 12.5 µl BIOMIX (Eurofins Genomics, Ebersberg, Germany), 0.2 µl Forward primer (10µM), 1.0 µl Reverse primer (10µM), 1 µl CAG-primer (FAM or HEX labeled). PCR reactions were achieved in a BioRad TM100TM Thermal Cycler with the following parameters: 94◦C for 10 min, then 14 × (denaturation at 94◦C for 60 s, annealing at 62◦C + 0.5◦C per cycle for 90 s, extension at 72◦C for 60 s), followed subsequently by 35 × (denaturation at 94◦C for 30 s, annealing at 55◦C for 30 s and extension at 72◦C for 30 s), last extension step at 72◦C for 60 s and final storage conditions at 4◦C. PCR sample concentrations were adjusted before 85 µl formamide (HiDi) were added. This mixture was run in an automatic capillary sequencer Genetic Analyzer 3130 (Applied Biosystems, Forster City, CA, USA) using GeneScan 500 Rox (Applied Biosystems) as size standard after a denaturing pretreatment for 3 min at 92◦C. Scoring of the electropherograms was done using GeneMarker V2.4.2 (SoftGenetics LLC, State College, PA, USA) a binary presence/absence matrix of alleles was exported for genotype characterization because of the presence of several "null" alleles, which may be due to the hybrid origin of the parent plants. The SSR profiles were analyzed in FAMD applying the Jaccard similarity index and generating neighbor joining trees (Schlueter and Harris, 2006). The visualization of trees was done in FigTree v1.4.2 (Rambaut, 2009; **Figures S4**–**S14**). The data confirmed non-maternal offspring and parental combinations in the F<sup>2</sup> generation (**Tables S2**–**S12**).

#### Female Development

To evaluate the frequency of aposporous initial cell formation in contrast to the occurrence of sexually derived functional megaspores in ovules of R. auricomus hybrids, differential interference contrast (DIC) microscopy was applied (Hojsgaard et al., 2014).

Ranunculus flower buds with a minimal diameter of 5 mm were harvested and directly fixed in FAA solution (formaldehyde: acetic acid: ethanol: dH2O; 2:1:10:3.5) for 48 h at room temperature. The fixative was replaced with 70% ethanol, in which samples were stored until further treatments. Thereafter, plant tissue was dehydrated using 95% and 100% ethanol for each 30 min, before the flower buds were cleared in an increasing dilution series of methyl salicylate (25; 50; 85; 100%; Carl Roth GmbH + Co. KG, Karlsruhe, Germany) in ethanol (Young et al., 1979; Hojsgaard et al., 2014). Complete ovaries were dissected from the flower buds and mounted in pure methyl salicylate on object slides. DIC microscopy analysis was performed with Leica DM5500B microscope equipped with a DFC 450 C camera and LAS V41 software (Leica Microsystems, Wetzlar, Germany).

Discrimination of sexual and aposporic cells was accomplished by evaluation of the location of the two cell


TABLE 1 | Characteristics of the six SSR markers used for F2 hybrid genotyping. Ta (annealing temperature).

types. While sexual megaspores usually occurred at the chalazal site of the degraded germ line, asexual initial cells were found close to the sexual megaspores but obviously in somatic ovule tissue. In some ovules a temporal coexistence of functional megaspore and potential aposporous initial cell was observed (**Figure 1**). Percentages of sexual functional megaspores (FMs), functional aposporous initial cells (AICs) and aborted ovules are given in **Table 2**.

Statistical analyses and test for significant differences of the two groups (one parent vs. both parents aposporous) were done by applying an arcsin transformation and one-way ANOVA using IBM SPSS Statistics 24 (IBM Deutschland GmbH, Ehningen, Germany).

#### Seed Set

To determine the reproductive fitness of the Ranunculus F<sup>2</sup> hybrids by seed formation, the plants were transferred from the botanical garden to a YORK <sup>R</sup> climate chamber (18◦C, humidity of 60%, day: night regime of 16 h: 8 h; Johnson Controls, Milwaukee, WI, USA) to prevent unwanted pollination events e.g., by bees or other insects. At least three flowers per plant were manually cross pollinated and subsequently packed in plastic Crispac bags (2 mm Ø holes, Baumann Saatzuchtbedarf, Waldenburg, Germany) to collect ripe seeds. Harvested Ranunculus seeds were visually assessed and mature, brown achenes were counted and separated from aborted, yellow ones. Furthermore, full endosperm development was tested by shortly applying thumb-pressure to each achene (Klatt et al., 2016). Based on these numbers, the seed set was calculated for single collective fruits, for individual plants as well as for each hybrid cross after Hörandl (2008). Seeds were stored at 4◦C until usage.

#### Flow Cytometric Seed Screen (FCSS)

The unique development pathways of single Ranunculus hybrid seeds were comprehended by flow cytometric measurements and data analysis (Matzk et al., 2000). Single seeds were ground by two small steel beads (4 mm Ø, Qiagen, Hilden, Germany) in a 2 ml SafeSeal micro tube (Sarstedt) using a TissueLyser II (Qiagen) for 7 s at 30 Hz s−<sup>1</sup> . DNA extraction started with inverting the seed powder for 30 s after adding 200 µl of Otto I buffer (Otto, 1990). Subsequent procedures such as sample filtration, nuclei staining and sample measurements were identical to the ploidy determination protocol (incl. gain settings). In FCSS, the ploidy of endosperm and embryo in seeds (C values) were determined by calculating means of DNA content for each peak by using the FloMax software. Based on these data the "peak index" (PI) was calculated (mean peak value of endosperm/mean peak value of embryo DNA content), which allowed, together with the peak positions, the identification of the specific reproduction pathway of every single seed (**Table 3**, after Klatt et al., 2016, modified).

Earlier R. auricomus studies (e.g., Hojsgaard et al., 2014; Klatt et al., 2016) had revealed an eight-nucleate Polygonum type embryo sac, and hence a peak index of 1.5 is characteristic for sexually formed seeds. These consist of a reduced embryo sac, in which fertilization of the egg cell by one sperm nucleus results in a zygotic embryo (n + n) while the two fused reduced central cell nuclei both were fertilized by the other reduced male gamete (2n + n) (**Table 3**, pathway A). A classical apomictic R. auricomus seed is considered to exhibit peak indices of 2.0–4.0 which is due to the unreduced embryo sac nuclei, the parthenogenetic development of the embryo (2n + 0) and either autonomous endosperm (4n + 0; PI = 2.0, **Table 3**, pathway D) or the pseudogamous formation of the endosperm by central cell fertilization by two unreduced pollen nuclei (4n + 4n; **Table 3**, pathway E, peak index 4.0). We regard an interpretation of pathway D as G2 peak of the embryo as unlikely as Ranunculus seeds always form a rapidly growing endosperm, with endosperm peaks usually being higher than embryo peaks (while G2 peaks are always much smaller). Pathway E could also result from endosperm endopolyploidy following pathway D, but nevertheless is a case of an asexual seed. The other cases of central cell fertilization by one reduced pollen nucleus (4n + n; peak index 2.5) or by two reduced pollen nuclei (4n + 2n; peak index 3.0) as typical for established Ranunculus apomicts (Hojsgaard et al., 2014; Klatt et al., 2016) were not detected in our study. An intermediate case between sex and apomixis is the occurrence of

FIGURE 1 | Asexual embryo sac development in an ovule of a diploid Ranunculus F2 hybrid. (a) Ovule during functional megaspore formation. The germ line with the four meiotic products is visible, of which three cells are aborted and only the one near the chalazal pole survived and developed into a functional megaspore. (b) Identical ovule as in (a). but this image displays one cell layer above the germ line, showing an aposporous initial cell. Plant individual: J10 × J30 (12). FM, functional megaspore; AIC, aposporous initial cell; ii, inner integuments; \*, micropylar pole; •, chalazal pole. Scale bar: 50µm.


TABLE 2 | Analysis of female development in diploid Ranunculus F2 hybrid ovules at the end of sporogenesis and beginning of gametogenesis.

Mean percentages of sexual functional megaspore (FM) formation, aposporous initial cell (AIC) formation and ovule abortion were determined by DIC microscopy. "Type" designates whether only the maternal (m) or the paternal (p) parent or both (mp) of the hybrid class was aposporous (see Table S1).

so-called maternal BIII hybrids (**Table 3**, pathway C). Here, an asexually formed, unreduced egg cell is fertilized by a reduced male gamete and the endosperm is developed after fertilization of the central cell by a reduced male pollen nucleus as well. This combination results in a ploidy shift of the embryo (2n + n) and endosperm peaks (4n + n) and a unique peak index of 1.7. One single case of a paternal BIII hybrid was found. Here, egg cell and central cell of a reduced embryo sac were fertilized by each one unreduced pollen nucleus forming a triploid embryo (n + 2n) and tetraploid endosperm (2n + 2n; peak index = 1.3; **Table 3**, pathway B).

#### Germination Rates

In order to determine the viability of seeds formed by F<sup>2</sup> plants, up to ten seeds per plant from all 13 genotypes (**Table 5**) were sown onto sterilized Fruhstorfer soil (type P mixed with 1/3 sand), covered with quartz gravel and incubated in a YORK <sup>R</sup> climate chamber (16◦C, humidity of 60%, day: night regime of 16 h: 8 h; Johnson Controls) for 10 weeks. In spring, the pots were transferred to the old botanical garden (University of Göttingen) to ensure natural sprouting conditions. Germination was checked weekly. The final germination rates were calculated after 23 weeks.

TABLE 3 | Reproductive pathways of seed development of F3 hybrids seeds of the Ranunculus auricomus complex identified by Flow Cytometric Seed Screen (FCSS).


Nomenclature of pathways B and C was adapted from Doležel et al. (2007). PI, peak index = endosperm/embryo ploidy; m, maternal; p, paternal.

#### RESULTS

### Ploidy Determination and Genotyping of F<sup>2</sup> Hybrids

Most of the Ranunculus F<sup>2</sup> plants from diploid parents (F, J crosses) were found to be diploid with one exception (one new triploid plant). The individuals with a "G" in the name descended from crosses of R. cassubicifolius (4x) with R. notabilis (2x) and were previously determined as triploid. As expected from the aneuploidy of the 3x parent plants, the F<sup>2</sup> offspring was determined as 3x, 4x, and 6x (**Table S1**).

#### Female Development

About 4,900 ovules, from ten different synthetic Ranunculus F<sup>2</sup> hybrid crosses, corresponding to 79 plant individuals, were examined for the mode of female development. All analyzed ovules belong exclusively to diploid Ranunculus plants because polyploid individuals in general formed only a very small number of flower buds. The fraction of these buds which showed the informative stage of development was too small to be statistically analyzable. The same was true for the crossings J9A × J20A, F10 × J33 and F7A × J9. Altogether, 4,811 ovules from 79 diploid plants were interpretable. Ovules showed disturbed megasporogenesis indicated by persistence of meiotic germ cell proliferation at manifold time points. The normal, sexual trait, in which the germ line cell located closest to the chalazal pole developed further into a functional megaspore, was observed in 63.08% (mean of all Ranunculus samples, **Table 2**; **Figure S3a**). An overall mean percentage of 16.08% of all analyzed hybrid ovules was found to develop aposporously (**Figure S1**; **Table 2**). Apospory was indicated by the occurrence of AIC in close proximity to a sexual functional megaspore, for instance one cell layer below (**Figure 1**). AICs occurred in two hybrid classes with an aposporous father, in five with an aposporous mother, and in three classes where both parents were aposporous (**Table 2**). The proportion of apospory in the analyzed hybrids derived from both aposporous parents (mean 21.18% ± 11.83 STD, median 19%) was higher in comparison to F<sup>2</sup> plants that originate from parents, of which only one formed aposporic embryo sacs (mean 13.98% ± 13.94 STD, median 11%). The difference was statistically weakly significant (P = 0.012, **Figure 2**).

#### Seed Set, Flow Cytometric Seed Screen, Germination Rates

The R. auricomus F<sup>2</sup> hybrids were used to create seed by handpollination between individuals in 2016. This seed set revealed a mean of 22.49% well-developed, mature Ranunculus seeds, while the remaining 77.51% were identified as aborted (**Figure S2**; **Table 3**). None of the polyploid F<sup>2</sup> plants was able to form mature, living seeds for analysis.

In the FCSS analysis only plants were taken into account that produced at least three mature, viable seeds that displayed both an embryo and an endosperm peak in FCSS histograms. Overall, 600 mature F<sup>3</sup> hybrid seeds were analyzed by singleseed FCSS to elucidate their individual mode of development. The measurements showed that seven out of twelve Ranunculus crosses had exclusively formed sexual seeds, while the others developed BIII and apomictic seeds as well (**Figure 3**; **Tables 3**, **4**). In total, fourteen non-sexual seeds were detected, which equals 2.33% of the 600 investigated seeds. Eleven (78.57%) of these seeds were classified as maternal BIII hybrids, one as paternal BIII hybrid and the other two apomictic seeds developed either as shown in pathway "D" or as in pathway "E" (**Figure 3**; **Table 3**).

In total, nearly 280 Ranunculus seeds were sown in February 2017 and cultivated in a climate chamber and afterwards outside in the botanical garden under natural conditions. The overall germination rate of all 13 different tested genotypes was determined after 23 weeks and was in the mean 36.96% (**Table 5**).

#### DISCUSSION

Gametophytic apomixis is a long studied topic in developmental and evolutionary botany (e.g., Winkler, 1908; Gustafsson, 1946; Nogler, 1984a). Its functional causes, however, are still unclear and under extreme debate because manifold hypothesis and ideas circulate in order to explain this phenomenon. The most important potential natural triggers are hybridization (Ernst, 1918; Mogie, 1992), polyploidization (e.g., Sober, 1984) or a combination of both (e.g., Bierzychudek, 1985; Asker and Jerling, 1992). This specific type of reproduction demands three synchronized and balanced phases to ensure growing of viable, apomictic fruits (Grimanelli et al., 2001). First, the effective circumvention of meiotic cell division (e.g., via apospory), then



TABLE 5 | Germination rate of mature Ranunculus seeds derived from F2 plants after 23 weeks of cultivation.


FIGURE 2 | Boxplots of percentages of aposporous ovules for diploid F2 hybrids. Hybrid plants descending from parents that both have shown apospory before (left) formed significantly more aposporous ovules than the plants with only an aposporous mother plant (P = 0.012). Outliers are marked as stars and open circles, the box represents the interquartile range and in the boxplots the median is displayed.

to individuals with only one aposporous parent (13.98% ± 13.94 STD, P = 0.012). Since all plants were kept under equal conditions, we can rule out differential stress influence on frequencies of apospory (Klatt et al., 2016; Rodrigo et al., 2017). The influence of genomic dosage of control factors on frequencies of aposporous ovule formation was already shown previously in crossing experiments of polyploid Ranunculus by Nogler (1984b). However, other than assumed by Nogler (1984a), our results suggest that also haploid male and female gametes can carry apospory-controlling heritable control factors.

Mean peak indices (PI) of the respective developmental pathways.

the parthenogenetic establishment of an embryo and finally, the successful endosperm development. In the present study synthetic R. auricomus hybrid plants of the second generation were analyzed. To common knowledge sexual Ranunculus plants follow the Polygonum type of female development (Nogler, 1973) but evidently this important process was heavily altered, indicated by persistence or abortion of embryo sac formation. Similar but more severe developmental disturbances have been described by Hojsgaard et al. (2014), who analyzed the parental generation of the plants in focus here.

#### Frequencies and Genomic Dosage Effects on Apospory

All analyzed Ranunculus plants of the second hybrid generation were invariably identified as diploid, non-maternal genotypes, which means that these plants were sexually formed without any spontaneous ploidy shift. In the grand mean, 16.08% of the investigated F<sup>2</sup> ovules showed aposporous initial cell formation, while only a mean of 11% of the diploid F<sup>1</sup> hybrid ovules had aposporous development (Hojsgaard et al., 2014). In addition, apospory in F<sup>2</sup> hybrids seems to be dependent on dosage of heritable genetic control factors, because plants that originated from parents that both had shown apomeiotic embryo sac development, displayed a significantly enhanced percentage of asexual ovules (21.18% ± 11.83 STD) compared

but both the embryo and the central cell got fertilized by each one reduced male gamete (pathway C). (C) Asexual seed with a diploid embryo and a tetraploid endosperm (pathway D). The embryo, derived from an unreduced embryo sac, as well as the endosperm developed without fertilization. (D) Asexual Ranunculus seed with a diploid embryo and a near octoploid endosperm (pathway E). From the unreduced embryo sac, the embryo developed parthenogenetically into a diploid embryo and the unreduced polar nuclei got both fertilized by two unreduced pollen nuclei. Genotypes: (A,D) J30 × J18 (01) X J10 × J30 (04), (B) F10 × J33 (12) X F10 × J33 (07), (C) J20 × J2 (14) X J20 × J2 (18).

Ranunculus F<sup>2</sup> hybrids illustrate developmental disturbances during megasporogenesis and -gametogenesis that are often thought to result in either whole ovule abortion or in reduced fertility due to failures during megagametophyte formation (**Figures S3b–d**). Similar, but more drastic irregularities were observed by Hojsgaard et al. (2014) when analyzing the temporal and developmental processes during female development of the F<sup>1</sup> generation. In contrast, natural Ranunculus hybrids show milder discrepancies in embryo sac and seed formation (Nogler, 1971, 1972; Hojsgaard et al., 2014).

However, it is still unresolved which apomeiosis-provoking factor triggers reprogramming of a somatic nucellar cell. Since an effect of polyploidy can be ruled out in our F<sup>2</sup> plants, it is assumed that all these alterations are due to previous hybridization, which consequences are known to be the strongest and most perceptible in the first few hybrid generations, especially in diploid plants (Barton, 2001). Hybridization is a powerful driving force in plant speciation and evolution that can result in genomic shocks (Rieseberg et al., 2003). Hybridization can cause dramatic chromosomal rearrangements which were shown to be associated to apomixis in diploid, diplosporous Boechera (Kantama et al., 2007). It is further supposed that hybridization events dislocate timing and pattern of gene expression of sexual reproduction controlling genes by changing their genomic constitution or epigenetic regulation (Koltunow, 1993; Carman, 1997; Hand and Koltunow, 2014; Shah et al., 2016). Epigenetics is altered upon hybridization in plants and such reversible changes like DNA methylation or RNA interference are thought to be able to cause apomixis, without affecting the plants' genome sequence (Comai, 2005; Ozias-Akins and van Dijk, 2007; Grimanelli, 2012). In Ha et al. (2009) speculated that genomic shocks can be prevented by specific small RNAs formed during hybridization or polyploidization, providing improved genome stability to hybrid plants. Furthermore, it was shown that the onset of reproductive actions in mutant Arabidopsis ovules were mainly caused by small RNA silencing pathways involving the AGO9 protein (Olmedo-Monfil et al., 2010). Cell-to-cell signaling is a feature of double-stranded small RNAs, where they tend to silence their target genes (Molnar et al., 2010). The assumption of cell-to-cell signaling is reasonable because aposporous initials always emerge in the direct neighborhood of the megaspore tetrad (e.g., **Figure 1**). Small RNAs commonly interact with proteins of the ARGONAUTE family and form together the RNA-induced silencing complex (RISC), which is an essential component during transcriptional and posttranscriptional gene silencing (Bourc'his and Voinnet, 2010; Feng et al., 2010; Mallory and Vaucheret, 2010). Thus, it seems reasonable that heritable epigenetic processes are responsible for functional silencing of the sexual reproduction pathway in favor of apomixis (Grimanelli, 2012; Hand and Koltunow, 2014).

Our results confirm that apospory is a facultative mechanism, which includes parallel existence of sexual and apomictic development and thus finally in a mixture of sexual and asexual seeds (Nogler, 1984a). The facultative character of gametophytic apomixis could be the result of the ability to maintain the epigenetically unsilenced genomic state (Hand and Koltunow, 2014).

### Diploid Hybrids Are Able to Reproduce via Apomictic Seed Formation

During seed formation, further developmental processes come into play and influence proportions of sexually vs. apomictically formed seed. Successful seed formation in sexual Ranunculus species is highly dependent on fertilization of the egg and the central cell nuclei. In pseudogamous apomicts, fertilization of the central cell and endosperm development is important for seed formation as well. In angiosperms, the optimal ratio of maternal (m) to paternal (p) genome contributions in the endosperm was determined to be 2:1 due to genomic imprinting. Deviations from this ratio have deleterious effects leading to heavy disturbances or even to seed abortions (Spielman et al., 2003; Vinkenoog et al., 2003). Ranunculus species were characterized as very sensitive to endosperm imbalances (Hörandl and Temsch, 2009). Failure of endosperm development likely explains high seed abortion rates of our F<sup>2</sup> hybrids. More than three-fourth of all seeds harvested from diploid Ranunculus F<sup>2</sup> hybrids were found to be dead, either due to abortion at early stages of development or due to mal-developed endosperm tissue (**Figure S2**). Only a mean of 22.49% of achenes were intact. Conversely, polyploid F<sup>2</sup> hybrids failed completely to produce mature seeds. Even well-formed achenes showed no embryo peak in FCSS analyses. Therefore, no evaluation on the reproductive mode from polyploid plants could be made. Extreme seed abortion rates in the diploids appeared to be at expense of apomictic development as finally almost only functional sexual seeds were formed. This differs fundamentally from natural Ranunculus apomicts that produce higher proportions of apomictic than sexual seeds (Hojsgaard et al., 2014; Klatt et al., 2016).

In other apomictic plant genera the assertiveness of the sexual pathway seem to mainly depend on the survival rate of functional meiotic cells that possibly can be influenced by pollination timing (Espinoza et al., 2002; Hojsgaard et al., 2013). Natural apomicts have found various strategies to circumvent seed failure e.g., by sustaining the optimal conditions or by tolerating variations in the paternal contribution to the endosperm (e.g., Savidan, 2007; Dobeš et al., 2013). In our synthetic F<sup>2</sup> plants, seed formation trod various paths. Most of the diploid plants developed sexual seeds and maintained the favored conditions of 2m: 1p ratio, while the apomictic seeds showed several types of genomic imbalance based on an unreduced embryo sac. BIII hybrid seeds revealed a different endosperm contribution of 4m: 1p or 2m: 2p (**Table 3**). The seed that developed fully autonomously without any fertilization showed an extreme change to 4m: 0p (**Table 3**) and the seed that followed reproductive pathway D showed a modified genomic imbalance in the endosperm of 4m: 4p (**Table 3**). A comparable tolerance was also previously observed in polyploid F<sup>1</sup> plants, which showed alterations in genome dosage as well (Hojsgaard et al., 2014). Autonomous endosperm development was before reported for apomictic R. auricomus (Klatt et al., 2016), and here analysis of the FCSS histogram verifies this rare observation (**Figure 3C**). Also in apomictic R. kuepferi, autonomous endosperm occurred very rarely (Schinkel et al., 2016).

However, the two most common modes of seed formation in natural apomictic Ranunculus were not detected in this study. Typical apomictic Ranunculus seeds are composed of an unreduced embryo sac that parthenogenetically developed into an embryo plus a pseudogamously developed endosperm formed by fertilization of one unreduced or two reduced male gametes (PI = 2.5 and 3.0 respectively, e.g., Klatt et al., 2016). Notably, these cases restore the optimal 2m: 1p ratio in the endosperm and usually represent the most frequent case of functional apomictic seeds in Ranunculus (Hojsgaard et al., 2014). It is likely that none of these "standard" cases was found in the F<sup>2</sup> hybrids due to various reasons: (1) in many cases, premature embryo sac formation during the bud stage was observed, before pollen is available (**Figures S3e,f**); (2) the pollen quality of hybrid plants could have been low, as it is typical for Ranunculus apomicts (Izmailow, 1996; Hörandl et al., 1997; Schinkel et al., 2017); in both cases, failure of endosperm development could have caused seed abortion. Alternatively, several crosses (F × F and J × J genotypes) could have resulted in sibling cross incompatibility.

The formation of several BIII hybrids by diploid F<sup>2</sup> hybrids indicates an insufficient coupling of apospory and parthenogenesis, which are both essential for the functional establishment of gametophytic apomixis. This circumstance could be due to the early onset of apospory in F<sup>1</sup> hybrids as described by Hojsgaard et al. (2014) and again verified in the F<sup>2</sup> plants by appearance of fully mature, seven-nucleic embryo sacs already in flower bud stage (**Figures S3e,f**). It is assumed that apospory is connected to extreme long time periods of egg cell receptivity, which reduces the degree of parthenogenetically formed embryos and in turn increases the number of BIII seeds as found in the F<sup>2</sup> hybrids (Martinez et al., 1994; Nogler, 1995). Apospory and parthenogenesis are under different genetic control mechanisms (Nogler, 1984a; Ozias-Akins and van Dijk, 2007). The coupling of these processes is obviously not yet established in the majority of F<sup>2</sup> hybrids studied here, except for two apomictically formed seeds.

#### The Role of Polyploidy for Expression of Apomixis

Our results shed a new light on the role of polyploidy. In almost all plants, natural apomictic reproduction occurs together with polyploidization, which led to the conclusion that polyploidy is an essential necessity for apomixis rather than an option (e.g., Bierzychudek, 1985; Carman, 1997; Koltunow and Grossniklaus, 2003). Nonetheless, a few reports on natural diploid apomictic plants are known e.g., in Boechera (Dobeš et al., 2006; Aliyu et al., 2010), Paspalum (Siena et al., 2008), and R. kuepferi (Schinkel et al., 2016). Thus, the switch to gametophytic apomixis in the F<sup>2</sup> generation analyzed here is maintained by the hybrid character of the plants and not by polyploidization. Nevertheless, the identified BIII hybrid seeds, derived from diploid parents, would result in triploid neopolyploids with a high potential for apospory because of maternal gene dosage effects. Fertilization of the unreduced triploid egg cell by an aposporous pollen donor will result in tetraploids with increased dosages for apospory. So-called female triploid bridges are described in mediating polyploid apomicts (e.g., Schinkel et al., 2017). In this aspect our results support the hypothesis by Schinkel et al. (2017) that apomixis would be rather a cause than a consequence of polyploidy.

Flow cytometric observations of apomictically developed diploid Ranunculus seeds do not support the hypothesis by Nogler (1984a) that inheritance of apospory-controlling factors would require unreduced gametes because of recessive lethal effects in the haploid genome. Mature diploid, asexual Ranunculus seeds show that they do not suffer from recessive lethal effects during seed formation. Since hybridization events are known to cause disturbances in epigenetic regulation, the observed irregularities and temporal alterations are likely to be due to changes of epigenetic genome modulation (e.g., Grimanelli, 2012). In order to get a complete picture of the establishment of apomixis in Ranunculus hybrid plants, the viability of harvested seeds, derived from synthetic F<sup>2</sup> hybrids, was analyzed by determination of their germination rate. Sexual Ranunculus species are known to have a higher fitness than apomictic species in terms of seed set (Izmailow, 1996; Lohwasser, 2001; Hörandl, 2008). However, germination rates between diploid F<sup>1</sup> hybrids, hexaploid apomicts, and diploid sexual species did not differ significantly from each other (Hörandl, 2008). The observed mean germination rate of c. 37% in the F<sup>2</sup> hybrids studied here falls within the range of means of c. 35–54% of the previous study. The germination process is obviously not significantly disturbed in F<sup>2</sup> hybrids, which means that further hybrid generations can be formed.

## CONCLUSION

The success of apospory in diploid Ranunculus F<sup>2</sup> hybrids was found to be based on irregularities during female development, triggered by interspecific hybridization that strongly interfered with temporal and developmental course of action. The frequency of unreduced embryo sac formation depended on the dosage of genetic control factors passed on by the parent generation. However, the connection of apospory, parthenogenesis and pseudogamous endosperm formation is not yet reliably installed in synthetic F<sup>2</sup> hybrids, which is indicated by a high rate of aborted seeds, probably suffering from unbalanced maternal: paternal genome contributions to the endosperm. Nevertheless, a small but not negligible number of apomictic and BIII seeds was obtained, meaning that the establishment of apomixis in Ranunculus hybrids potentially can continue in further generations. On the one hand, BIII hybrid plants are assumed to be the next step toward stabilization and extension of apomictic potential in a polyploid background. On the other hand, polyploid F<sup>2</sup> hybrids in this study, only formed a small number of mature but not analyzable seeds without detectable embryo tissue. Thus, for the next plant generation several triploid individuals are expected that would be highly aposporous but mostly seed-sterile. In a larger, evolutionary timescale, rare successful polyploid apomictic seed formation would be favored by natural selection and increase in frequency. This process could result in the establishment of a functional apomictic, polyploid new Ranunculus lineage.

## AUTHOR CONTRIBUTIONS

BB performed research, analyzed and interpreted data. BB and EH wrote the manuscript. EH designed the research. MD performed some FCSS experiments.

### FUNDING

This project was founded by the German Research Fund DFG (project Ho 4395/4-1) to EH.

### ACKNOWLEDGMENTS

We thank the reviewers for valuable comments on the manuscript. We also thank Silvia Friedrichs and Sabine Schmidt for taking good care of the plants, Jennifer Krüger for SSR lab work, and Simone Klatt, Ladislav Hodac, and Diego Hojsgaard ˇ for methodical advice.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018. 01111/full#supplementary-material

#### REFERENCES


after sexual hybridisation and polyploidisation. New Phytol. 204, 1000–1012. doi: 10.1111/nph.12954


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2018 Barke, Daubert and Hörandl. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

## The MAP3K-Coding QUI-GON JINN (QGJ) Gene Is Essential to the Formation of Unreduced Embryo Sacs in Paspalum

#### Edited by:

Andrea Mazzucato, Università degli Studi della Tuscia, Italy

#### Reviewed by:

Takashi Okada, University of Adelaide, Australia Serena Varotto, Università degli Studi di Padova, Italy Ross Bicknell, The New Zealand Institute for Plant & Food Research Ltd., New Zealand

> \*Correspondence: Silvina C. Pessino pessino@arnet.com.ar

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 02 July 2018 Accepted: 03 October 2018 Published: 24 October 2018

#### Citation:

Mancini M, Permingeat H, Colono C, Siena L, Pupilli F, Azzaro C, de Alencar Dusi DM, de Campos Carneiro VT, Podio M, Seijo JG, González AM, Felitti SA, Ortiz JPA, Leblanc O and Pessino SC (2018) The MAP3K-Coding QUI-GON JINN (QGJ) Gene Is Essential to the Formation of Unreduced Embryo Sacs in Paspalum. Front. Plant Sci. 9:1547. doi: 10.3389/fpls.2018.01547 Micaela Mancini<sup>1</sup> , Hugo Permingeat<sup>1</sup> , Carolina Colono<sup>1</sup> , Lorena Siena<sup>1</sup> , Fulvio Pupilli<sup>2</sup> , Celeste Azzaro<sup>1</sup> , Diva Maria de Alencar Dusi<sup>3</sup> , Vera Tavares de Campos Carneiro<sup>3</sup> , Maricel Podio<sup>1</sup> , José Guillermo Seijo4,5, Ana María González4,6, Silvina A. Felitti<sup>1</sup> , Juan Pablo A. Ortiz<sup>1</sup> , Olivier Leblanc<sup>7</sup> and Silvina C. Pessino<sup>1</sup> \*

1 Instituto de Investigaciones en Ciencias Agrarias de Rosario, CONICET-UNR, Facultad de Ciencias Agrarias, Universidad Nacional de Rosario, Zavalla, Argentina, <sup>2</sup> Istituto di Bioscienze e BioRisorse, Consiglio Nazionale delle Ricerche, Perugia, Italy, <sup>3</sup> Parque Estação Biológica, Embrapa Recursos Genéticos e Biotecnologia, Brasília, Brazil, <sup>4</sup> Instituto de Botánica del Nordeste, CONICET-UNNE, Corrientes, Argentina, <sup>5</sup> Facultad de Ciencias Exactas y Naturales y Agrimensura, Universidad Nacional del Nordeste, Corrientes, Argentina, <sup>6</sup> Facultad de Ciencias Agrarias, Universidad Nacional del Nordeste, Corrientes, Argentina, <sup>7</sup> DIADE, Univ Montpellier, IRD, Montpellier, France

Apomixis is a clonal mode of reproduction via seeds, which results from the failure of meiosis and fertilization in the sexual female reproductive pathway. In previous transcriptomic surveys, we identified a mitogen-activated protein kinase kinase kinase (N46) displaying differential representation in florets of sexual and apomictic Paspalum notatum genotypes. Here, we retrieved and characterized the N46 full cDNA sequence from sexual and apomictic floral transcriptomes. Phylogenetic analyses showed that N46 was a member of the YODA family, which was re-named QUI-GON JINN (QGJ). Differential expression in florets of sexual and apomictic plants was confirmed by qPCR. In situ hybridization experiments revealed expression in the nucellus of aposporous plants' ovules, which was absent in sexual plants. RNAi inhibition of QGJ expression in two apomictic genotypes resulted in significantly reduced rates of aposporous embryo sac formation, with respect to the level detected in wild type aposporous plants and transformation controls. The QGJ locus segregated independently of apospory. However, a probe derived from a related long non-coding RNA sequence (PN\_LNC\_QGJ) revealed RFLP bands cosegregating with the Paspalum aposporycontrolling region (ACR). PN\_LNC\_QGJ is expressed in florets of apomictic plants only. Our results indicate that the activity of QGJ in the nucellus of apomictic plants is necessary to form non-reduced embryo sacs and that a long non-coding sequence with regulatory potential is similar to sequences located within the ACR.

Keywords: apomixis, apospory, LNC-QGJ, MAP3K, Paspalum notatum, plant reproduction, QGJ, QUI-GON

### INTRODUCTION

fpls-09-01547 October 22, 2018 Time: 14:35 # 2

Asexual reproduction can naturally occur in ovules of several flowering plant taxa through apomixis, an alternative route to sexuality, which allows the formation of maternal embryos within seeds (Nogler, 1984; Carman, 1997). This atypical trait relies on developmental alterations which cause unreduced cells within the ovule to acquire a reproductive fate. Although mechanistically diverse, apomictic pathways are usually classified into two major classes (i.e., sporophytic and gametophytic), depending on the origin of maternal embryos (Hand and Koltunow, 2014). During sporophytic apomixis, embryogenesis occurs spontaneously in somatic cells of the ovule, leading to the formation of seeds that harbor supernumerary maternal embryos. In contrast, gametophytic apomixis involves the differentiation of functional, unreduced embryo sacs (2n-ES) within the ovule, followed by egg cell parthenogenetic development into embryos (Bicknell and Koltunow, 2004). Depending on the origin of 2n-ESs, gametophytic apomixis can be further subcategorized into: (1) diplospory, when the megaspore mother cell (MMC) fails meiosis and enters into gametogenesis; or (2) apospory, when one or several nucellar or integumental cells, which are usually somatic companions of the MMC, acquire a gametic fate. In all gametophytic apomicts the embryo develops autonomously, while the formation of the endosperm can be either autonomous or fertilization-dependent (pseudogamous) (Bicknell and Koltunow, 2004).

In the last decade, transcriptomic surveys have allowed the identification of hundreds of candidate genes allegedly associated with apomixis in di- and monocotyledonous plants. However, predictions for most of these candidates revealed they belong to a few functional categories, including signal transduction, cell-cycle control, protein turnover, intercellular signaling, transposon activity and transcriptional regulation (Pessino et al., 2001; Rodrigues et al., 2003; Albertini et al., 2004; Cervigni et al., 2008; Laspina et al., 2008; Sharbel et al., 2009, 2010; Yamada-Akiyama et al., 2009; Garcia-Aguilar et al., 2010; Polegri et al., 2010; Okada et al., 2013). Particularly, a transcript fragment (N46) displaying homology with mitogen-activated protein kinase kinase kinases (MAP3K/MAPKKK/MEKK) and similarity to the Arabidopsis gene At1g53570 (MAPKKK3) was identified in Paspalum notatum, an aposporous sub-tropical grass (Laspina et al., 2008). Interestingly, another transcript showing homology with the same Arabidopsis gene (A-148-3) mapped to the apospory controlling region (ACR) of Paspalum simplex, a single, non-recombinant, dominant superlocus, which confers nearly 100% apospory (Polegri et al., 2010), epigenetically controlled parthenogenesis (Podio et al., 2014a) and the capacity to form endosperm with unbalanced parental genome contributions (Ortiz et al., 2013).

Based on these results and considering the essential roles of MAPKs in plant development (Musielak and Bayer, 2014; Xu and Zhang, 2015), we rationalized that the P. notatum At1g53570 homolog N46 might be involved in the switch from sexuality to aposporous apomixis in this species. The central biological question of our work was the following: is the Paspalum At1g53570 ortholog (N46) involved in the developmental molecular cascade controlling apospory, either as trigger or participant? To test this hypothesis, we first mined transcriptomic resources available for P. notatum to complete the characterization of N46 sequences. We also conducted spatio-temporal expression analyses in sexual and apomictic genotypes of P. notatum, and used Brachiaria brizantha, a related aposporous species, as a validation control. Finally, we made functional analyses in P. notatum by producing RNA interference (RNAi) lines. Moreover, we mapped N46 onto the P. notatum genome to explore the occurrence of genetic linkage with apomixis. Finally, we determined that the At1g53570-like transcript previously identified by Polegri et al. (2010) (see above) was not a protein-coding ortholog of At1g53570 and N46, but a lncRNA showing only partial similarity with these genes.

#### MATERIALS AND METHODS

### Plant Material

The P. notatum genotypes used in this work belong to the IBONE's germplasm collection (Instituto de Botánica del Nordeste, UNNE-CONICET, Corrientes, Argentina) and are listed below: (i) natural apomictic tetraploid genotype Q4117 (2n = 4x = 40) (Ortiz et al., 1997); (ii) experimentally obtained sexual tetraploid genotype Q4188 (2n = 4x = 40) (Quarin et al., 2003); (iii) colchicine-treated sexual double-diploid genotype C4-4x (2n = 4x = 40) (Quarin et al., 2001); and (iv) 55 F<sup>1</sup> hybrids derived from a cross between Q4188 and Q4117, 29 of them sexual and 26 apomictic (Stein et al., 2007). The Brachiaria brizantha genotypes used here were: (i) sexual diploid genotype BRA 002747 (B105) (2n = 2x = 18); and (ii) cultivar Marandu BRA 00591 (B30), a facultative tetraploid apomictic (2n = 4x = 36). Both genotypes belong to the Embrapa's germplasm collection and are maintained at Embrapa Genetic Resources and Biotechnology, Brasilia Federal District, Brazil.

#### Sequence Analysis

The N46 full-length sequences were retrieved from 454/Roche FLX + floral transcriptome databases generated in prior work (Ortiz et al., 2017) and available at DDBJ/ENA/GenBank under the accessions GFMI00000000 and GFNR00000000, versions GFMI02000000 and GFNR01000000, respectively. Analysis of DNA similarity was done by using the BLASTN and BLASTX packages at the NCBI<sup>1</sup> , the Arabidopsis Information Resource<sup>2</sup> and the Gramene<sup>3</sup> websites, as well as exploring the Oryza Repeats Database<sup>4</sup> . For open reading frame (ORF) detection, the NCBI ORF Finder tool was used<sup>5</sup> . Gene schemes were constructed with the WormWeb Exon-Intron graphic maker<sup>6</sup> . Alignments and phylogenetic analyses were done with ClustalW2 (Larkin et al., 2007) and MEGA6 (Tamura et al., 2013)

<sup>1</sup>http://www.ncbi.nlm.nih.gov/BLAST/

<sup>2</sup>https://www.arabidopsis.org/Blast/index.jsp

<sup>3</sup>www.gramene.org

<sup>4</sup>http://rice.plantbiology.msu.edu/annotation\_oryza.shtml

<sup>5</sup>http://www.ncbi.nlm.nih.gov/gorf/gorf.html

<sup>6</sup>http://wormweb.org/exonintron

software packages, respectively. The evolutionary history was inferred using the UPGMA method (Sneath and Sokal, 1973). Evolutionary distances were computed using the Poisson correction method (Zuckerkandl and Pauling, 1965) (units: number of amino acid substitutions per site). The lncRNA similarity survey was done onto the plant lncRNA GreeNC database (Paytuví Gallart et al., 2016).

#### PCR Amplifications

fpls-09-01547 October 22, 2018 Time: 14:35 # 3

Genomic DNA was extracted from 200 mg of leaves by using CTAB (Paterson et al., 1993). To reveal the presence of the 76-nt intron in the genome, amplification reactions were carried out with a primer pair complementary to the intron flanking regions: FIP upper (50 -ATTTGCAAGGACCAACATCC-3<sup>0</sup> , Tm: 59.80◦C) and FIP lower (5<sup>0</sup> -ATGGCAAGCAACTTCGATTC-3<sup>0</sup> , Tm: 60.22◦C). To amplify the entire QGJ sequence we used the following primers: N46full upper: 50GCGTGTACGCCTCTCTCTCT3<sup>0</sup> , Tm: 59.78◦C; N46full lower: 50CTGCATCCTGGGTGAAAAAT3<sup>0</sup> , Tm: 59.93◦C. Reactions (final volume 25 µL) included 1X Real mix qPCR (BIODYNAMICS), 2 mM MgCl2, 200 µM dNTPs, 200 nM gene-specific primers and 60 ng genomic DNA. Amplifications were performed in a BIO-RAD thermocycler, programmed as follows: 1 min at 94◦C, 35 cycles of 1 min at 94◦C, 2 min at 57◦C and 2 min at 72◦C, and a final elongation step of 5 min at 72◦C.

To evaluate the representation of different splice variants by semiquantitative RT-PCR, total RNA was extracted from leaves and/or spikelets at premeiosis/meiosis and reverse transcribed using Superscript II (INVITROGEN). Amplifications were conducted with: (1) the same primer pair flanking the 76-nt intron described above; or (2) a primer pair located inside the intron: IP upper (5<sup>0</sup> -AAACAGCATGGTGCAGTCAA-3<sup>0</sup> , Tm: 60.31◦C) and IP lower (5<sup>0</sup> -TCAGGTGGACAATTGATGAGA-3 0 , Tm: 59.07◦C). Each reaction (final volume 25 µL) included the same components used for genomic amplifications and were run in the same thermocycler, but 20 ng of cDNA were used as template, and the cycling was 5 min at 94◦C, 25 cycles of 30 s at 94◦C, 30 s at 57◦C and 45 s at 72◦C and a final elongation step of 5 min at 72◦C.

To quantitate QGJ expression in reproductive organs, the following samples were collected: (1) spikelets of apomictic and sexual P. notatum and B. brizantha plants (P. notatum: Q4117 and C4-4x genotypes; B. brizantha: B30 and B105 genotypes) at premeiosis, meiosis and postmeiosis; (2) ovaries of apomictic and sexual B. brizantha plants (B30 and B105 genotypes) at the same above-mentioned stages; (3) spikelets at meiosis/young leaves of wild type (Q4117 genotype, apomictic) and transformant (RNAi1, RNAi2, TC1, TC2) P. notatum plants. Total RNA was extracted with the SV Total RNA Isolation Kit (PROMEGA), which includes a DNAse treatment step. cDNAs were synthetized with Superscript II (INVITROGEN). All qPCR reactions (final volume: 25 µL) included 200 nM gene-specific primers, 1X Real mix qPCR (BIODYNAMICS) and 20 ng of cDNA. Three biological replicates were processed each into three technical replicates. Replicates with templates produced in the absence of Superscript II (INVITROGEN) and without templates were included (negative controls). Amplifications were performed in a Rotor-Gene Q thermocycler (QIAGEN), programmed as follows: 2 min at 94◦C, 45 cycles of 15 s at 94◦C, 30 s at 62◦C and 40 s at 72◦C and a final elongation step of 5 min at 72◦C. QGJ-specific primers were: (1) N46N upper (50GGCCCTGCATCTCCTACTTCAT3<sup>0</sup> , Tm: 68◦C) and N46N lower (5<sup>0</sup> 'TGCCCAAACGTCCCACTGC3<sup>0</sup> , Tm: 62◦C), which amplified QGJ in all allelic contexts (used for chronological expression analysis); (2) N46S upper (50AATCGAAGTTGCTTGCCATC3<sup>0</sup> , Tm: 60◦C) and N46S lower (50GCTCTGTTAGACCGCTGCTT3<sup>0</sup> , Tm: 59◦C), which were located outside the N46 segment cloned into the pBS86-N46 vector (used for analysis of expression in transgenic plants). Nontemplate reactions were included as controls. β-tubulin was used as an internal reference gene, as recommended by Felitti et al. (2011), Ochogavía et al. (2011), and Podio et al. (2014b), who worked in the same plant model. Relative quantitative expression levels were calculated by using REST-RG (Relative Expression Software Tool V 2.0.7 for Rotor Gene, Corbett Life Sciences) considering take-off and amplification efficiency values for each particular reaction.

### In situ Hybridization (ISH) Analyses

Spikelets of P. notatum (genotypes Q4117 and C4-4x) and B. brizantha (genotypes B30 and B105) were collected at premeiosis/meiosis. Flowers were dissected, fixed in 4% paraformaldehyde/0.25% glutaraldehyde/0.01 M phosphate buffer pH 7.2, dehydrated in an ethanol series and embedded in paraffin (for P. notatum) or butyl-methyl-methacrylate (BMM) (for B. brizantha). Specimens were cut into sections of 10 µm (Paspalum) or 3.5 µm (Brachiaria) and placed onto slides treated with poly-L-lysine 100 µg/mL. The paraffin or BMM were removed with xylene or acetone series, respectively. Prior to hybridization, control sections were stained with acridine orange and examined under UV light to verify RNA integrity. A plasmid including the original N46 fragment isolated by Laspina et al. (2008) was linearized using restriction enzymes NcoI or SalI (Promega). Sense and anti-sense probes were labeled with the Roche Dig RNA Labeling kit (SP6/T7), following the manufacturers' instructions, and hydrolyzed to 150–200 bp fragments. Prehybridization was carried out in 0.05 M Tris–HCl pH 7.5 buffer containing 1 µg/mL proteinase K in a humid chamber at 37◦C for 10 min. Hybridization was carried out overnight in a humid chamber at 42◦C, in 10 mM Tris–HCl pH 7.5 buffer containing 300 mM NaCl, 50% formamide (deionized), 1 mM EDTA pH 8, 1 X Denhardt's solution, 10% dextran sulfate, 600 ng/mL tRNA and 600 ng/mL of probe. Detection was performed following the instructions of the Roche Dig Detection kit, using anti DIG AP and NBT/BCIP as substrates. Sections were mounted in glycerol 50% and observed under Leica DMRX (Paspalum experiments) or Zeiss-Axiophot (Brachiaria experiments) light microscopes.

#### Plant Transformation

A vector containing an N46 hairpin (pBS86-N46) (Pact1D:rfan46-s:rga2i:rfa-n46-as:T35s/Pubi:bar:Tnos) was constructed from cloning the complete N46 fragment (451 bp) reported

Mancini et al. Role of QGJ in Apospory

by Laspina et al. (2008) into the selector, bar-containing plasmid pBS86 (Thompson et al., 1987), which includes two insertion sites in opposite orientation (cgf-s and cgf-as). The rice act1 promoter was considered suitable, because it drives expression in male and female reproductive tissues in rice (Zhang et al., 1991) and P. notatum (Mancini et al., 2014). Briefly, attB1 and attB2 Gateway sequences were included in the 5<sup>0</sup> and 3<sup>0</sup> ends of N46-specific PCR primers (Forward primer: GGGGACAAGTTTGTACAAAAAA GCAGGCTTCCCCTCCTCCCCTGTGCCGAC; Reverse primer: GGGGAC CACTTTGTACAAGAAAGCTGGGTTAAG CCTCCCCAAACGGACCAT). An amplicon was generated from a pGemTeasy N46 clone (Laspina et al., 2008), purified by using a Qiagen column and mixed with a Gateway donor vector and BP Clonase enzymes. The recombination mix was used to transform DH5α competent cells (INVITROGEN). The entry clone was then transferred into the Gateway Destination vector pBS86 using LR clonase. The insertion was validated by sequencing at the Plant Biotechnology Centre, Melbourne, VIC, Australia. The pBS86-N46 vector, together with the reporter plasmid pact1-gfbsd2 (Ochiai-Fukuda et al., 2006) carrying the eGFP gene (encoding an enhanced green fluorescent protein) were used to co-transform P. notatum plants (Q4117 genotype) with a protocol previously developed in our laboratory (Mancini et al., 2014). Transformation events were identified by PCR amplification of the transgenes from genomic DNA using the following primers: eGFPF 5 <sup>0</sup>GGGGACAGCTTTCTTGTACAAAGTGGGGATGGTGAGC AAGGGCGAGGAGCT3<sup>0</sup> (Tm: 65.4◦C)/eGFPR 5<sup>0</sup> -GGG GACAACTTTGTATAAAGTTGGTTACTTGTACAGCTCGTCC ATGCC-3<sup>0</sup> (Tm: 66.1◦C) (used to detect eGFP within pact1 gfbsd2) and BARXLF 5<sup>0</sup> -CCGGCGGTCTGCACCATCGT-3<sup>0</sup> (Tm: 66◦C)/BARXR 5<sup>0</sup> -ATCTCGGTGACGGGCAGGAC-3<sup>0</sup> (Tm: 66◦C) (used to detect BAR within pBS86-N46). Reactions of 25 µl final volume included 1x Taq polymerase buffer (PROMEGA), 0.2 mM forward and reverse primers, 2 mM MgCl2, 0.2 µM dNTPs, 50 ng genomic DNA and 1U Taq Polymerase (PROMEGA). Positive (with 20 ng of pact1-gfbsd2 or pBS86-N46) and negative (non-template) controls were run in parallel. Cycling consisted of 5 min at 94◦C, 35 cycles of 30 s at 94◦C, 1 min at the annealing temperature (Ta) and 30 s at 72◦C, and a final 10 min extension at 72◦C. The T<sup>a</sup> was set at 2◦C less than the lower predicted Tm. Calli transient transformation and eGFP Pact1-directed expression in reproductive tissues was followed by using an Eclipse E200 fluorescence microscopy (Nikon, Tokyo, Japan) with an standard filter cube for excitation 470/40 nm; emission 535/50 nm. Transgenic plants were grown in controlled chambers at IICAR, CONICET-Facultad de Ciencias Agrarias, Universidad Nacional de Rosario, Argentina, under a 14 h photoperiod (150–200 µE.m−<sup>2</sup> .s−<sup>1</sup> ) at 26 ± 2 ◦C.

### Cytoembryological Observations and Pollen Viability Tests

Spikelets at anthesis were fixed in FAA (70% ethanol:formaldehyde:acetic acid 18:1:1) for 24–48 h. Ovaries were dissected and placed in 70% ethanol for at least 24 h, treated with 3% H2O<sup>2</sup> during 2 h and dehydrated in an ethanol series (50%, 70%, 95% and twice 100%; 30 min each step). Next, they were cleared using a series of methyl salicylate/ethanol (v:v) solutions (1:1, 3:1, 5.6:1; 30 min for each step). Finally, ovaries were incubated in methyl salicylate for at least 12 h and examined using a Leica DM2500 microscope equipped with DIC optics. Pollen viability was estimated by staining with Alexander's reagent (Alexander, 1980). Purple-stained grains were considered to be viable whereas lack of staining (i.e., pale-green/non-colored grains) indicated sterility. Observations were carried out in a Nikon Eclipse E200 microscope.

#### Statistical Methods

The average number of aposporous and meiotic embryo sacs per ovule was compared among four independent transgenic events and the control genotype Q4117. A modified Shapiro– Wilk test was used to test the normal distribution of the variables (Shapiro and Wilk, 1965). Due to the non-normal distribution detected, the variables were compared using the non-parametric tests of Wilcoxon (Wilcoxon, 1945) and Kruskal–Wallis (Kruskal and Wallis, 1952). Confidence intervals for observed proportions were calculated following the method described by Newcombe (1998), derived from a procedure outlined by Wilson (1927) with a correction for continuity<sup>7</sup> . Chi<sup>2</sup> tests for homogeneity were calculated with the R software<sup>8</sup> .

#### Linkage Analyses

An F<sup>1</sup> population of 55 individuals, derived from a cross between sexual Q4188 as pistillate parent and apomictic Q4117 as male progenitor and characterized for reproductive modes (29 sex: 26 apo) (Stein et al., 2007) was used for linkage analyses. For N46 bulked segregant analysis (BSA), 30 µg of genomic DNA from the two parental lines and two equitable bulks of 10 sexual and 10 apomictic F<sup>1</sup> hybrid progenies were digested with EcoRI, HindIII, and PstI. Samples were loaded in 1% agarose gel (1xTAE), electrophoresed at 40 mA and blotted onto nylon membranes (Hybond N, Amersham) using 10x SSC buffer. DNA was fixed at 80◦C for 2 h. DIG probe labeling (the same N46 fragment used for ISH analyses), hybridization and detection were performed as described by Ortiz et al. (1997). For A-148-3 linkage analysis, 5 µg of genomic DNA from the two parental lines and the 55 F1s were digested using EcoRI, electrophoresed and finally blotted onto Nylon membranes. A-148-3 was converted into an RFLP probe according to Polegri et al. (2010). Probe <sup>32</sup>P labeling, blot hybridization and exposition to X-ray films was performed according to Pupilli et al. (2001).

### LncRNA Expression Analysis

PCR amplifications were conducted from 50 ng of cDNA produced from total RNA extracted from leaves or flowers, with upper primer LNCU: 5<sup>0</sup> -AATTGTGCGAAATCCAATCA-3<sup>0</sup> and lower primer LNCL: 5<sup>0</sup> -TTCACCATTACTGCCCACAA-3<sup>0</sup> . The cycling program included 1 cycle of 1 min at 94◦C, 30 cycles of

<sup>7</sup>http://vassarstats.net/prop1.html

<sup>8</sup>http://cran.r-project.org/doc/contrib/Owen-TheRGuide.pdf

1 min at 94◦C, 2 min at 57◦C and 2 min at 72◦C and a final elongation cycle of 5 min at 72◦C.

### RESULTS

#### Characterization of N46 Full-Length Sequence

Laspina et al. (2008) reported similarity between a mRNA fragment differentially expressed in florets of sexual and apomictic P. notatum plants (N46) and a full-length cDNA transcribed from the maize gene GRMZM6G513881 (see footnote 3; NCBI Reference Sequence NM\_001137220.1), which encodes a MAP3K protein. Here, we took advantage of Paspalum 454/Roche FLX + floral transcriptomes recently developed in our laboratory to recover the N46 full cDNA sequences from apomictic (Q4117) and sexual (C4-4x) genotypes and carry out molecular phylogenetic analysis. In the apomictic floral transcriptome library, we detected one isogroup (apoisogroup 00379) represented by four homologous isotigs, namely apoisotig 03083 (GFMI02003139.1), apoisotig 03084 (GFMI02003140.1), apoisotig 03085 (GFMI02003141.1) and apoisotig 03086 (GFMI02003142.1). In the sexual floral transcriptome library, we also found one isogroup (sexisogroup 02509), but it contained a single isotig (sexisotig 08547; GFNR01008571.1). ClustalW nucleotide (nt) sequence alignments revealed that apoisotigs 03085 and 03086 and sexisotig 08547 were highly similar, differing only by a few polymorphisms (SNPs and INDELs) (**Supplementary Figure S1**). Apoisotigs 03083 and 03084, were respectively, identical to apoisotigs 03085 and 03086, except for a 76-nt insertion with the canonical donor-receptor sites of GU-AG-type introns, which corresponds to a partially conserved intron in maize GRMZM6G513881. These results suggest that P. notatum N46-like floral sequences are genetic and splice variants of a single locus with at least two different alleles in the apomictic genotype (apoisotigs 03085/03083 and 03086/03084) and a third allele detected in the sexual genotype (sexisotig 08547). The intron-like insertion (located between positions 1857–1934 and 1833–1908 in apoisotigs 03083 and 03084, respectively) modifies the reading frame to produce a protein with a variable C-terminal end (**Figure 1** and **Supplementary Figure S2**). Genomic amplification with flanking primers showed that the intron-like region is present in both apomictic and sexual plants (**Supplementary Figure S3**). Although the non-processed form had been sequenced only from the apomictic samples in the 454/Roche FLX + transcriptome libraries, semi-quantitative RT-PCR experiments showed that both variants (processed and non-processed) were represented in flowers of apomictic and sexual plants (**Supplementary Figure S3**). Moreover, qPCR of cDNAs originated from a mix of flowers at different developmental stages (from premeiosis to anthesis) with primers located inside the intron revealed no significant differential representation between reproductive modes (not shown). BLASTX searches using N46 full-length sequences as queries identified homology to MAP3K genes belonging to the YODA family (best annotated match: Oryza sativa mitogen-activated protein kinase kinase kinase YODA isoform X1 XP\_015617106.1; 79% identity; E-val: 0.0; query coverage: 74%; alignment length: 1,688). A phylogenetic tree inferred using 22 homologous protein sequences from different species showed that all Paspalum sequences grouped into a single cluster within the Poaceae clade, supporting the conclusion that they are allelic isoforms or, alternatively, expressed from gene copies that diverged recently (**Supplementary Figure S4**). Finally, the whole QGJ sequence was amplified by using primers located at the borders (see "Materials and Methods"), to confirm its existence without the need of computational assembly. Based on its identity as a member of the YODA family, we named the N46 locus QUI-GON JINN (QGJ), after another character of the Star Wars saga.

### QGJ Quantitative Expression in Reproductive Organs

The QGJ expression was quantified in spikelets of sexual and apomictic P. notatum genotypes at different developmental stages (1: premeiosis; 2: late premeiosis/meiosis; 3: postmeiosis) by using real-time PCR. The primer pair used for amplification was complementary to all known QGJ variants (see "Materials and Methods," qPCR experiments). At stage 1 (premeiosis), QGJ transcripts were equally represented in both P. notatum reproductive types (**Figure 2A**). Later, during late premeiosis/meiosis, the expression in the sexual plant was significantly higher (**Figure 2A**). In contrast, an opposite pattern was observed at post-meiosis (**Figure 2A**). Besides, we took advantage of the EMBRAPA (Brasilia, Brazil) collection of Brachiaria brizantha plants, another wellcharacterized aposporous pseudogamous system (Pagliarini et al., 2012), to validate the results. Brachiaria brizantha (syn. Urochloa brizantha) is, like P. notatum, a rhizomatous perennial grass (Poaceae), which reproduces through aposporous pseudogamous apomixis. As P. notatum plants, aposporous Brachiaria genotypes form supernumerary non-reduced embryo sacs lacking antipodals from nucellar somatic cells surrounding the MMC. The B. brizantha genome also includes a single ACR lacking recombination, which may be evolutionary related to the Paspalum one, since it is located in a chromosomal background displaying partial synteny to rice chromosome 2 (Pessino et al., 1997, 1998). In Brachiaria spikelets a similar expression profile was detected, but at premeiosis overexpression was detected in the apomictic genotype (**Figure 2B**). However, a more detailed quantification of QGJ in RNA samples extracted in Brachiaria isolated ovaries revealed overexpression in apomictic plants at late premeiosis/meiosis (**Figure 2C**), which suggest the occurrence of contrasting representation patterns in different tissues.

#### In situ Expression Pattern

The site of expression was then examined through in situ hybridization in developing ovaries and anthers of P. notatum, using the original N46 clone to produce the sense and antisense probes (**Figure 3**). N46 is complementary to a region conserved among all QGJ variants, so the experiment has no potential to differentiate them. In premeiotic flowers of sexual plants, the

antisense probe showed weak signal in the ovule nucellus and integuments and moderate to strong signal in the MMC, the anther tapetum and pollen mother cells (**Figure 3A**). Meanwhile, in apomictic plants the same probe revealed strong signal in the ovule nucellus and MMCs, and a diminished signal in anthers (**Figure 3B**). The sense probe showed weal signal in ovules of both sexual and apomictic plants (**Figures 3C,D**). We validated the observed in situ differential expression of QGJ genes in B. brizantha aposporous ovaries, by using the same N46 probe (**Figure 4**). In the Brachiaria experiments, thinner microtome slice cuts, together with a microscope with a higher resolving power were used (see "Materials and Methods"), allowing a more accurate detection of the hybridization pattern. In premeiotic ovules of sexual plants, a weak to moderate signal was detected in the ovule nucellus, while a moderate to strong signal appeared in the MMCs (**Figure 4A**). After meiosis I, the signal became mainly restricted to the micropylar cell of female dyads, yet some signal could also be observed in the nucellus (**Figure 4B** and **Supplementary Figure S5**). In tetrads, a strong signal was detected in the non-functional (micropylar) megaspores, while the functional one (located close to the chalazal end of the ovule) showed low signal (**Figure 4C**). In premeiotic ovules of apomictic plants, a weak to moderate signal was detected in the ovule nucellus, while a moderate to strong signal appeared in the MMCs (**Figure 4D**). During aposporous initials (AI) differentiation, moderate to strong signal was detected in the ovule nucellus, except for the cell layer surrounding the MMC (the AI onset site) (**Figures 4E,F**). Note that at this stage the MMC has enlarged and formed a meiocyte instead of entering meiosis I (meiosis frequently fails in obligate aposporous plants). Apospory initials originate from this proximal cell layer lacking signal (**Figure 4F**). While strong signal was detected in pollen mother cells of the sexual plant (**Figure 4G**), a moderate to weak signal was observed in pollen mother cells of the aposporous genotype (**Figure 4H**). Finally, hybridizations using sense probes detected weak signals in the nucellus of both plant types (**Figures 4I,J**). Our results indicate that, in sexual plants, QGJ is weakly expressed in nucellar tissues during meiosis. At this stage, its expression is restricted to the non-functional (micropylar) megaspores, which are adjacent to the functional (chalazal) megaspore. In contrast, and in agreement with our previous results from Paspalum, a strong expression of QGJ was observed in nucellar cells of apomictic plants. However, the proximal layer of cells originating the AI lacked signal, suggesting that QGJ is

expressed in cells adjacent to the functional germ cells, i.e., nonfunctional reduced megaspores or nucellar cells located aside the non-reduced megaspores in sexual and apomictic plants, respectively.

### A Decrease of QUI-GON JINN Expression Impairs the Formation of Aposporous Embryo Sacs (AES)

Next, we decided to investigate if a diminished expression of QGJ in an apomictic background gives rise to altered reproductive phenotypes. Firstly, a plant transformation vector including an N46 hairpin (pBS86-N46) was constructed by cloning the complete N46 fragment in sense and antisense orientation within plasmid pBS86 (see "Materials and Methods"). Then, QGJ RNAi lines were obtained by Q4117 biolistic co-transformation with plasmids pact1-gfbsd2 (which expresses an enhanced green fluorescent protein gene eGFP under the rice ACT1 promoter) and pBS86-N46 (see "Materials and Methods"). From 41 positive transgenic events (for pact1-gfbsd2, pBS86-N46 or both), two groups of lines were selected, which had, respectively, been transformed with: (1) the reporter plasmid pact1-gfbsd2; and (2) the RNAi plasmid pBS86-N46 and the reporter plasmid pact1-gfbsd2 (**Figures 5A–H,K,L**). Plants belonging to the first group were classified as transformation control lines (since they allow evaluation of reproductive phenotypes in plants subjected to in vitro culture and transformation procedures), while those corresponding to the second group were labeled as RNAi lines. These plants were classified as T0, since they were regenerated from bombarded calli. Prior to the molecular and cytoembryological analysis, two rounds of small rhizomes subculture were conducted (P. notatum is perennial and reproduces by forming rhizomes). Only four plants flowered in the isolated GMO chamber under controlled conditions (transformation control lines #TC1/#TC2 and RNAi lines #RNAi1/#RNAi2), together with a wild type control. Fluorescence analysis in transgenic lines carrying the pact1 gfbsd2 vector (both RNAi and TC) confirmed that the rice Act1 promoter drives expression in male and female reproductive tissues of P. notatum (**Figures 5I,J**).

Quantitative RT-PCR analyses revealed that the QGJ expression was significantly attenuated in floral tissues of both RNAi lines compared to the apomictic wild type ecotype (relative expression levels ranging from 0.478 to 0.656; **Supplementary Figure S6A**). Lower expression levels were detected in leaves in comparison to flowers of Q4117 and no significant reduction in expression was detected in leaves of RNAi lines (**Supplementary Figure S6B**). Rates of viable pollen were slightly reduced in the two RNAi lines (#RNAi1 and #RNAi2) with respect to both the control lines (#TC1 and #TC2) and the wild type (Q4117) (**Table 1**). However, even if statistically significant, this minor alteration might not imply physiological consequences. Female reproductive development was examined by observing cleared ovules at anthesis and determining the type and number of ES per ovule. Relatively high proportions of aborted ovaries, i.e., containing no ES, were detected for lines #RNAi1 and #RNAi2 compared to both wild type and control plants (20–30% vs. <10%; **Table 2**). Defects in both initiation and completion of AES formation affected female reproductive development in RNAi lines (**Table 2** and **Figure 6**). A lower number of AES per ovule was detected for both #RNAi1 and #RNAi2 (**Table 2**) and most of them exhibited weak/abortive phenotypes such as small size, ragged borders, and no detectable polar nuclei (**Figure 6**). Conversely, in control lines (#TC1 and #TC2), the average number of AES per ovule was similar to that of Q4117 (**Table 2**). Finally, the proportion of ovules containing meiotic ES (MES) showed no statistical difference among Q4117, control lines and RNAi lines (**Table 2**). We concluded that the significant reduction of QGJ transcripts after introducing a QGJ hairpin

asterisk at the top.

construction in the apomictic genotype Q4117 impaired the formation of AESs. Our data suggest that expression of QGJ in nucellar cells is necessary for aposporous development in P. notatum.

### Genetic Linkage Analysis Between the QGJ Locus and the ACR

Possible co-segregation of the QGJ locus with apospory was examined in P. notatum by using bulked-segregant analysis (**Figure 7**). The complete N46 original fragment was hybridized onto genomic DNA samples originated from two parental plants (Q4188, sexual female parent and Q4117, apomictic pollen donor) and genomic DNA bulks made from 10 sexual and 10 apomictic F<sup>1</sup> plants derived from the Q4188 × Q4117 cross. Segregation in F<sup>1</sup> is expected due to ACR hemizygosity and the heterozygous nature of a high number of parental genomic loci. Although genomic DNA digestion with three different restriction enzymes (EcoRI, HindIII, and PstI) produced several polymorphic bands between parental plants, none of them resulted polymorphic between F<sup>1</sup> bulks. The bulked results show the sexual polymorphisms are not specific to sexual F1s. However, the apomictic band present in the parental PstI digest disappeared in the apomictic F1s. Since PstI is sensitive to certain contexts (CpNpG sites), this observation might suggest a methylation change occurring during hybridization. In silico mapping onto the Gramene website revealed that the putative ortholog to QGJ is located in chromosome 11 (Os11g0207200, E-value 0.0), in a genomic region showing no synteny with the ACR of P. notatum. However, other high score hits (Os02g0555900, Os02t0666300-01, Os12g0577700, E-values 5.7E-86, 3E-58, and 7.6E-15) are protein kinase genes located in a rice genome region (chromosome 2 long arm, positions: 21,002,439–21,008,209, 27,048,920–27,061,955 and chromosome 12 long arm, positions: 23,885,845–23,888,835, respectively) syntenic to the P. notatum ACR. Our results showed no evidence of a genetic link between QUI-GON JINN and the ACR, but suggested that sequences showing significant similarity to this gene might be located within the genomic region controlling apospory.

### The Paspalum ACR Transcribes a Long Non-coding Sequence Showing Partial Similarity to QGJ

In a previous work, Polegri et al. (2010) reported full linkage between the Paspalum simplex transcriptome fragment A-148- 3 and apospory, and determined that this candidate was also homologous to At1g53570. A search for the complete sequence of A-148-3 in the Paspalum Roche 454 floral libraries revealed that it was not a PN\_QGJ ortholog, since it showed the best hit of similarity with apoisotig 04689 (GFMI02004742.1) (query cover

Sense probe. (A) Megaspore mother cell (mmc). (B) Meiosis (dyad stage). (C) Meiosis (tetrad stage showing two megaspores). (D) Archesporial cell stage. (E,F) Meiosis (mmc entering meiosis = meiocyte) and AI formation stage. (G,H) Pollen mother cell stage. (I,J) Meiosis stage. Bars: 10 µm. References: mmc, megaspore mother cell; dy, dyad; te, tetrads; lp, layer of proximal cells surrounding the MMC; ai, apospory initial.

100%; E-value: 6 e−143, identities 90%; position on apoisotig 04689: 5588–5955). Interestingly, apoisotig 04689 is a 6835-nt sequence with no coding potential and specific to the apomictic P. notatum floral transcriptome libraries (reads in the apomictic library: 133; reads in the sexual library: 0). A search in the GreeNC database revealed similarity with six predicted plant lncRNAs at E-values ≤ 1 e−<sup>10</sup> (best match: Zmays\_GRMZM2G024551\_T01; E-value: 1.00411e−27; alignment length: 226 nt, involving the segment flanked by positions 4304–4530 within apoisotig 04689; positive: 177). Given its partial similarity with QGJ, its lack of protein-coding potential, its similarity to sequences included in the GreeNC database and its detected expression in the apomictic floral transcriptome, we inferred that isotig 04689 is a long non-coding RNA (lncRNA) related to QGJ, and was renamed accordingly as PN\_LNC\_QGJ (after P. notatum long non-coding QGJ). The complete QGJ functional sequence (apoisotig 03085, 2377 nt) has similarity to the PN\_LNC\_QGJ transcript in positions ranging 1033–1326/1352–1475 (matching positions 5493–5800/6255–6387 in the PN\_LNC\_QGJ sequence) (**Figure 1**). Meanwhile, the original N46 fragment spanned positions 733–1174 in the QGJ sequence (apoisotig 03085) (**Figure 1**).

Mapping of the A-148-3 original transcript in a P. notatum population (55 F<sup>1</sup> plants) revealed one polymorphic band strictly cosegregating with apomixis (**Figure 8**). Another band showed partial linkage, confirming association of the sequence with proximal regions. Besides, the two additional monomorphic/non-segregating bands were detected, which could be related with the same locus or, alternatively, with other genomic regions located elsewhere (**Figure 8**). Furthermore, reverse-transcribed PCR experiments with PN\_LNC\_QGJ specific primers conducted in several apomictic and sexual P. notatum individuals showed that PN\_LNC\_QGJ is expressed only in apomictic plants (**Supplementary Figure S7**).

Based on these observations, we concluded that original transcript fragment A-148-3 is part of a long non-coding RNA (namely PN\_LNC\_QGJ) differentially expressed in apomictic and sexual plants. Besides, a sequence similar to the A-148- 3 probe is located in the P. notatum ACR. However, further experiments should be conducted to determine if the copy of the gene located at the ACR is producing the differentially expressed lncRNA (considering the existence of monomorphic bands, which could represent copies located in other parts of the genome). Moreover, the existence of a functional link between

the altered QGJ expression and a possible regulator activity of PN\_LNC\_QGJ has yet to be investigated through functional analysis. The alignments between the Z. mays reference genomic sequence GRMZM6G513881 and the sequences of all PN\_QGJ and PN\_LNC\_QGJ isotigs are displayed in **Supplementary Figure S8**.

#### DISCUSSION

The extracellular signal-regulated kinase 1/2 (ERK1/2) cascade is a central signaling pathway that modulates a wide variety of cellular processes, including proliferation, differentiation, survival, apoptosis, and stress response (Wortzel and Seger, 2011). The intracellular communication between membrane receptors and their nuclear or cytoplasmic targets upon stimulation is mediated by a limited number of signaling pathways, including a group of mitogen-activated protein kinase (MAPK) cascades (Wortzel and Seger, 2011). MAPK signal transduction cascades consist of three sequentially activated kinases. Upstream signals activate MAPK kinase kinases (MAPKKKs), which in turn phosphorylate MAPK kinases (MKKs); subsequently, MKKs activate specific MAPKs. The downstream targets of MAPKs can be either transcription factors or cytoskeletal proteins. Phosphorylation and activation of a MAPK can lead to changes in its subcellular localization and its activity on transcriptional effectors, thereby reprogramming gene expression (Fiil et al., 2009). In particular, the Arabidopsis genome encodes 60 putative MAP3Ks (including the MEKK, RAF, and ZIK subfamilies), 10 MAPKKs, and 20 MAPKs, involved in a plethora of different responses to specific ligands (Ichimura et al., 2002). Here, we characterized the

TABLE 1 | Pollen viability of transformed and control plants.


H, QUI-GON hairpin; E, eGFP; PN, total number of counted pollen grains; NVP, number of non-viable pollen grains; VP, number of viable pollen grains; PVP, proportion of viable pollen. 95% CI (confidence intervals) were calculated by using the Newcombe method (Newcombe, 1998), with a correction for continuity (http://vassarstats.net/prop1.html), as described in the section "Materials and Methods."

expression and function of QUI-GON JINN (QGJ), the putative ortholog to Arabidopsis At1g53570/AtMEKK3, in reproductive organs of sexual and apomictic P. notatum plants. AtMEKK3 belongs to the MEKK subfamily related to budding yeast Ste11p (Lukowitz et al., 2004). It comprises 12 members (Ichimura et al., 2002), including critical regulators of plant cell division and differentiation during reproduction, i.e., YODA/AtMAPKKK4 (early embryogenesis) (Lukowitz et al., 2004), AtMEKK20 (male gamete differentiation) (Borg et al., 2011) and ScFRK/AtMEKK19/20 (female gametophyte development) (Daigle and Matton, 2015).

Our results suggests that QGJ plays a role in promoting the acquisition of a gametophytic cell fate by AIs, a critical step in the establishment of the aposporous pathway, or alternatively affects the development of the embryo sacs. The fact that QGJ is not expressed in the cell layer originating AIs, but in the adjacent ones, suggests that a non-cell-autonomous signaling mechanism might be operating. Such mechanisms

FIGURE 6 | Cytoembryological analysis of the reproductive behavior in QGJ RNAi lines. Mature ovules of wild type genotype Q4117 (A,B), transformation control lines #TC1 (C) and #TC2 (D) and QGJ RNAi lines #RNAi1 (E) and #RNAi2 (F). Aposporous embryo sacs (A,C,D). Meiotic embryo sac (with a proliferating mass of antipodals) (B). Aborted embryo sacs (E,F). References: pn, polar nuclei; a, antipodals; ab, aborted. Bars: 100 µm.

TABLE 2 | Cytoembryological analysis of transformed and control plants.


H, QUI-GON hairpin; E, eGFP; N, number of ovules analyzed; Ab, aborted ovules (without embryo sacs); V, potentially viable ovules (with sacs); AES, total number of aposporous embryo sacs; AES/N, average number of aposporous embryo sac per ovule; KW, Test of Kruskal–Wallis comparing the AES per ovule variable; MES, total number of meiotic embryo sacs; MES/N, average number of meiotic embryo sacs per ovule; 95% CI, ninety-five % confidence intervals around MES/N proportions were calculated by using the Newcombe method (Newcombe, 1998) with a correction for continuity (http://vassarstats.net/prop1.html), as described in the section "Materials and Methods." Aposporous embryo sacs (AES) can be readily distinguished from meiotic ones (MES) since they lack antipodal cells.

are commonly used for intercellular communication during plant development (Van Norman et al., 2011). In contrast, pollen viability and meiotic embryo sac formation seem poorly affected in RNAi plants, showing that the QGJ partial silencing is not influencing male/female meiosis in an obligate apomictic background. However, since the meiotic pathway

is per se diminished in obligate apomictic plants (the Q4117 apomictic genotype naturally shows a rate of meiotic embryo sac development of around 3–4%) (Ortiz et al., 1997), it is difficult to evaluate the QGJ post-transcriptional attenuation consequences on the sexual developmental pathway when using the genetic background transformed here. Such analyses should be conducted in a mutant/transformant line derived from a sexual plant, an experiment that we plan to complete in the near future. Moreover, the slight decrease in pollen viability detected in the Paspalum transgenic lines might reflect a mild response to the introduction of the hairpin in a tissue where the gene is naturally down-regulated with respect to sexual plants, as revealed by in situ hybridization. Podio et al. (2012) had analyzed anaphase I configurations and pollen viability in aposporous and sexual tetraploid cytotypes of P. notatum and found reduced pollen viability in the aposporous genotypes, including Q4117. A reduced activity of QGJ in pollen mother cells of aposporous plants (**Figures 3**, **4**) could also explain the diminished expression detected in apo plants in comparative qPCR experiments conducted on florets (**Figure 2**), since anthers represent a high proportion of the floret tissue at this stage. Based on all these observations, we hypothesized that, similarly to YODA (Lukowitz et al., 2004), QGJ mediates a signaling pathway acting as a key regulator to define cell lineage during plant reproduction. While YODA is required for normal development of the zygote and the cells of the basal lineage originating the suspensor, QGJ might play a role during the sporophytic-togametophytic transition phase. However, although our results indicate that QGJ activity is essential to non-reduced embryo sacs formation, overexpression experiments under ovule specific promoters will be necessary to assess if expression in the nucellus of sexual plants is fully responsible for apomictic development. An alternative hypothesis is that this candidate is involved later, after the fate decision has been made, in either sexual or aposporous embryo formation.

Polegri et al. (2010) reported the isolation of A-148-3, a P. simplex transcript homologous to the predicted Arabidopsis QGJ ortholog (At1g53570), showing constitutive expression in apomictic genotypes during all reproductive developmental stages and linkage to the P. simplex ACR. However, full sequence analysis revealed that A-148-3 is not a QGJ ortholog, but an lncRNA with partial similarity to QGJ. Genetic linkage analyses in a Q4188 (sexual) × Q4117 (apomictic) F<sup>1</sup> population confirmed that a sequence showing similarity with A-148-3 is located within the P. notatum ACR. On the contrary, we found no evidence that N46 and the ACR are genetically linked. Note that the A-148.3 probe has some potential to hybridize QGJ (the original 354 bp-long A-148-3 sequence includes a 147-nt insertion with 78 % similarity to P. notatum QGJ). Moreover, from the 451 nt covered by the original fragment N46, a 112-nucleotide segment keeps partial similarity (72%) with PN\_LNC\_QGJ. Although these similarities are limited, involve only a portion of the probes (40% of A-148.3 and 25% of N46), and the experimental conditions were strict enough to ensure specific detection, the possibility of some cross hybridization cannot be fully discarded, especially for A-148.3. However, the detection of contrasting genomic hybridization patterns when

using N46 and A-148.3 as alternative probes (**Figures 7**, **8**), suggests that each of them has capacity for specific detection. The ACR is a genomic region specific to apomictic genotypes, which is highly heterochromatic and harbors almost intact exonic sequences interlaced within highly repetitive sequences (Podio et al., 2014a). PN\_LNC\_QGJ is expressed in floral tissues of aposporous plants only, and it includes large, non-coding stretches similar to retrotransposons and two short exonic QGJ regions (439 nt in total out of 6835). Long non-coding RNAs have recently emerged as critical regulators of gene expression in many eucaryotes, including plants (Ariel et al., 2015; Chekanova, 2015; Liu et al., 2015). Therefore, considering the sequence relationship between QGJ and PN\_LNC\_QGJ, it is tempting to speculate that PN\_LNC-QGJ could mediate QGJ modulation in reproductive tissues of Paspalum apomicts. Among many putative mechanisms, our results point out at least two of them: (1) a change in splicing leading to the formation of variants; and (2) the induction of nucellar expression via miRNA hijacking. However, no claim of a functional relationship between QGJ and PN\_LNC\_QGJ can be currently made, since it is not supported by functional analysis data. Moreover, expression of PN\_LNC\_QGJ from the ACR should be further confirmed. Shortly, we will focus in determining if the miss-expression of QGJ is caused by a transcriptional regulatory event or is alternatively influenced by PN\_LNC-QGJ, and to determine if PN\_LNC-QGJ is expressed from the ACR. Interestingly, non-coding transcripts carrying exonic sequences were proposed to regulate PN\_SERK and PS\_ORC3, two genes putatively involved in apomixis in P. notatum and P. simplex, respectively (Podio et al., 2014b; Siena et al., 2016).

In the past decade, numerous candidate genes for apomixis were identified (Hand and Koltunow, 2014; Ronceret and Vielle-Calzada, 2015) but how the underlying networks integrate into sexual reproduction and alter expression patterns remain largely unknown. Our work posits that a MAP3K signaling pathway of an ERK1/2 cascade is pivotal to aposporous embryo sac differentiation. However, the rest of the members of the ERK cascade and their interactions with this kinase remain unknown. Interestingly, besides N46 (QGJ), Laspina et al. (2008) reported the differential expression of several other genes involved in ERK cascades in comparisons between sexual and apomictic plants: an LRR family protein (N79), a GPI anchored protein (N20), phosphatidylinositol 4K (N23), a Ser/Thr phosphatase (N102), the PRIP-interacting protein (N69) and a kinesin (N114). From them, only N20 and N69 have been further characterized (Felitti et al., 2011; Siena et al., 2014). N20 (later renamed N20GAP-1) is ortholog to genes At4g26466 (LORELEI, encoding a GPI-anchored protein) and/or At5g56170 (LORELEI-like), shows partial cosegregation with apospory and is increasingly overexpressed in apomictic plants from premeiosis to antesis (Felitti et al., 2011). N69 is ortholog to gene AT1G45231 (TGS1, encoding a trimethylguanosine synthase which has a dual role in splicing and transcription), and, contrarily, is increasingly overexpressed in sexual plants from premeiosis to antesis (Siena et al., 2014). Moreover, evidence was shown that TGS1 Ser<sup>298</sup> phosphorylation is promoted by an ERK cascade to activate transcriptional activity at some promoters (Kapadia et al., 2013). The availability of Paspalum RNAi lines for N20 (LORELEI), N69 (TGS1) and N46 (QGJ MAP3K, reported here) would allow to investigate in detail a possible biological link among these molecules. Though functional approaches are challenging in polyploid, highly heterozygous apomictic species like P. notatum, the development of reference genomes, transformation protocols and advanced microscopy tools will likely accelerate the discovery of the central mechanisms underlying the switch from sexuality to apomixis.

#### AUTHOR CONTRIBUTIONS

MM: transformation experiments and the genotypic/phenotypic analysis. HP: transformation experiments supervision. CC and CA: RNA extractions, real time experiments, and bioinformatic analysis. LS: developmental stage classification and RNA extraction. FP: PN\_LNC\_QGJ mapping. DdAD, VdCC, and MP: Brachiaria in situ hybridization. MP: PN\_QGJ genomic hybridization. JS and AG: Paspalum in situ hybridization. SF: transformation vector construction. JO: phylogenetic analysis and experimental design. OL: experimental design, sequence analysis, and manuscript writing. SP: experimental design, in silico analysis, qPCR analysis, and manuscript writing.

### FUNDING

Thanks are due to the European Union's Horizon 2020 Research and Innovation Programme under the Marie Skłodowska-Curie Grant Agreement No. 645674; Agencia Nacional de Promoción Científica y Tecnológica (ANPCyT), Argentina, Projects PICT 2007-00476, PICT 2011-1269, PICT-2014-1080; Consejo Nacional de Investigaciones Científicas y Técnicas (CONICET), Argentina, Project: PIP 11220090100613; CNPq/PROSUL project number 490749/2008-9, Brazil and the Ufficio Relazioni Europee e Internazionali del Consiglio Nazionale delle Ricerche, Italy (Laboratori Congiunti Bilaterali Internazionali CNR, Prot. 0005651).

#### ACKNOWLEDGMENTS

MM, CC, CA, and MP received fellowships from CONICET. MP, SF, LS, JS, AG, JO, and SP are research staff members of CONICET. Thanks are due to Dr. María Sartor for cytoembryology/ISH experiments technical support and discussion, Ana Cristina M. M. Gomes for help with the Brachiaria ovules sectioning and Dr. Celina Beltrán for collaborating with statistical analyses.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018.01547/ full#supplementary-material

#### REFERENCES

fpls-09-01547 October 22, 2018 Time: 14:35 # 14



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling Editor is currently organizing a Research Topic with one of the authors FP, and confirms the absence of any other collaboration.

Copyright © 2018 Mancini, Permingeat, Colono, Siena, Pupilli, Azzaro, de Alencar Dusi, de Campos Carneiro, Podio, Seijo, González, Felitti, Ortiz, Leblanc and Pessino. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Phenotypic, Hormonal, and Genomic Variation Among Vitis vinifera Clones With Different Cluster Compactness and Reproductive Performance

Edited by: Sara Zenoni, University of Verona, Italy

#### Reviewed by:

Zhanwu Dai, INRA Centre Bordeaux-Aquitaine, France Chiara Pastore, University of Bologna, Italy

#### \*Correspondence:

Jérôme Grimplet jgrimplet@cita-aragon.es; jerome.grimplet@icvv.es

#### †Present address:

Jérôme Grimplet, Unidad de Hortofruticultura, Centro de Investigación y Tecnología Agroalimentaria de Aragón, Instituto Agroalimentario de Aragón-IA2 (CITA-Universidad de Zaragoza), Zaragoza, Spain Javier Tello, UMR AGAP, INRA-Supagro, Montpellier, France

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 15 June 2018 Accepted: 10 December 2018 Published: 07 January 2019

#### Citation:

Grimplet J, Ibáñez S, Baroja E, Tello J and Ibáñez J (2019) Phenotypic, Hormonal, and Genomic Variation Among Vitis vinifera Clones With Different Cluster Compactness and Reproductive Performance. Front. Plant Sci. 9:1917. doi: 10.3389/fpls.2018.01917 Jérôme Grimplet\* † , Sergio Ibáñez, Elisa Baroja, Javier Tello† and Javier Ibáñez

Instituto de Ciencias de la Vid y del Vino (CSIC, Universidad de La Rioja, Gobierno de La Rioja), Logroño, Spain

Previous studies showed that the number of berries is a major component of the compactness level of the grapevine clusters. Variation in number of fruits is regulated by events occurring in the fruitset, but also before during the flower formation and pollination, through factors like the initial number of flowers or the gametic viability. Therefore, the identification of the genetic bases of this variation would provide an invaluable knowledge of the grapevine reproductive development and useful tools for managing yield and cluster compactness. We performed the phenotyping of four clones (two compact and two loose clones) of the Tempranillo cultivar with reproducible different levels of cluster compactness over seasons. Measures of reproductive performance included flower number per inflorescence, berry number per cluster, fruitset, coulure, and millerandage indices. Besides, their levels of several hormones during the inflorescence and flower development were determined, and their transcriptomes were evaluated at critical time points (just before the start and at the end of flowering). For some key reproductive traits, like number of berries per cluster and number of seeds per berry, clones bearing loose clusters showed differences with the compact clones and also differed from each other, indicating that each one follows different paths to produce loose clusters. Variation between clones was observed for abscisic acid and gibberellins levels at particular development stages, and differences in GAs could be related to phenotypic differences. Likewise, various changes between clones were found at the transcriptomic level, mostly just before the start of flowering. Several of the differentially expressed genes between one of the loose clones and the compact clones are known to be over-expressed in pollen, and many of them were related to cell wall modification processes or to the phenylpropanoids metabolism. We also found polymorphisms between clones in candidate genes that could be directly involved in the variation of the compactness level.

#### Keywords: reproductive performance, grape cluster compactness, fruitset, hormones, phenotyping, somatic variation, transcriptomics

**Abbreviations:** ABA, abscisic acid; GA, gibberellin; GAac, gibberellic acid; IAA, indole-3-acetic acid; JA, jasmonic acid; SA, salicylic acid; SNP, single-nucleotide polymorphism.

## INTRODUCTION

fpls-09-01917 January 4, 2019 Time: 15:37 # 2

Cluster compactness, the density of the berries in the cluster, is a primary aspect of grapevine (Vitis vinifera L.) selection programs. Berries inside the cluster are organized as a thyrse, which is unique to grapevine among economically important crops. The thyrse, etymologically derived from Dionysus's staff, is a type of mixed compound inflorescence. It is a raceme inflorescence type where flowers are replaced by a cyme (Hickey and King, 2000; Gerrath et al., 2017), also described as a panicle (Pratt, 1971). Grapevine cluster compactness is an economically important trait since it affects several major component of the fruit quality (reviewed by Tello and Ibáñez, 2018). Foremost, compact clusters are more susceptible to pests and diseases. Several reasons have been pointed out to explain it: first, compact structure favors propagation of pathogen agents within the cluster and lack of aeration among the berries create a suitable environment for their proliferation; besides, berries in close contact to each other showed a reduction of the protective cuticular wax (Marois et al., 1986; Rosenquist and Morrison, 1989). Thus, all treatments that reduce cluster compactness are expected to lead to a lower predisposition of clusters to early and severe incidence of pests and diseases (Molitor et al., 2012). Compactness also affects ripening homogeneity, the shaded berries tend to receive less solar radiation which affects berry composition and maturation dynamics (Pieri et al., 2016; Silvestre et al., 2017). Finally, loose cluster is a favorable factor for table grape appearance.

The compactness level of a cluster is the resultant of the sum variation of directly or indirectly related traits, linked to the rachis architecture or to the berry size. It is difficult to quantify, although there already exist indexes for the cluster compactness calculated from quantifiable parameters (Tello and Ibañez, 2014) as an alternative to the visual assessment recommended by the OIV (International Organisation of Vine [OIV], 2007). Recent studies in our group showed that, in a multi-cultivar frame, the number of berries and rachis dimensions are key components in the determination of cluster compactness, followed by berry dimensions (Tello et al., 2015). The final number of berries in the cluster is a consequence of the number of flowers in the inflorescence and their rate of conversion into berries (fruit set rate). These two traits are major responsible of the reproductive performance of a vine. All these mentioned traits are genetically determined, although some of them may be very influenced by environmental factors, leading to seasonal or individual variation (Dry et al., 2010). In grapevine, cultivars are vegetatively propagated, producing plants with the same genotype, but many of these cultivars have been maintained for several centuries, especially in those used for winemaking. For that reason, somatic mutations have accumulated in many plants of these varieties, allowing making clonal selection, where a single plant of the cultivar is multiplied to constitute a clone within the cultivar. Clones may differ in many traits, including cluster compactness and related traits, and constitute interesting material for genetic studies. In a previous work, clones from Garnacha cultivar showing variation in cluster compactness and other traits were compared at the gene expression level (Grimplet et al., 2017). Flowers from clones differing for the number of berries showed extensive differences in their transcriptome in the Garnacha cultivar at E-L 26 (cap-fall complete) and allowed identifying gene networks and genes potentially related to the phenotypical variation.

These previous results indicated that studying the grapevine reproductive performance was necessary for the identification of important genetic factors affecting cluster compactness. Specifically, we aim to determine the role of factors such as the initial number of flowers and fruit set rate, processes of the grapevine reproductive development that are under the control of plant hormones (Giacomelli et al., 2013). GAs mediate the formation of the inflorescence axis. Later cytokinins regulate the differentiation into flowers and are specifically needed for the growth of pistil (Pool, 1975), and flowering timing is controlled through the GA:cytokinin balance (Srinivasan and Mullins, 1981).

The goal of this work was to identify genetic and molecular processes behind the phenotypical differences between clones of Tempranillo cultivar differing in their reproductive performance and cluster compactness. To reach that goal we characterized clones at phenotipical, hormonal, and transcriptomics level and perform global analyses from these data. The final aim was to identify candidate genes and polymorphisms involved in the determination of flower number and fruit set rate in relation to cluster compactness.

### MATERIALS AND METHODS

### Plant Material

Plant material was collected at Viveros Provedo's plot in Logroño (La Rioja, Spain), where all the vines were treated in the same way. The four clones used in the analysis originated from a Vitis vinifera cv. Tempranillo clone collection grafted on Richter-110 rootstocks (Provedo Eguía et al., 2007). Two of the clones are described as producing compact clusters ("compact" clones: RJ51 and VP2) and two produce loose clusters ("loose" clones: VP25 and VP11) (Tello et al., 2015). Sampling was performed in 2012, 2014, 2015, 2016, and 2017 for phenotyping and in 2015 for hormones and RNAseq analyses, which were done on the same samples. These latest samplings were performed at E-L 13–14 (April 30), E-L 16–17 (May 14), E-L 18–19 (May 28), and E-L 26 (June 8) [developmental stages according to the modified E-L system for grapevine growth stages (Coombe, 1995)]. Flower samples were collected and immediately frozen in liquid nitrogen; once in the lab samples were kept at −80◦C until their use.

#### Phenotyping

Each clone was phenotyped during 2–5 years using morphological variables of the cluster and the berry related to compactness trait as described in Tello et al. (2015). In addition, the clones were characterized in 2016 and 2017 using new variables related to their reproductive performance, for which, 10 inflorescences from different plants were tagged and bagged before flowering. The bags were removed once all the calyptras (fused petals, one calyptra per flower) had fallen inside.

Calyptras were scanned and counted to estimate the number of flowers. The same tagged inflorescences, already transformed in clusters, were collected at harvest stage for phenotyping. Basically, new variables consisted in counting data of the initial number of flowers existing in the inflorescence and of their derived organs in the ripe cluster: seeded (normal) berries, seedless berries, and live green ovaries (LGOs), also known as "hens," "chickens," and "shot berries," respectively (Collins and Dry, 2009). These variables were used to estimate the fruitset rate (proportion of flowers that develop into berries – either seeded or seedless –), and the abnormal conditions named coulure (excessive proportion of desiccated or drop flowers) and millerandage (excessive proportion of post-flowering organs that develop into either seedless berries or LGOs) (Dry et al., 2010). Cluster compactness index CI-12 was calculated according to Tello and Ibañez (2014). Statistical comparisons between clones were done using SPSS v.24 (IBM, Chicago, IL, United States). PCA was performed with the R software package prcomp and visualized with the fviz\_pca package.

#### Hormones Analysis

Hormones were analyzed at the Servicio de Cuantificación de Hormonas Vegetales in the IBMCP in Valencia, Spain with a UHPLC-mass spectrometer (Q-Exactive, ThermoFisher Scientific) from at least 100 mg of flower and inflorescence material. Hormones included IAA, ABA, JA, SA, and the GAs GA51, GA4, GA1, G29, and GA8. Hormones were analyzed at E-L 13–14, E-L 16–17, E-L 18–19, and E-L 26 stages.

#### RNA Extraction and RNAseq Analysis

Total RNA was extracted from samples using the Spectrum plant total RNA kit (Sigma<sup>1</sup> ) as recommended by manufacturer. DNase I digestion was carried out with the RNase-free DNase Set (QIAGEN). RNA integrity and quantity were assessed with a Nanodrop 2000 spectrophotometer (Thermo Scientific) and an Agilent's Bioanalyzer 2100. RNA samples were processed to build strand-specific cDNA libraries (one per biological sample) using Illumina TruSeq RNA Library Preparation Kit (Illumina, San Diego, CA, United States). Sequencing of all 24 libraries (3 replicates <sup>∗</sup> 4 clones <sup>∗</sup> 2 stages) was conducted on two sequencing lane using Illumina HiSeq 2500 v4 platform (Illumina, San Diego, CA, United States) to produce 19–24 million strand-specific 125 bp paired-end reads per library. Sequencing was performed at the Center for Genomic Regulation (Barcelona, Spain).

Sequences analysis were performed using the Galaxy tool (Afgan et al., 2016) to streamline the process on the 24 samples. Reads were mapped to the reference (12X.V2) grapevine genome using TopHat 2.1.0 (Kim et al., 2013) allowing only for unique mapping and up to three mismatches per read mapped and a minimal quality of 20. The alignment was performed using the Grapevine reference annotation V.3 (Canaguier et al., 2017). Read counts were generated using featureCounts from Subreads 1.5.1 (Liao et al., 2013). Analysis of differential gene expression was performed using EdgeR (Robinson et al., 2010) between each

<sup>1</sup>www.sigmaaldrich.com

pair of clones at the two time points. Gene expression clustering was performed using the Quality threshold (QT) clustering method (Heyer et al., 1999) complemented by hierarchical clustering (HCL) with a maximum distance threshold of 0.2. Clustering was performed with the TMEV software (Saeed et al., 2003).

#### Sequence Polymorphisms Analysis

Detection of polymorphisms [SNPs and insertions/deletions (indels)] between clones was performed using the RNA-seq alignments bam files. Variant calling vcf files were obtained with the variant caller utility implemented in the SAMtools package v1.2 (mpileup, bcftool) (Li et al., 2009). The vcf files were filter for a quality >40 using vcffilter for the vcflib toolkit<sup>2</sup> . Other file handling operations were performed with vcftools (Danecek et al., 2011). 605 polymorphisms between clones were detected automatically with bftoolcall, with alleles appearing consistently in the three replicates per clone. For the purpose of this work, establishment of differences between the Tempranillo clones regarding their homozygosity/heterozygosity status followed strict rules. Only polymorphisms based on a minimal depth of 50 reads were considered. One clone was considered heterozygous for a given SNP when the number of reads of the minor allele represented at least 30% of all the reads for that locus and clone. The other clones were considered homozygous when contained up to one read for the minor allele, representing less than 2% of the reads. However, we only considered polymorphic site where no read was detected for the minor allele in at least one of the homozygous clones. All the remaining cases were left non-assigned. Only polymorphisms confirmed in all replicates after individual checking in IGV software (Thorvaldsdottir et al., 2013) were considered. The effect of detected polymorphisms considering grapevine 12X.V3 gene prediction was estimated using SnpEff v.2.0.3 (Cingolani et al., 2012).

#### Functional Categories Analysis

To identify the biological functions over-represented within selected probe sets, functional enrichment analyses were performed using the Cytoscape plugin Bingo (p < 0.05) (Maere et al., 2005) adapted to the functional categories manually annotated described in Grimplet et al. (2012) updated for the differentially expressed genes absent in the previous annotation (v1).

#### RESULTS

### Phenotyping and Comparison of the Clones

Phenotypic analyses showed a large amount of variability within Tempranillo cultivar for many traits related to the reproductive development (**Table 1**). In several cases, the differences between clones are stable and robust as to be statistically significant after up to 5 years of data, with very different climatic conditions. Thus, the four clones displayed a consistent difference for the

<sup>2</sup>https://github.com/vcflib/vcflib



The number of seasons and the total number of clusters (N) used for each variable is indicated. Within a row, a different letter after the mean value indicates significant differences at α = 0.05. OIV 204: Cluster compactness defined according to the "Organisation International de la Vigne et du Vin (OIV) descriptor number 204. LGO: Live Green Ovary.

visually assessed cluster compactness between the two compact clones (RJ51 and VP2) and the two loose clones (VP25 and VP11). The compactness index CI-12 even showed significant differences between the four clones, with the same trend observed for the cluster weight, one of the index components, and for the number of seeds per berry, the other single variable for which all the four clones showed significant differences. It is generally accepted that seed content relates to berry size, but here only the compact clone RJ51 showed significantly larger berries. Rachis architecture does not seem to have a main role in the differences of cluster compactness between these clones, as key variables like the lengths of the cluster, of the first branch and of the second branch of the rachis do not differ significantly between the clones, while they were important in a multi-cultivar study (Tello et al., 2015). Only the cluster width, rachis weight, and number of nodes showed a differential behavior between compact and loose clones.

Finally, several variables related to the reproductive performance of the vine seem relate to compactness. Based on 5-year data, the number of normal (seeded) berries in the cluster was similar in the two compact clones, and significantly lower in the loose clones, where there is still a difference between VP25 and VP11, the clone with the lowest number. This is a reflection of the fruitset rates, with similar values for VP2 and RJ51, followed by VP25 and, with the lowest value, VP11. An inverse trend is observed for the number of flowers and coulure, where VP11 showed the highest value, although the differences between the other clones are not significant. Other variables showed no significant differences between clones (like Millerandage), or no relation with compactness trait (like the length of pedicel).

In order to evaluate the relation between each measured parameter and cluster compactness, we performed a principal component analysis (PCA) based on the phenotypic data (**Figure 1**). The PCA axis 1 separates factors correlating and anticorrelating with compactness, and separates the two compact clones, at the left, and the two loose clones at the right. The factors correlating the most with compactness were the number of seeds and cluster weight, on both components. Number of seeded berries, fruitset, rachis weight, number of rachis nodes and cluster width, and length also correlated with compactness. Results were in line with the results of Tello and Ibañez (2014) and Tello et al. (2015), including length of the first branch which anti-correlated with compactness. The other negatively correlating factors were number of seedless berries and its linked parameters millerandage and coulure index, as well as the only parameter that was not measured at the same stage as the compactness (at harvest), the initial number of flower. The second component of the PCA separates RJ51 from VP2 (compact) and VP11 from VP25 (loose) and distinguish variables less related to cluster compactness in these clones such as berry dimensions.

#### Hormones Analyses

Hormones profiles were obtained from samples taken every 2 weeks, at four different stages of the floral evolution, from E-L

13–14 when inflorescences started to be clearly distinguishable to E-L 26 at the end of flowering. The levels of hormones showed distinctive evolution patterns along flowering. Overall, ABA was significantly (p < 0.05) most abundant in E-L 18–19 (just before flowering) against the three other stages and significantly less abundant at E-L 13–14 against the three other stages (**Figure 2**). For individual clones, this pattern was true for RJ51, VP25, and VP11, only VP2 showed no significant differences between the three later stages, because at E-L 18–19 ABA was significantly less abundant in VP2 than in the three other clones. Besides, VP11 showed ABA levels significantly lower than the three other clones at E-L 13–14, while in VP25 ABA was significantly more abundant than in the three other clones at E-L 26.

Jasmonic acid profile was similar to that found for ABA: JA global levels were significantly (p < 0.05) higher in E-L 18–19 versus the three other stages, which showed no differences between them (**Figure 2**). The clones individually followed the same pattern, although in VP2 the difference between E-L 16–17 and E-L 18–19 was not significant. At E-L 13–14, JA levels were significantly more abundant in VP11 than in VP2 and VP25. At E-L 16–17, JA was significantly more abundant in VP2 and VP11 than in RJ51 and VP25.

Auxin (IAA) global levels were significantly higher at E-L 26, with no differences between the other stages (**Figure 2**). The clones individually mostly followed the same pattern; however, RJ51 did not show significant differences between E-L 16–17, E-L 18–19, and E-L 26, VP2 showed differences between E-L 13–14 and E-L 16–17, and VP11 did not show differences between E-L 13–14 and E-L 26. In the comparisons between the clones, RJ51 was the most different: at E-L 13–14 IAA was significantly less abundant in RJ51 than in VP2 and VP25; at E-L16–17, IAA levels were significantly higher in RJ51 vs. VP2, and at E-L 18–19, IAA in RJ51 was significantly more abundant than in VP2 and VP11. Besides, at E-L 26, VP2 levels were significantly more abundant than VP11.

Salicylic acid global levels did show no significant differences between the four stages, and the same occurred for the individual clones, except for RJ51 (**Figure 2**), where SA levels were significantly higher in E-L 13–14 and E-L 26 than in E-L 16–17 and E-L 18–19. No significant differences were observed between clones.

Among the GAs analyzed, GA<sup>51</sup> and GA<sup>4</sup> were not detected at E-L 18–19. For active GAs (**Figure 3**), GA<sup>1</sup> global levels were significantly more abundant at E-L 13–14 than at E-L 16–17, but

differences between two clones with different compactness.

with great disparity between clones. In VP2, GA<sup>1</sup> levels were significantly higher at E-L 26 than at E-L 13–14 and E-L 18–19. The opposite was observed in VP11, GA<sup>1</sup> was significantly more abundant in E-L 13–14 and E-L 18–19 than in E-L 16–17 and E-L 26. Comparing between the clones, GA<sup>1</sup> levels at E-L 18–19 were higher in VP11 than in the other three clones, while at E-L 26, levels were significantly higher in VP25 than in RJ51 and VP11 and in VP2 vs. VP11.

Global levels of GA<sup>4</sup> were significantly higher at E-L 13–14 and E-L 26 than in E-L 16–17. This pattern can be observed in all the clones. However, differences were significant only in RJ51 and VP25 between E-L 26 and E-L 16–17. Between clones, at E-L 16–17, GA<sup>4</sup> levels were significantly higher in RJ51 than in VP2 and at E-L 26, in RJ51 vs. VP11.

For inactive GAs, GA<sup>51</sup> showed the opposite pattern to GA4, with global levels significantly less abundant at E-L 13–14 and E-L 26 than at E-L 16–17. This pattern was observed in all the clones, although differences were significant only in VP2 and VP25. GA<sup>29</sup> levels were significantly higher in the latest stages studied than in the earliest ones. This pattern can be observed in all the clones but significant results were only occasionally observed. GA<sup>8</sup> is the inactivation product of GA<sup>1</sup> and its global levels showed a steady increase over time with significant differences between all stages but between E-L 16–17 and E-L 18–19. RJ51, VP25, and VP11 followed this pattern; only in VP25, the difference between E-L 13–14 and the two intermediate stages and for VP2 between E-L 16–17 and E-L 26 were not significant. GA<sup>8</sup> levels were higher at E-L 16–17 in VP2 than in the other clones, while at E-L 18–19 were more abundant in RJ51 and VP11 than in VP2. At E-L 26, again GA<sup>8</sup> was more abundant in VP 11 than in the compact clones VP2 and RJ51.

In summary, considering all the clones together, significant differences between stages were found for all the hormones studied excepted SA, indicating their possible role during inflorescence growth and flowering. No differences (p < 0.01) were found in comparisons compact vs. loose clones, supporting the hypothesis that the mechanisms for loosening the clusters may be different in VP11 and VP25, and must be studied separately. The stages E-L 18–19 and E-L 26 were chosen for gene expression analysis because these two stages showed the most relevant differences between clones. Most remarkably, the higher abundance of GA<sup>1</sup> in the loose clone VP11 at EL 18–19 discriminated it from both compact clones. VP11 also showed the lowest levels of the two active GAs at EL26, and the highest of

GA8. VP25 only showed differential hormone levels with respect to the compact clones (and VP11) for ABA at E-L 26.

### Global View on Gene Expression Within the Clones

Principal component analysis (**Supplementary File S1**) showed that the replicates from each clone grouped together and shared similar expression profile. At both stages, the first dimension of the analysis grouped all the samples together (it represented 93.7% of the variance for E-L 18–19 and 85.4% for E-L 26). We performed the gene expression comparison on samples from only one cultivar in one organ at the same stage of development. Therefore, we expected that the expression of most genes was identical between conditions. As this first component presented

little information, the components 2 and 3 were used for the plots. At E-L 18–19 (**Supplementary File S1A**), the second component of the PCA (method svd) discriminated compact clone RJ51 on one side and loose clone VP11 on the other, explaining 3.2% of the variance, we did not observed differences between VP2 and VP25 on this axis. The third component (1.1% of variance) separated VP25 from the three other clones, including VP2 contrary to the second component. At E-L 26, the second component (6.8% of variance) discriminated RJ51 from the three other clones (**Supplementary File S1B**). However, no component allowed discrimination in relation to compactness, the loosest clone VP11 was the closest to RJ51. The third component grouped together all the replicates of each clone and clearly separated VP2 from VP25.

#### Gene Expression Profiles

After the visual global expression evaluation, we compared the expression for each gene among the clones in order to identify genetic evidences related to the phenotype differences. Gene differential expression detection was performed by pair-wise comparisons of clones at both developmental stages. A total of 1490 genes were differentially expressed between at least two clones at E-L 18–19 and only 168 at E-L 26 (**Supplementary File S2**). Of them, 28 genes maintained the same differential expression between clones in both stages while 26 other genes were differentially expressed in both stages but not with the same differences between clones. More than half of the differentially expressed genes were detected at E-L 18–19 in the comparison between the extremes clones in terms of compactness and reproductive performance parameters, RJ51 and VP11 (**Table 2**), in accordance to PCA results (**Supplementary File S1**). At E-L 26, the number of differential expressed genes was more evenly distributed among the comparisons, with overall little differences between clones (a maximum of 77 genes showed differential expression in any pair-wise comparison).

To identify genes more probably related with the loose cluster phenotype, genes differentially expressed in each loose clone against the two compact clones together were detected (**Table 2**). In most of the comparisons, a small set of genes was found: five for VP11 vs. compact at E-L 26, eight for VP25 vs. compact at E-L 18–19, six for VP25 vs. compact at E-L 26. Only in the comparison compact vs. VP11 at E-L 18–19, a larger number of differentially expressed genes was detected (324), and allowed performing functional categories enrichment analysis.

Different types of gene expression profiles were found at both stages and were clustered as shown in **Supplementary** **Files S3**, **S4**. A larger number of clusters was found at E-L 18–19 than at E-L 26.

### Genetic Variation and Possible Effects on Gene Expression

We analyzed RNAseq data focusing in three kind of genes/differences that could be relevant for the study: transcripts with some polymorphism visible on the mRNA sequence irrespective of their expression level; genes that were only expressed or not expressed at all in one loose clone, in any of the two stages; and genes that seemed to express constitutively the same differences between clones in the two stages.

#### Sequence Polymorphisms in the RNAseq Data in Tempranillo Clones

Forty-seven genes showed some polymorphism among the four clones (**Supplementary File S5**) after the application of strict parameters for validation of polymorphisms and genotypes (homozygous/heterozygous). Additionally, these polymorphisms were validated on independent genome sequencing data for RJ51, VP11, and VP25 (data not shown). In a previous study (Royo et al., 2015), all the SNP fulfilling similar criteria were validated by PCR. Four of the polymorphisms were predicted to have a high putative impact (**Table 3**), likely leading to a non-functional allele in one of the clones. The whole expression observed for each of these four genes was not differential between the clones. They all code for proteins related to primary metabolism and cellular processes.

#### Genes Absent or Present in One Loose Clone

The promoter areas of the genes are not visible through RNA sequencing, but expression patterns can show indications on the integrity of the promoter sequence. The most likely candidates for alteration of the promoter area in a specific loose clone are the genes that never exhibited expression in one loose clone, while being expressed in the others at the same stage and those that were expressed only in one loose clone (**Table 4**). Since these genes had raw ratios of expression that could tend to infinite (division by values close to zero reads), the EdgeR-corrected fold changes were high. Therefore, the 10 genes in the list had a fold change >8 at least.

#### Genes With Constant Expression Along Flower Development

Additionally to the genes showing no expression, or only expression in one loose clone, we identified the genes that presented a stable differential expression between clones over


#### TABLE 3 | Detected SNP and indels with a predicted high impact on the protein structure.


TABLE 4 | List of genes only expressed or never expressed in one loose clone at least in one stage (EL 18–19 or E-L 26) in RNAseq analysis.


Numbers show gene expression as log2 ratio compared to the median value using EdgeR, in bold font when reads were detected in that clone, and in normal font when no reads were detected in that clone. Bold Gene ID, expression was consistent over stages.

TABLE 5 | List of genes with differential expression stable over stages between one loose vs. the other in RNAseq analysis.


Expression levels are log2 ratio compared to the median value.

both studied stages (**Table 5**) and with differences between a loose clone and the compact clones. Besides the four genes already reported in **Table 4** (Gene ID in bold), we identified six other genes. These genes presented the same expression profile in both stages, with similar pair-wise differences between clones. The underlying hypothesis was that if a modification had occurred in the promoter sequence of a constitutively expressed gene, it would be visible in both stages. Among the detected genes, most of them were more abundant in loose clones. For many, their putative function revealed little information.

#### Functional Analysis of the Differentially Expressed Genes

The previous analyses were directed to the identification of candidate genes for potential sources of variation on compactness-related traits; however, many other genes were differentially over-expressed in the study. Functional categories enrichment analysis was performed in order to identify the main mechanisms impacted in cluster compactness and their related traits at E-L 18–19 between VP11 and the two compact clones.

Most noticeably, cell wall-related functional categories were enriched in the compact clones, in particular in the process related to pectin modification (**Figure 4**). In addition, several transporters categories, such as proton transporter, monovalent cation–proton antiporter, TIP aquaporin, and synaptosomal vesicle fusion pores, and protein kinases. There is one category significantly under-represented in the list of genes more expressed in compact clones, the genes coding for proteins related to protein synthesis. It indicates that this process was remarkably stable between clones, allowing us to discard higher activity in protein synthesis as factor of the compactness.

The VitisNet representation of the events occurring in the cell wall metabolisms (**Figure 5**) highlighted also numerous changes specifically in the pectin metabolism-related genes with many isogenes under-expressed in VP11 vs. both compact clones and even more against RJ51 only.

Concerning the functional categories over-represented in the VP11 clone vs. the compact clones, the categories related to flavonoids biosynthesis, oxidative stress response, and oxidasedependent iron transporter (**Figure 6**) showed significant results.

The VitisNet representation of the networks related to the polyphenols (**Figure 7**) showed clear over-expression for most of the genes involved in the biosynthesis of the anthocyanin from the phenylalanine in VP11 vs. the two compact clones, and for VP11 vs. only RJ51.

## DISCUSSION

The aim of this work was to identify genetic changes that affected genes involved in cluster compactness variation. For that, four different clones of the cultivar Tempranillo were studied, two of them (RJ51 and VP2) presenting compact clusters, as expected for the variety, and two presenting loose clusters (VP25 and VP11). Our hypothesis is that each of these two clones present loose clusters due to a genetic mutation originally produced in a single Tempranillo plant, which was vegetatively propagated. This mutation makes it differ from the normal plants with compact clusters but in a, basically, identical genetic background. The contrast in compactness was reproducible over the years and the clones were grown in the same conditions and parcel, at few meters from each other. Therefore, we expected that the differences between clones in phenotypic traits, hormones, and gene expression levels had a genetic origin that could be isolated

by monitoring the gene expression and their polymorphism with little "noise" in the gene expression. Besides, in this work VP25 and VP11 have shown different phenotypic, hormonal, and gene expression characteristics, indicating that their loose phenotypes result from distinct mechanisms and should be studied independently.

### Phenotypical Differences and Hormone Levels Between Clones

Our previous findings showed that the two main components affecting the cluster compactness in a multi-cultivar frame were the cluster architecture and the number of berries (Tello and Ibañez, 2014; Tello et al., 2015; Grimplet et al., 2017). Within a single cultivar, Garnacha Tinta, only the number of berries showed stable differences between compact and loose clones (Grimplet et al., 2017). Therefore, in this work we undertook the cluster phenotyping with a special emphasis on traits affecting the reproductive performance of the selected clones and found significant differences in many of them. Some of these differential traits possibly related to cluster compactness such as the fruitset rate, the number of seeded berries per cluster, and the number of seeds per berry, which followed the trend compact clones > VP25 > VP11. The same tendency was observed for the pollen viability in these clones (Tello et al., 2018) and can explain these results, as a limited pollen viability may compromise pollination and fecundation, lowering the number of seeds per berry, the fruitset, and the number of seeded berries in the cluster.

For targeting the most appropriated phenological stages for the transcriptome evaluation, we first analyzed the hormones evolution during the inflorescence and flower development to pinpoint key stages with dramatic changes.

This work presents the first detailed study of the evolution of hormones levels during the formation of inflorescences and flowers. Several studies in grapevine have addressed the hormones levels from blooming but, to our knowledge, they were not measured in earlier stages. Giacomelli et al. (2013) analyzed the GAs quantities in flowers between the beginning and the end of flowering. They detected lower abundance at the end (E-L 26) than at the beginning of flowering (around E-L 19) for GA1, GA<sup>4</sup> and GA8, which we only observed for VP11 for GA1. We could not relate this observation to the loose phenotype of VP11 because the authors performed the experiment on Pinot Gris clone R6, a somewhat compact clone. Interestingly, GAac applied on 1 cm inflorescence (around early E-L 13) induced an increase of flower number and branching (Khurshid et al., 1992) and we observed higher number of flowers in VP11, although less number of nodes.

Antolín et al. (2003) measured ABA from the beginning of flowering, which corresponded to the peak value they observed. It gradually decrease in later stages, which was also observed by Jia et al. (2013). For three of the four clones (all but VP2) we obtained similar results but additionally we observed that ABA showed a gradual increase earlier in the inflorescence development up to this point.

For IAA we also obtained similar results to those of Jia et al. (2013) on the overlapping studied stages but the higher amount

of IAA at E-L 26 than at E-L 18–19 was not observed in such proportions. Before E-L 18–19, we observed no evolution of IAA.

Jasmonic acid in Arabidopsis is more abundant just before flowers open (Nagpal et al., 2005), as we observed in grapevine. Role of JA is to participate in the initiation of flowering, which is in accordance to our observation since it peaked at that stage.

### Potential Mutated Candidates for the Differential Phenotypes

Following our first goal, besides monitoring the global differences found in the molecular mechanisms affected within clones, we conducted analysis to identify the genes with variation between clones. The expression study through RNAseq allowed us to highlight three types of potential candidates: (i) Using the RNA sequences from the sequencing data, we could identify transcripts with SNP leading to modification of the protein integrity, such as the inclusion of a stop codon. (ii) Some genes showed a complete impairment of expression in at least one clone (no more than three detected reads), with possibly severe direct damage in their ability to be transcript due to variation in their regulator sequence. (iii) Other differentially expressed genes that showed identical expression between clones in both stages; their expression might not be influenced by environmental or physiological factors and the difference might be constitutive. We identified several genes fulfilling these criteria. However, functional analysis revealed that many of these three types of genes have an unclear or unknown function. It might mean that mechanisms important for compactness and related traits are yet to be studied and deciphered but we gathered some evidences for some genes of potential functional roles that can be discussed.

#### Possible Role of the Genes Containing Polymorphisms With High Impact on Protein

Among the 52 genes bearing variation between clones, four showed polymorphisms predicted to cause a high impact in the protein sequence. Vitvi14g00503 has clearly two polymorphic SNPs in VP2, one leading to a non-functional allele with a stop gain. This gene codes for a phosphoribosylaminoimidazole carboxylase but no evidence of a specific involvement in plant phenotype has been described in the literature.

Vitvi17g00268 has two polymorphic SNPs in VP25 (one non-functional with a stop gain). It corresponds to a protein phosphatase 2C that presented high expression in almost all tissues in the grapevine atlas in cv. Corvina (Fasoli et al., 2012), specifically in buds and was very and exclusively down regulated in pollen. The protein sequence contains an N terminal sequence of 70 aa that has not appeared in any other known plant proteins.

It seems not to share specific homology to the phosphatase 2C proteins described as involved in ABA signaling (Rodriguez, 1998). Vitvi06g00376 has a stop codon loss in one allele in VP25. It corresponds to a DNA-directed RNA polymerase III subunit C1. No potential involvement in reproductive mechanisms was described in the literature. These two genes showed two different alleles in VP25 including one that may not be functional. The two proteins affected are involved in general cellular processes that if disrupted could affect compactness; however, their relation to compactness need to be more investigated.

Vitvi08g02244 is homologous to a high mobility group protein B1. One of the alleles in RJ51 has a splice-site donor variant leading to different 5<sup>0</sup> -UTR. Homologous genes have a described impact in the phenotype in other species. In Arabidopsis, mutant lacking HMGB1 had a slightly delayed and reduced germination rate, reduced root length, and enhanced sensitivity to methyl methane sulfonate (MMS) (Lildballe et al., 2008). However, we found two elements that might minimize its role in our study. The expression of the functional allele only, in terms of readscount was equivalent to the expression in other clones, so the quantities of functional transcripts should be similar. In addition, this gene is not among the potential direct orthologs in grapevine of the Arabidopsis gene HMGB1 gene responsible for the mutant phenotype. Those were not differentially expressed here.

#### Genes Absent or Present Only in One Loose Clone

We used two strategies to highlight genes potentially presenting polymorphisms in their regulatory sequence. The first one was the identification of the genes only expressed or no expressed at all (less than three reads) in one loose clone in any stage. We hypothesized that promoters regions in those genes would be altered to reverse their expression in the affected clone.

Few of these genes presented a clear function (**Table 4**), none among those differentially affected in VP25. Vitvi14g02553 corresponds to a Germin, and its expression was absent in both compact clones, but was highly variable in the replicates of the loose clones. In the atlas, it was not present in flowers and was seed-specific, so one may wonder whether its expression in VP11 from the beginning to the end of flowering could disturb the normal pollination and fecundation processes. Germin-like proteins (GLPs) are involved in basal host resistance against powdery mildew (GER3 and GER4) in rice (Davidson et al., 2010). Pathogen tolerance is not directly related to compactness traits, but is well described that loose clusters show a reduced incidence of pests and diseases, attributed to their physical and physiological properties (reviewed in Tello and Ibáñez, 2018). There was no disease resistance genes correlating with that Germin. Another Germin co-express in CL02 (**Supplementary File S3**) with this gene indicating a possibility that a common Germin regulator might regulate its expression, and not being constitutively differentially expressed. Vitvi02g01439 was only expressed in E-L 18–19, except in VP11, for which no expression was observed. This gene codes for a 4-kDa proline-rich protein DC2.15, and it has been detected in a list of genes transiently expressed during early embryogenesis in carrot (Aleith and Richter, 1991). Its expression was also induced by the removal

of auxins in carrot cell culture, but this cannot be correlated here, as there are no differences between VP11 and the other clones at E-L 18–19. Homologs of this protein were found in many species and it contains a plant lipid transfer protein/seed storage protein/trypsin-alpha amylase inhibitor domain but its role in-plant needs to be addressed. This gene belong to the CL19 (**Supplementary File S3**) of coexpression with three genes related to glycosyltransferases [one of cytokinins described later (Vitvi08g02412) and two of anthocyanin].

#### Genes With Constant Expression Along Flower Development

Most of the genes with differential expression between clones but not presenting variation between the two stages had an undefined or unclear function (**Table 5**). There were only three genes with homologs with a function described in other organism. The most interesting for the study of cluster compactness is Betaamyrin synthase (Vitvi10g01862), which is constitutively more abundant in VP11. In the atlas, this gene was specific from buds, with low expression in the other organs. Beta-amyrin synthase is a key enzyme in the biosynthesis of the oleanolic acid that has anti-microbial activity (Liu et al., 2016). In addition to this gene and the above mentioned coding for a GLP, many genes related to pathogen resistance were detected as more abundant in loose clones in this study as well as in the previous one (Grimplet et al., 2017). These results indicate that lower pest and disease incidence in loose clusters might not be only related to physical features (Tello and Ibáñez, 2018) but also to molecular mechanisms favoring the tolerance. The previous study was performed on a different cultivar (Garnacha Tinta) with the same observation. It would be useful to study the effect of a lower degree of compactness on the expression of defense-related genes in different clusters of a single plant and at later stages.

### Cellular Mechanisms Working Differentially in the Clones

The genes described in the previous section exhibited dramatic differences of expression or sequence polymorphisms. Many other genes that were differentially expressed at a lower magnitude give invaluable information on the behavior of networks of genes involved in molecular mechanisms differentially affected in the studied clones. More specifically, we identified major changes in some hormones biosynthesis and signaling, mechanisms related to differences in the cell wall structure and the biosynthesis of flavonoids.

#### Hormones Metabolism and Signaling Play a Critical Role in Phenotypic Differences

Two of the analyzed hormones, JA and SA, will not be further discussed, as their quantities were stable between clones and we did not found in the expression analysis any element that would involve them.

#### **Gibberellins**

Gibberellins promote flowering through the activation of genes encoding the floral integrators in long-day plants such as Arabidopsis (Boss et al., 2004; Mutasa-Göttgens and Hedden, 2009). In grapevine, GAs inhibit flowering (Boss and Thomas, 2002) and GAs treatments performed at bloom tends to favor loosening and aeration of the clusters by reducing fruitset (Dokoozlian and Peacock, 2001) and berry number (Lynn and Jensen, 1966). Giacomelli et al. (2013) measured the abundance of different endogenous GAs in Pinot Noir flower development, including only one matching time point with our study: E-L 26. At this stage, our data showed one order of magnitude less that in that study, but all the ratios between the different GAs were conserved. Between clones, we observed differences in GAs levels, noticeably at E-L 18–19 a higher abundance of GA<sup>1</sup> in VP11, the clone with a lower number of seeds, and Cheng et al. (2015) reported that a pre-bloom GAac treatment in a grapevine cultivar induced seed abortion. As mentioned before, GAs are known to play a role in flowering initiation but there was no difference of expression in all the known genes involved in the flowering pathway. We also never observed any difference between clones in flowering timing, which allowed us to discard the flowering regulatory network as a responsible event for the differences observed in cluster compactness or fruitset between clones.

Although the molecular mechanisms specific to grape responses to GAs are not fully known, some advances are in progress, and it has been recently shown that grape flower abscission mechanisms triggered by GAac application are different to those promoted by other stimuli, such as shading (C-starvation) (Domingos et al., 2015). In addition, Cheng et al. (2015) recently established the list of genes responding to a GAac treatment in the grapevine flower. We clearly identified changes in expression for the genes involved in the GA metabolism and genes known to be regulated by GAs, in particular genes related to cell wall mechanisms. Two isoforms (Vitvi09g00452 and Vitvi04g00435) of the enzyme catalyzing the last step of the active GA biosynthesis pathway, the GA 3-beta dioxygenase, were more abundant in the compact clone RJ51 than in the two loose clones at E-L 18–19, levels were always higher in VP2 too but difference is only significant with VP11 for Vitvi09g00452. Cheng et al. (2015) also documented an increase of GA 3-beta dioxygenase related to higher content in GAs, in that case after treatment with exogenous GAac.

Hormones analysis at E-L 18–19 showed that, of the two active GAs, GA<sup>1</sup> is more abundant in VP11 clone, while GA<sup>4</sup> could only be detected in VP2. This might be explained by a higher turnover and rate of degradation. Three isoforms (Vitvi06g00659, Vitvi05g00163, and Vitvi19g02230) of the enzyme degrading active GAs into inactive GAs, the GA 2-beta dioxygenase, were differentially expressed. Cheng et al. (2015) found two GA 2 beta dioxygenase upregulated after GAac treatment, but they were other isoforms, on chromosomes 7 and 10. Vitvi06g00659 expression fitted the profiles of GA<sup>1</sup> in clones and stages. At E-L 18–19, it was more abundant in loose clones (only significant between VP11 and RJ51) while a drop of expression in VP11 only was observed at E-L 26. At E-L 18–19, Vitvi05g00163 was also more abundant in loose clone VP11 but less abundant in VP2 than RJ51. Vitvi19g02230 is specifically down regulated in VP11 in E-L 18–19. These two genes did not show differences between clones at E-L 26.

One hypothesis is that expression might be part of an auto regulatory process induced by GA1. Such a process has been hypothesized before in maize and Arabidopsis (Phillips et al., 1995): GA-induced down-regulation of GAs biosynthetic enzymes. Here it might also involve GA-induced up-regulation of the GAs catalytic enzyme. Higher GA<sup>1</sup> content correlates with lower biosynthetic GA3ox transcript level and higher catalytic GA2ox transcript level in VP11, possibly translated later in lower GA<sup>1</sup> content in that clone at E-L 26.

The substrate specificity of some of the enzymes has been studied in grapevine by Giacomelli et al. (2013) for isoforms that were potentially involved in the catalysis of any active GA but no enzyme showed specificity for GA1. Vitvi05g00163, that was studied as (Vv)GA2ox4, was able to catalyze in vitro all 13-hydroxylated and non-13-hydroxylated GA substrates (GA1, GA4, GA9, and GA20). Vitvi19g02230 (GA2ox3) was able to catalyze active GA<sup>1</sup> and GA<sup>4</sup> (but not GA<sup>9</sup> and GA20), as all the GA2ox studied.

Differences in GAs content may influence several mechanisms on which we observed different expression of involved transcripts. Exogenous GAac significantly increased wall extensibility in the wheat non-mutant controls but had no effect on the near-isogenic GA-insensitive genotypes (Keyes et al., 1990). We observed great changes of expression in many transcripts related to cell wall that will be discussed in a dedicated section below. Among the principal genes related to the cell wall and known to be regulated by GA, some were not differentially expressed in our study. GAs overproduction also promotes sucrose synthase expression and secondary cell wall deposition in cotton fibers (Bai et al., 2014). Vitvi09g00452 a sucrose synthase showed a higher expression in VP11, fitted to the profile of GA<sup>1</sup> levels in this clone. The molecular basis for the long-known role of GAs in regulating cell expansion remains less clear, but recent work revealed that DELLA proteins, which are destabilized by GAs, interfere with the function of tubulin chaperones that are required for proper microtubule dynamics and consequently for cell expansion (Locascio et al., 2013); however, here no DELLA-coding genes showed differential expression.

#### **ABA**

We observed one main significant difference between clones for ABA levels, but it could not be related to cluster compactness and related traits: quantities of ABA were clearly less abundant in VP2 at E-L 18–19 compared to the others and were similar for all clones at E-L 26. ABA plays an antagonist role of GAs in the flower development by participating in the process of maintaining female organ in a dormant state before pollination in tomato (Vriezen et al., 2008). Consistently, after a steady increase, we observed a sharp decrease at end of bloom, once flowers are pollinated. The increase of ABA during the flower development was also observed in rose petal (Sood and Nagar, 2003) and in grapevine decrease in later stages of development (Owen et al., 2009). VP2, which presents a high number of flowers and berries, did not show this increase, so this possible disturbing of ABA over ovary dormancy does not seem to negatively affect flowering or fruitset. Unfortunately, little molecular evidences were gathered to correlate hormone levels and gene expression. One gene (Vitvi02g01286) related to the carotenoids metabolism involved in ABA biosynthesis showed expression differences, corresponds to VviCCD4a as described by Grimplet et al. (2014). It did not showed difference between VP2 and the loose clones and its expression was more abundant in RJ51 compared to the three other clones. Some genes known to be regulated by ABA were down regulated but neither showed differences between VP2 and the loose clones. One ABA receptor (Vitvi02g00695) was down regulated in VP11, only significantly with VP2, but values in RJ51 and VP25 are similar to VP2.

#### **Auxin**

Auxin plays a critical role in flowering, it is necessary for the initiation of floral primordia and flower formation (Cheng and Zhao, 2007). Auxin IAA was more abundant in the clone RJ51 in E-L 16–17 and E-L 18–19, important stages for the flower development. In our previous study, we highlighted the differences between clones in the expression of transcripts involved in the auxin transport, here none of these genes were differentially expressed. Still we observed changes of expression in genes known to be regulated by auxin, although many of them could be regulated by other hormones and might not be directly related to variation in auxin content between clones (Ren and Gray, 2015). The ARF6-like gene (Vitvi15g01767), highlighted in our previous study because it was the most differentially expressed, is not differentially expressed here. Instead, we observed an increased expression of nine SAUR genes that appeared localized in the same area of the chromosome 3. Vitvi04g01261 is a YUCCA flavin monooxygenase involved in the biosynthesis of auxin (Zhao et al., 2001) that was more expressed in RJ51 at E-L 18–19 compared with the three other clones. No relationship between cluster compactness phenotype and auxin levels or auxin-related gene expression could be established with the data gathered in this work.

#### **Cytokinins**

The role of cytokinins in flower development is unclear. Cytokinin content was not evaluated here and only one gene involved in the cytokinins metabolism is differentially expressed but it might have a significant impact. Vitvi08g02412, underexpressed in VP11 at E-L 18–19 is the closest grapevine homolog of the Phaseolus gene ZOG\_PHALU, which is the only described trans-zeatin O-beta-D-glucosyltransferase in plants. This gene may regulate active vs. storage forms of cytokinins and was shown to have an impact on cell division and seed growth (Martin et al., 1999). Little is known on the disruption of the balance between trans-zeatin O-beta-D-glucoside and trans-zeatin but it may have an impact on the aforementioned processes too.

#### Cell Wall Metabolism and Flavonoids Metabolism Pathways Genes Expression Are Vastly Affected in VP11

#### **Cell wall**

Many genes related to cell wall metabolism were differentially expressed between clones at E-L 18–19. At the later stage E-L 26, differences of expression were mitigated, since only two genes from this network were differentially expressed. This indicates that cell wall modifications could be observed until beginning

of flowering but key events might occur only early. Again, this timing of event supports the hypothesis that flowering itself probably has a minor or no role in the phenotypic differences observed. It also dismiss the possibility of the occurrence of specific mutations in late mechanisms promoting higher abscission of flower in loose clones. On another side, several elements depicted below indicate that key differences between VP11 and compact clones might be related to the gamete formation, specifically pollen. The pollen viability in these clones was evaluated by Tello et al. (2018) on plants from the same plot during the same season 2015, and also in 2017. Results were consistent and VP-11 showed the lowest pollen viability (around 60%), followed by VP25, while the two compact clones showed a pollen viability close to 100%.

The broad differential expression (24 genes differentially under-expressed in VP11 vs. both compact clones, plus 23 only against RJ51) indicates a lower activity in the cell wall formation in VP11. It is possibly related to disruptions in cellular multiplication in VP11, which is also confirmed by the higher expression of transcripts related to cytoskeleton regulation in compact clones. However, unlike at E-L 26 in Garnacha (Grimplet et al., 2017), the genes related to cell cycle are not differentially expressed. Many genes related to pectin metabolisms showed differential expression. Ten polygalacturonases were differentially regulated, as well as 12 pectinesterases and 6 pectate lyases. For the xyloglucan degradation, 16 xyloglucan endotransglucosylases (XET) were significantly more abundant in the compact clones. Cell wall metabolism in flowers has been studied, principally in the frame of the pollen formation, showing that disturbance of the pectin network integrity caused lower pollen viability in potato (Cankar et al., 2014) or cotton (Wu et al., 2015), which is coherent with our results. The other main topic related to cell wall is the pollen tube formation but the studied stages were not relevant in that case, since pollination occurs between E-L 19 and E-L 26. The main hypothesis is that the observed changes are related to pollen since one of the main differences between the two stages is the presence (E-L 18–19)/absence (E-L 26) of important amounts of pollen, as flowers sampled at E-L 26 had lost their anthers. Moreover, all the genes differentially expressed are described in the grapevine atlas as very abundant in pollen and flower (but not in petal and carpel) and most of them are even specific of flower and pollen (Fasoli et al., 2012). Polygalacturonases were shown to be abundant in pollen of corn and other grasses (Pressey and Reger, 1989), and they showed increased expression during pollen tube growth (Dearnaley and Daggard, 2001).

A high number of transcripts involved in the regulation of actin cytoskeleton were also observed as overregulated in compact clones at E-L 18–19 (none in E-L 26). This regulation also plays an important role in the pollen tube growth but little is known of its variation during earlier events in flower. A transcriptome analysis in Arabidopsis revealed that both cell wall metabolism and cytoskeleton were strikingly overrepresented in pollen in preparation for the progamic phase, the pollen tube growth through the pistil (Honys and Twell, 2004). As for cell wall metabolism, the differentially expressed genes were detected as pollen specific in the atlas. As an example, cofilin is an actin depolymerizing factor (ADF), which were essential for pollen viability in Physcomitrella patens (Augustine et al., 2008). Two isogenes (Vitvi15g01148 and Vitvi16g01026) were over-expressed in compact RJ51 vs. loose VP11.

#### **Flavonoids**

At least one isoform of every gene in the phenylpropanoids metabolism from the phenylalanine to the anthocyanin was found differentially expressed and more abundant at E-L 18–19 in VP11 than in RJ51, and many of them also vs. VP2 and even vs. VP25 (**Figure 7**). Interestingly GAac treatment in grapevine flowers led to significantly higher polyphenol and anthocyanin content in wine (Teszlák et al., 2005). The much higher anthocyanin content also correlated to the lower Botrytis infection grade in GAac-treated grapes in the same work. As VP11 showed higher GA<sup>1</sup> content at E-L 18–19, it is reasonable to hypothesize that this increase in polyphenol metabolism gene expression might be related to GAs. Control of GAs over anthocyanin content has been documented but showed antagonist effect in different species and organs (Loreti et al., 2008). GAs were shown to be required for the induction of anthocyanin gene transcription in flowers of Petunia hybrida (Weiss, 2000) while in immature apple fruit, anthocyanin formation appears to be repressed by endogenous GA (Saure, 1990). Cheng et al. (2015) also detected enrichment of the phenylpropanoids metabolism category after GAac treatment of grape flowers. Lower disease incidence in loose clusters, when looseness is related to higher GA content as for VP11 (but not VP25), might be favored as a side effect by an overproduction of phenylpropanoids and/or anthocyanin, likely since flowers stages. Studies done on the polyphenols contents in musts produced from these clones did not show the same tendency, and VP11 showed the lowest values of anthocyanins and total polyphenols [(Provedo Eguía et al., 2007) and personal communication]. So, the protective effect against pests and diseases through the activation of phenylpropanoid network would occur during early flowering, perhaps limiting the infections or infestations at this time, and would contributed later, together with other physical and physiological attributes, to lower the incidence of diseases of loose clusters.

### CONCLUSION

We described here for the first time in grapevine intracultivar differences at three levels: phenotypical, hormonal, and transcriptional, among four clones, two producing compact clusters and two producing loose clusters. Evolution of hormonal levels during inflorescence development have been shown in Tempranillo, with clear differences between hormones and few differences between the clones. Considering all the analyses, loose clone VP25 presented few differences with the compact clones, giving no clues about general mechanisms or gene networks involved in its loose phenotype. Although several differentially expressed or polymorphic genes versus compact clones might be involved in looseness, even though their role is currently unknown. On the contrary, clone VP11 showed large differences with the compact clones in several aspects, mainly phenotypical

and in gene expression. Stage E-L 18–19, corresponding to the start of flowering, was the most informative stage, and gave some indications based on the coincidence of higher levels of active GA<sup>1</sup> and reduced expression of genes involved in cell wall metabolism. As in VP25, a number of genes differentially expressed could also play a role in the phenotypical differences and are worthy to be further investigated.

#### AUTHOR CONTRIBUTIONS

JI and JG designed the study and drafted the manuscript. JG, SI, JT, EB, and JI performed sampling and phenotyping. JI performed phenotypic analysis. JG performed the hormonal and gene expression analysis and interpretation. All authors read and approved the final manuscript.

#### FUNDING

This work was financially supported by the projects AGL2014- 59171R (co-funded by FEDER), AGL2010-15694, and the Ramón y Cajal (grant RYC-2011-07791), all from the Spanish MINECO.

#### ACKNOWLEDGMENTS

We acknowledge Viveros Provedo for the collection and maintenance of the clones, and R. Aguirrezábal, S. Hernáiz,

#### REFERENCES


B. Larreina, E. Vaquero, and M. I. Montemayor for their technical assistance. We also acknowledge support of the publication fee by the CSIC Open Access Publication Support Initiative through its Unit of Information Resources for Research (URICI).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018.01917/ full#supplementary-material

FILE S1 | A) Principal component analysis at E-L 18–19 with the differentially expressed genes of all replicates of the four Tempranillo clones. B) Principal component analysis at E-L 26 with the differentially expressed genes of all replicates of the four Tempranillo clones.

FILE S2 | List of significantly differentially expressed genes (twofold ratio, p-value < 0.05) between at least two clones at E-L 18–19 or E-L 26.

FILE S3 | Clustering of the differentially expressed genes at E-L 18–19 using QT+HCL method with a threshold of 0.2. Red background: lower expression in VP11, green higher expression in VP11, blue: higher expression in VP25 vs. compact clones. Cluster 30 corresponds to leftover genes that did not fit any of the profiles.

FILE S4 | Clustering of the differentially expressed genes at E-L 26 using QT+HCL method with a threshold of 0.2. Red background: lower expression in VP11, blue: higher expression in VP25 vs. compact clones. Cluster 14 corresponds to leftover genes that did not fit any of the profiles.

FILE S5 | List of variant polymorphisms between clones. Dark color: homozygous SNP, light color: heterozygous SNP. White: unclear polymorphism.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Grimplet, Ibáñez, Baroja, Tello and Ibáñez. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Occurrence of Seedlessness in Higher Plants; Insights on Roles and Mechanisms of Parthenocarpy

Maurizio E. Picarella and Andrea Mazzucato\*

Laboratory of Biotechnologies of Vegetable Crops, Department of Agriculture and Forest Sciences, University of Tuscia, Viterbo, Italy

Parthenocarpy in a broad sense includes those processes that allow the production of seedless fruits. Such fruits are favorable to growers, because they are set independently of successful pollination, and to processors and consumers, because they are easier to deal with and to eat. Seedless fruits however represent a biological paradox because they do not contribute to offspring production. In this work, the occurrence of parthenocarpy in Angiosperms was investigated by conducting a bibliographic survey. We distinguished monospermic (single seeded) from plurispermic (multiseeded) species and wild from cultivated taxa. Out of 96 seedless taxa, 66% belonged to plurispermic species. Of these, cultivated species were represented six times higher than wild species, suggesting a selective pressure for parthenocarpy during domestication and breeding. In monospermic taxa, wild and cultivated species were similarly represented. The occurrence of parthenocarpy in wild species suggests that seedlessness may have an adaptive role. In monospermic species, seedless fruits are proposed to reduce seed predation through deceptive mechanisms. In plurispermic fruit species, parthenocarpy may exert an adaptive advantage under suboptimal pollination regimes, when too few embryos are formed to support fruit growth. In this situation, parthenocarpy offers the opportunity to accomplish the production and dispersal of few seeds, thus representing a selective advantage. Approximately 20 sources of seedlessness have been described in tomato. Excluding the EMS induced mutation parthenocarpic fruit (pat), the parthenocarpic phenotype always emerged in biparental populations derived from wide crosses between cultivated tomato and wild relatives. Following a theory postulated for apomictic species, we argument that wide hybridization could also be the force driving parthenocarpy, following the disruption of synchrony in time and space of reproductive developmental events, from sporogenesis to fruit development. The high occurrence of polyploidy among parthenocarpic species supported this suggestion. Other commonalities between apomixis and parthenocarpy emerged from genetic and molecular studies of the two phenomena. Such insights may improve the understanding of the mechanisms underlying these two reproductive variants of great importance to modern breeding.

Keywords: adaptation, apomixis, inventory, parthenocarpy, Solanum lycopersicum, tomato

#### Edited by:

Jose I. Hormaza, Instituto de Hortofruticultura Subtropical y Mediterránea La Mayora (IHSM), Spain

#### Reviewed by:

Jaime Prohens, Universitat Politècnica de València, Spain Rihito Takisawa, Kyoto University, Japan

> \*Correspondence: Andrea Mazzucato mazz@unitus.it

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 23 October 2018 Accepted: 24 December 2018 Published: 18 January 2019

#### Citation:

Picarella ME and Mazzucato A (2019) The Occurrence of Seedlessness in Higher Plants; Insights on Roles and Mechanisms of Parthenocarpy. Front. Plant Sci. 9:1997. doi: 10.3389/fpls.2018.01997

## INTRODUCTION

"That some plants produce fruits without seeds is a fact observed and recorded by the ancients, according to Sturtevant in 1890" is the introductory statement reported in Gustafson's comprehensive work regarding the subject of parthenocarpy (Gustafson, 1942). The reasons for such an interest are soon after explained "because seedless fruits were thought to be better and also because many varieties are self-sterile, necessitating the planting of more than one variety in an orchard to insure a profitable crop" (Gustafson, 1942).

The production of seedless fruits (apireny or parthenocarpy sensu lato) has attracted since long time the farmers, because they are set independently of successful pollination. In addition, seedless fruits are favorable to processors, being their manipulation easier, and to consumers, being more pleasant to eat. Seedless fruits can occur when the ovary develops directly without fertilization (parthenocarpy sensu stricto) or when pollination and fertilization trigger ovary development, but the ovule/embryo aborts without producing mature seed (stenospermocarpy). The term parthenocarpy is hereby used in its broad sense to indicate both forms of apireny. Parthenocarpy is generally driven by genetic factors; nonetheless, seedlessness can be also induced with the application of various hormones to young inflorescences (Nitsch, 1952; Schwabe and Mills, 1981). Sources of genetic parthenocarpy are either obligate or facultative. In sexually propagated species, parthenocarpic genotypes should be facultative in order to be multiplied in conditions where the trait expressivity is lower. Differently, obligate parthenocarpy can be adopted in vegetatively propagated crops (Gorguet et al., 2005). From the adaptive point of view, the production of seedless fruits is an intriguing phenomenon, because empty fruits are costly to the mother plant and do not contribute to the production of offspring. When seed set fails, the abscission of the flower is the standard pathway to avoid the waste of resources in growing structures not fulfilling a biological purpose. The occurrence and permanence of parthenocarpy in plant populations is largely the effect of human selection that harnessed seedlessness as a commodity in fruit crops (Varoquaux et al., 2000). However, parthenocarpic genotypes are also found in wild species or in crops were the main product is not the fruit (non-fruit crops), indicating the possibility of adaptive reasons underlying empty fruit formation in higher plants.

In parallel with parthenocarpy that involves carpel development independent of pollination, the term parthenogenesis is used to indicate the development of an embryo in absence of male contribution. Parthenogenesis is part of the process called apomixis, a modified mode of reproduction resulting in seed production by asexual means (agamospermy, Nogler, 1984). Seeds of apomictic origin replicate the exact genome of the mother plant as they result from the parthenogenetic development of unreduced (apomeiotic) egg cells. In gametophytic apomixis, the apomeiotic egg cell is differentiated within an unreduced female gametophyte developing when a somatic nucellar cell acquires the developmental program of a megaspore (apospory) or when the meiocyte bypasses meiosis and proceeds directly with the gametophytic development (diplospory). In all cases, apomixis opens the possibility for cloning genotypes by seed. By consequence, harnessing apomixis is an exciting perspective for plant breeders and efforts to decipher its genetic control have been strongly pursued in the last decades (Albertini et al., 2010).

In this work, we present a bibliographic investigation of the occurrence of seedlessness within flowering plants and review hypothesis into the possible "adaptive" roles for parthenocarpy. To follow a case study, the inventory of the sources of parthenocarpy reported in tomato indicated that wide hybridization is involved in the majority of lines showing seedlessness in this species. Parallelisms with studies on apomicts offered novel cues into the mechanisms controlling parthenocarpy in angiosperms.

### METHODS

Search of species where the occurrence of parthenocarpy has been described was carried out through the available literature. The main bibliographic source was the comprehensive report by Gustafson (1942) that was combined with other publications. For each species, the phylogenetic position was listed, according to the flowering plant classification of the Angiosperm Phylogeny Group III (APG III, 2009). The ovules/ovary (seeds/fruit) ratio was considered to distinguish the species in seed categories as "monospermic" (a single seed per fruit) and "plurispermic" (more-than-one-seed per fruit). In addition, species were distinguished according to their occurrence as wild or cultivated, and among the latter between fruit and non-fruit crops (species predominantly grown for the consumption of vegetative parts or for ornamental means). Finally, parthenocarpic species were classified according to their life form, fruit type, sex distribution and occurrence of polyploidy. Differences between distribution of diploid and polyploid species within classes of seed category, life form, sex distribution and status as crop or wild were estimated by χ 2 -test of 2 × 2 or 2 × 3 contingency tables.

The inventory of the sources of parthenocarpy described in tomato was carried out using a similar procedure. A first screening was based on the most comprehensive reviews (Philouze, 1983; George et al., 1984; Lukyanenko, 1991), and further details were found in additional publications, newsletters and bulletins.

To compare gene expression patterns, genes involved in fruit set in tomato were selected from the analysis reported on the pat mutant (Ruiu et al., 2015). We addressed those genes that are upregulated after anthesis in the WT but not in the mutant (referred to as Pollination-dependent, PD group in Ruiu et al., 2015) and those that are up-regulated after anthesis in the mutant but not in the WT (referred to as Fruit growth-related, FG group). PD and FG gene lists have been used to retrieve gene expression at anthesis and few days after in cultivated tomato (Solanum lycopersicum L., formerly Lycopersicon esculentum Miller; cv M82) and in S. pimpinellifolium L. (formerly L. pimpinellifolium Miller) using the Tomato Expression Atlas (TEA) at the Sol Genomics Network website (sgn, https://solgenomics.net; Picarella and Mazzucato Inventory of Parthenocarpy in Angiosperms

Shinozaki et al., 2018). Comparable data were available for tissuespecific analysis on pericarp, placenta and septum. The ratio of expression after anthesis and at anthesis was calculated and expressed as logFC. Genes showing no expression after anthesis in TEA databases were discarded from the analysis, whereas genes showing no expression before anthesis were assigned the arbitrary value of 0.01 in order to allow a logFC value to be calculated. For the two groups of genes and the three tissues, the correlation coefficient between the logFC in Chico III and M82 and between Chico III and S. pimpinellifolium were calculated using SAS software package (SAS <sup>R</sup> University Edition).

### RESULTS AND DISCUSSION

#### Distribution of Parthenocarpy in Flowering Plants

After our search, parthenocarpy was reported in 96 Angiosperm taxa, 60 of which were listed in the (Gustafson, 1942) and the others were integrated from more recent sources (**Table S1**). Exactly one third of the species was classified as monospermic and the rest as plurispermic (**Table S1**). The most represented taxonomic group was the Rosidae (49.8%, **Figure 1A**), with a higher contribution of Anacardiaceae and Rutaceae (eight species each), Rosaceae (six species), and Moraceae (four species). The Asteridae contributed 13.8% of the species, with the strong prevalence of Solanaceae (nine species). Monocots were also present in the list (11.6%; **Figure 1A**). Notably, 17.9% of the listed species belonged to Basal Eudicots, where the Cucurbitaceae were prevalent (four species; **Table S1**).

About one half of the parthenocarpic species were trees; the rest were equally distributed as annuals herbs, perennial herbs and shrubs (**Figure 1B**). More than one half of the species were hermaphrodite, whereas about 40% showed a form of sex separation (**Figure 1C**). Among species with monospermic fruits, the majority had a drupe as fruit type; among the plurispermics the most common fruit was the berry (**Figure 1D**). In monospermics, about 38% of species were wild and 50% were fruit-crops (**Figure 1E**). In plurispermic species, fruit-crops were predominant (71%), but still about 8% were wild and 21% were species cultivated, but not for products of the reproductive system (**Figure 1F**). About half of the parthenocarpic species (52.1%) were polyploid or showed instances of polyploidy or evidence of hybrid origin (**Table S1**). This frequency was significantly higher (χ <sup>2</sup> = 13.1, P ≤ 0.01) than the general incidence of polyploidy in angiosperms (34.5%, Wood et al., 2009). Considering the above described classification of parthenocarpic species, polyploidy was unevenly distributed between monospermics and plurispermics, with a higher frequency in plurispermic species (**Figure 2A**). Differently diploid and polyploid species were evenly found among classes related to life form, to sex distribution or to the status as wild, non-fruit and fruit crops (**Figures 2B–D**).

The survey confirmed the previous observation that parthenocarpy is taxonomically widespread, being "not uncommon" (Gustafson, 1942) in species producing fruits with several to many seeds while representing a less frequent event in species having monospermic fruits (**Figure 3**; Roth, 1977). In addition, parthenocarpy was observed mostly among dicot taxa in both the wild and cultivated categories (**Table S1**). Polyploidy occurred with high frequency among parthenocarpic species. The wide occurrence of parthenocarpy in fruit-crops (65%) is likely the result of a selective pressure for seedlessness during their domestication and breeding. Reasons for such a selection can be several: parthenocarpy (i) releases fruit set from environmental constraints, (ii) may be advantageous for fruit processing, (iii) may improve fruit quality, or (iv) simply represents a feature appreciated by consumers. Selected varieties of watermelons, grapes, Citrus, pineapples and bananas are clear examples of fruit-crops where seedlessness is frequent (Varoquaux et al., 2000).

### The Mission of a Seedless Fruit: An Adaptive Role of Parthenocarpy

If parthenocarpy in fruit crops evidently benefited of human selection, the production of seedless fruits in wild or non-fruit crop species (**Table S1**; **Figures 1E,F**) represents an apparent biological paradox because they do not directly contribute to the production of offspring. The persistence of parthenocarpy in such species suggests the possibility of adaptive reasons for retaining empty fruits. In monospermic plants, such a role has been based on different mechanisms by which parthenocarpic fruits would reduce seed predation. In this sense, a functional role of seedless fruits has been proposed for wild parsnip (Pastinaca sativa L.; **Figure 3A**), where their occurrence has been related to a defensive value against the parsnip webworms (Zangerl et al., 1991). Given a choice between parthenocarpic and normal fruits, the webworm prefers seedless fruits because of the lower concentration of the deterrent furanocoumarins they contain. In terebinth (Pistacia terebinthus L.), parthenocarpy appears to reduce seed predation because predators cannot discriminate between seeded and seedless (deceptive) fruits, as ovaries are not yet enlarged at the time of oviposition; the larvae soon die because in the parthenocarpic fruits there is no endosperm available for feeding (Traveset, 1993). A similar hypothesis has been extended to Pistacia lentiscus L. (Verdù and García-Fayos, 1998) and Bursera morelensis L. (Ramos-Ordoñez et al., 2008) and even to explain the occurrence of empty seeds in the gymnosperm Juniperus osteosperma (Torr.) Little (Fuentes and Schupp, 1998).

The reason given for an adaptive role of parthenocarpy in monospermic species more difficultly applies to plurispermic taxa, where still wild species and non-fruit crops with parthenocarpic fruits have been reported. It is thought that plants have evolved flowers with a great number of ovules as a response to habitats where pollination is more uncertain (Verdù and García-Fayos, 1998). In these cases, a plant with many ovules per flower often experiences a very variable seed/ovule ratio (Burd, 1994). It is well-known that seeds supply the ovary with the hormones necessary for triggering fruit set and development (Sotelo-Silveira et al., 2014); in fact, fruits grow more in portions were seeds are developing (Crane, 1964). Differences in the number of seeds per fruit alter the cost of the fruit for the mother

plant: plants invest fewer resources per seed in multi-seeded fruits than in few-seeded fruits (Obeso, 2002). Accordingly, as a strategy to optimize resources, mother plants avoid the development of plurispermic fruits with few seeds (Obeso, 2002). In the case of low pollination rates, the few seeds set are presumable not enough to support fruit growth, thus causing abscission. Under these circumstances, parthenocarpic capacities could offer the opportunity to accomplish fruit development and the production and dispersal of the few seeds that otherwise would be lost. The high incidence of parthenocarpy in plants with separate sexes (**Figures 3E,F**), that experience low pollination rates more often than hermaphrodites, supports this suggestion. In this scenario, genes for (facultative) parthenocarpy provide an adaptive advantage which would lead to a boost of seed development-related hormones (auxin and gibberellins) and the chance to produce few seeds for the next generation. In a parallel example, it was demonstrated that reduced reproductive output resulting from early flowering offers an advantage in the adaptation of invasive weeds to higher latitudes (Kralemann et al., 2018).

### Inventory of Parthenocarpic Sources in Tomato

The tomato is an important vegetable crop worldwide and a model for the study of fruit set and development (Foolad, 2007). Harnessing parthenocarpy in this species has been an important breeding objective to uncouple fruit set from environmental constrains and to provide quality traits such as higher soluble solids and ascorbic acid content (**Figure 3L**; Gorguet et al., 2005; Mazzucato and Picarella, unpublished results).

Although recent reviews have generally focussed on describing reverse genetics experiments (Gorguet et al., 2005; Shinozaki and Ezura, 2016; Joldersma and Liu, 2018), in the past inventories have been compiled that provide information on the origin and use of parthenocarpic accessions obtained by mutagenesis and conventional breeding (Philouze, 1983; George et al., 1984; Lukyanenko, 1991).

With possibly the only exception of the parthenocarpic fruit (pat) mutant obtained by EMS treatment (Bianchi and Soressi, 1969), all the parthenocarpic lines described were derived from crosses involving the cultivated tomato and different relative species (**Table 1**). pat-2, one of the best studied locus for parthenocarpy in tomato, was first described by Dovedar (1973, cited by Philouze, 1983) in the progeny of a cross involving S. habrochaites S. Knapp & D. M. Spooner (formerly Lycopersicon hirsutum Dunal). pat-k was retrieved in the progeny of a cross involving pat-2, although it segregated as an independent locus (Takisawa et al., 2017). Two parthenocarpic lines obtained in The Netherlands (Zijlstra, 1985) were subsequently classified in the pat series; IVT-1 was related to a digenic control (pat-8/pat-9), whereas IVT-2 was referred to as pat-5 (Gorguet et al., 2008). Both these lines were derived from crosses with different wild species TABLE 1 | Sources of genetic parthenocarpy in tomato described in the literature (species are reported with taxonomic names adopted after Peralta et al., 2008).


<sup>a</sup>Not determined, not applicable.

<sup>b</sup>Derived from a cross between a variant from "Severianin" and a non-parthenocarpic cultivar.

<sup>c</sup>Classified in the IPK seedbank (http://www.ipk-gatersleben.de/genbank/) as a L. esculentum Mill. convar. parvibaccatum Lehm. var. cerasiforme (Dunal) Alef.

<sup>d</sup>Classified in the IPK genebank as L. esculentum Mill. convar. fruticosum Lehm. var. pygmaeum Lehm.

(**Table 1**). Also IL5-1 (pat-6/pat-7) was obtained after a cross with S. habrochaites (Gorguet et al., 2008).

The same involvement of wide crosses is found in the pedigree of those sources of parthenocarpy that were not genetically characterized. A contribution from S. lycopersicum var. cerasiforme (formerly L. esculentum var. cerasiforme) is traced in the first report of parthenocarpic fruits in tomato (Hawthorn, 1937), in the Ukrainian selection Pridneprovskij (Kraevoj, 1949, cited by Philouze, 1983), in PI190256 (Johnson and Hall, 1954), and possibly in the pedigree of the varieties Lyconorma and Lycoprea whit the parental accession Heinemänns Jubileum (Reimann-Philipp, 1977 personal communication cited by Philouze, 1983). In addition, line RP 75-59 derived from a cross between Atom and Bubjekosoko, British and Russian cultivars, respectively, was characterized as pat-3/pat-4 (Reimann-Philipp, 1968, cited by Philouze, 1983). Bubjekosoko is a cherry tomato type (Mahmoud and El-Eslamboly, 2014), classified in the IPK seedbank (http:// www.ipk-gatersleben.de/genbank/) as a L. esculentum Mill. convar. parvibaccatum Lehm. var. cerasiforme (Dunal) Alef taxon.

Several parthenocarpic selections obtained in Oregon had S. pimpinellifolium and S. habrochaites in their pedigree (Baggett and Frazier, 1978a,b, 1982). S. pimpinellifolium was also a relative of Carobeta, a variety carrying the introgression of the B allele responsible for the orange fruit color due to high content of β-carotene (Georgiev and Mikhailov, 1985).

Facultative parthenocarpy was also found after more distant crosses of the cultivated tomato with S. cheesmaniae (L. Riley) Fosberg (formerly L. cheesmaniae L. Riley; Mikhailov and Georgiev, 1987, cited by Lukyanenko, 1991), S. neorickii D.M. Spooner et al. (formerly L. parviflorum C.M. Rick et al.; Philouze, 1983), S. pennellii Correll [formerly L. pennellii (Correl) D'Arcy; Stoeva et al., 1985, cited by Lukyanenko, 1991], S. sitiens I.M. Johnst. [formerly L. sitiens (I.M. Johnst.) J.M.H. Shaw; R. Chetelat, personal communication], S. peruvianum L. [formerly L. peruvianum (L.) Miller; (Lesley and Lesley, 1953)] and Cyphomandra spp. (Luneva, 1957 cited by Philouze, 1983).

The association of parthenocarpy and wide hybridizations was first addressed by Lesley and Lesley (1953), who attributed the phenotype to an "exceptional combination of genes coming from the two species that involved an excessive production of auxin." After that, this association was mentioned (Philouze, 1983; Ho and Hewitt, 1986), but no specific hypothesis as to the mechanism of this observation was proposed.

### Wide Hybridization as a Force Driving Departures From Normal Sexual Plant Reproduction

As all developmental processes, sexual plant reproduction is a complex pathway depending on external and internal stimuli and regulated by multi-dimensional checkpoints and interactions. However, early studies underlined the modular and hierarchical structure of reproductive development algorithms in plants (Haig, 1990). This suggestion was developed by modern synthetic biologists that support the awareness that cells and organisms are organized as a hierarchical combination of functional modules (Benner and Sismour, 2005). Following the extensive amount of data produced by high-throughput sequencing methods, the modular organization of cellular systems has emerged and led to the notion that they could be treated similarly to traditional engineering systems (electrical or mechanical). It seems therefore possible to use novel combinations of existing modules to achieve new functions in a given organism in a predictable way (Cameron et al., 2014).

First inventories of species showing agamospermic behavior revealed that apomixis occurs almost exclusively in taxa characterized by hybrid origin and polyploidy (Asker and Jerling, 1992). Following this evidence, Carman (1997) elaborated the "duplicate-gene asynchrony hypothesis" for the genetic control of apomixis. The theory, also known as "no-gene theory," postulates that modular sets of genes inherited from different species may manifest asynchronous expression in terms of heterochronicity (wrong expression or asynchrony in time) and/or heterotopicity (wrong expression in space) and as such explain modifications of the reproductive system like apospory, diplospory, and apomixis as a whole (Carman, 1997). Thus, apomixis and related reproductive variations would result from developmental programs that are ectopically and/or prematurely expressed due to the misregulation of duplicate genes in polyploids, mesopolyploids, or paleopolyploids (Carman, 1997).

Accordingly with Carman's hypothesis, the so called "stages of evolution" of apomixis begin with weak facultative expression that has been consolidated by mutations. This is corroborated by the fact that "tendencies toward apomixis" are common in natural and synthetic polyploids (Asker and Jerling, 1992; Osborn et al., 2003). Interestingly, according to the "fading borders model," gradual heterotopic variation in the level of expression of floral organ identity genes resulted in the evolution of floral organ morphology across diversification of angiosperms, from the basalmost to the more evolved lineages (Soltis et al., 2007).

Experimental evidence of the "no-gene" theory has recently emerged from analysis of transcriptomes in apomicts. The occurrence of heterochronic gene expression, compared to sexual types, has been experimentally displayed in the diplosporous Tripsacum dactyloides (L.) L. (Grimanelli et al., 2003; Bradley et al., 2007) and Boechera retrofracta (Graham) Á. Löve & D. Löve (Sharbel et al., 2010).

### Parthenocarpy as a Consequence of Wide Hybridization

Examining the common origin from interspecific crosses in tomato sources for parthenocarpy leads to postulate a similar "no-gene" (that means "no-mutation") genetic basis also for parthenocarpy. When different genomes are "colliding" (sensu Carman, 2001) after interspecific or intraspecific wide crosses, modification of developmental programs controlling fruit set may occur by overlapping of regulatory signals that may be spatial-temporally asynchronous and thus drive the development of the ovary independently of fertilization. Such a modification can eventually become fixed in populations if adaptive advantages with the new developmental program exist as it is found in apomictic plants and parthenocarpic crops.

From an evolutionary point of view, gene interactions are postulated to be functional within species, but incompatible or deleterious in hybrids (Muller, 1942). Hybrid lethality may therefore function as a driver of seed abortion that can lead to stenospermocarpy. However, reunification of divergent genomes may more simply lead to novel patterns of expression in target loci and genetic or epigenetic changes resulting in altered gene expression, gene silencing, novel tissue specificity or activation of transposable elements (Comai et al., 2000). That these events may lead to improved fitness is witnessed by the common hybrid origin of invasive plants (Blair and Hufbauer, 2010) and by the attitude of apomicts to colonize disturbed habitats ("geographical parthenogenesis"; Cosendai and Hörandl, 2010).

Related, sexually compatible species, may present different time spans for reproductive developmental modules such as development of sporangia, meiosis, gametogenesis, fertilization, and fruit set. A hybrid between these taxa could inherit different modules that may not be synchronized as in the parents. Although, it is known that a specific set of genes is activated exclusively after pollination/fertilization (Vriezen et al., 2008; Ruiu et al., 2015), it is also recognized that fruit set may be driven by fertilization-independent pathways, activation of downstream genes or removal of repressors driven by mutations or hormone treatments (Pascual et al., 2009; Wang et al., 2009; Ruan et al., 2012). In this scenario, the effect of asynchrony in hybrid gene expression may be crucial to induce fruit set positive signals before fertilization can take place.

This hypothesis is supported by a number of observations that have emerged from studies on genetic parthenocarpy in tomato as detailed below:


days after anthesis and at anthesis of genes up-regulated by pollination (PD group) and early fruit growth (FG group) in cv Chico III ovaries (Ruiu et al., 2015) and in M82 and S. pimpinellifolium pericarp, placenta and septum (Tomato Expression Atlas at the Sol Genomics Network website, sgn, https:// solgenomics.net;Shinozaki et al., 2018).

module organ development and identity. Ultimately such alterations may affect later processes like ovary growth.


interspecific crosses as in pummelo (C. grandis x C. paradisi) and other Citrus hybrids (Soost and Cameron, 1985; Vardi et al., 2008).

All these observations support the idea that the expression of parthenocarpy in many tomato lines is the consequence of particular combinations of (sets of) genes involved in reproduction, more than that of a single gene that underwent spontaneous or induced mutation. This possibility is supported by the high occurrence of polyploidy among parthenocarpic species that has been described and discussed before (**Figure 2**).

### Transcriptomics of Tomato Fruit Set Supports the Hybrid Origin of Parthenocarpy

A number of studies have focussed on the transcriptomic description of pollination-dependent and pollinationindependent fruit set in tomato, comparing systems where parthenocarpy was driven by hormone treatment (Vriezen et al., 2008; Tang et al., 2015), expression of inductive genes (Martinelli et al., 2009; Molesini et al., 2009) or silencing of repressors (Wang et al., 2009; Mounet et al., 2012). Genetic parthenocarpy has been investigated at the transcriptomic level only in the pat3/pat4 (Pascual et al., 2009) and in the pat (Ruiu et al., 2015) mutants, but the former was the only system analyzed where seedlessness was obtained after hybridization. In this study, the authors concluded that the stage of anthesis was the most different between the wild-type and the pat3/pat4 parthenocarpic line and the key point at which many genes are differentially expressed. However, normal and parthenocarpic fruit set were transcriptionally similar, without drastic changes in gene expression between the two genotypes (Pascual et al., 2009). Thus, transcriptomic analysis of fruit set in pat3/pat4 suggested the importance of differential gene expression in time, although this study could not explicitly conclude that heterochronicity was the driving force of the entire process.

Transcriptomic studies at the fruit set stage have also been carried out in tomato wild relatives (Pattison et al., 2015; Dai et al., 2017). However, due to the lack of parallel studies, the available databases offer scarce possibility to evaluate heterochronicity in gene expression between wild and cultivated forms. To get insights into the degree of correlation of gene expression in cultivated and wild forms, genes involved in fruit set were selected from the analysis on the pat mutant (Ruiu et al., 2015) addressing those transcripts that are up-regulated after anthesis in the WT but not in the mutant (Pollination-dependent genes, PD group) and those that are up-regulated after anthesis in the mutant but not in the WT (Fruit growth genes, FG group). For all these genes, the logFC between anthesis and 4/5 DPA was calculated from expression data retrieved in the TEA database in M82 and S. pimpinellifolium, respectively and separately for different ovary tissues (pericarp, placenta, septum).

The correlations found between the two cultivated forms (Chico III and M82) ranged from 0.22 to 0.45, being more differentiated among tissues in the PD than in the FG gene group (**Figure 4**). All the correlations between Chico III and S. pimpinellifolium showed lower values, with a decrement that ranged between 30 and 74% in the PD group and between 39 and 66% in the FG gene group (**Figure 4**). Making allowance of the differences in the experimental systems compared, this analysis provided an indirect indication that specific sets of genes are differentially activated at the fruit set interface between cultivated and wild tomato.

#### Further Commonalities Between Apomixis and Parthenocarpy

A similar mechanistic basis for apomixis and parthenocarpy may also be deduced by the fact that the two phenomena seldom occur in the same taxon, as reported in birch (Bogdanov and Stukov, 1976), in subtropical species of the Asteraceae (Werpachowski et al., 2004), in Citrus (Vardi et al., 2008) and in Musa (Okoro et al., 2011).

Koltunow et al. (2002) treated apomixis and parthenocarpy as phenomena with possible common bases by highlighting a number of commonalities between the two processes. First, they both derive from the disruption of molecular mechanisms that prevent the development of a floral organ (ovule and carpel, respectively) in the absence of fertilization. As such, the ovule becomes a fundamental structure in the molecular signaling underlying these mechanisms. Moreover, the two processes are stochastic and both characterized by facultativeness, that makes possible the coexistence of modified and normal processes within the same individual (Koltunow et al., 2002).

A further common element is the involvement of B-class MADS-box homeotic transcription factors in both apomixis and parthenocarpy. The fact is paradoxical since, according to the ABC model for floral organ formation, B-class genes are typically expressed in the second and third floral whorl and contribute to the identity and development of petals and stamens (Weigel and Meyerowitz, 1994). However, several authors reported the expression of SlDEF [the tomato ortholog of DEFICIENS (DEF) in Antirrhinum majus and of APETALA3 (AP3) in Arabidopsis thaliana] in the fourth floral whorl (Mazzucato et al., 2008; Tang et al., 2015). In the aposporous apomict Hieracium piloselloides Vill., the ovule presents a downregulation of DEFH in a broad zone of the chalaza that coincides with the region where aposporous initials differentiate; such a downregulation is not seen in sexual ovules (Guerin et al., 2000). In parallel, differential expression of DEF homologs have been reported in ovaries showing wild-type or parthenocarpic behavior. In tomato, SlDEF shows a peak of expression in ovaries at anthesis, that coincides with the signal that arrests ovary growth (Vriezen et al., 2008; Wang et al., 2009); such an accumulation is absent in ovaries that develop autonomously in the pat mutant (Mazzucato et al., 2008; Ruiu et al., 2015). In parallel with these findings, mutated

#### REFERENCES


alleles (apple; Yao et al., 2001) or epialleles (oil palm; Ong-Abdullah et al., 2015) of B-class MADS box genes have been shown to cause parthenocarpy and defects in their expression showed interference with fruit set in grapevine (Fernandez et al., 2013).

### CONCLUSIONS

The inventory of angiosperm species showing parthenocarpic behavior and of the sources of parthenocarpy in the specific case of tomato offered novel insights into the role that autonomous ovary development may have played in natural evolution and in the man-driven activity of selection and breeding. The search of novel parthenocarpic species, novel spontaneous and induced mutants as well as novel genes involved in the phenomenon will give support to the models proposed and new insights into the control of ovary development in angiosperms. In parallel with apomixis, such insights will pave the way to new opportunities to harness a modification of the reproductive system in tomato and in other fruit crops that is of great interest to modern breeding.

### AUTHOR CONTRIBUTIONS

AM and MP contributed conception and design of the study. MP organized the database and wrote the first draft of the manuscript. AM wrote sections of the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.

#### ACKNOWLEDGMENTS

We warmly thank G. P. Soressi for having inspired research on parthenocarpy at the Tuscia University and developed materials used herein. E. Barcelos, M. Berenbaum, C. Gasser, and the PNAS Editorial Office, M. J. Dìez, I. Kataoka, C. Mesejo, J. Prohens, M. F. Ramos Ordoñez, J. Sardos and the Plos ONE Editorial Office are acknowledged for use of photographic material of Elaeis oleifera, Pastinaca sativa, Annona squamosa, Cucumis sativus, Actinidia arguta, Citrus clementine, Solanum muricatum, Bursera aptera, and Musa acuminate, respectively. Esther van der Knaap is deeply acknowledged for critical reading and discussion on the manuscript.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2018. 01997/full#supplementary-material

Asker, S. E., and Jerling, L. (1992). Apomixis in Plants. Boca Raton, FL: CRC Press. Baggett, J. R., and Frazier, W. A. (1978a). Oregon cherry' tomato. Hort Sci. 13:598. Baggett, J. R., and Frazier, W. A. (1978b). Oregon T5-4 parthenocarpic tomato line. Hort Sci. 13:599.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Picarella and Mazzucato. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Polyploidy in Fruit Tree Crops of the Genus Annona (Annonaceae)

Carolina Martin† , Maria. A. Viruel, Jorge Lora and José I. Hormaza\*

Instituto de Hortofruticultura Subtropical y Mediterránea La Mayora (IHSM-UMA-CSIC), Málaga, Spain

Genome duplication or polyploidy is one of the main factors of speciation in plants. It is especially frequent in hybrids and very valuable in many crops. The genus Annona belongs to the Annonaceae, a family that includes several fruit tree crops, such as cherimoya (Annona cherimola), sugar apple (Annona squamosa), their hybrid atemoya (A. cherimola × A. squamosa) or pawpaw (Asimina triloba). In this work, genome content was evaluated in several Annona species, A. triloba and atemoya. Surprisingly, while the hybrid atemoya has been reported as diploid, flow cytometry analysis of a progeny obtained from an interspecific cross between A. cherimola and A. squamosa showed an unusual ploidy variability that was also confirmed karyotype analysis. While the progeny from intraspecific crosses of A. cherimola showed polyploid genotypes that ranged from 2.5 to 33%, the hybrid atemoyas from the interspecific cross showed 35% of triploids from a total of 186 genotypes analyzed. With the aim of understanding the possible implications of the production of non-reduced gametes, pollen performance, pollen size and frequency distribution of pollen grains was quantified in the progeny of this cross and the parents. A large polymorphism in pollen grain size was found within the interspecific progeny with higher production of unreduced pollen in triploids (38%) than in diploids (29%). Moreover, using PCR amplification of selected microsatellite loci, while 13.7% of the pollen grains from the diploids showed two alleles, 41.28% of the grains from the triploids amplified two alleles and 5.63% showed up to three alleles. This suggests that the larger pollen grains could correspond to diploid and, in a lower frequency, to triploid pollen. Pollen performance was also affected with lower pollen germination in the hybrid triploids than in both diploid parents. The results confirm a higher percentage of polyploids in the interspecific cross, affecting pollen grain size and pollen performance. The occurrence of unreduced gametes in A. cherimola, A. squamosa and their interspecific progeny that may result in abnormalities of ploidy such as the triploids and tetraploids observed in this study, opens an interesting opportunity to study polyploidy in Annonaceae.

Keywords: Annona, Annonaceae, karyotype, polyploidy, triploid, tetraploid, unreduced gametes

### INTRODUCTION

Polyploidy is believed to be a major mechanism of adaptation and speciation, recognized as a major force in evolution (Van de Peer et al., 2017) and very valuable for crop improvement (Udall and Wendel, 2006; Mason, 2016). Polyploidy is more common in plants than in animals. It is estimated that between 30 and 70% of extant flowering plant species are polyploids

#### Edited by:

Emidio Albertini, University of Perugia, Italy

#### Reviewed by:

Richard M. K. Saunders, The University of Hong Kong, Hong Kong Michael David Pirie, Johannes Gutenberg University Mainz, Germany

> \*Correspondence: José I. Hormaza ihormaza@eelm.csic.es

†Present address: Carolina Martin, Rijk Zwaan Ibérica S.A., Almería, Spain

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 08 December 2018 Accepted: 22 January 2019 Published: 11 February 2019

#### Citation:

Martin C, Viruel MA, Lora J and Hormaza JI (2019) Polyploidy in Fruit Tree Crops of the Genus Annona (Annonaceae). Front. Plant Sci. 10:99. doi: 10.3389/fpls.2019.00099

**63**

(Bretagnolle and Thompson, 1995; Ramsey and Schemske, 1998; Otto and Whitton, 2000; Adams and Wendel, 2005) and all angiosperms are supposed to have descended from polyploidy ancestors and, consequently, would indeed be paleopolyploids (Debodt et al., 2005; Cui et al., 2006; Jaillon et al., 2007; Soltis et al., 2009; Amborella Genome Project, 2013).

Diploid species have two sets of homologous chromosomes. Each chromosome of one set may pair with a corresponding one of the other set during meiosis. Such normal meiosis produces haploid gametes. Abnormal meiosis, due to various genetic and environmental factors, can generate unreduced gametes, or gametes with the somatic chromosome number (Ahmad et al., 1984; Saini, 1997; Viccini and De Carvalho, 2002; Sun et al., 2004; Bajpai and Singh, 2006; Rezaei et al., 2010; Wang et al., 2017). Studies on polyploid evolution have revealed that the most common mechanism of polyploidy in flowering plants involve unreduced gametes (Bretagnolle and Thompson; Darlington, 1937, 1965; Harlan and deWet, 1975; deWet, 1979; Karpechenko, 2010; Kreiner et al., 2017). Different theoretical models of polyploidy have been elaborated considering that the success of tetraploids arisen by sexual polyploidization within a diploid population is influenced by the frequency with which unreduced gametes are produced by diploids (Felber, 1991; Felber and Bever, 1997; Ramsey and Schemske, 1998; Li et al., 2004). There are two main ways to produce unreduced diploid gametes in plants, as a result of first-division restitution (FDR) or second-division restitution (SDR) during meiosis (Hermsen, 1984; De Storme and Geelen, 2013). In FDR, the first meiotic division fails and, as consequence, the two chromosomes of the unreduced gamete are non-sister chromatids. In SDR, the second meiotic division fails and the two chromosomes of the unreduced gamete are sister chromatids. A third proposed mechanism has been reported as indeterminate meiotic restitution (IMR), which produces microspores with disproportionate number of chromosomes due to a restitution mechanism (Lim et al., 2001; De Storme and Geelen, 2013).

The potential role of unreduced gametes in the origin and evolution of polyploids as well as in plant breeding, has been reviewed in different plant species (Harlan and deWet, 1975; deWet, 1979; Bretagnolle and Thompson, 1995; Ramsey and Schemske, 1998; Brownfield and Kohler, 2011; Sattler et al., 2016). The use of unreduced gametes in plant breeding is an effective tool for the induction of polyploidy and variation (Bringhurst and Köhler, 1984; Negri and Veronesi, 1989; Iwanaga et al., 1991; Dewitte et al., 2009, 2010; Younis et al., 2014; Bradshaw, 2016).

The Annonaceae is the largest extant family in the earlydivergent eumagnoliid angiosperm clade (The Angiosperm Phylogeny Group et al., 2016). The Annonaceae contains over 2400 species in more than 130 genera, grouped in four subfamilies (Anaxagoreoideae, Ambavioideae, Annonoideae, and Malmeoideae) and 14 tribes (Chatrou et al., 2012), with a pantropical distribution (Couvreur et al., 2011; Chatrou et al., 2012; Guo X. et al., 2017). A limited number of species in the Annonaceae, belonging to just two genera in the tribe Annoneae of the subfamily Annonoideae [Annona L. and Asimina Adans., since Rollinia A. St.-Hil. has been included in Annona (Rainer, 2007)], produce edible fruits. Examples include cherimoya (Annona cherimola Mill.), sugar apple (A. squamosa L.), soursop (A. muricata L.), atemoya (a hybrid between A. cherimola and A. squamosa), custard apple (A. reticulata L.), ilama (A. macroprophyllata Donn. Sm.), pond apple (A. glabra L.), or pawpaw (Asimina triloba) (Larranaga et al., 2019). Among them, cherimoya and sugar apple have been used as food source by pre-Colombian cultures in Central and South America (Popenoe, 1989). Their cultivation has continued up to the present day, with a clear niche of expansion in countries with subtropical climates. Pawpaw is a particularly interesting fruit crop in the family because, although the fruit shows an exotic tropical flavor, Asimina is the only genus of the Annonaceae adapted to cold climates (Pomper and Layne, 2010; Losada et al., 2017). In spite of the phylogenetic position among early-divergent angiosperms and promising interest as commercial crops, few studies have evaluated ploidy in the tribe Annoneae and several discrepancies have been shown among different works, mainly in the karyotype due to the presence of grouped chromosomes (Kumar and Ranadive, 1941; Asana and Adiata, 1945; Bowden, 1945; Darlington and Ammal, 1945; Bowden, 1948; Miege, 1960; Thakur and Singh, 1965, 1969; Walker, 1972; Morawetz, 1986a,b, 1988; Datta and De, 1990; Folorunso and Olorode, 2008). Most Annona species are diploid with the exception of A. glabra (Bowden, 1945, 1948; Darlington and Ammal, 1945; Miege, 1960; Thakur and Singh, 1969; Morawetz, 1986a) and A. lutescens (Morawetz, 1986a) that have been reported as tetraploid, and the previously named Rollinia genus (now included in Annona, Rainer, 2007), showing tetraploid (A. neolaurifolia and A. exsucca, Morawetz, 1986b) and hexaploid species (A. mucosa and A. pulchrinervis, Walker, 1972; Morawetz, 1986a,b). Triploid mutants have also been occasionally reported in Asimina triloba (Bowden, 1949). Interestingly, in the frame of a breeding program in Annona an unexpected high proportion of triploid genotypes was found in the progeny of an interspecific cross involving the diploid species, A. cherimola and A. squamosa (Martin, 2013). Spontaneous triploids from crosses involving diploid parents have been reported in intra- and inter-specific crosses in several genera such as Citrus (Esen and Soost, 1971), Populus (Bradshaw and Stettler, 1993), Anthoxanthum (Bretagnolle, 2001), Rhododendron (Ureshino and Miyajima, 2002), Rosa (Crespel et al., 2006), and Asparagus (Ozaki et al., 2014).

Although polyploidy is one of the main processes involved in angiosperm evolution, the mechanisms behind it have not been evaluated previously in the tribe Annoneae in which ploidy has been reported in several species. After the finding of an unexpected high number of triploids in the progeny of an interspecific A. cherimola × A. squamosa cross, we decided to carry out a thorough study of ploidy levels in several species of the Annoneae and analyze with detail the reasons behind the production of non-reduced gametes. For that, we first studied ploidy in several Annona species (A. senegalensis, A. montana, A. glabra, A. muricata, A. emarginata, A. neosalicifolia, A. cherimola, and A. squamosa) including the hybrid atemoya (A. cherimola × A. squamosa) and Asimina triloba. Then, we analyzed the ploidy in some backcrosses (A. cherimola × atemoya), self-crosses (atemoya × atemoya) and intraspecific crosses (A. cherimola × A. cherimola). Finally, we evaluated the implications of unreduced gametes on pollen performance, pollen size and frequency distribution of pollen grains in the parents and the progeny that result in polyploidy.

#### MATERIALS AND METHODS

fpls-10-00099 February 7, 2019 Time: 19:13 # 3

#### Plant Material

Adult trees of eight Annona species (A. senegalensis, A. montana, A. glabra, A. muricata, A. emarginata, A. neosalicifolia, A. cherimola and A. squamosa) including the hybrid atemoya (A. cherimola × A. squamosa) were used in these experiments. We also included Asimina triloba because Asimina is the only genus in the tribe Annoneae in which the presence of natural triploids has been shown (Bowden, 1949). All the plant material analyzed is located in an ex situ germplasm field collection at the IHSM La Mayora-CSIC-UMA, Málaga, Spain, at latitude 36◦ 450N, longitude 4◦ 4 <sup>0</sup> W and altitude 35 m. The progeny obtained from an interspecific cross between diploid A. cherimola (cv. Fino de Jete, "Fj") and A. squamosa (cv. Thai seedless, "Ts"), four backcrosses (A. cherimola × atemoya), two selfcrosses (atemoya × atemoya) and seven intraspecific crosses (A. cherimola × A. cherimola) were also analyzed (**Tables 1**, **2**).

#### Measure of the Relative DNA Content Using Flow Cytometry

Flow cytometry analysis was done with the Cystain UV Precise T Kit (Sysmex, Norderstedt, Germany). Crude nuclei were extracted from young leaves using a sharp razor blade and deposited in a nuclei isolation buffer (Sysmex, Norderstedt, Germany). After chopping, the crude solution was passed through a 30 µm nylon filter and mixed with staining buffer (Sysmex) in a proportion of 1 volume of crude solution to 4 volumes of staining buffer. Relative DNA content was measured by using a Cyflow <sup>R</sup> PA, Partec. DNA content was quantified relative to 2C DNA content from the tomato cultivar, Moneymaker (1.9 pg) (Bennett et al., 2000). The 2C value corresponds to the DNA content of a somatic diploid nucleus (Doležel et al., 1989; De Rocher et al., 1990). DNA content was based on 3000–5000 nuclei per sample and two independent replicates and was calculated using the following formula (Dolezel and Bartos, 2005): Sample 2C DNA content = [(sample G<sup>1</sup> peak mean)/(standard G<sup>1</sup> peak mean)] × standard 2C DNA content (pg DNA). DNA content (pg) was converted to mega base pairs (Mpb), 1 pg = 980 Mpb (Bennett et al., 2000).

#### Karyotype

Chromosomes were observed both in flower buds and in young leaves. Flower buds were fixed and stored in 3:1 (v/v) ethanol: acetic acid. The karyotype was evaluated in somatic cells from young pistils that were hydrolyzed using 1N HCl, incubated 30 s at 65◦C, and squashed in 1% aceto-carmine under a coverslip. Young leaves were hydrolyzed using 45% acetic acid, heated by flame 5 s and squashed. Then, the slide was introduced in liquid nitrogen for several seconds and stained with a solution of 1 µg/mL of 4<sup>0</sup> ,6-diamidino-2-phenylindole (DAPI). The methods were modified from Burley (1965), Tanaka and Okada (1972), and Leeton and Fripp (1991).

### DNA Analysis From Single Pollen Grains Sample Handling

Annona cherimola and A. squamosa flowering cycles are characterized by a protogynous dichogamous system (Wester, 1910), a common characteristic in Annonaceae (Gottsberger, 1999). Dehiscent anthers from flowers at the male stage of diploid and triploid hybrids and from the two parents of the interspecific cross were shaken using forceps inside 1.5 ml Eppendorf tubes to release the pollen grains; then the pollen grains were stored in a freezer before manipulation. Pollen grains were mixed with 30 µl sterile distilled water in the 1.5 ml Eppendorf tubes and carefully pipetted back and forth. A drop from the mixture of diluted pollen was placed in a disposable Petri dish and observed under a Leica S6D stereo microscope (2.5 × magnification). Some characteristics of the pollen grain such as the rigid and thick exine (Santos and Mariath, 1999; Parre and Geitmann, 2005) and the size from 0.005 to 0.25 mm in diameter (Perveen and Qaiser, 2001; Sarkissian and Harder, 2001) facilitate handling and, as a consequence, the selection of individual pollen grains. However, in A. cherimola, as in other Annonaceae (Lora et al., 2009, 2014), the four sibling haploid microspores are held together in a persistent pollen mother cell wall that is surrounded by callose until its dissolution when the microspores are shed free. Because of that, the individual pollen grains had to be mechanically separated by moving them within the drop of water with the help of a single hair paintbrush to reduce static electricity.


TABLE 1 | DNA ploidy levels of Annona spp. and Asimina triloba.

TABLE 2 | DNA ploidy levels of the progeny from interspecific and intraspecific crosses, backcrosses and self-crosses.


Ploidy in the parents: diploids<sup>2</sup> ; triploids<sup>3</sup> ; tetraploids<sup>4</sup> . Fj, Fino de Jete.

#### DNA Extraction

Flow cytometry and chromosome counting do not allow the determination of the parental origin of the extra haploid genome in the triploids. However, polymorphic, heterozygous and codominant markers, like microsatellites, provide an interesting tool to analyze the ploidy level and the genetic origin. This approach can even be used with single pollen grains. For the genotyping analyses, pollen grains were collected from flowers just before anther dehiscence in 15 diploid and 15 triploid hybrids from the same interspecific cross. Twenty pollen grains were isolated per genotype. After collection, individual pollen grains were transferred to a DNA-free PCR tube (0.2 ml capacity) containing 2 µl of extraction buffer (Isagi and Suyama, 2011): 0.01% sodium dodecyl sulfate, SDS, 0.1 g/l proteinase K, 0.01 M Tris-HCl pH 7.8 and 0.01 M EDTA. Following this step, the presence of a single pollen grain in the drop of buffer was checked under the stereo microscope. The DNA extract from the single pollen grain was used directly as a PCR template. The PCR tube was then closed and incubated at 37◦C for 60 min and 95◦C for 11 min.

#### Single Pollen Grain Genotyping

The multiplex PCR method (Chamberlain et al., 1988) was used to amplify multiple microsatellite loci simultaneously in a single reaction. Genotypes of each pollen grain were analyzed using two microsatellite primers developed in A. squamosa LMTS52 and LMTS135 (GenBank KF010995 and KF011078, respectively). Multiplex PCR amplification was performed using a thermalcycler (Bio-Rad Laboratories, Hercules, CA, United States) under the following conditions: an initial step of activation at 94◦C for 1 min, then 35 cycles of denaturation at 94◦C for 30 s, annealing at 50◦C for 30 s, and extension at 72◦C for 1 min, followed by a final extension at 72◦C for 5 min. The volume of the reaction mixture was 10 µl containing extracted DNA from a single pollen grain, 16 mM of (NH4)SO4, 67 mM of Tris-HCL, pH 8.8, 0.01% of Tween20, 4 mM of MgCl2, 10 mM de KCl, 0.1 mM of each dNTP, 2.6 µM of each primer and 0.9 U de BioTaqTM DNA polimerase (Bioline). Forward primers were labeled with a fluorescent dye on the 5<sup>0</sup> end. PCR products were analyzed by capillary electrophoresis in a CEQTM 8000 genetic analyzer (Beckman Coulter, Fullerton, CA, United States).

#### Occurrence and Frequency of 2n Pollen Grains

The identification and analysis of the frequency of unreduced microspores from diploid and triploid plants were also made by analyzing the distribution of pollen grain size. To measure pollen size, 0.05 g of anthers from two flowers in the male stage per genotype were placed over a microscope slide. After adding a drop of 45% acetic acid, the anthers were pressed with forceps to release the pollen grains and then the remnant of the anthers removed. Autofluorescence from the exine of the pollen grains allowed visualizing the samples without staining. Preparations were observed under an epifluorescent Leica DM LB2 microscope with a 340–380 excitation filter. Measurements of the diameters of pollen grains in both parents and the hybrids were made with the Image J software over pictures taken with a Canon Power Shot S50 camera attached to the microscope.

The diameter of a minimum of 200 pollen grains per flower was measured and the range and distribution of pollen size was calculated for each plant. The frequency of 2n pollen grains was recorded by counting all the pollen grains of each sample and separating them into ranges of five units, according to their size. Following the distribution frequency of the pollen diameter, a

threshold that corresponded to the mean size in each individual, above which the pollen grains were considered as unreduced, was established. The percentage of unreduced microspores was calculated as the accumulative frequency above the mean size diameter. Some pollen grains with irregular shapes were observed in the samples; those irregular grains, together with those poorly developed, were not considered to determine the pollen diameter. A Student's t-test at the 0.05 significance level was used to compare the unreduced pollen production between diploid and triploid genotypes. To detect a putative relationship between the unreduced pollen production and ploidy level, the Pearson's correlation coefficient at the 0.01 significance level was computed. Statistical analyses were performed using SPSS 17.0 statistical software (SPSS Inc., Chicago, IL, United States).

#### Pollen Germination

To evaluate in vitro pollen germination, pollen was maintained within the dehisced anthers as reported previously (Rosell et al., 1999); the anthers were hydrated leaving them in a glass vial placed in a covered tray with wet filter paper for 60 min at room temperature. Then, approximately 0.02 g of pollen with anthers was placed on a 35 mm Petri dish with 1–2 ml of liquid germination medium at room temperature (Lora et al., 2006, 2012). Pollen was considered as germinated when the length of the tube was longer than the grain diameter. Data were collected from four Petri dishes with at least 200 pollen grains counted in each one. The pollen germination medium consisted of 8% sucrose, 200 mg/l MgSO47H2O, 250 mg/l Ca(NO3)24H2O, 100 mg/l KNO3, and 100 mg/l H3BO<sup>3</sup> (Lora et al., 2006).

To evaluate in vivo pollen germination, pollen tube growth was documented using squash preparations of stigmas from hand pollinated flowers kept in water at room temperature. For this purpose, pistils were fixed in formalin-acetic acid-alcohol (FAA) 24 h after pollination and stored at room temperature. Pistils were water washed and placed in 1N NaOH for 1 h to soften the tissues. Stigmas were dissected and squash preparations were stained with 0.1% aniline blue in PO<sup>4</sup> K<sup>3</sup> (Currier, 1957; Linskens and Esser, 1957) and observed with a 340–380 nm excitation filter and an LP425 barrier filter.

#### RESULTS

#### Ploidy in Annona

To study polyploidy in the tribe Annonae, we first analyzed the ploidy and DNA content in eight Annona species and Asimina triloba (**Table 1**). Annona cherimola, A. squamosa, A. senegalensis, A. muricata, A. emarginata, and Asimina triloba showed a DNA content between 1.63 and 1.75 pg that correspond to ploidy level 2×. To the best of our knowledge, this is the first reported ploidy of A. senegalensis, showing diploidy, and A. neosalicifolia, which showed a higher DNA content of 4.82 pg and, consequently, can be considered as hexaploid (**Table 1**).

To confirm the ploidy in the Annona species studied, we evaluated the karyotype using acetocarmine (**Figure 1**) and DAPI (**Figure 2**). The acetocarmine staining in somatic cells from young flower buds revealed all different phases of mitosis (**Figure 1**). More specifically, DAPI staining revealed the number of somatic chromosomes that confirmed the abovementioned ploidy. Thus, A. cherimola, A. squamosa, the diploid atemoyas observed by flow cytometry analysis and A. muricata showed 14 somatic chromosomes (**Figure 2**). Additionally, we also quantified the number of chromosomes of A. glabra that was 28 (**Figure 2D**) confirming its 4× ploidy level (Bowden, 1948). The hexaploidy of A. neosalicifolia was also confirmed by its karyotype showing 42 chromosomes (**Figure 2E**). Thus, the karyotype and the flow cytometry analyses revealed that the basic chromosome number in Annona is seven.

#### Polyploidy in the Interspecific Cross A. cherimola × A. squamosa

Annona cherimola and A. squamosa can be intercrossed to produce the cultivated A. squamosa × A. cherimola hybrid, atemoya. We analyzed DNA content in five accessions of atemoya and the results showed a ploidy of 2× (**Table 1**). To further examine ploidy in atemoya, we also studied the ploidy in 186 atemoyas of the progeny from the interspecific cross A. cherimola (cv. Fino de Jete, "Fj") × A. squamosa (Thai seedless. "Ts"). Surprisingly, the interspecific cross showed 65 triploid individuals (35%, **Table 2**). Thus, additionally, we also studied the progeny from four backcrosses (A. cherimola × atemoya), two self-crosses (atemoya × atemoya) and seven intraspecific crosses (A. cherimola × A. cherimola). The resulting progeny showed diploid, triploid and, interestingly, tetraploid individuals (**Table 2**). Three of the backcrosses were performed using pollen from triploid atemoyas and triploid individuals were observed in the progeny of two of them. The fourth backcross was performed in the two directions with the tetraploid individual. No fruits were obtained when the tetraploid genotype was used as maternal parent but fruits were obtained when it was used as male parent, showing 56% of triploids and 44% of tetraploids in the progeny. Anomalies in ploidy were also frequent in the intraspecific crosses showing 6.9% of triploid individuals and was less frequent in the self-crosses (**Table 2**). Consistent with the flow cytometry data, the karyotype of diploid, triploid and tetraploid atemoyas from the interspecific cross A. cherimola × A. squamosa showed 21 and 28 chromosomes, respectively (**Figures 2F–H**). Taken together, anomalous ploidy was more frequent in interspecific crosses than in intraspecific crosses.

### DNA Amplification and Genotyping From Single Pollen Grains

Genotyping of 20 pollen grains per genotype was performed with two microsatellite markers, LMTS52 and LMTS135, which amplified two alleles in the female parent, A. cherimola "Fj" and only one in the male parent, A. squamosa "Ts."

Within each genotype, we observed pollen grains with patterns of only one peak, diallelic and triallelic patterns with different frequencies in triploids. Among the diploid interspecific hybrids analyzed, 13.7% of the pollen grains showed two alleles for the same locus. Among interspecific triploid hybrids, 41.28% of the pollen grains had two alleles and 5.63% three

FIGURE 1 | Chromosomes revealed using 1% acetocarmine from the progeny of the interspecific cross A. cherimola × A. squamosa during the mitotic cycle. (A) Chromosomes in a diploid hybrid during early prophase. (B,C) Prometaphase in a diploid (B) and triploid hybrid (C). (D,E) Metaphase in a diploid (D) and triploid hybrid (E). (F) Chromosomes in a triploid hybrid in an early anaphase. (G) Late anaphase in a diploid hybrid. (H,I) Telophase in a diploid hybrid. Scale bars, 10 µm.

peaks corresponding to three alleles with clear and stable amplification signals.

#### Pollen Performance Was Affected in Triploid Hybrids

Changes of nuclear DNA content could affect pollen performance. Thus, we next studied pollen grain size and pollen germination. Differences in size among pollen grains from the same genotype were observed in the samples analyzed in this study (**Figure 3**).

A high polymorphism in the size of pollen grains was detected within the interspecific progeny. Among the analyzed diploid hybrids, pollen grains with sizes ranging from 22 to 100 µm were observed with an average pollen size of 52 ± 10 µm. In the case of the triploid hybrids, pollen grain size ranged from 18 to 126 µm with an average size of 56 ± 11 µm. Regarding the closely related parental genotypes, pollen grain size ranged from 43 to 63 µm with an average of 52 ± 4 µm in the female parent (A. cherimola, "Fj"), and from 33 to 58 µm with an average of 44 ± 4 µm in the male parent (A. squamosa, "Ts").

We established a threshold in pollen grain size in each analyzed individual corresponding to its mean size to determine the percentage of unreduced pollen. Thus, in general, a higher proportion of unreduced pollen grains was observed in triploids (38 ± 9%) than in diploids (29 ± 10%). Moreover, we observed "giant" grains both in diploids and triploids with diameters larger than 75 µm. The frequency distribution of pollen size

(E) Hexaploid A. neosalicifolia. (F,H) Diploid hybrid (F), triploid hybrid (G) and tetraploid hybrid (H) from the interspecific cross A. cherimola × A. squamosa. Scale bars, 10 µm.

of parents and diploid and triploid progeny was clearly distinct (**Figure 3**). While in the parental genotypes mainly three pollen grain diameters can be distinguished (**Figures 3A–D**), both diploid and triploid hybrids showed an overlap in size distribution between reduced and unreduced pollen and a larger variation in the frequency distribution of pollen grain size (**Figures 3E–H**). The production of unreduced pollen differed statistically between diploids and triploids, (t = −3.01, df = 38, P = 0.005). The distribution was unimodal and symmetrical in diploids whereas it was skewed toward larger diameter values in triploids (**Figure 3**). A significant correlation was observed between the frequency of unreduced pollen and the ploidy level of the hybrids (Pearson's correlation coefficient r = 0.439, P = 0.005, N = 40). Additionally, individual frequency distribution data revealed that some genotypes from the interspecific progeny, including diploids and triploids, are more capable of producing unreduced pollen than others.

We next evaluated if these morphological differences among pollen grains result in differences in pollen germination. Percentage of pollen germination in vitro in triploid hybrids ranged from 15 to 45% with an average of 27%. By contrast, pollen germination in vitro in diploid genotypes ranged between 21 and 72% with an average of 36%. Pollen germination in vitro was higher in the parental lines (75% in "Fj" and 55% in "Ts") compared with triploid (19%) and diploid (36%) hybrids. We evaluated pollen germination in vivo on the stigma using pollen from the parental line, "Fj", a triploid hybrid and a tetraploid hybrid from the backcross "Fj" × "FT197." Pollen from "Fj" showed higher germination (58% ± 3.8, n = 646) than pollen from a triploid hybrid (7.7% ± 0.4, n = 267). Pollen from a tetraploid hybrid also showed high germination (49% ± 10.5, n = 419, **Figure 4**).

#### DISCUSSION

Polyploidization is one of the main force of evolution and crop improvement in flowering plants. Although most species of the genus Annona show diploidy, the progeny of a cross between diploid A. cherimola and A. squamosa showed an unexpected high polyploidy level with a high number of triploid hybrids. Consequently, in order to understand the reasons of this unexpected high number of triploids, we analyze the polyploidy and its consequences for pollen performance in different species of the genus Annona and in Asimina triloba.

#### Polyploidy in Annonaceae

It has been suggested that the ancestral basic chromosome number in Annonaceae is seven (Walker, 1972), that has also been considered as the basic chromosome number in flowering plants (Grant, 1963). However, in the tribe Annoneae, a haploid chromosome number of seven has only been reported in the genus Annona; in this tribe, x = 8 has been reported in the genus Asimina (Tanaka and Okada, 1972), Disepalum (Johnson, 1989), Goniothalamus (Sauer and Ehrendorfer, 1984), and Neostenanthera (Walker, 1972). Moreover, x = 8 has also been reported in sister lineages such as the tribes Monodoreae (genera Isolona, Monodora, and Uvariopsis) (Walker, 1972) and Uvarieae (genera Dasymaschalon, Desmos, Fissistigma, Friesodielsia, Melodorum, Mitrella, Sphaerocoryne, and Uvaria) (Walker, 1972; Okada and Ueda, 1984; Sauer and Ehrendorfer, 1984; Van Heusden, 1992) and in the more basal tribes Duguetieae (genera Duguetia and Fusaea) and Xylopieae (genera Artabotrys and Xylopia) (Morawetz, 1984; Van Heusden, 1992). This could indicate that x = 7 in Annona may have arisen by aneuploid chromosome loss from ancestors with x = 8. Indeed, Morawetz (1986a) already proposed x = 8 as the original basic chromosome number in Annonaceae. In this work we have observed the haploid chromosome number seven in A. cherimola, A. squamosa and their hybrid, atemoya, which supports previous karyotype reports in both species (Kumar and Ranadive, 1941; Asana and Adiata, 1945; Thakur and Singh, 1965, 1969; Tanaka and Okada, 1972; Morawetz, 1986a) in contrast to x = 8 reported by Bowden (1945, 1948). The presence of the eighth chromosome could be due to the confusion of distant satellite (or whole arms) with separate chromosomes (Sauer and Ehrendorfer, 1984). Polyploidy has been reported in nine genera of the Annonaceae (Van Heusden, 1992), mainly tetraploids although triploids have also been reported in Cymbopetalum (Morawetz,

1986a) and Duguetia (Morawetz, 1984). Hexaploid and octoploid ploidy levels were also observed in the former genus Rollinia (Walker, 1972; Van Heusden, 1992) that is now included in the genus Annona (Rainer, 2007). In this work, this hexaploidy was also observed in Annona neosalicifolia previously named Rollinia neosalicifolia.

Although triploid mutants have been occasionally reported in Asimina triloba (Bowden, 1949), their presence is generally unknown in the tribe Annoneae. Interestingly, we observed a high percentage of triploid hybrid atemoyas (35%) from an interspecific cross between A. cherimola × A. squamosa. However, atemoya has been considered diploid (Bowden, 1948). The triploid hybrids may result from the fusion of reduced (n) and unreduced (2n) gametes. The production of unreduced pollen grains is mainly controlled genetically (Mok et al., 1975; Ramsey and Schemske, 1998; Barcaccia et al., 2003; De Storme and Geelen, 2011; Mason and Pires, 2015), although there are several evidences, in many crop species, showing that the genes involved in the control of the unreduced pollen production are highly influenced by climatic conditions (Bretagnolle and Thompson, 1995; Ramsey and Schemske, 1998; Bretagnolle, 2001; De Storme et al., 2012). Studies in different species of Solanaceae, Salicaceae, Poaceae, Rosaceae, and Ranunculaceae have shown the influence of the environmental conditions, mainly light and temperature, in the production of 2n gametes (Veilleux and Lauer, 1981; Hermsen, 1984; Felber, 1991; Fuzinatto et al., 2008; Rezaei et al., 2010; Pecrix et al., 2011; Guo L. et al., 2017). Particularly changes on temperatures prevailing at the gamete formation stage can cause various meiotic abnormalities due to the change in the expression of some genes that affect gamete viability (Kumar and Singhal, 2011; Singhal et al., 2011; Pecrix et al., 2011; Guo L. et al., 2017). A similar situation could be occurring in Annona in this work, where crosses were performed in a region with different temperature ranges than those present in the natural range of cultivation in the Neotropics during flowering time. While cherimoya is cultivated under an average annual temperature range of 18–21◦C in Ecuador with limited annual fluctuations (Van Damme et al., 2000) or between a temperature range of 18◦C and 25◦C in the summer in Peru (Morton, 1987), under the growing conditions of Southern Spain, the fluctuation is higher with average high temperatures during the flowering season ranging from 19 to 29◦C (Lora et al., 2011). Interestingly, this production of triploids is only observed in the progeny of the crosses made under these environmental conditions, since all the five accessions of atemoyas and the 338 accessions of A. cherimola maintained at the IHSM la Mayora Annona germplasm collection are diploids.

### DNA Amplification and Genotyping From Single Pollen Grains Revealed the Presence of Unreduced Pollen Grains

Due to the small size of pollen and the difficulty of manipulation, DNA extraction from single pollen grains have been tried by laborious methods such germination of the pollen (Ziegenhagen et al., 1996; Honsho et al., 2016) and drilling into the pollen wall by using a UV-laser microbeam (Matsunaga et al., 1999). However, these methods are unsuitable for processing a large number of samples. In this work we optimized the approach of using single pollen grains through the application of multiplex PCR or co-amplification of several microsatellite markers in the same reaction (Ghislain et al., 2004; Meudt and Clarke, 2007; Guichoux et al., 2011) to amplify the haploid nuclear genome of a single pollen grain. The markers used in this work allowed us detect a high heterozygosity transmission level from the pollen donor plant to the progeny in which FDR seems to be involved. However, heterozygous markers for the male parent would be necessary to infer a possible SDR situation.

Triploids usually produce euploid gametes (n, 2n) (Belling and Blakeslee, 1922; Dermen, 1931; King, 1933; Satina and Blakeslee, 1937a,b; Lange and Wagenvoort, 1973; Dujardin and Hanna, 1988; Kovalsky et al., 2018) but also 3n gametes through gamete non-reduction (Belling and Blakeslee, 1922; Lange and Wagenvoort, 1973; Mok et al., 1975). Thus, as described previously in other species, among the triploid interspecific hybrids between A. cherimola and A. squamosa we observed a frequency of 53% of pollen grains with one allele (n), 41.3% with two alleles (2n) and 5.63% with three alleles (3n). It should be pointed out, however, that diploid pollen grains containing two copies of the same allele will be considered as haploids and, consequently, the number of 2n pollen grains detected is the minimum, and the real number would be probably higher. As expected, a larger percentage of gametes with two alleles was observed in the triploid hybrids that showed approximately three times more unreduced gametes than diploids.

### Pollen Performance Was Affected in Unreduced Pollen

DNA content can have a direct effect on pollen morphology and performance. Indeed, the examination of the size range of the pollen produced by an individual is the most common method to detect unreduced pollen. This method has been used as indicator of 2n pollen presence in several species (Quinn et al., 1974; Ramanna, 1983; Sala et al., 1989; Orjeda et al., 1990; Bretagnolle, 2001; Crespel et al., 2006; Kovalsky and Solís Neffa, 2012; Nikoloudakis et al., 2018). In these studies, pollen grains with larger size have usually been considered as unreduced.

In this work, pollen grains showed larger diameters in triploid than in diploid genotypes. Size distribution of pollen grains revealed a relative overlap between unreduced and reduced pollen as described previously in other species (Maceira et al., 1993; Yan et al., 1997; Bretagnolle, 2001). Therefore, based on visual observations, we used a pollen grain size threshold with the mean size to determine the unreduced pollen grains produced in the parents and in each diploid and triploid genotype analyzed. A large variation was found in the production of unreduced pollen between triploid (38%) and diploid (29%) genotypes and a high positive correlation between the production of unreduced pollen grains and the ploidy level of the pollenproducing plant has been observed. However, some of the large pollen may not be 2n (Maceira et al., 1993) and could include 3n gametes. The frequency of unreduced gametes

produced by the diploid interspecific hybrids was higher than that observed in other species such as Turnera sidoides (Kovalsky and Solís Neffa, 2012) or Anthoxanthum alpinum (Bretagnolle, 2001) with low frequency of unreduced pollen production (1% approximately) but, lower than those found in diploid population of Populus tomentosa with up to 50% of unreduced gametes (Zhang and Kang, 2010).

We also observed giant pollen grains with sizes higher than 75 µm both in the diploid and triploid progeny, although it occurred more frequently in triploids. These giant pollen grains may be 3n pollen (Tavoletti et al., 2000; Camadro et al., 2008; Zhang and Kang, 2010). The production of these 3n gametes would increase the probability of obtaining tetraploid plants in the progeny obtained after crossing diploid and triploid genotypes.

Our data also revealed that some genotypes from the interspecific progeny seemed prone to produce a higher percentage of unreduced pollen than others. Similar patterns have been observed in other plant species (Watanabe and Peloquin, 1989; Haan et al., 1992; Maceira et al., 1992). The frequency of 2n gametes was also found variable within the flowers of an individual tree in some cases such as in Medicago sativa (McCoy et al., 1982; Veilleux et al., 1982), Solanum spp. (Veilleux, 2011), and Turnera sidoides (Kovalsky and Solís Neffa, 2012, 2016). The capacity to produce 2n gametes is governed by different alleles with different degrees of penetrance and expressivity (Bretagnolle and Thompson, 1995), and this could explain the individual variation observed within our population.

In addition to pollen size, pollen viability and germination could be affected by chromatin distribution imbalance in triploids (Bretagnolle and Thompson, 1995; Ramsey and Schemske, 1998). We observed a reduced pollen germination (7.7%) in triploid compared to the diploid (58%) genotypes. Reduction of pollen viability was reported in triploids of Manihot esculenta (cassava) (Nassar, 1992) and rose (Crespel et al., 2006) showing reductions of 19 and 50%, respectively. However, in spite of the reduced pollen viability and germination, the evidence of the functionality of the pollen from triploid hybrid atemoyas was observed in the backcrosses. Similarly, functional pollen from triploids was also reported in potato (Mok et al., 1975; Tarn and Hawkes, 1986; Adiwilaga and Brown, 1991), and Lolium (Thomas et al., 1988) and, consequently, they could be used for introgression of genetic diversity from diploid to polyploid crop varieties (Bretagnolle and Thompson, 1995).

#### Polyploidy, Evolurtion and Crop Improvement

Recently, molecular tools have provided an interesting insight into the regulatory and genomic consequences of polyploidy. Together with the emerging evidence of ancestral duplication through polyploidization in model plants, knowledge of these consequences has stimulated thinking on the relationship between early polyploidy events, success of the polyploids, and the long-term fate of the new species (Comai, 2005). The "2n gametes pathway" is considered to be the main way of polyploidy origin and evolution in flowering plants (Harlan and deWet, 1975; Bretagnolle and Thompson, 1995; Ramsey and Schemske, 1998; De Storme and Geelen, 2013). Although unreduced pollen grain production has been observed in several species (Bretagnolle, 2001; Ramanna and Jacobsen, 2003; Taschetto and Pagliarini, 2003; Crespel et al., 2006; Gallo et al., 2007; Camadro et al., 2008; Zhang et al., 2009; Gómez-Rodríguez et al., 2012; Xue et al., 2011), the observations made in this work in two different species (A. cherimola and A. squamosa) and their interspecific hybrids could have implications to explain the emergence of polyploidy and as valuable information for crop improvement in the Annonaceae. Additional work will be needed to explain the reasons behind the unexpected high production of polyploid genotypes in the interspecific and intraspecific crosses in Annona. In any case, polyploidy provides genome buffering, higher allelic diversity and the possibility of new functions for the duplicated genes; all this has important implications for crop improvement (Udall and Wendel, 2006; Mason, 2016) and, indeed, many of the current cropsr are hybrids of polyploids (Mason, 2016). Interestingly, most of the studies of polyploidy in crops have been performed in non-perennial species (Mason, 2016) and, consequently, there is a lack of information on the mechanisms and extension of polyploidy in most woody perennial crops, such as those studied in this work. The approaches shown here to study the reasons behind the unexpected high number of polyploids produced in intraspecific and interspecific crosses involving diploid genotypes can be useful to perform similar analyses in other woody perennial crops.

#### AUTHOR CONTRIBUTIONS

CM, MAV, and JIH conceived the study and designed the experiments. CM performed most of the experiments. JL performed some of the additional crosses, analyzed the progeny and the in vivo pollen germination experiments. CM, JL, MAV, and JIH wrote the manuscript.

### FUNDING

This research was supported by Ministerio de Economía y Competitividad – European Regional Development Fund, European Union (AGL2016-77267-R, AGL2015-74071-JIN), the BBVA Foundation (BIOCON 08-184/09) and INIA (RFP2015- 00009). CM was supported by a predoctoral grant of the Junta de Andalucía (AGR2742).

#### ACKNOWLEDGMENTS

We thank Sonia Civico for help with the karyotype and ploidy analyses and Yolanda Verdún for help with the molecular analyses. We would like to acknowledge the comments received by the two reviewers of the manuscript that have clearly helped to improve the paper from our initial submission. We also acknowledge support of the publication fee by the CSIC Open Access Publication Support Initiative through its Unit of Information Resources for Research (URICI).

#### REFERENCES

fpls-10-00099 February 7, 2019 Time: 19:13 # 11


plant family Annonaceae: steady diversification and boreotropical geodispersal. J. Biogeogr. 38, 664–680. doi: 10.1111/j.1365-2699.2010.02434.x


microsatellites (SSRs) for genotyping of cultivated potato. Theor. Appl. Genet. 108, 881–890. doi: 10.1007/s00122-003-1494-7




**Conflict of Interest Statement:** CM is currently employed by company Rijk Zwaan Ibérica S.A.

The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Martin, Viruel, Lora and Hormaza. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Identifying and Engineering Genes for Parthenogenesis in Plants

Kitty Vijverberg<sup>1</sup> \*, Peggy Ozias-Akins<sup>2</sup> and M. Eric Schranz<sup>1</sup>

<sup>1</sup> Biosystematics Group, Experimental Plant Sciences, Wageningen University and Research, Wageningen, Netherlands, <sup>2</sup> Department of Horticulture, Institute of Plant Breeding, Genetics and Genomics, University of Georgia, Tifton Campus, Tifton, GA, United States

Parthenogenesis is the spontaneous development of an embryo from an unfertilized egg cell. It naturally occurs in a variety of plant and animal species. In plants, parthenogenesis usually is found in combination with apomeiosis (the omission of meiosis) and pseudogamous or autonomous (with or without central cell fertilization) endosperm formation, together known as apomixis (clonal seed production). The initiation of embryogenesis in vivo and in vitro has high potential in plant breeding methods, particularly for the instant production of homozygous lines from haploid gametes [doubled haploids (DHs)], the maintenance of vigorous F1-hybrids through clonal seed production after combining it with apomeiosis, reverse breeding approaches, and for linking diploid and polyploid gene pools. Because of this large interest, efforts to identify gene(s) for parthenogenesis from natural apomicts have been undertaken by using map-based cloning strategies and comparative gene expression studies. In addition, engineering parthenogenesis in sexual model species has been investigated via mutagenesis and gain-of-function strategies. These efforts have started to pay off, particularly by the isolation of the PsASGR-BabyBoom-Like from apomictic Pennisetum, a gene proven to be transferable to and functional in sexual pearl millet, rice, and maize. This review aims to summarize the current knowledge on parthenogenesis, the possible gene candidates also outside the grasses, and the use of these genes in plant breeding protocols. It shows that parthenogenesis is able to inherit and function independently from apomeiosis and endosperm formation, is expressed and active in the egg cell, and can induce embryogenesis in polyploid, diploid as well as haploid egg cells in plants. It also shows the importance of genes involved in the suppression of transcription and modifications thereof at one hand, and in embryogenesis for which transcription is allowed or artificially overexpressed on the other, in parthenogenetic reproduction. Finally, it emphasizes the importance of functional endosperm to allow for successful embryo growth and viable seed production.

Keywords: apomixis, embryogenesis, embryo induction, PsASGR-BabyBoom-Like (PsASGR-BBML), doubled haploids, parthenogenesis, Pennisetum, Taraxacum

#### Edited by:

Emidio Albertini, University of Perugia, Italy

#### Reviewed by:

Stefan de Folter, Centro de Investigación y de Estudios Avanzados (CINVESTAV), Mexico Masaru Ohme-Takagi, Saitama University, Japan

> \*Correspondence: Kitty Vijverberg kitty.vijverberg@wur.nl

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 15 November 2018 Accepted: 24 January 2019 Published: 19 February 2019

#### Citation:

Vijverberg K, Ozias-Akins P and Schranz ME (2019) Identifying and Engineering Genes for Parthenogenesis in Plants. Front. Plant Sci. 10:128. doi: 10.3389/fpls.2019.00128

**Abbreviations:** ASGR, apospory-specific genomic region; DHs, doubled haploids; FIS, fertilization-independent seed; MZT, maternal-to-zygote transition; PRC2, polycomb repressive complex 2; TEs, transposable elements; TF, transcription factor; ZGA, zygotic genome activation.

## INTRODUCTION

fpls-10-00128 February 16, 2019 Time: 17:39 # 2

#### Parthenogenesis and Spontaneous Embryo Development

Parthenogenesis is the spontaneous development of an embryo from an unfertilized egg cell: parthenos = virgin, genesis = creation. It naturally occurs in a variety of plant and animal species, particularly in lower plants such as mosses and algae and species-rich invertebrate groups such as insects, nematodes, and crustaceans, but also in c. 10% of the fern and 1% of the flowering plant species, and as rare examples in vertebrates (Bell, 1982; Suomalainen et al., 1987; Asker and Jerling, 1992; Schön et al., 2009; Hand and Koltunow, 2014; Grusz, 2016). In plants, the egg cell develops within a female gametophyte, which is a multicellular organism that arises from a megaspore, a product of female meiosis. The female gametophyte is thus haploid (1n) and alternates with the diploid (2n) sporophytic generation after fertilization of the egg cell with a haploid male sperm cell (e.g., Niklas and Kutschera, 2010; Schmidt et al., 2015; Bowman et al., 2016). In more primitive plants, the mosses and ferns, female gametophytes are relatively large, free-living organisms and egg cells develop in special regions, the archegonia. In higher plants, the female gametophyte (also embryo sac) is highly reduced. Circa 70% of the angiosperm species produce female gametophytes of the Polygonum-type, which consists of seven cells only: two gametes, the egg cell and central cell, and five accessory cells, the two synergids and three antipodals (Maheshwari, 1950). Animals lack an intermediate organism similar to the female gametophyte. Here, the egg cell is a direct product of meiosis and, as such, similar to the megaspore.

Parthenogenesis usually occurs in combination with a mechanism that keeps or restores the diploid chromosome number, since haploid offspring are usually less fit or nonviable in nature. Depending on the mechanism involved, true or partial clones of the mother are produced. In angiosperms, one of two types of apomeiosis (apo = without) occur: apospory, in which the gametophyte develops directly from a sporophytic cell of the ovule, or diplospory, in which meiosis is omitted, restituted, or preceded by endoreplication in the megaspore mother cell (Nogler, 1984; Asker and Jerling, 1992). In both cases, true clones of the mother plant are formed given that in diplospory, chromosome restitution happens before crossingover has initiated, and after endoreplication, copy- rather than sister-chromosome pairing occurs. However, there are reported exceptions of recombination in diplosporous apomicts, e.g., in dandelion (Malecka, 1965). Apomeiosis can also be facultative, in which part of the offspring is produced by sexual means. This is found in diplosporous species, e.g., Erigeron (Noyes, 2005), but particularly also in pseudogamous aposporous species in which ovule-derived embryo sacs develop next to the reduced embryo sac and autonomous versus the sexually derived embryo are competing for resources, e.g., Paspalum (Ortiz et al., 2013). Similar mechanisms of apomeiosis exist in parthenogenetic animals, although here more often meiosis still occurs, involving either haploid offspring or restoration of diploidy through various mechanisms (Avise, 2008).

Successful embryo development depends on a third factor, the nutrition of the embryo. In angiosperms, the embryo is nourished by the endosperm, a tissue that in sexual individuals arises via fertilization of the central cell. The process of double fertilization in which the egg cell and central cell each are fertilized by one of two clonal sperm cells is unique to flowering plants (see for a review, e.g., Dresselhaus et al., 2016). In most apomictic species, endosperm development is pseudogamous, requiring fertilization of the central cell, whereas in a minority of apomictic species, the endosperm develops autonomously. In both cases, the usual maternal (m) versus paternal (p) genome ratio of 2m : 1p in the endosperm might be altered, which can severely affect seed development in many plant species (Scott et al., 1998; Autran et al., 2005; Kradolfer et al., 2013). Apomicts evolved different adaptations to overcome this requirement, e.g., in pseudogamous panicoid grasses only four nuclei comprise the aposporous embryo sac with predominantly unreduced, uni-nucleate central cells fertilized by a reduced sperm (Ozias-Akins, 2006). In animals, embryo nutrition is provided by the mother in one of many ways, without the need for a second fertilization event. In some parthenogenetic animal species, however, embryo development needs activation by a sperm without the fusion of gametes, known as gynogenesis or spermdependent parthenogenesis. Parthenogenesis and apomeiosis together, combined with either pseudogamous or autonomous endosperm formation, is defined as apomixis (sensu stricto) or agamspermy, the clonal seed formation (reviewed by, e.g., Pupilli and Barcaccia, 2012; Hand and Koltunow, 2014; Conner and Ozias-Akins, 2017).

Parthenogenesis is one form of apogamy and is sometimes also used in a wider sense, including spontaneous embryo development from (a) gametophytic cell(s) other than the egg cell, which is particularly common in lower plants (Asker and Jerling, 1992). Apogamy sensu lato includes, in addition, the spontaneous development of an embryo from a sporophytic cell, known as somatic embryogenesis. This process lacks the development of an embryo sac, endosperm, and seed coat. A classic example of somatic embryogenesis is the spontaneous embryo formation at leaf margins in Kalanchoë spp. (Garcês et al., 2007). A particular form of it is adventitious embryony or polyembryony in which the embryo(s) develop(s) from a sporophytic cell of the ovule (Nogler, 1984). Another special form is in vitro embryogenesis in which embryos develop explanta usually from microspores (pollen) or, less frequently, female gametophytic cells (gametophytic embryogenesis), or protoplasts, leaves, hypocotyls, or other plant tissues (sporophytic embryogenesis), often indirectly via the formation of a callus (Horstman et al., 2017). The reprogramming to progress into embryogenesis occurs under the influence of external stimuli such as hormones, heat stress, or overexpression of particular TFs (Hand et al., 2016; Ikeuchi et al., 2016). Although successful in a range of species, many species or particular genotypes can be (very) recalcitrant to in vitro embryogenesis and unable to produce embryos with any of the known stimuli (Ochatt et al., 2010; Soliìs-Ramos et al., 2012). Identifying a gene that is able to induce parthenogenesis particularly in these recalcitrant species and genotypes would be very valuable as a tool in plant breeding.

The different forms of embryogenesis are summarized in **Figure 1**. In this review, we focus on parthenogenesis in the strict sense, concerning the spontaneous development of an embryo from an unfertilized egg cell, and in flowering plants. Nevertheless, the induction of embryo development from any other cell or tissue may include commonalities with this process in the plant egg cell and, where overlapping, this will be considered in addition.

#### Historical Aspects of Parthenogenesis

The use of "Parthenogenesis" to denote asexual reproduction in plants followed its use in insects, as by Owen (1849): "On parthenogenesis, or the successive procreating individuals from a single ovum" (1849) and von Siebold (1856): "Wahre Parthenogenesis bei Schmetterlingen und Biene, ein Beitrag zur Fortpflanzungsgeschichte der Thiere" (1856). In fact, apomixis rather than parthenogenesis is meant here, also including apomeiosis and endosperm development. The recognition that some plant species are capable of asexual reproduction came only two decades after the universality of "the law of sexual reproduction" was established, realizing that this also holds for plants (c. 1830) (Nogler, 2007). Many discussions and numerous investigations in the two prior centuries preceded recognition (reviewed by Bergsma, 1857). Most of these studies included dioecious plant species, particularly Cannabis, Mercurialis, Spinacia, Curcurbits, Silenes, and Ricinus, and some monoecious species with separate male and female flowers, e.g., maize. However, due to the state of knowledge at that time and limited technical possibilities, all studies violated to a certain degree one or both of two essential conditions: (1) complete isolation of the plants and (2) exact observations (Bergsma, 1857). Regarding (1), some studies were done even before the discovery of the pollen or knowledge of its function (e.g., Camerarius, 1694), and (2), particularly the occasional formation of male organs in female flowers remained unrecognized. A relationship between asexual seed production and either annual plants or monoecious species has also been suggested (among the early investigators: Spallanzani, L., c. 1770–1785, Bernhardi, J., c. 1834–1839, Lecoq, H., c. 1858–1867, and Naudin, C., c. 1861–1867; see Bergsma, 1857). Ultimately, only one of these reports of asexual seed production was confirmed and is now considered as its first proof, namely, that Coelebogyne ilicifolia (presently: Alchornea ilicifolia; Euphorbiaceae) produces perfect seeds without any apparent action of pollen (Smith, 1841). After better fixing and staining methods became available, this example of asexual seed was revealed to result from polyembryony rather than parthenogenesis (Strasburger, 1877). It took another two decades to confirm that true parthenogenesis indeed exists in angiosperms, verified on the basis of careful observations in Antennaria alpina (Juel, 1898). Apomeiosis in this species involves the omission of meiosis in the megaspore mother cell, followed by two mitotic divisions resulting in four unreduced spores, now known as the Antennaria-type of diplospory. Subsequently, parthenogenesis was proven to occur in other species, including Taraxacum and Hieracium (Murbeck, 1904), and involve additional modes of apomeiosis (Juel, 1906; Rosenberg, 1906; reviewed by Nogler, 2007). According to our current knowledge, ∼400 species from different plant families are able to produce seeds without fertilization, and apomixis evolved numerous times in plants (Carman, 1997; van Dijk and Vijverberg, 2005). This suggests that parthenogenesis likely also relies on more than one genetic mechanism.

### EGG CELL ARREST AND THE TRIGGER FOR EMBRYOGENESIS

In parthenogenetic reproduction egg cell arrest, as is found in sexual reproduction prior to fertilization, is absent or strongly reduced. Cytological investigations indicate a short period of egg cell arrest at least in some apomictic species, followed by precocious (before anthesis) embryo development (Nogler, 1984), e.g., in dandelion (van Baarlen et al., 2002) and the wild relative of wheat, Tripsacum dactyloides (Grimanelli et al., 2003). In sexual plants at the end of female gametophyte patterning (see for a review Tekleyohans et al., 2017), the mature egg cell is characterized by highly condensed repressive chromatin and a relatively quiescent transcriptional state (Garcia-Aguilar et al., 2010; Pillot et al., 2010). This is hypothesized to be necessary for attaining totipotency in the zygote and early embryo (Baroux and Grossniklaus, 2015). After fertilization and karyogamy, structural changes in the chromatin are necessary to enable access to the DNA for transcription and the replication machinery. Whether the egg cell in apomicts also undergoes a (brief) period of chromatin repression and transcriptional silencing before embryogenesis initiates is yet unknown, but likely if indeed needed to obtain totipotency. Parthenogenesis may then involve factors that are responsible for the spontaneous de-repression of chromatin and activation of transcription. Alternatively, the egg cells in apomicts bypass a chromatin repressive and transcriptionally silent state and need reprogramming. In this context, it is interesting to know the chromatin state in initial cells of embryogenesis other than egg cells, e.g., in somatic embryogenesis, to search for parallels. In any case, factors that are involved in chromatin-remodeling and transcriptional regulation are candidates to play a role in the parthenogenetic pathway.

In egg cell arrest and embryo development, a role for signals from the surrounding tissue is indicated, particularly from the companion cells, the central cell in the mature gametophyte and endosperm in the developing seeds (Grossniklaus, 2011). In sexual plant reproduction, the central cell also arrests until fertilization, but shows chromatin that is depleted from repressive marks and displays a more active transcriptional competence (Baroux and Grossniklaus, 2015). This allows for the expression of maternal alleles and TEs, the latter thought to serve the production of 24 nucleotide siRNAs to reinforce silencing of TEs in the egg cell (Ibarra et al., 2012; Roche et al., 2016; Martinez and Köhler, 2017). The former, the expression of maternal alleles, may contribute to the differential expression of maternally and paternally inherited alleles in the early endosperm after fertilization, as is also the results of imprinted genes (Wang and Köhler, 2017). In many species, the endosperm is, therefore, sensitive to a maternal to paternal dosage, and deviations from this lead to endosperm failure and embryo arrest (Scott et al.,

embryo ectopically (C,D) after sexual (A) or asexual (B–D) reproduction, with orange indicating the sexual process, blue the asexual or apomictic process, pink apomictic reproduction with fertilization of the central cell, and N = chromosome set after reduction division: (A) zygotic embryogenesis, involving chromosome reduction (N) and gamete fusion (N+N for the embryo, 2N+N for the endosperm), (B) apomictic embryogenesis, occurring in the ovule, either gametophytic apomixis in which an embryo sac arises from an unreduced megaspore (diplospory) or sporophytic cell of the ovule, usually adjacent to a sexually derived spore or developing embryo sac (apospory), and parthenogenetic (spontaneous) embryo development and autonomous (spontaneous) or pseudogamous (after fertilization of the central cell) endosperm formation, or sporophytic apomixis in which the embryo arises directly from a sporophytic cell of the ovule, often as polyembryony and alongside a sexually derived embryo and endosperm (C) somatic/sporophytic embryogenesis, involving ectopic embryo development from sporophytic cells, and (D) gametophytic embryogenesis, idem from a gametophytic cell. The latter two (C,D) omit the formation of an embryo sac, endosperm, and a seed coat, and occur naturally, for example, from leaf margins or ovular cells (C), gametophytic tissue in lower plants or, e.g., a synergid (D), but are particularly known from in vitro embryogenesis in which embryos are formed in culture, after external induction, particularly from protoplasts, leaf, the hypocotyl or other plant tissues (C), or microspores (D).

1998; Kradolfer et al., 2013). Silencing of TEs in the egg cell by small RNAs from the surrounding nucellar tissue is also reported. A study in Arabidopsis shows reactivation of TEs in the egg cells of plants that are mutant for ARGONAUT9 (AGO9), a small RNA binding protein of the RNA Induced Silencing Complex (RISC) (Olmedo-Monfil et al., 2010). Another study shows higher overall transcription levels in early embryos of parthenogenetic Tripsacum x maize hybrids as compared to embryos of sexual maize, supporting reduced silencing under parthenogenetic conditions (Garcia-Aguilar et al., 2010). Taken together, findings suggest that: (1) dedicated (TE-)silencing pathways, involving companion cells and surrounding ovular tissue, result in dynamic patterns of transcriptional suppression in the egg cell, and (2) the m : p balance in the endosperm is important for proper functioning of the endosperm, which in turn is essential for embryo survival. They imply that changes in

the accompanying cells and, as a result, in the communication to the egg cell, for example, changes in genes involved in DNA (de-)methylation or small RNA pathways, may have evolved in parthenogenesis.

In the sexual model species Arabidopsis, central cell arrest requires control by the PRC2, an evolutionarily conserved complex that is involved in the suppression of development via the regulation of epigenetic modulation (reviewed by Mozgova and Hennig, 2015; Wang and Köhler, 2017). The PRC2 maintains the repressive state of its target genes by preserving the trimethylation of the N-terminal tail of histone H3 on lysine 27 (H3K27me3), a mark of transcriptional silencing. Different PRC2s exist, with the one involved in seed development containing the fertilization independent seed (FIS)-class proteins: MEDEA (MEA) (Grossniklaus et al., 1998), FIS2 (Luo et al., 1999), FERTILIZATION-INDEPENDENT ENDOSPERM (FIE) (Ohad et al., 1999), and MULTICOPY SUPPRESSOR OF IRA1 (MSI1) (Köhler et al., 2003a). Mutations in one of the FISclass genes result in autonomous endosperm formation, showing diploid nuclei and development until cellularization (Chaudhury et al., 1997). Mutations in MSI1 result in spontaneous embryo development in addition, although, with early embryo abortion up to the c. 20-cell stage (Guitton and Berger, 2005). These non-viable, haploid embryos express molecular markers and polarity similar to the diploid wild-type embryos produced by fertilization. Mutants of the FIS-class genes mea and fis2 also rarely show embryo-like structures (Chaudhury et al., 1997). Since the penetrance of FIS-mutants on autonomous endosperm development is highest for msi1 (Köhler et al., 2003a), possibly the egg cell is able to undergo spontaneous development also in other FIS-mutants, but with lower penetrance (Chaudhury et al., 1997). Later studies showed that the functional requirement of the FIS-PRC2 could be bypassed by increasing the maternal genome dosage in the endosperm (Kradolfer et al., 2013), and that the FIS-PRC2 functions in the repression of maternal alleles of paternally expressed imprinted genes (reviewed in Wang and Köhler, 2017). The authors proposed that the FIS-PRC2 evolved concomitantly with sexual endosperm and the angiosperms. This is particularly interesting in the context of apomixis, in which the ability to reproduce sexually is lost or modified and the maternal genome dosage in the endosperm is usually increased in pseudogamous apomicts and unique in autonomous apomicts. Apomictic species may thus have become independent from the FIS-PRC2, either because it has a (relatively) modified expression or they evolved changed requirements for it. Unraveling this changes in more detail may give clues for parthenogenetic reproduction.

Despite the great discoveries discussed above, the precise molecular mechanism(s) by which the egg cell achieves its competence and is activated for embryogenesis is still unknown. In animals, early embryogenesis mainly depends on maternal genetic information deposited in the egg cell before fertilization (Tadros and Lipshitz, 2009; Eckersley-Maslin et al., 2018). During the MZT and ZGA, maternal transcripts are degraded and zygotic ones synthesized. In flowering plants, the large cytoplasm of the egg cell also allows for the deposition of maternally derived molecules. Single cell type transcriptome analyses confirmed that the egg cell of the dicot Arabidopsis (Wüst et al., 2010) and monocots rice (Anderson et al., 2013) and maize (Chen et al., 2017) is stocked with RNAs, proteins and other molecules that support embryogenesis upon activation. Circa 30–40% of the total number of genes are expressed in the egg cell, a percentage not notably lower than in other (gametic) cells. Several evidences suggest that embryogenesis in plants also mainly relies on maternal transcripts (Autran et al., 2011), although paternal contribution soon after fertilization is also reported (Del Toro-De León et al., 2014; Anderson et al., 2017), and hypothesized to trigger embryogenesis of fertilized egg cells (Khanday et al., 2018). If particularly or solely maternal transcripts are involved in the initiation of embryogenesis and MZT, parthenogenetic embryo development might be similar to that in sexual reproduction. However, if paternal factors are involved in addition, alternatives for their need should have been evolved in parthenogenesis, e.g., by activation of usually silent maternal transcripts. Transcripts over-represented in the egg cells of Arabidopsis include TF-families, particularly those of type I MADS domain, RWP-RK domain, and reproductive meristem (Wüst et al., 2010). In addition, PIWI/ARGONAUTE/ZWILLE (PAZ) domain encoding genes are upregulated, supporting a role for epigenetic regulation through small RNA pathways, and the AUXIN RESPONSE FACTOR 17 (ARF17) is enriched, suggesting the involvement of auxin. An interesting recent finding that highlights the importance of auxin in embryogenesis regulation is the identification of an auxin-response network that suppresses embryo development from the suspensor in Arabidopsis (Radoeva et al., 2018). In rice and maize egg cells, TFs are also over-represented, as are genes involved in transcriptional regulation and nucleic acid binding (Anderson et al., 2013; Chen et al., 2017). A comparative transcriptome analysis between egg cells and zygotes in maize shows ZGA to involve c. 10% of the genome (Chen et al., 2017). Particularly genes that encode transcriptional regulators are activated in ZGA and chromatin assembly is modified, while the egg cell becomes primed to activate the translational machinery. In summary, data show that a range of molecules known to play a role in development are stored in the egg cell and ready for use in embryogenesis. They suggest that only a trigger is needed to release the repressive state and activate transcription and translation in order to initiate this.

In vertebrate egg cells, different evidence suggests that the key trigger for egg cell activation is a rise in intercellular Ca2+, initiated by the fertilizing sperm and responsible for all further downstream reactions (Horner and Wolfner, 2008; Machaty, 2016). An increase of internal Ca2<sup>+</sup> is also detected in zygotes of maize (Digonnet et al., 1997; Antoine et al., 2001) and wheat (Pónya et al., 2014) in in vitro fertilization experiments after the fusion of the gametes. Subsequently, cell wall material is formed, likely representing a block to polyspermy. A role for Ca2<sup>+</sup> in cell– cell communication during plant fertilization was suggested by detecting a Ca2<sup>+</sup> maximum at pollen tube rupture (Dresselhaus and Franklin-Tong, 2013). A short Ca2<sup>+</sup> transient in both the egg and central cell was associated with pollen tube burst and sperm cell arrival, while a second extended Ca2<sup>+</sup> transient solely in the egg cell was correlated with successful fertilization (Denninger et al., 2014). Although rising upon the fusion of gametes, a Ca2<sup>+</sup>

rise alone apparently is insufficient to trigger parthenogenesis in plants. In some parthenogenetic organisms of other kingdoms, such as insects, stimuli imparted to the egg cell during ovulation or egg-laying, or non-sperm-based signals, e.g., a change in ionic strength or pH, can trigger egg cell activation (Horner and Wolfner, 2008). In vitro protocols for DH-production make use of external stimuli such as a change in ion concentration or other abiotic stress factors to induce embryogenesis in micro- and megaspores (Germanà, 2006; Belogradova et al., 2009; Islam and Tuteja, 2012; Hand et al., 2016). However, these external stimuli are not generally applicable, and despite being successful in some species and genotypes others can be completely recalcitrant to such triggers. Nevertheless, they suggest that the breaking of egg cell arrest and/or release of the repressive chromatin and transcriptional silent state may (also) involve a change in (internal) physiology, particularly involving Ca2+. Searching for factors that underlie such changes likely aid in defining the molecular basis of parthenogenesis.

In summary, data show that egg cells in sexually reproducing species undergo a period of arrest that goes together with condensed, repressive chromatin and silenced transcription, and an egg cell that is stocked with molecules ready for use in embryo development. The molecular mechanism(s) or trigger by which the egg cell is activated and embryogenesis initiates is yet unknown, but results suggest the involvement of factors that release the chromatin repressive and transcriptionally silent state, e.g., genes involved in (de-)methylation, small RNAs, and hormones or a change in physiology. Particularly the inactivation or modification of the FIS-PRC2 may play a role in these changes. Parthenogenetic egg cells lack arrest or arrest for only a (very) short period, and it is unknown whether this implies that chromatin repression and transcriptional silencing are also omitted. Since a quiescent state is hypothesized to be necessary to attain totipotency in the zygote, probably this state occurs also in parthenogenetic eggs, but only for a (very) short period. In any case, factors that are involved in chromatinremodeling or transcriptional regulation are likely candidates in the parthenogenesis developmental pathway.

#### GENETIC CONTROL OF PARTHENOGENESIS IN APOMICTS AND ITS INDEPENDENCY FROM APOMEIOSIS AND POLYPLOIDY

Parthenogenesis in angiosperms has long been thought to be a process that was initiated by apomeiosis and unknown to occur independently from it. Also the genetic control of parthenogenesis and apomeiosis was long assumed to rely on a single (master) gene or one locus with tightly linked genes (Mogie, 1992). However, some of the former genetic models (Richards, 1970; Nogler, 1984; Asker and Jerling, 1992) and more recent genetic mapping and other studies (reviewed by Vijverberg and van Dijk, 2007; Hand and Koltunow, 2014), showed that parthenogenesis is able to segregate from apomeiosis. Among the first evidence for this came from artificial crosses in the common dandelion (Taraxacum officinale), using diploid sexual female x triploid apomictic male crosses, resulting in small amounts of diploid, triploid, and tetraploid hybrid offspring, corresponding to fertilization of the haploid egg cell with a haploid, diploid, or triploid sperm, respectively (Tas and van Dijk, 1999; van Dijk et al., 1999). All tetraploid and some triploid hybrids showed spontaneous seed formation, thus displaying all apomixis elements. The other triploid hybrids produced (near-) diploid (type-A; three plants), triploid (type-B; four plants), or tetraploid (type-C; two plants) offspring after pollination with haploid pollen. Additional cytological investigations (van Baarlen et al., 2002) and molecular marker studies (van Dijk et al., 2003) showed that type-B hybrids were diplosporous, parthenogenetic, and lacked endosperm autonomy, while type-C hybrids were diplosporous, with autonomous endosperm formation, but lacking parthenogenesis. Apart from demonstrating that apomeiosis, parthenogenesis, and also endosperm autonomy, were inherited independently from each other, these results established that parthenogenesis functions independently from endosperm autonomy and vice versa in dandelion. Some offspring of the type-B triploids originated from parthenogenetic development (2n + 0 embryos), whereas the remainder resulted from fertilization of the egg cell (2n + n) suggesting incomplete, ∼2/3rd penetrance of parthenogenesis in these hybrids (van Dijk et al., 1999). Apparently, the usual precocious development of the embryo, as occurs in full apomicts, was disturbed, possibly as a result of the separation of parthenogenesis from apomeiosis or from modifiers or enhancers. All type-A triploids and the diploid hybrids gave rise to diploid offspring only after pollination with haploid pollen, suggesting that they were true sexuals. The absence of apomeiosis in diploid hybrids was later confirmed by the absence of a linked microsatellite marker in progeny from a similar cross (van Dijk et al., 2009). It was suggested to be the result of a genetic load associated to long-term asexual reproduction that becomes apparent and lethal in haploid gametes. This would then possibly also hold for parthenogenesis. Although presumed, the absence of parthenogenesis in the type-A triploids and diploid hybrids and the independent acting of parthenogenesis from apomeiosis were not explicitly demonstrated.

Shortly afterward, the separate inheritance of apomeiosis and parthenogenesis was confirmed in another diplosporous Asteraceae, Erigeron annuus (Noyes, 2000; Noyes and Rieseberg, 2000). In a follow-up, Noyes et al. (2007) re-confirmed the existence of one locus for diplospory (D) and a separate locus for parthenogenesis and endosperm autonomy (F), and showed that spontaneous development (F) occurred in the presence as well as absence of D, although, with early embryo abortion in the meiotic context. Parthenogenesis could thus act as an embryogenesis inducer in the absence of apomeiosis, whereas a possible role in embryo growth needed verification. All offspring investigated in these studies were triploid, implying imbalanced meiosis, resulting in aberrant chromosome numbers in most of the egg cells. Apparently, parthenogenesis can act in an aneuploid context, although with early embryo arrest. More recently, Noyes and Wagner (2014) showed that in autonomously produced di-haploid offspring from a tetraploid

synthetic apomict (genotype: Dddd/Ffff) with incomplete, c. 50% penetrance of diplospory, D segregated 1:1 and parthenogenesis (F) was present in all (genotypes: Dd/Ff and dd/Ff). This confirmed that parthenogenesis is able to act and give rise to viable offspring independently from apomeiosis, at least, in a di-haploid context. Due to the absence of the genotypes Dd/ff and dd/ff among the offspring obtained without pollination, one could infer that the functioning of F is based on its presence and expression in the female gametophyte rather than on signals from the surrounding sporophytic maternal tissue. The study also mentioned rare, spontaneous development of seeds by some of the di-haploid parthenogenetic plants, that germinated normally and grew-out into haploid plants up to flowering (Noyes and Wagner, 2014). This showed that parthenogenesis is able to function in a haploid context, although, with (very) low efficiency. The low haploid survival rate is likely a result of recessive-lethal selection against the parthenogenesis locus, as was suggested in one of the previous studies in Erigeron for the absence of apomixis elements from diploid offspring (Noyes and Rieseberg, 2000).

Currently, the separate inheritance of apomeiosis and parthenogenesis is confirmed for most apomictic model species, including aposporous and diplosporous apomicts and monocots as well as dicots, as is implied by, e.g., a cytological study in Poa pratensis that shows two aposporous, non-parthenogenetic individuals (Albertini et al., 2001); flow cytometric analysis in Hypericum perforatum, indicating parthenogenetic development in 10% of the reduced, di-haploid offspring in the presence of pseudogamous endosperm formation (Barcaccia et al., 2006); idem in the guinea grass Panicum maximum, resolving eight different reproductive pathways of seed development (Kaushal et al., 2008); a gamma deletion mapping study in Hieracium caespitosum, showing one locus for Loss-Of-Apospory (LOA) and one for Loss-Of-Parthenogenesis (LOP) (Catanach et al., 2006); comparative mapping studies in Pennisetum squamulatum versus Cenchrus ciliarus, resolving one aposporous recombinant that lacks parthenogenesis (Conner et al., 2013); and backcrossing experiments in Allium, resulting in diplosporous individuals with and without parthenogenetic development (Yamashita et al., 2012). The independent inheritance of endosperm autonomy from parthenogenesis is also supported by other studies, e.g., in Hieracium, the factor AutE was found to function independently from apospory and parthenogenesis (Ogawa et al., 2013). Mapping studies support the dominant monogenic/ locus inheritance of both, apomeiosis and parthenogenesis. The usual co-segregation of the apomixis elements apparently is the result of their location in complex, repeat, and transposonrich, non-recombining and in some species hemizygous genomic regions, particularly illustrated by the >50 Mbp long ASGR in P. squamulatum (Akiyama et al., 2004) and the Apomixis Controlling Locus in Paspalum simplex (Labombarda et al., 2002). This co-segregation likely evolved as a prerequisite for each of the apomixis elements alone to survive, since their separate occurrence will be untenable in the long-term due to their creation of plant lines with accumulating increasing or decreasing ploidy levels (Asker and Jerling, 1992).

Recently, the long-term studies in natural apomicts paid off by resolving the BABY BOOM-Like (BBML) gene in P. squamulatum as a candidate gene for parthenogenesis (Conner et al., 2015; next paragraph). Transgenes of PsASGR-BBML were able to induce parthenogenesis in the tetraploid sexual relative P. glaucum, supporting their function in di-haploid eggs. PsASGR-BBMLpromoter-GUS analysis provided evidence for the expression of parthenogenesis in the egg cells of P. glaucum, where GUS expression was observed from 1 day before anthesis and in the post-fertilization developing embryos. These observations confirmed the function of a parthenogenesis gene on the basis of its presence and expression in the egg cell, rather than the companion central cell and/or surrounding sporophytic tissue. In a more recent study, it was shown that PsASGR-BBML transgenes were also able to induce parthenogenesis in haploid eggs of sexual diploid rice and maize (Conner et al., 2017). This clearly demonstrated that parthenogenesis is fully functional also in the haploid context, and supports that the usual absence of parthenogenesis from haploidy is likely a result of recessive-lethal selection against associated flanking genomic regions.

In summary, studies in natural apomicts show that parthenogenesis usually co-segregates with apomeiosis, but is able to segregate and function independently from it. Parthenogenesis can also segregate and act independently from autonomous endosperm formation. Most studies indicate monogenic/-locus, dominant inheritance of parthenogenesis. It functions normally in di-haploid egg cells, and is able to induce embryogenesis in aneuploid eggs, although, with early embryo abortion. Parthenogenesis is virtually absent from reduced, haploid plant egg cells, but rare observations of its functioning in haploid eggs have been reported for dihaploid parthenogenetic Erigeron (see above) and a diploid apomictic Hieracium plant (Bicknell, 1997). The isolation of the parthenogenesis inducing PsASGR-BBML gene, and the demonstration of its function as a transgene in haploid eggs from Pennisetum, rice, and maize, confirms that parthenogenesis is able to act independently from polyploidy. It suggests that the absence of the trait from haploidy is likely explained by a genetic load in linked genomic regions. Finally, the results support the gametic presence and expression of parthenogenesis rather than non-cell-autonomous signaling from companion cells or surrounding, sporophytic tissue.

### CANDIDATE GENES FOR PARTHENOGENESIS

### The PsASGR-BabyBoom-Like Gene

A BABY BOOM (BBM)-Like gene was discovered in the natural apomictic grass P. squamulatum while skim-sequencing bacterial artificial chromosome clones that were linked to the ASGR (Conner et al., 2008). BBM genes were originally identified in Brassica napus (Boutilier et al., 2002) and shown to induce somatic embryogenesis in B. napus and Arabidopsis upon ectopic expression. They are part of a large gene family characterized by the APETALA 2/ETHYLENE RESPONSE FACTOR (AP2/ERF) DNA-binding domain (Jofuku et al., 1994; Riechmann and Meyerowitz, 1998). This TF-family of almost 150 members in Arabidopsis (Sakuma et al., 2002) and 157 members in rice

(Nakano et al., 2006) is divided into groups of genes containing either one or two AP2 domains. The one-domain ERF-like genes typically are involved in biotic or abiotic stress response whereas the two-domain AP2-like genes function in growth and development (Floyd and Bowman, 2007). BBM genes along with AINTEGUMENTA (ANT) and PLETHORA (PLT) genes belong to the AINTEGUMENTA-Like (AIL) subclade within the eudicotANT (euANT) class of AP2/ERF DNA-binding domain genes, all of which function during embryogenesis (Horstman et al., 2014). PsASGR-BBML protein sequence is most similar to other BBML genes from related apomictic species: Cenchrus ciliaris and Pennisetum spp., but also to BBM genes in Setaria italica (foxtail millet) and Oryza sativa (rice), and yet more distantly related to BBM genes in Zea mays and Sorghum bicolor (El Ouakfaoui et al., 2010; Conner et al., 2015). It is surprising that the rice BBM1 is more closely related to PsASGR-BBML than to any BBM copy in maize and sorghum, given that rice and the panicoid grasses that include Pennisetum, sorghum, and maize, diverged from one another around 60 million years ago.

None of the AIL genes in Arabidopsis is expressed in prefertilization gametic cells during sexual reproduction (Horstman et al., 2014). On the other hand, after transformation to sexual P. glaucum (pearl millet), the PsASGR-BBML gene was shown to be expressed in egg cells prior to fertilization and to be sufficient for the initiation of embryos in the absence of fertilization (Conner et al., 2015). Since tetraploid pearl millet was the transgenic background, parthenogenetic development of the reduced eggs gave rise to diploid progeny. A second cycle of parthenogenesis resulted in true haploids, which expectedly were sterile. Sterile haploids also were derived through parthenogenesis of reduced egg cells in rice and maize (Conner et al., 2017). Among these three transgenic cereals, it was demonstrated that both the native PsASGR-BBML promoter and an egg-cell-specific Arabidopsis promoter (DD45; Steffen et al., 2007) provided the appropriate temporal regulation to enable fertilization-independent embryo formation. Mature haploid seed formation was irregular possibly as a result of asynchronous embryo-endosperm development. This first demonstration of parthenogenesis gene function opens the door for synthesizing apomixis in cereal crops once the capacity to produce unreduced gametes at high frequency is installed.

Interestingly, it was recently found that a wild-type rice BBM1 (Os-BBM1) transgene under an Arabidopsis egg-cellspecific promoter (DD45) was also able to initiate embryogenesis in rice egg cells without fertilization (Khanday et al., 2018). This supported the close relationship of PsASGR-BBML with Os-BBM1 and the functionality of the associated AP2-like domain in parthenogenesis rather than an evolved novel capability in functional domains. Most interestingly, it was shown that Os-BBM1 lacks expression in the egg cells of rice, but is expressed in sperm cells, whereas only male BBM1-transcripts are expressed in early zygotes. This suggests the requirement of fertilization in embryogenesis for the transmission of malegenomic factors that are maternally silenced. It would imply that in parthenogenesis, an essential, normally maternally imprinted gene, may have become maternally expressed. This is supported by another interesting recent study that shows maternal expression of the normally imprinted gene PHERES1 (PHE1; Köhler et al., 2003b) in apomictic Boechera (Kirioukhova et al., 2018), as is further discussed in Section "The FIS-PRC2 and RETINOBLASTOMA RELATED1". It nicely brings together the many studies that indicate the involvement of diverse repression mechanisms in egg cell arrest and the associated factors that putatively play a role in the release of this repression (see section "Egg Cell Arrest and the Trigger for Embryogenesis" above).

### The "Salmon System" in Wheat

Investigations in the context of parthenogenesis were done in the "Salmon system" of wheat (Triticum aestivum) already some decades ago. This system was developed for use in haploid production after it was recognized that transfer of the nucleus of the sexual cultivar "Salmon" to cytoplasm of the grass genus Aegilops resulted in lines that were capable of autonomous embryo development (Tsunewaki and Mukai, 1990). In the "Salmon" line, the short arm of chromosome 1B of wheat has been replaced by the short arm of chromosome 1R of rye. Tsunewaki and Mukai (1990) concluded that besides a cytoplasmic Restorer of fertility (Rfv1) factor, two nuclear genes were involved in spontaneous embryo development: the inducer gene Parthenogenesis gain (Ptg) that is under sporophytic control, and the suppressor gene Suppressor of parthenogenesis (Spg) that is under gametophytic control. These two genes were concomitantly exchanged with the chromosome 1 arm. Two other researchers came to the same conclusion in a 1B/1R-translocation system in durum wheat (Hsam and Zeller, 1993). To improve the system for in vivo investigation of parthenogenesis, three isogenic homozygous lines were produced, the male fertile sexual line Ae. aestivum-Salmon (aS), and the male sterile parthenogenetic lines Ae. caudata-Salmon (cS) and Ae. kotschyi-Salmon (kS) (Matzk et al., 1995). Comparative protein analysis from ovary extracts of these three lines resolved one protein that was uniquely expressed in the two parthenogenetic lines from 3 days before and during anthesis. This protein, P115.1, was characterized as a α-tubulin polypeptide. Tubulin α-chains are the major constituent of microtubules and function in GTPbinding and, regarding their function, could possibly also be a result of parthenogenesis rather than its cause. Further studies on isolated egg cells from the three isogenic lines and a common wheat line indicated that parthenogenetic development is independent from ovary-derived signals (Kumlehn et al., 2001). This encouraged the researches to focus on the egg cells and construct cDNA libraries from them. Analysis of these libraries delivered a number of egg cell specific candidates among which were RWP-RK domain (RKD)-containing TFs. These were subject to later studies in Arabidospis (Köszegi et al., 2011; Tedeschi et al., 2017) and Marchantia polymorpha (Rövekamp et al., 2016; Koi et al., 2016) (see next paragraph). Some eggs of the parthenogenetic lines showed a second nucleolus, a characteristic of zygotes isolated from sexual lines (Naumova and Matzk, 1998; Kumlehn et al., 2001).

Together, the results showed that parthenogenesis apparently is an inherent property of the egg cell and not the surrounding tissue and is able to establish zygotic competence in the absence of fertilization.

### RWP-RK Domain (RKD)-Containing Transcription Factor

Encouraged by their identification in wheat egg cell cDNA libraries (Kumlehn et al., 2001; former paragraph) RKD-TF homologs were searched for and investigated in Arabidopsis thaliana, which resolved a total of five AtRKDs (Köszegi et al., 2011). At least two of them were preferentially expressed in the egg cell and, interestingly, their ectopic expression induced cell proliferation and activated an egg cell-like transcriptome. Members of the RWP-RK domain family contain the MINUS DOMINANCE (MID) factor, which is in the distantly related green algae species Chlamydomonas reinhardtii required for gamete differentiation (Ferris and Goodenough, 1997). Since the RKD-TFs of Arabidopsis are highly redundant and the genes are conserved over the plant kingdom, the single copy homologous RKD-TF in Marchantia polymorpha was used for functional analysis (Koi et al., 2016; Rövekamp et al., 2016). MpRKD showed wide expression in M. polymorpha, but preferentially high in antheridia, developing egg cells, and sperm precursor cells. Lines with downregulated expression showed large cells at the base of the archegonium, indicating that egg cell specification occurs on the bases of anatomy and position, however, in the absence of specific molecular markers (Rövekamp et al., 2016). These cells underwent cell divisions instead of entering the quiescent egg cell stage, suggesting a role of MpRKD in establishing and/or maintaining the quiescent state of the egg cell prior to fertilization. MpRKD mutants lacked effects on the overall morphology of reproductive organs, but showed striking defects in egg and sperm cell differentiation (Koi et al., 2016). Together, these results indicate that RKD-TFs are evolutionary conserved regulators of germ cell differentiation in land plants and particularly act in the gametophyte-to-sporophyte transition by preventing the egg cell from entering mitosis in the absence of fertilization, i.e., by suppressing parthenogenesis.

#### The FIS-PRC2 and RETINOBLASTOMA RELATED1

Until the recent findings of PsASGR-BBML and RKD-TFs, a candidate for parthenogenesis in plants mentioned in literature was the FIS-PRC2 gene MSI1 (Köhler et al., 2003a; Guitton and Berger, 2005; section "Egg Cell Arrest and the Trigger for Embryogenesis"). It was isolated via a mutant screen in the sexual model A. thaliana and searched for because MSI-like homologs in yeast (MSI1) and mammals (RbAp46/48) were found to be involved in chromatin metabolism (Hennig et al., 2003). At that time, it became apparent that the modulation of chromatin structure played an important role in the regulatory decisions and gene expression during development, also in plants. As discussed above, the FIS-PRC2 is involved in gene suppression during seed development, particularly affecting the endosperm, whereas MSI1, and to a lesser extent also other FIS-class genes, affects embryo initiation in addition (Chaudhury et al., 1997). Other studies showed a role for the FIS-PRC2 in balancing the maternal versus paternal gene dosage, by showing plants with an increased maternal dosage resembling FIS-mutant phenotypes (Kradolfer et al., 2013). The results indicate that a release of gene suppression alone is insufficient to obtain viable seeds, but that this, particularly or solely, is a result of failure of the endosperm and maybe not embryo. Investigations on the role of the endosperm showed that, indeed, endosperm cellularization impacts embryo growth, and FIS-mutant embryos could be rescued on appropriate medium in vitro (Hehenberger et al., 2012). Also in other modes of in vivo induced parthenogenesis, such as via pollination with irradiated pollen or triploid inducer lines that results in haploid embryo development in some species (Germanà, 2006), embryos need to be rescued and cultivated in vitro due to failure of the endosperm. This indicates that release of the repressive state in the egg cell can be sufficient for the initiation of embryo development, however, finding clues for restoration of endosperm development is also necessary for successful parthenogenetic seed development.

The FIS-class genes FIS2 and MEA are imprinted genes. They are silenced throughout the life cycle of the plant, but become active in the female gametophyte, especially in the central cell, and remain expressed and active in the endosperm after fertilization, whereas the paternal alleles remain silent (Wang and Köhler, 2017). MEA is involved in the control of embryo growth in sexual species by repressing the maternal allele expression of the TF PHE1 (Köhler et al., 2003b). Thus, PHE1 is also imprinted, but expressed from the paternal allele only. A recent study asked the question what would happen with embryo growth in autonomous apomicts, where paternal alleles are absent (Kirioukhova et al., 2018). It was hypothesized that the silencing of maternal alleles might have become reduced or relieved during the evolution of apomixis, allowing maternally imprinted genes to be expressed from the maternal allele. This was tested for PHE in sexual versus asexual Boechera, a close relative of A. thaliana. In apomictic Boechera, the maternal PHE-like allele indeed was expressed, indicating a reversion of the imprinting status of this gene. In addition, a heavily methylated 30MR was deleted from the PHE-alleles in apomicts, allowing increase of their expression. The authors proposed a model in which parthenogenesis in Boechera evolved via changes in epigenetic regulation of imprinted genes based on changes in DNA methylation (see Figure 3 in Kirioukhova et al., 2018). This shows parallelisms to an artificially induced case of parthenogenesis in mice through the loss of distal DNA methylation, resulting in maternal activation of the paternally expressed Insulin-like growth factor 2 (Igf 2) gene (Kono et al., 2004). Thus, a modified role in transcriptional regulation of maternal alleles is indicated and interesting to further investigate in the context of parthenogenesis.

FIS-mutant phenotypes are resembled also by phenotypes of RETINOBLASTOMA RELATED1 (RBR1) mutants (Ebel et al., 2004; Johnston et al., 2010). This gene is related to the tumor suppressor gene RB in mammals, which has a role in inhibiting cell cycle progression. RBR1 in plants functions in cell cycle control during gametogenesis, with mutants showing

supernumerary nuclei at the micropylar end and impaired cellularization (Johnston et al., 2008, 2010). Polar nuclei do not fuse in rbr gametophytes and cell-type-specific markers usually lack expression. RBR1 represses the G1/S-phase transition through inhibiting E2F transcription, and this, in turn, involves RBR1-phosphorylation that influences the RBR1-E2F interaction (Boniotti and Gutierrez, 2001; Kuwabara and Gruissem, 2014). Cyclin-dependent kinase (CDK) A combined with cyclin (CYC) regulatory subunit D (CDKA/CYCD; serine-threonine protein kinase) is involved in this phosphorylation. In interaction with MSI1, RBR1 also plays a role in the downregulation of METHYLTRANSFERASE1 (MET1) (Jullien et al., 2008). Reduction of MET1 in the central cell is essential for the activation of FIS2 and thus for the FIS-PRC2. Indeed, FIS2 expression is reduced in rbr-gametophytes (Johnston et al., 2010). Thus, RBR1 and genes associated to its functioning and/or to cell cycle progression are additional candidates to be involved in the suppression of spontaneous embryo and endosperm development and, as such, to have a role in parthenogenesis.

### Genes Involved in the Induction of Ectopic Embryogenesis

A number of other genes, mostly TFs, have been reported to be involved in the induction of embryogenesis ectopically and/or in vitro after overexpression (excellent reviews by Radoeva and Weijers, 2014; Horstman et al., 2017). They are mentioned here for completeness, but we refer to the other reviews for their listing and details, since a specific role in parthenogenesis is yet undetermined. Whether they are part of one or a few larger networks also needs further elucidation. Among them are AP2-TF family genes, including BBM (AIL2/PLT4) (see also section "The PsASGR-BabyBoom-Like gene", above) most other AIL-genes, and WOUND INDUCED DEDIFFERENTIATION 1 (WIND1) (Horstman et al., 2017) that support embryogenesis. In addition, most genes of the "LAFL"-network, namely, LEAFY COTYLEDON 1 (LEC1), LEC1-Like (L1L), and LEC2, another member of the RWP-RK domain-containing family, RKD4 (see also section "RWP-RK Domain (RKD)-Containing Transcription Factor"), and the homeodomain TF WUSCHEL (WUS). There are also genes that function more indirectly by increasing the capacity for embryogenesis, such as AGAMOUS-Like 15 (AGL15) and SOMATIC EMBYOGENESIS RECEPTOR KINASE 1 (SERK1). Last, some genes are known to be involved in the suppression of embryogenesis, including the chromatin-helicase-DNA binding gene (CHD3/4)-Like chromatin remodeling factor encoded by PICKLE (PKL), genes of the PRC1 and -2 (see sections "Egg Cell Arrest and the Trigger for Embryogenesis" and "The FIS-PRC2 and RETINOBLASTOMA RELATED1"), and the HIGH-LEVEL EXPRESSION OF SUGAR-INDUCIBLE GENES VAL1 and VAL2. Yet knowing the putative role of parental expression of Os-BBM1 in embryogenesis, it is relevant to investigate this also for the other genes mentioned.

In summary, a PsASGR-BBML gene has been isolated and verified as the first parthenogenesis gene by demonstrating its functionality in related, sexual relative grasses such as pearl millet and rice, but not yet in eudicots. Other candidates and studies support that the suppression of spontaneous embryo and endosperm development in sexual reproduction is under tight epigenetic control and release of this control allows for the initiation of spontaneous embryo and endosperm development. This is shown to involve FIS-PRC2 genes and genes associated to it and to cell cycle control. Although initiated, mutants of these genes show early embryo arrest and endosperm development up to cellularization, indicating that a release of transcriptional suppression alone is not enough to obtain viable seeds. Functional endosperm is important in addition, either because of probable roles in the regulation of embryogenesis, but especially also to nourish the embryo. Restoring endosperm development is, therefore, necessary for successful seed development via parthenogenesis. Alternatively, the haploid embryos can be cultivated in vitro after embryo rescue, as is also done in some of the other haploid induction methods currently used in DHproduction. Interesting recent results show that Os-BBM1 is paternally expressed, maternally silenced, and hypothesized to induce embryogenesis in rice egg cells after fertilization. Other recent results support a role for evolved changes in apomicts in this context, by showing the normally paternally expressed PHE-Like genes to be maternally expressed in apomictic Boechera. The results converge upon the importance of genes involved in the suppression of transcription and modifications thereof in apomicts at one hand and genes involved in the developmental process for which either transcription is allowed or artificially overexpressed on the other in parthenogenetic reproduction. In **Table 1** and **Figure 2** this convergence is summarized.

### THE USE OF PARTHENOGENESIS IN PLANT BREEDING

The ultimate goal of identifying a gene for parthenogenesis is to apply it in protocols for breeding line production in order to induce either gametophytic or sporophytic embryogenesis in a variety of cell types and plant species. This is of particular interest in the context of DHs production, a method that is widely used for the instant production of homozygous lines via haploid induction technology followed by chromosome doubling methods (reviewed by, e.g., Germanà, 2006, 2011; Belogradova et al., 2009; Islam and Tuteja, 2012; Hand et al., 2016; Ikeuchi et al., 2016; Horstman et al., 2017). Current methods include, among others, in vivo induction of spontaneous haploid embryo formation at (very) low frequencies via wide crosses, crosses with triploids, or by using irradiated pollen, usually followed by embryo rescue and in vitro embryo cultivation, and in vitro induction of embryogenesis in response to different (a)biotic stress factors. These methods, however, need time-intensive and species-specific protocol development and lack application in a number of important crops. Although successful in particular species or genotypes, others can be completely recalcitrant to produce DHs. Understanding the genetic basis of egg cell activation and the initiation of embryogenesis will largely contribute to the production of a wide variety of DHs.



1After ectopic and

overexpression;

2in culture, in vitro;

3and

bHLH-network.

Most interesting is the combination of parthenogenesis with various forms of modified meiosis. Particularly the combination with apomeiosis, the omission of first meiosis, resulting in unreduced gametes that maintain all or most of the genetic and epigenetic variation of the mother plant, is interesting in order to produce clonal seeds. This enables the maintenance of vigorous F1-hybrids that are usually produced via extensive crossing and careful selection procedures involving five or more growing seasons, however, segregate in the subsequent generation. Being able to re-grow valuable F1-hybrids over more than one generation has high potential for food security and the increasing demand on food. Proof-of-principle for synthetic clonal reproduction was obtained by Marimuthu et al. (2011) using either the DYAD mutant [Ravi et al., 2008; one allele of SWITCH1 (SWI1): Mercier et al., 2001; Agashe et al., 2002] or the "turning MEIOSIS into MITOSIS" (MiMe) variant (d'Erfurth et al., 2009) to obtain unreduced gametes in combination with the CENTROMERE-SPECIFIC HISTON 3 (CENH3) mutant (Ravi et al., 2011) to fertilize the central cell without genomic contributing to the embryo. This method is now awaiting improvements to produce unreduced gametes at high frequency as well as identify or produce CENH3-Like variants in crops. Another interesting modified meiosis variant is the omission of second meiosis to produce (near-)homozygous gametes, using mutants of OMISSION OF SECOND DIVISION 1 (OSD1) (d'Erfurth et al., 2009). This is particularly of interest in the (near-)Reverse Breeding approach in which (near-)homozygous parental lines and chromosome substitution lines are produced in one generation (Dirks et al., 2009). Reverse Breeding relies on the suppression of recombination during the first meiosis and omission of chromatid separation in the second. All these protocols need the induction of embryo development from the gametes produced and are awaiting capacity for in vivo or in vitro embryo induction. A third very useful application of parthenogenesis in plant breeding is the potential to link diploid with polyploid gene pools, in alternation with apomeiosis. This makes it, for instance, much easier to cross-in interesting characters of diploid wild relatives to the usual tetraploid crop varieties in potato. All three aims contribute to the control of plant reproduction and breeding and are highly relevant in order to optimize crop development and increase plant productivity.

### CONCLUSION AND FUTURE PERSPECTIVE

fpls-10-00128 February 16, 2019 Time: 17:39 # 13

This review provides a summary of current knowledge on parthenogenesis in plants obtained from studies in natural apomicts and mutants in sexual model species. Several lines of evidence from natural apomicts support that parthenogenesis inherits and functions independently from apomeiosis and endosperm formation, and that it is a monogenic and dominant trait. Results also show that parthenogenesis is expressed and active in the egg cell and independent from signals from the surrounding sporophytic tissue. Parthenogenesis functions normally in reduced di-haploid egg cells, resulting in viable embryos that grow into di-haploid plants. It also works in true haploid eggs, but with (much) lower frequency in producing viable plants and with the haploid offspring usually being infertile, and in aneuploid eggs, although with early embryo abortion. The reason that parthenogensis is usually absent from haploids in nature is indicated to be the result of a linked deleterious genetic load. The relationship of parthenogenesis to the central cell and endosperm is more complicated. The central cell plays a role in egg cell suppression in sexual reproduction (see **Figure 2**), and functional endosperm is needed for successful embryo development. It is not entirely clear if the parthenogenetic egg cell also undergoes a (short period) of quiescence in which the chromatin is repressed and transcription is silenced. These processes might be necessary for obtaining totipotency in the zygote, but has to be confirmed. Studies in sexual model systems, particularly Arabidopsis, support that the release of egg cell repression and transcriptional silencing results in the initiation of embryo development. Genes that are involved in this, particularly genes of the FIS-PRC2 or related to cell cycle control, e.g., RBR1, and the recently uncovered RKD-TFs, may therefore have changed or become ineffective in apomicts. A possible mechanism for this could be a change in the target gene sequence by which the silencing is reduced, e.g., by a deletion of a region involved in heavy methylation as was found in PHE-alleles in apomictic Boechera (Kirioukhova et al., 2018). Similarly, the functioning of PsASGR-BBML in apomicts, and the maternal silencing of the related Os-BBM1 in sexual rice, may hint to a reduction of maternal silencing of BBM-Like in apomictic grasses. The release of gene suppression particularly in the FIS-PRC2 mutants also affects the central cell, leading to spontaneous endosperm development up to cellularization. Embryos arrest at an early stage in these mutants, and this may also involve failure of the endosperm, since they can be rescued by in vitro cultivation. Embryos obtained with PsASGR-BBML also need either embryo rescue or fertilization of the central cell in order to allow endosperm development and embryo growth progression. The sexual endosperm is maternally to paternally genome dosagedependent in most species (2m : 1p), whereas in apomicts, this dosage is usually highly disturbed without any obvious effect. Apparently, apomicts have evolved several mechanisms to overcome these requirements. One study supports that the function of the FIS-PRC2, and thus transcriptional silencing, can be suppressed by an increase of the maternal dosage in the endosperm, leading to a FIS-mutant-like phenotype. Whether this mimics the situation in the central cell of apomicts has to be resolved (see **Figure 2**). Altogether, the results show that parthenogenesis involves changes in epigenetic regulation needed to allow genes that are essential in embryo induction to be expressed from the maternal allele. They also show that parthenogenesis functions independently from endosperm development, but that the absence or failure of the endosperm impacts successful embryo and plant development. Insight into the mechanisms that have developed in apomicts to overcome the failure of the endosperm, and the development of methods to restore endosperm production, for example, through artificial crosses to fertilize the central cell, is needed and is the next challenge in successful seed production via parthenogenesis.

In the near future, establishing the function of the emerging parthenogenesis candidates in a wider sense is one of the main priorities as is the identification and/or engineering of parthenogenesis in non-grass species. A second priority is to transfer the knowledge to crop species to make it useful in plant breeding and in in vitro embryogenesis protocols. Third is to improve the experimental separation between the functioning of the egg cell and embryo growth progression on one hand and the dependence of the central cell and endosperm on the other in research on parthenogenesis, and to further unravel the mechanisms that underlie spontaneous endosperm formation. Fourth, to combine parthenogenesis with the different forms of "omission of meiosis" in order to use it as a tool in plant breeding, e.g., for clonal seed production/apomixis. Also interesting is to further unravel other mechanisms that have evolved in apomicts, such as the possible differences in egg cell quiescence as well as the ways in which the endosperm overcomes the maternal to paternal dosage requirements.

## AUTHOR CONTRIBUTIONS

KV designed and wrote the manuscript. PO-A wrote some parts of the manuscript. PO-A and MS gave valuable review comments.

## FUNDING

The review is written as part of the Dutch Scientific Organization (NWO), Applied and Engineering Sciences (STW/TTW) fellowship 13700: PARtool: The molecular basis of plant embryo development without fertilization (Parthenogenesis), and its use as a tool in breeding line production, to KV and MS.

#### REFERENCES

fpls-10-00128 February 16, 2019 Time: 17:39 # 14


Asker, S. E., and Jerling, L. (1992). Apomixis in Plants. Boca Raton FL: CRC Press.


Biol. J. Linn. Soc. 61, 51–94. doi: 10.1111/j.1095-8312.1997.tb0 1778.x


Kalanchoë. Proc. Natl. Acad. Sci. U.S.A. 104, 15578–15583. doi: 10.1073/pnas. 0704105104



aestivum, L.) egg cells reveal [Ca2+ ] cyt oscillation of intracellular origin. Int. J. Mol. Sci. 15, 23766–23791. doi: 10.3390/ijms151223766



Hörandl, U. Grossniklaus, P. J. van Dijk, and T. F. Sharbel (Ruggell: Gantner Verlag), 137–158.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Vijverberg, Ozias-Akins and Schranz. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Overcoming Cabbage Crossing Incompatibility by the Development and Application of Self-Compatibility-QTL- Specific Markers and Genome-Wide Background Analysis

Zhiliang Xiao† , Fengqing Han† , Yang Hu, Yuqian Xue, Zhiyuan Fang, Limei Yang, Yangyong Zhang, Yumei Liu, Zhansheng Li, Yong Wang, Mu Zhuang\* and Honghao Lv\*

#### Edited by:

Jianjun Chen, University of Florida, United States

#### Reviewed by:

Ryo Fujimoto, Kobe University, Japan Ping Lou, Dartmouth College, United States

#### \*Correspondence:

Mu Zhuang zhuangmu@caas.cn Honghao Lv lvhonghao@caas.cn

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 29 August 2018 Accepted: 05 February 2019 Published: 26 February 2019

#### Citation:

Xiao Z, Han F, Hu Y, Xue Y, Fang Z, Yang L, Zhang Y, Liu Y, Li Z, Wang Y, Zhuang M and Lv H (2019) Overcoming Cabbage Crossing Incompatibility by the Development and Application of Self-Compatibility-QTL- Specific Markers and Genome-Wide Background Analysis. Front. Plant Sci. 10:189. doi: 10.3389/fpls.2019.00189 Institute of Vegetables and Flowers, Chinese Academy of Agricultural Sciences, Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture, Beijing, China

Cabbage hybrids, which clearly present heterosis vigor, are widely used in agricultural production. We compared two S5 haplotype (Class II) cabbage inbred-lines 87–534 and 94–182: the former is highly SC while the latter is highly SI; sequence analysis of SI-related genes including SCR, SRK, ARC1, THL1, and MLPK indicates the some SNPs in ARC1 and SRK of 87–534; semi-quantitative analysis indicated that the SI-related genes were transcribed normally from DNA to mRNA. To unravel the genetic basis of SC, we performed whole-genome mapping of the quantitative trait loci (QTLs) governing self-compatibility using an F<sup>2</sup> population derived from 87–534 × 96–100. Eight QTLs were detected, and high contribution rates (CRs) were observed for three QTLs: qSC7.2 (54.8%), qSC9.1 (14.1%) and qSC5.1 (11.2%). 06–88 (CB201 × 96–100) yielded an excellent hybrid. However, F<sup>1</sup> seeds cannot be produced at the anthesis stage because the parents share the same S-haplotype (S57, class I). To overcome crossing incompatibility, we performed rapid introgression of the self-compatibility trait from 87–534 to 96–100 using two self-compatibility-QTL-specific markers, BoID0709 and BoID0992, as well as 36 genome-wide markers that were evenly distributed along nine chromosomes for background analysis in recurrent back-crossing (BC). The transfer process showed that the proportion of recurrent parent genome (PRPG) in BC4F<sup>1</sup> was greater than 94%, and the ratio of individual SC plants in BC4F<sup>1</sup> reached 100%. The newly created line, which was designated SC96–100 and exhibited both agronomic traits that were similar to those of 96–100 and a compatibility index (CI) greater than 5.0, was successfully used in the production of the commercial hybrid 06–88. The study herein provides new insight into the genetic basis of self-compatibility in cabbage and facilitates cabbage breeding using SC lines in the male-sterile (MS) system.

Keywords: cabbage, crossing incompatibility, quantitative trait loci, genetic mapping, genomic background analysis

#### INTRODUCTION

fpls-10-00189 February 22, 2019 Time: 18:21 # 2

Cabbage (Brassica oleracea L. var. capitata), a cole crop species, is a vegetable of worldwide economic importance due to its strong resistance, wide adaptability, favorable taste and healthcare-related value (Fang, 2000). Cabbage hybrids, which clearly present heterosis vigor, are widely used in cabbage production, and self-incompatible (SI) lines and male-sterile (MS) lines are two important tools for utilizing cabbage heterosis. Since the 1970s, attention has been paid to the selection of SI lines for the seed production process. However, self-incompatibility-based hybrid seed production is labor intensive, as the two parental lines can be reproduced only by artificial pollination; moreover, the purity of hybrid seeds never reaches 100%. Since the late 1990s, breeders have been turning to applying cytoplasmic male-sterile (CMS) or dominant genetic male-sterile (DGMS) lines for cabbage hybrid seed production (Pelletier et al., 1983; Earle et al., 1994; Fang et al., 1997), and many SI lines are still used in the MS system.

Self-incompatibility, the recognition and rejection of self-pollen, is a widespread mechanism by which flowering plants prevent self-fertilization. Pollen genotypes are determined by diploid pollen wall proteins: when the stigma and pollen have the same S-haplotype, they will present an incompatible reaction. This system retains genetic variation and avoids inbreeding depression in more than half of angiosperm species. Cabbage sporophytic SI is regulated by a single multi-allelic S-locus (Bateman, 1954; Thompson, 1957), which consists of the following three pollen–stigma recognition-related genes: SLG (S-locus glycoprotein), SRK (S-locus receptor kinase), and SCR/SP11 (S-locus cysteine-rich protein and S-locus protein 11) (Nasrallah, 1974; Stein and Nasrallah, 1993; Kusaba et al., 2000; Takasaki et al., 2000; Watanabe et al., 2000; Nettancourt, 2001; Takayama et al., 2001; Takayama and Isogai, 2005; Fujimoto et al., 2006). The specific interaction between SRK, which is the female determinant expressed in stigmas, and SCR/SP11, which is the male determinant expressed in pollen grains, induces a signaling cascade within stigma epidermal cells. This signaling network provides stigmas with the ability to recognize and reject self-pollen grains. In addition, ARC1 (arm repeat containing 1), THL1 (thioredoxinh-1), and MLPK (M-locus protein kinase), which are not associated with the S locus, are also involved in pathways related to pollen–stigma interactions. THL1 is a negative regulator (Cabrillac et al., 2001; Haffani et al., 2004), while ARC1 and MLPK are positive regulators (Stone et al., 1999, 2003; Murase et al., 2004). As a relative characteristic of selfincompatibility, self-compatibility has also been analyzed in Brassica crops, including B. oleracea, B. rapa, and B. napus (Okuda et al., 2000; Murase et al., 2004; Okamoto et al., 2007; Boggs et al., 2009). Some studies have shown that genetic variation in S-locus genes may be responsible for self-compatibility, such as SCR (Okamoto et al., 2007), SRK (Gaude et al., 1993; Isokawa et al., 2010), THL1 (Bower et al., 1996), ARC1 (Stone et al., 1999), MLKP (Murase et al., 2004; Hatakeyama et al., 2010). In addition, Kakita et al. (2007) and Isokawa et al. (2010) reported self-compatibility traits that are inconsistent with known S-loci and concluded that new loci may lead to self-compatibility.

Self-compatible (SC) lines facilitate cabbage hybrid seed production by two aspects. (i) These lines reduce costs during the parental line reproduction process; the male parental line and the maintainer line can be made to be SC lines, whose reproduction can be performed by honeybee pollination. (ii) These lines avoid crossing incompatibility during the hybrid seed production process, as parental lines with the same S-haplotype cannot produce hybrid seed by honeybee pollination. Therefore, the development of SC lines is urgently needed to meet the needs of improvements in cabbage breeding. Approximately 50 forms have been characterized in cabbage (Bateman, 1952; Thompson, 1957; Ockendon, 2000). Based on the sequence similarities between SLG and SRK, the S haplotypes are categorized as Class I or Class II. In B. oleracea, only three Class II S haplotypes have been identified (i.e., S2, S5, and S15). Class I S haplotypes are generally dominant over Class II S haplotypes. Tian et al. (2013) identified 26 S-haplotypes in cabbage, e.g., S2, S5, and S15, which can provide a basis for quick analyses of S-haplotypes in cabbage. Stern et al. (1982) and Fang et al. (1983) proposed methods of compatibility index (CI) calculations and fluorescence microscopy observations to assess the self-compatibility phenotype of cabbage; these methods are widely used in compatibility identification.

In the current study, we determined that 87–534 was an elite cabbage line that carried the S5 haplotype (Class II) with a high compatibility index (CI; i.e., >10.0). 94–182 consisted of the same haplotype, but had a very low CI value (<1.0). We conducted sequence and expression analyses of the SI-related genes in 87–534 and 94–182 plants to clarify why 87–534 was highly self-compatible, and mapped the quantitative trait loci (QTLs) associated with self-compatibility using a segregating population derived from 87–534 × 96–100. We also applied genome-wide background markers and self-compatibility-QTLspecific markers in recurrent back-crossing (BC) for rapid introgression of the self-compatibility trait from 87–534 to 96–100; and used the newly developed line SC96–100 to disrupt the crossing incompatibility and generate an excellent hybrid similar to 06–88. The study here provides new insight into the genetic basis of self-compatibility in cabbage and facilitates cabbage breeding via SC lines in the MS system.

#### MATERIALS AND METHODS

#### Plant Materials

Lines 87–534 is an elite line originating from the cultivar 'Flstacus' and introduced from Germany by IVF-CAAS in 1987; its CI is greater than 10.0 at the anthesis stage, and its S-haplotype is S5 (class II), as identified in our previous study (Tian et al., 2013). The field and podding performance of this line is shown in **Figure 1A** (a1 and a2).

Lines 94–182, introduced to China from United States by IVF–CAAS in 1994, is a cabbage inbred line with a CI value < 1.0. Its S haplotype is also S5 (Class II).

Line 96–100, the S-haplotype of 96–100 is S57 (class I) (Tian et al., 2013), and the CI (number of seeds per pod) at the anthesis stage < 1.0. The field and podding performance is shown in **Figure 1B** (b1 and b2). Lines 96–100 is a parent of several excellent cabbage hybrids, including Zhonggan 18, Zhonggan 828, and Zhonggan 588 (Yang et al., 2004; Zhang et al., 2014).

Line CB201 is an elite line derived from the cultivar 'CB201' and introduced from Thailand by IVF-CAAS in 2002; its S-haplotype is S5 (class II).

Hybrid 06–88 is an elite F<sup>1</sup> derived from CB201 × 96–100 (**Figure 1C**). However, F<sup>1</sup> seeds cannot be produced via pollination at the flower stage due to the same haplotype; however, the CI is normal at the bud stage [**Figure 1** (c1 and c2)].

To study the genetic basis of self-compatibility by QTL mapping and transfer the self-compatibility trait to 96–100, the highly SC line 87–534 was crossed with 96–100 to produce an F<sup>1</sup> population, from which an F<sup>2</sup> population comprising 230 individuals was obtained. Each F<sup>2</sup> individual was self-pollinated at the anthesis stage, and the CI was calculated at the podding stage.

#### Phenotyping

The self-compatibility phenotype was assessed based on the CI and fluorescence microscopy. The CI was determined using a published procedure (Fang et al., 1983). Three individuals were manually self-pollinated (approximately 10 flowers per individual were selected) 1 or 2 days after the flowers had fully opened. The CI was then calculated at the podding stage via the following formula: CI = number of seeds/number of pollinated flowers. The self-incompatibility/self-compatibility levels were rated as follows: self-incompatibility: CI < 1; moderate self-compatibility (MSC): 4 ≤ CI ≤ 1; self-compatibility: CI > 4.

Cabbage samples were also analyzed by fluorescence microscopy as previously described (Stern et al., 1982). For each individual, five flowers were manually self-pollinated 1 or 2 days after they had fully opened. The stigmas were harvested 24 h later and fixed in formalin–acetic acid– alcohol fixation fluid for 24 h, following preservation in 70% ethanol. The samples were treated with 1 M NaOH, incubated in a water bath at 60◦C for 1 h, washed three times with distilled water, and then stained with 0.1% aniline blue solution (0.1 M K3PO<sup>4</sup> and 0.1% aniline blue) for 12 h. A fluorescence microscope was used to count the number of pollen tubes that had penetrated the stigmas (NPT) (Martin, 1959). The self-incompatibility levels were rated as follows: self-incompatibility: NPT < 10; MSC: 10 ≤ NPT ≤ 25; self-compatibility: NPT > 25.

Three independent repetitions were performed to obtain a mean value. All pollinations were conducted in the experimental greenhouse of IVF-CAAS, Beijing, China, between 9:00 and 10:00 a.m. on sunny days in late April at 20–25◦C to avoid bad weather conditions inappropriate for pollination.

#### Polymerase Chain Reaction Amplification and Expression Analysis of Self-Incompatibility–Related Genes

Total RNA was extracted from the stigmas and anthers of 87–534 and 94–182 plants using the EasyPureTM Plant RNA Kit (TransBionovo Co., Beijing, China). The cDNA templates for all genes, except for SCR, were prepared using RNA extracted from anthers. The purified RNA was reverse transcribed to cDNA using the EasyScriptTM Reverse Transcriptase Kit (TransBionovo). For polymerase chain reaction (PCR) amplifications and semi-quantitative analysis of SI-related genes, specific primers (**Supplementary Table S1**) were designed to amplify whole coding sequences. The reference sequences of S-related genes were obtained from the NCBI [SRK (AB024416.1), SCR (AB067448), ARC1 (EU344909), MLPK (AB121973), THL1 (AF273844.1)], we also blasted these genes

on the Brassica genome '02-12'<sup>1</sup> (Liu et al., 2014) and 'TO1000' genome<sup>2</sup> (Parkin et al., 2014) to obtain more information. The primers were designed by Premier (Version 5.0<sup>3</sup> ). The actin 1 gene served as the reference gene (Zhao et al., 2012). The PCR products were separated by 1% agarose gel electrophoresis (130 V). The target DNA bands were purified using the EasyPureTM Quick Gel Extraction Kit (TransBionovo) and sequenced. Three technical replicates were performed for each gene. The relative changes in gene expression levels were calculated using the 2 <sup>−</sup>11CT method (Livak and Schmittgen, 2001).

#### Molecular Marker Design and Genotyping

Total genomic DNA was extracted from the young leaves of all individuals using cetyltrimethylammonium bromide according to a published method (Murray and Thompson, 1980). The DNA concentrations were determined using a NanoDrop ND-100 spectrophotometer (Thermo Fisher Scientific Co., Wilmington, DE, United States) and then diluted to a working concentration of 40–50 ng/µl for subsequent PCR.

To design primers for insertion–deletion (InDel) markers, whole-genome resequencing (approximately 10× coverage) of the parental lines (i.e., 87–534 and 96–100) was performed. Totals of 7.2 and 7.1 Gb of Illumina paired-end reads were generated for 87–534 and 96–100, respectively. The B. oleracea '02-12' reference sequence was retrieved from 'BRAD' for resequencing data alignment and for detecting sequence polymorphisms between the parental lines (Cheng et al., 2011; Liu et al., 2013; Lv et al., 2014a). To avoid the detection of false polymorphisms, multi-hit reads were filtered and removed from the dataset, and only single-hit reads were used to design primers. All the primers used were designed in accordance with the following parameters: amplicon length, 100–200 bp; primer length, 19–25 bp; differential fragment length, 3–6 bp; and melting temperature, 53–58◦C. A total of 2000 primer pairs for the polymorphic InDel markers were designed, and 1000 pairs that were evenly distributed across nine chromosomes were selected for further analyses. These InDel primers were used for whole-genome genomic background analyses in 87–534 and 96–100 to identify the polymorphic markers on different chromosomal segments. Individual F<sup>2</sup> plants were then screened via the polymorphic markers.

Each 10 µl PCR mixture contained 1 µl of PCR buffer (10×, Mg2<sup>+</sup> included), 0.8 µl of dNTPs (2.5 mM each), 0.2 µl of Taq DNA polymerase (2.5 U/µl), 2.5 µl of DNA template (40 ng/µl), 0.3 µl of each forward and reverse primer (10 µM), and 9.8 µl of ddH2O. The reaction mixture was incubated in a GeneAmp PCR system 9700 (Applied Biosystems Inc., Foster City, CA, United States), and the PCR profile was as follows: initial 5 min at 94◦C; 35 cycles of 30 s of DNA denaturation at 94◦C, 30 s of annealing at 55◦C and 45 s of extension at 72◦C; and a final extension of 7 min at 72◦C. With respect to polyacrylamide gel electrophoresis (PAGE), the PCR products were separated on 8% (w/v) polyacrylamide gels at 160 V for 1.5 h and then visualized with silver staining.

For each marker, individuals with the 87–534 allele were categorized as 'a'. Individuals with the 96–100 allele were categorized as 'b,' and those with the F1 allele were categorized as 'h.'

#### QTL Mapping of Self-Compatibility

A linkage map was constructed using the Join Map 4.0 program with a minimum logarithm of odds (LOD) score of 4.0 (Van Ooijen, 2006). The Kosambi function was used to convert the recombinant value to genetic distance (Kosambi, 1944). A χ 2 test for goodness of fit to the expected 1:1 Mendelian segregation ratio was performed to identify significantly skewed markers (P < 0.01).

Quantitative trait loci analysis was performed using QTL IciMapping version 4.0 (Meng et al., 2015) and QTL Cartographer version 1.13 (Basten et al., 2004). A forward–backward stepwise regression was performed to choose co-factors before performing QTL detection. A permutation test was performed with QTL Cartographer to estimate the appropriate significance threshold for analysis. A LOD threshold of 3.0, which corresponded to a genome-wide significance level of 0.10, was chosen. The resulting QTL names consisted of an abbreviated trait name followed by the chromosome and QTL codes. For example, qSC4.1 represents the first QTL on chromosome 4 for self-compatibility.

#### Development of Self-Compatibility Marker Combinations and Genomic Background Markers

The strategy of developing self-compatibility marker combinations involved both the selection of markers closely associated with the high contribution rate (CR) QTL trait of self-compatibility and the development of markers or marker combinations for screening individual F<sup>2</sup> plants. Based on the CI, the best marker or marker combination was then used to identify the self-compatibility phenotype and applied to marker-assisted recurrent BC.

The genomic background analysis of the back-cross populations revealed that some of the polymorphic markers were evenly distributed across the polymorphic region in both 87–534 and 96–100.

#### Marker-Assisted Recurrent Backcrossing

The donor parent 87–534 was crossed with the recipient parent 96–100 to obtain F1-generation plants, which were subsequently successively back-crossed with 96–100 to obtain back-cross populations. The best individuals of every population (200 individuals) were selected for further experiments; the genomic DNA was extracted from all individuals and subsequently analyzed with self-compatibility marker combinations, and the individual SC plants were saved. All the individual SC plants were subjected to genomic background analyses via background markers. The plants were phenotyped to characterize the overall

<sup>1</sup>http://brassicadb.org/brad/

<sup>2</sup>http://plants.ensembl.org/Brassica\_oleracea

<sup>3</sup>http://www.premierbiosoft.com/

performance of various plant traits. The main agronomic traits of back-cross individuals were examined, with 87–534 and 96–100 plants serving as reference materials. The individual plants that were phenotypically similar to 96–100 were transplanted to the greenhouse and subjected to vernalization. Based on CI and fluorescence microscopy, the best individual SC plant was selected for further BC. Finally, individuals that were highly SC with almost the same genetic background as that of the 96–100 plants were self-pollinated to generate materials that were homozygous for self-compatibility, which were named SC96–100 lines. SC96–100 was crossed with CB201 to test the transfer results (the self-incompatibility/self-compatibility ratio and the self-compatibility based on 06–88).

To further test the performance of SC96–100, the agronomic traits (head weight, length, width, and core length) of SC96–100 and SC06–88 were evaluated and compared with those of 96– 100 and 06–88, respectively, according to the methods described in Lv et al. (2017).

### RESULTS

#### Phenotyping

The CI values at the anthesis stage for 87–534 and 96–100 were 13.2 and 0.6, respectively. Additionally, microscopy analyses of the 87–534 samples revealed that an excess of 25 pollen tubes clustered together, germinated, and penetrated the stigmas (**Figures 2A,a**). In contrast, in the 96–100 samples, most pollen tubes failed to penetrate the stigmas. We observed that callose was deposited on the stigma surface and observed malformed pollen tubes that failed to grow (**Figures 2B,b,c**). The CI and microscopy results indicated that the 87–534 and 96–100 plants exhibited completely different SI phenotypes (i.e., the 87–534 plants were highly SC, while the 96–100 plants were highly SI). With respect to the hybrid 06–88, the CI value was 0.1, and microscopy analysis revealed that its pollen tubes also failed to penetrate the stigmas

FIGURE 2 | Observations of pollen tube germination at the stigmas of 87–534, 96–100, and 06–88. Panels (A,a) show the side views of the stigma of 87–534; Pt indicates the pollen tubes; panels (B,C) show the side views of the stigma of 96–100 and 06–88, respectively; panels (b,c) show the front views of the stigma of 87–534 and 96–100, respectively.

(**Figure 2C**). The results obtained via CI values and microscopy analysis are consistent, which was the same case in the study by Zhao (2007). For convenience, the CI value was used as the main evaluation criterion in this study.

### Analysis of the SI–Related Genes in Lines 87–534 and 94–182

Whole coding sequences of SI-related genes from 87–534 to 94–182 plants were amplified using gene-specific primers (**Supplementary Table S1**). The amplified sequences were aligned using the DNAMAN program (version 7.0)<sup>4</sup> . There were no differences between the SCR, MLPK, and THL1 fragments of 87–534 and 94–182 plants. The ARC1 sequence was also similar between 87–534 and 94–182 plants (97.07% similarity), with some single nucleotide polymorphisms (SNP) causing 17 amino acid differences. For SRK, we detected three SNPs in the coding sequence and some SNPs in the intron region, however, these SNPs did not cause mutations in amino acids. The alignment of ARC1 and SRK was shown at **Supplementary Figures S1**, **S2**.

The cDNA prepared using RNA extracted from 87–534 to 94–182 stigmas and anthers was subjected to semi-quantitative analysis, with an Actin1 gene serving as a reference. For all SI-related genes (i.e., SRK, SCR/SP11, THL1, MLPK, and ARC1), we generated distinct DNA bands via PCR experiments (**Supplementary Figure S3**). This result indicated that the SI-related genes were transcribed normally from DNA to mRNA.

#### Linkage Map Construction

Based on the whole-genome resequencing data (approximately 10× coverage) of the parental lines 87–534 and 96–100, we chose 1000 primer pairs as polymorphic InDel markers for nine chromosomes. We then selected 335 primer pairs that produced reliable PCR products to genotype the F<sup>2</sup> mapping population. Of these markers, 302 were co-dominant, and 33 were dominant.

Join Map 4.0 software was used to construct nine linkage groups consisting of 329 markers (six markers did not map to any of the linkage groups) with a LOD threshold of 4.0 (**Figure 3**). The map spanned 969.5 cM, with an average marker interval of 2.95 cM. The linkage group lengths ranged from 59.3 to 156 cM, with 24–54 markers. The nine groups were anchored to their corresponding reference chromosomes (i.e., chromosomes C01–C09) according to the physical positions of the markers. The longest (156 cM) and shortest (59.3 cM) linkage groups were on chromosomes 3 and 1, respectively. The maximum (4.82 cM) and minimum (1.76 cM) average distances occurred on chromosomes 9 and 2, respectively. Chromosome 2 had the most markers (54), while chromosome 1 had the fewest (24). The largest interval between markers was 34.04 cM on chromosome 4 (between B767 and B444). Overall, the markers were relatively evenly distributed on the nine chromosomes (**Table 1**).

A total of 43 skewed markers (31.9%) were detected based on a χ 2 test for goodness of fit to the expected 1:1 Mendelian segregation ratio (P < 0.01). Twenty-four of the skewed markers were from the parent 96–100. Although the segregation ratio

<sup>4</sup>http://www.lynnon.com/

TABLE 1 | Distribution characteristics of the genetic linkage map.


for some markers was considerably different from the expected 1:1 ratio, there was an adequate distribution of markers in the different linkage groups. These results were comparable to those of previous studies involving other mapping populations of Brassica crop species (Foisset et al., 1996; Voorrips et al., 1997; Lv et al., 2014b, 2016).

### Major QTLs Associated With Self-Incompatibility Were Identified

The self-compatibility phenotype was characterized using the CI evaluation method for each individual F<sup>2</sup> plant; the CI values for each individual F<sup>2</sup> plant are listed in **Supplementary Table S2**, and a frequency histogram was obtained, as shown in **Supplementary Table S2**. The CI exhibited a continuous and skewed distribution; more than 70% of the individuals had CI values lower than 1.0, indicating that the self-compatibility phenotype was controlled by major effect genes.

IciMapping software was used to perform QTL analysis for self-compatibility traits based on the constructed linkage groups and self-compatibility trait values. With a LOD threshold of 3.0, eight QTLs were detected: qSC4.1, qSC5.1, qSC6.1, qSC7.1, qSC7.2, qSC8.1, qSC9.1, and qSC9.2. Three high-CR loci were identified, including qSC7.2 (54.81%), qSC9.1 (14.14%), and qSC5.1 (11.25%) (**Table 2**). qSC7.2, which presented the highest CR, was located at an interval on C07 harboring the S-locus, indicating that it might be a main effect locus conferring the self-compatibility phenotype to the plants. Additionally, there are higher-effect QTLs (qSC9.1, qSC5.1) that present a lower CR than did qSC7.2, indicating that other loci related to self-compatibility traits other than the S-locus exist. qSC9.1 exhibited a negative additive effect for self-compatibility (the additive effect was −0.20).

qSC7.1, qSC4.1, qSC6.1, qSC8.1 and qSC9.2 had low CRs of 9.36, 2.55, 3.42, 3.04, and 2.41%, respectively.


<sup>a</sup>Name of the QTL. <sup>b</sup>Position of the peak marker or marker interval. <sup>c</sup>Peak marker or marker interval. <sup>d</sup>Peak marker or marker interval. <sup>e</sup>Peak marker or marker interval. <sup>f</sup>Peak marker or marker interval. <sup>g</sup>Peak marker or marker interval. <sup>h</sup>Peak marker or marker interval. <sup>i</sup>Additive effect: a positive additivity indicated that 96-100-308 carries the allele for an increase in the trait value, while a negative additivity means that 01-20 carries the allele for an increase in the trait value. <sup>j</sup>Proportion of phenotypic variance explained by each QTL. <sup>k</sup>Robust QTLs are indicated in bold.

### Self-Compatibility-Specific Markers and the Development of Background Markers

Three QTLs (qSC7.2, qSC5.1, and qSC9.1) with high CRs were obtained by QTL mapping; these QTLs' linkage markers (BoID0709, BoID0329, and BoID0992) were selected to develop self-compatibility marker combinations. The marker combinations were as follows: BoID0709, BoID0992 + BoID0329, BoID0709 + BoID0992, and BoID0709 + BoID0329. In addition, qSC7.2 and qSC5.1 had a positive additive effect, while qSC9.1 had a negative additive effect. Therefore, we screened individual plants that had the same allele as that in 87–534 for markers BoID0709 and BoID0329 and plants that had the same allele as that in 96–100 for marker BoID0992. According to the CI of the individual F<sup>2</sup> plants, the accuracy of the identification of the self-compatibility marker combinations was analyzed.

The results showed that BoID0709 + BoID0992 yielded the highest correctness (94.12%), followed by BoID0709 (74.24%), BoID0992 + BoID0329 (23.53%), and BoID0709 + BoID0329 (83.33%). Based on the data above, BoID0709 and BoID0992 were applied to marker-assisted selection (MAS) for self-compatibility. The primer sequences used are shown in **Supplementary Table S3**.

Based on the linkage map, 36 polymorphic markers that were evenly distributed on each chromosome were selected as background markers; the information on these markers is shown in **Supplementary Table S3**.

### Successful Introgression of Self-Compatibility From 87–534 to 96–100

For every back-cross population, individual plants that had the same allele as that in 87–534 for marker BoID0709 and plants that had the same allele as that in 96–100 for marker BoID0992 were chosen for further BC; the numbers of individuals that met these criteria in every population were 100 (F1), 102 (BC1), 98 (BC2), 99 (BC3), and 106 (BC4). The segregation ratio for both the F<sup>1</sup> and BC populations conformed to a Mendelian ratio of 1:1, according to results of a χ 2 test.

Thirty-six polymorphic markers were used to analyze the genetic backgrounds of these individual plants after the self-compatibility marker combinations were used. The recurrent parent genome (PRPG) of every population comprised the F<sup>1</sup> (43.10–55.17%), BC<sup>1</sup> (56.90–68.97%), BC<sup>2</sup> (64.14–79.31%), BC<sup>3</sup> (77.93–86.21%), and BC<sup>4</sup> (86.55–93.10%), which indicated that the genetic background of the individual plants carrying the self-compatibility trait in every population gradually became similar to that of 96–100. We selected 30 individual plants that had a genetic background similar to that of the 96–100 plants for BC with 87–534 plants to generate BC<sup>4</sup> plants whose backgrounds were also analyzed as described above.

Phenotypic observations were performed to characterize the overall performance of various plant traits. Half of the plants that were similar to the 96–100 plants were selected for transplantation to the greenhouse. In addition, the back-cross individuals were phenotypically somewhat similar to the 96–100 plants, and the phenotypes of typical individual plants of each population are shown in **Figure 4A**. Every 15 individual plants were self-pollinated at the anthesis stage, and fluorescence microscopy revealed that the number of pollen tubes gradually increased (**Figure 4B**).

At the seed-filling stage, we also observed that the seed setting of every population gradually increased (**Figure 4C**). The CI values were 0.17–1.10 (F1), 0.42–1.61 (BC1), 0.67–2.59 (BC2), 0.71–3.33 (BC3), and 2.30–4.71 (BC4) (**Supplementary Table S4**), which is consistent with the observed results. In addition, some BC<sup>4</sup> individuals reached the self-compatibility level.

BC4F<sup>1</sup> individuals were ultimately generated from self-pollinated progenies of the BC<sup>4</sup> individuals. We also identified 200 BC4F<sup>1</sup> individuals by self-compatibility-specific markers and obtained 195 individuals with self-compatibility traits. The CI and fluorescence microscopy analyses revealed that the BC4F<sup>1</sup> individuals reached the self-compatibility level (the CI was 5.19–7.60). The background similarity between BC4F<sup>1</sup> individuals and 96–100 individuals ranged from 95 to 97%, and the main agronomic characteristics of the former were generally the same as those of 96–100. We therefore named the BC4F<sup>1</sup>

plants as SC96–100; this newly developed SC96–100 line reached the requirements for self-compatibility and is currently being applied for the production of cabbage hybrids.

Moreover, every back-cross population line was pollinated with CB201; the field performance, fluorescence microscopy and seed setting performance are shown in **Figure 5**. The results showed that the field performance of CB201 × SC96–100 was similar to that of CB201 × 96–100, and the number of pollen tubes and the CI also gradually increased. Furthermore, the CB201 × BC4F<sup>1</sup> line reached the self-compatibility level as was named SC06–88, which can be applied for breeding.

The agronomic traits of 20 SC96–100 plants and 20 SC06–88 plants were compared with those of 96–100 plants and 06–88 plants, respectively, and the results are shown in **Supplementary Table S5**. Upon comparison, the agronomic traits of 96–100 and SC96–100 were, respectively, 0.7 kg/0.7 kg (head weight), 13.3 cm/12.9 cm (head length) and 5.3 cm/5.3 cm (core length); similarly, the agronomic traits of 06–88 and SC 06–88 were, respectively, 1.0 kg/1.1 kg (head weight), 13.7 cm/13.9 cm (head length) and 6.0 cm/5.9 cm (core length). The candidate agronomic traits of SC96–100 are similar to those of 96–100, and those of SC06–88 are also similar to those of 06–88. These results further showed that we successfully transferred self-compatibility from 87–534 to 96–100 and produced a new hybrid, SC06–88.

#### DISCUSSION

#### Self-Incompatibility–Related Genes and Self-Compatibility Traits

The SRK and SCR genes are the female and male determinants, respectively, during pollen–stigma recognition in the SI process. Their interaction enables the stigma to recognize self-pollens. Mutations to either of the determinants may result in a shift from SI to SC, further confirming their separate functions (Goring et al., 1993; Schopfer et al., 1999; Takayama et al., 2000; Nasrallah et al., 2002; Ekuere et al., 2004). Nasrallah et al. (1992) observed that in a self-compatible B. oleracea mutant, the expression of SRK did not result in the recognition of self-pollen. However, pollen grains can still interact with the stigmas of plants with certain S haplotypes. Cabrillac et al. (1999) reported that the stigmas of plants with a mutated SCR were incapable of recognizing self-pollen, but still interacted with the pollen grains of specific S haplotypes. Nasrallah et al. (2002) showed that gene transfer of the stigma receptor kinase SRK and its pollen-borne ligand SCR from one S-locus haplotype of the SI and cross-fertilizing Arabidopsis lyrata is sufficient to impart SI phenotype in self-fertile Arabidopsis thaliana, which lacks functional orthologs of these genes. These results suggest that cross and reciprocal cross tests during the anthesis stage may enable researchers to deduce whether the stigmas and pollen grains are functionally normal (i.e., whether SRK and SCR are functional). Therefore, we analyzed 87–534 (highly selfcompatible) and 94–182 (highly self-incompatible) plants, which carry the S5 haplotype (Class II).

Previous studies revealed that in addition to SRK and SCR, some SI-related genes are important for normal SI functions (Boyes et al., 1991). For example, THL1/THL2 is a negative regulator of the SI system that interacts with SRK and inhibits its activity in the absence of the SCR pollen protein (Bower et al., 1996; Cabrillac et al., 2001; Haffani et al., 2004). Additionally, SLG can enhance SI interactions, though it is not a required component (Gaude et al., 1993; Dixit et al., 2000; Takasaki et al., 2000). ACR1 is a positive regulator of the SI system that can be activated by SRK, leading to the ubiquitination and degradation of stigma proteins, which ultimately results in pollen rejection (Stone et al., 1999, 2003). MLPK, which is another positive regulator, may form a complex with SRK to regulate upstream SI interactions (Murase et al., 2004; Kakita et al., 2007). Mutations to these SI-related genes may lead to SC. For example, MLPK

belongs to the RLCK gene sub-family, and has eight conserved amino acids (G-G–G-V, A-K, E, DL-N, DFG, APE, D-WS-G, and R) (Steven and Hanks, 1991). Hatakeyama et al. (2010) constructed a detailed linkage map of B. rapa from the F<sup>2</sup> progeny and Mapping of SI-related genes revealed that these QTL were co-localized with SLG on R07 and MLPK on R03. Murase et al. (2004) determined that Gly194 was replaced by Arg. Because Gly194 is a highly conserved amino acid in the protein kinase VIa sub-family, this mutation resulted in the loss-of-activity of MLPK and the development of the SC phenotype.

fpls-10-00189 February 22, 2019 Time: 18:21 # 9

To confirm whether the SC traits in line 87–534 were caused by mutations in SI-related genes, we analyzed the sequences of these genes in 87–534 and 94–182 plants. Sequence analysis of SI-related genes including SCR, SRK, ARC1, THL1, and MLPK indicates some mutation in SRK and ARC1. We detect some mutations in the coding and conserved domains of the 87–534 genes. Semi-quantitative PCR results indicated these genes were normally expressed. These results imply that there are novel genetic factors associated with the SC phenotype of 87–534 plants.

### Quantitative Trait Locus Analysis of Self-Compatibility Traits

Except for the known SI-related genes, there are also other loci associated with SI/SC trait. Ma et al. (2009) mapped an S suppressor locus using a segregation population derived from S-1300 (SI) × 97-wen135 (SC) in B. napus. In Arabidopsis, Hülskamp et al. (1995) found the wax synthesis related genes CER was required for pollen-stigma recognition.

Till now, the mechanism of SC in B. oleracea has not been characterized. To unravel the novel genomic loci conferring SC traits to 87–534 plants, we conducted whole-genome QTL mapping experiments for the F<sup>2</sup> population derived from the hybridization between 87–534 and 96–100 (highly self-incompatible; S57 haplotype, Class I). Eight QTLs were detected on six chromosomes. No QTLs were detected on chromosomes 1–3. qSC7.2 had the highest CR value, and was located in the same marker interval as SRK and SCR on chromosome 7, indicating that the S-locus contained the main-effect genes conferring the SC phenotype. However, QTLs were not associated with any of the SI-related genes, including QTLs with high CR values [i.e., qSC9.1 (14.14%), qSC7.1 (9.36%), and qSC5.1 (7.06%)]. This observation suggests that there are novel genetic factors associated with SC traits. Additionally, the relationship between S-haplotypes and SC traits was consistent with the mapping results.

The candidate genes in the QTL regions were also discussed, and some of them might be good candidates, such as the genes involved in embryogenesis and pollen development. However, further work is still needed to fine-mapping and verify these genes, using a larger population. Besides, the markers used in this study may be useful for marker-assisted selection of self-compatible lines. In this study, the markers used in this study could be applied in MAS of SC lines; we used markers at both qSC7.2 (BoID0709) and qSC9.1 (BoID0992) to guarantee a high selection efficiency. Additionally, the results provide new insight into the mechanism of self-compatibility and can improve cabbage breeding via SC lines in the MS system.

#### Powerful Breeding Method: Trait-Specific Markers Combined With Genomic Background Analyses

Marker-assisted selection is an important tool that is widely used in breeding and enables direct genotypic selection and

effective gene polymerization via specific markers associated with target characteristics. In the study of resistance genes in cabbage (black rot, clubroot, and Fusarium wilt resistance, etc.), previous researchers have screened markers closely linked to these resistance genes and have applied them to cabbage breeding for disease resistance, which has provided important help for breeding resistant varieties (Landry et al., 1992; Pu et al., 2012; Kifuji et al., 2013). Genomic background analysis has also been applied to MAS to enable rapid and accurate breeding. Liu et al. (2017) and Yu et al. (2017) succeeded in transferring the Fusarium wilt resistance gene and the Ogu-CMS restorer gene by combining application of background selection and disease resistance-specific marker-assisted foreground selection.

In our study, we obtained self-compatibility-specific marker combinations for foreground selections and 36 genome-wide markers for background selections. Additionally, phenotypic observations (CI determination, fluorescence microscopy observation and agronomic traits) were also performed after our transfer process. A high selection efficiency in our transferring process indicated that MAS via multiple means will greatly improve the efficiency of breeding.

#### AUTHOR CONTRIBUTIONS

ZX wrote and revised the manuscript. ZX, FH, and YH isolated the samples and performed trait measurements, molecular experiments and marker assays. HL, YX, and MZ analyzed the trait and trial data and revised the manuscript. MZ, HL, and ZF conceived the idea and critically reviewed the manuscript. LY, YZ, YL, and ZL coordinated and designed the study. All authors read and approved the final manuscript.

#### REFERENCES


### FUNDING

This work was financially supported by grants from the Key Projects in the National Key Research and Development Program of China (2016YFD0100307 and 2016YFD0101804), the Science and Technology Innovation Program of the Chinese Academy of Agricultural Sciences (CAAS-ASTIP-IVFCAAS), and the earmarked fund for the Modern Agro-Industry Technology Research System, China (CARS-23).

### ACKNOWLEDGMENTS

The work reported here was performed at the Key Laboratory of Biology and Genetic Improvement of Horticultural Crops, Ministry of Agriculture, Beijing 100081, China.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00189/ full#supplementary-material

FIGURE S1 | Alignment of ARC1 amino acid sequence.

FIGURE S2 | Alignment of SRK amino acid sequences.

FIGURE S3 | Expression analysis for SI-related genes.

TABLE S1 | Primers used in SI-related gene analysis.

TABLE S2 | The CI score and S-haplotype of F<sup>2</sup> individuals; frequency histogram of F<sup>2</sup> individuals.

TABLE S3 | Information concerning QTL linkage markers and background markers.

TABLE S4 | Information on each back-cross individual.

TABLE S5 | Information on agronomic traits.



in class-I S, haplotypes of Brassica campestris, (syn. rapa ) L. FEBS Lett. 473, 139–144. doi: 10.1016/S0014-5793(00)01514-3


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Xiao, Han, Hu, Xue, Fang, Yang, Zhang, Liu, Li, Wang, Zhuang and Lv. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The DOF Transcription Factor SlDOF10 Regulates Vascular Tissue Formation During Ovary Development in Tomato

Pilar Rojas-Gracia, Edelín Roque, Mónica Medina, María Jesús López-Martín, Luis A. Cañas, José Pío Beltrán and Concepción Gómez-Mena\*

Department of Plant Development and Hormone Action, Biology and Biotechnology of Reproductive Development, Instituto de Biología Molecular y Celular de Plantas, CSIC-UPV, Valencia, Spain

The formation of fruits is an important step in the life cycle of flowering plants. The process of fruit development is highly regulated and involves the interaction of a complex regulatory network of genes in both space and time. To identify regulatory genes involved in fruit initiation in tomato we analyzed the transcriptomic profile of ovaries from the parthenocarpic PsEND1:barnase transgenic line. This line was generated using the cytotoxic gene barnase targeted to the anthers with the PsEND1 antherspecific promoter from pea. Among the differentially expressed genes we identified SlDOF10, a gene coding a DNA-binding with one finger (DOF) transcription factor which is activated in unpollinated ovaries of the parthenocarpic plants. SlDOF10 is preferentially expressed in the vasculature of the cotyledons and young leaves and in the root tip. During floral development, expression is visible in the vascular tissue of the sepals, the flower pedicel and in the ovary connecting the placenta with the developing ovules. The induction of the gene was observed in response to exogenous gibberellins and auxins treatments. To evaluate the gene function during reproductive development, we have generated SlDOF10 overexpressing and silencing stable transgenic lines. In particular, down-regulation of SlDOF10 activity led to a decrease in the area occupied by individual vascular bundles in the flower pedicel. Associated with this phenotype we observed induction of parthenocarpic fruit set. In summary, expression and functional analyses revealed a role for SlDOF10 gene in the development of the vascular tissue specifically during reproductive development highlighting the importance of this tissue in the process of fruit set.

Keywords: tomato, parthenocarpy, DNA with one finger, development, vascular tissue

### INTRODUCTION

The reproductive phase of angiosperms is characterized by the appearance of flowers and fruits. The flower is the reproductive organ of the plant and contains the male and female reproductive organs. The formation and development of the fruit is closely linked to the formation of the flower and under the control of both environmental and hormonal factors. Accordingly, flower and fruit development require the joint and coordinated action of a network of transcription factors (TFs)

#### Edited by:

Andrea Mazzucato, Università degli Studi della Tuscia, Italy

#### Reviewed by:

Barbara Molesini, University of Verona, Italy Simona Masiero, University of Milan, Italy

\*Correspondence: Concepción Gómez-Mena cgomezm@ibmcp.upv.es

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 14 October 2018 Accepted: 08 February 2019 Published: 26 February 2019

#### Citation:

Rojas-Gracia P, Roque E, Medina M, López-Martín MJ, Cañas LA, Beltrán JP and Gómez-Mena C (2019) The DOF Transcription Factor SlDOF10 Regulates Vascular Tissue Formation During Ovary Development in Tomato. Front. Plant Sci. 10:216. doi: 10.3389/fpls.2019.00216

**106**

that act throughout the regulation of gene expression (Karlova et al., 2014). The establishment of distinct transcriptional domains is a fundamental mechanism for determining different cell fates within tissues and organs (Moreno-Risueno et al., 2012). TFs regulate gene expression by binding specific cisregulatory elements in the promoter region of the target genes. In tomato, at least 998 putative TFs have identified from 62 different TF families that correspond to 2.87% of the estimated total number of genes (Guo et al., 2008; The Tomato Genome Consortium et al., 2012).

The DNA binding with one finger (DOF) proteins constitute a plant-specific family of TFs harboring a DNA-binding domain, which forms a single zinc-finger (Noguero et al., 2013). The highly conserved DOF domain is a region of 52 amino acid residues structured as a Cys2/Cys2 (C2/C2) zinc finger that recognizes a cis-regulatory element containing the common core sequence 5<sup>0</sup> -(T/A)AAAG-3<sup>0</sup> (Yanagisawa, 2004). Besides the N-terminal conserved DNA-binding domain, these proteins contain a more variable C-terminal transcriptional regulation domain having diverse amino acid sequences (Yanagisawa, 2001; Gupta et al., 2015). DOF proteins are present across plant lineage, from green unicellular algae to higher angiosperms, and represent a unique class of TFs having bifunctional binding activities with both DNA and proteins (Gupta et al., 2015). The number of DOF genes is quite variable among different species that ranges from 9 genes identified in Physcomitrella patens to 36 and 54 DOF genes identified in Arabidopsis and maize, respectively (Gupta et al., 2015).

DNA-binding with one finger TFs play key role in a variety of biological processes during development and in response to environmental stimulus. They are often associated to plant specific processes such as light-responsiveness, tuber formation, seed development, seed germination, flowering, and plant hormone responses [reviewed by Noguero et al. (2013)]. The DOF proteins are also involved in general cellular activities such as cell cycle progression, cell expansion, metabolism regulation, and more. During plant development these proteins regulate the formation of a diverse number of structures including stomata guard cells (Negi et al., 2013), pollen (Peng et al., 2017), and the vascular system (Konishi and Yanagisawa, 2007; Guo et al., 2009; Gardiner et al., 2010).

In tomato, 34 DOF proteins have been identified distributed in 11 chromosomes and classified in 4 classes and 6 clusters (Cai et al., 2013). In addition to the highly conserved DOF domain, up to 25 conserved domains have been identified in this gene family. These additional domains result in a high divergence in the structure of the genes between the different groups or subgroups (Cai et al., 2013). Despite the importance of this gene family during plant growth, only a small number of members have been functionally characterized in tomato. A group of five tomato DOF genes, homologous to Arabidopsis Cycling DOF Factors (CDFs) are reported to be involved in the control of flowering time and abiotic stress responses (Corrales et al., 2014). More recently, the TDDF1 gene was characterized and shown to be involved in circadian regulation and stress resistance (Ewas et al., 2017). Therefore, additional work is required to fully understand the role of DOF genes during tomato plant growth and development.

Tomato is a horticultural crop of major economic importance worldwide. The identification of regulatory genes involved in the control of fruit set will provide new molecular targets to implement breeding programs in these species. In this work we compared the transcriptome of ovaries from wild-type and parthenocarpic tomato plants (PsEND1::barnase) looking for differentially expressed TFs. In these plants parthenocarpic fruit development is triggered by early anther ablation. We selected the SlDOF10 gene from the DOF family of TFs for extensive expression analyses and functional characterization. Our results support a role for SlDOF10 gene in the development of the vascular tissue and specifically during reproductive development in tomato. We also discuss the role of the vascular system in the control of fruit set in this species.

#### MATERIALS AND METHODS

#### Plant Material and Growth Conditions

Tomato plants (Solanum lycopersicum L.) from the Micro-Tom (MT) cultivar were used as the wild-type genotype. The transgenic line PsEND1:barnase MT TR1d (Roque et al., 2007) was used in the transcriptomic analyses. Plants were grown in pots with coconut fiber at 25–30◦C (day) and 18–20◦C (night) and were irrigated daily with Hoagland's solution. Natural light was supplemented with Osram lamps (Powerstar HQI-BT, 400 W) to get a 16 h light photoperiod.

The treatment with IAA (2000 ng/ovary; Duchefa) and GA<sup>3</sup> (2000 ng/ovary; Duchefa) was carried out to unpollinated ovaries, on the day equivalent to anthesis, in 10 µl of 5% ethanol, 0.1% Tween 20 solution (Serrani et al., 2008). Control ovaries were treated with the same volume of solvent solution. Samples were collected in pools 10, 30, 60, and 120 min after the treatment, frozen in liquid N<sup>2</sup> and kept at −80◦C until processed for expression analysis.

For fruit analyses the four first inflorescences of 10 independent plants were collected. To asses facultative parthenocarpy 12 flowers from each genotype were emasculated 2 days before anthesis. After 18 days ovaries were collected and weighed on a precision balance.

#### Microarray Experiment Design

A comparative gene expression profiling was conducted using the microarray chip TOM2 (Cornell University, United States), a long oligonucleotide array representing 11862 tomato unigenes. Total RNA was isolated from tomato ovaries of wild-type (MT) and transgenic (PsEND1:barnase MT-TR1d) plants collected at 5 different time points during development. The selected floral stages were 6, 4, and 2-days-before anthesis (dba); at anthesis and 2-days-after anthesis (daa). The experiment was conducted with three biological replicates for each sample and a RNA reference sample obtained by pooling equivalent amounts of all RNA samples. After labeling, the reference sample was mixed with each individual sample to be used as a probe (30 hybridizations) using a dye swap approach. Microarray hybridization with labeled cDNA was performed using the protocols provided by the Tomato Functional Genomics Database (TFGD) at http://ted.bti. cornell.edu/cgi-bin/TFGD/array/TOM2\_hybridization.cgi. The microarray slide was scanned for spot intensity using GenePix 4000B scanner (Molecular Devices) at 10 µm resolution. Genepix Pro software was used to quantify the spot intensity after subtracting the background, and optimization of the appropriate signal to noise ratio.

#### Microarray Data Analysis

fpls-10-00216 February 22, 2019 Time: 18:22 # 3

Data files were imported into Acuity 4.0 (Axon Instruments), and background-subtracted intensity was normalized by using the Lowess normalization method (Yang and Speed, 2002) using Acuity default values (smoothing filter, 0.4; iterations, 3; δ = 0.01). Finally, only spots with valid values in 80% hybridizations were considered for further analyses. To detect differentially expressed genes, a one-way analysis of variance (ANOVA) was performed to compare the mean Lowess-normalized values for a gene between experimental groups (parthenocarpic and wild type). A P-value cutoff of 0.05 was used to flag genes as being differentially expressed. Mean values of differential genes were calculated from each sample as log2 values. For the visual presentation of the results showing differential expression of the genes between wildtype and transgenic lines, as well as for Wilcoxon rank sum test calculation, MapMan software was used (Thimm et al., 2004).

#### Quantitative RT-PCR

Total RNA was extracted using the RNeasy Plant Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's instructions. One microgram of total RNA was used to synthesize first-strand cDNA, using the SuperScript First-Strand Synthesis System for RT-PCR (Invitrogen, Carlsbad, CA, United States). Quantitative RT-PCR (qRT-PCR) was carried out using the SYBR GREEN PCR Master Mix (Applied Biosystems, Carlsbad, CA, United States) in an ABI PRISM 7000 Sequence detection system (Applied Biosystems, Carlsbad, CA, United States) following the manufacturer's recommendations. In a single experiment, each sample was assayed in triplicate. Expression levels were calculated relative to the constitutively expressed SlACTINE8 gene (Martin-Trillo et al., 2011) using the 11Ct method. qRT-PCR data were obtained using three biological replicates. Primers were designed using Primer Express software from Applied Biosystems and are listed in **Supplementary Table S1**.

#### In situ Hybridization

RNA in situ hybridization with digoxigenin-labeled probes was performed on 8 µM longitudinal paraffin sections of tomato seedlings and inflorescences as described previously (Gomez-Mena and Roque, 2018). The RNA antisense and sense probes were generated using the T7 polymerases, using a fragment of SlDOF10 (positions 289–734 from the ATG codon) cloned in both orientations into the pGEM-T Easy vector (Promega).

#### Histological Techniques

For histological studies, tissue was fixed and embedded in paraffin or resin (Technovit 7100; Kulzer, Wehrheim, Germany). Thin sections (1 µm) were stained with 0.05% toluidine blue in 0.1 M phosphate buffer at pH 6.8 (O'Brien et al., 1964). For wholemount GUS detection, tissues were fixed for 10 min in ice-cold 90% acetone and GUS activity was revealed by incubation in 100 mM NaPO<sup>4</sup> (pH 7.2), 2.5 mM 5-bromo-4-chloro-3-indolylß-D-glucuronide, 0.5 mM K3Fe(CN)6, 0.5 m MK4Fe(CN)<sup>6</sup> and 0.25% Triton X-100. Plant tissue was incubated at 37◦C for 20 h. After staining, chlorophyll was cleared from the samples by dehydration through an ethanol series. For GUS detection in sectioned tissues, seedlings were first stained for GUS, followed by fixation and sectioning as for in situ hybridization. Digital images were processed (cropping, brightness, contrast, and color balance) with Adobe Photoshop (Adobe Systems) and analyzed quantitatively using Image J<sup>1</sup> . Whole-mount GUS images were obtained from 10 z-stack images corresponding to different focal planes.

#### Subcellular Localization of SlDOF10 Protein by Transient Expression in N. benthamiana

The SlDOF10 coding sequence was cloned via Gateway LR reaction into the destination vectors pEarleyGate101 and pEarleyGate 104 (Earley et al., 2006) to obtain N- and C-terminal fusions to the yellow fluorescent protein (YFP). The constructs were transformed into Agrobacterium tumefaciens C58 GV3850 and overnight cultures were diluted in infiltration buffer and used to infiltrate 4 week-old Nicotiana benthamiana leaves (Sparkes et al., 2006). Observations were performed on leave disks 48 h after infiltration under a confocal scanning microscopy (LSM 780, Zeiss).

#### Plasmid Construction and Stable Plant Transformation

To make the GUS reporter fusion, approximately 2.5-kb of the 5<sup>0</sup> promoter region of the SlDOF10 gene was amplified using Advantage 2 Polymerase (Clontech) and oligonucleotides DOF10pro-For and DOF10pro-Rev, then cloned in the pCR8 vector (Invitrogen). Destination vector pKGWFS7,0 (VIB/Gent, Belgium) was used to generate the SlDOFpro::GUS construct. To make the 35S::SlDOF10 construct, SlDOF10 cDNA was amplified with oligos SlDOF10-ORF-for and SlDOF10-ORF-rev and inserted into pK2GW7 binary vector (VIB/Gent, Belgium) using Gateway technology (Invitrogen) that placed the cDNA under the cauliflower mosaic virus 35S promoter. The SlDOF10- RNAi construct was generated using a 428 bp fragment of SlDOF10 gene (positions 349–777 from the ATG codon), amplified using primers SlDOF10-RNAi-For and SlDOF10- RNAi-Rev and cloned into pK7GWIWG2(I) vector (VIB/Gent, Belgium). The fragment used in this construct is located outside of the conserved motifs of the protein (DOF domain and bipartite NLS signal). Primers are listed in **Supplementary Table S1**.

These three binary vectors were then introduced into A. tumefaciens LBA4404 by electroporation. The cotyledon cocultivation method (Ellul et al., 2003) was used to transform wildtype tomato plants (cv. Micro-Tom). The transgenic plants were screened on antibiotic plates and transformants were transferred to soil for propagation.

<sup>1</sup>http://rsb.info.nih.gov/ij/

### Transactivation Assay in Nicotiana benthamiana Leaves

A reporter plasmid was generated that consists in a fusion 2 consecutive DOF binding motif (TAAAG), the minimal TATA region of the 35S promoter and the Firefly Luciferase (LUC) gene. The two copies of the DOF cis-DNA element were produced by annealing the complementary single-stranded oligonucleotides 2 × DOF For and 2 × DOF Rev (**Supplementary Table S1**). In the same plasmid, the Renilla (REN) LUC under the control of the CaMV 35S promoter was used as control. The effector plasmid contains the complete SlDOF10 cDNA driven by the CaMV 35S promoter. A. tumefaciens C58C1 (pMP90) was transformed by electroporation with the independent constructs.

Equivalent amounts of the LUC fusion plasmid and effector plasmid (in combination with the suppressor of gene silencing p19) were coinfiltrated in 4 week-old N. benthamiana leaves. After infiltration, plants were incubated at 22◦C with 16 h photoperiod for 2 days before analysis. The luciferase activity was measured using the dual-luciferase reporter assay system (Promega, United States) according to the manufacturer's instructions. Relative light units were measured on a GloMax 96 Microplate Luminometer (Promega). The relative luciferase activity was calculated as the ratio between the LUC and the control REN luciferase activity. Four biological repeats were measured for each sample in three independent experiments.

#### Statistical Analyses

Statistical treatments of the data were made using the SPSS program, version 16.0 for windows, IBM. The analyses were made by Student t-test and one-way ANOVA for p < 0.05 followed by Tukey correction for multiple comparisons; (P < 0.05). Different letters above the data bars represent significant differences between treatments.

### RESULTS

### Identification of SlDOF10 Gene and Sequence Analysis

To identify regulatory genes that participate in the process of fruit set in tomato we looked for genes precociously activated in tomato parthenocarpic plants. We compared the transcriptome of ovaries form the parthenocarpic line PsEND1::barnase (Roque et al., 2007; Medina et al., 2013) and wild-type plants using the TOM2 oligo array (**Figure 1**). We compared the two samples over 5 independent time points that corresponded to 5 floral stages (6, 4, and 2 days before anthesis = dba; at anthesis and 2 days after anthesis = daa). The bigger number of changes in gene expression corresponded to stage 1 (6 dba) and stage 5 that corresponded to anthesis (**Figure 1A**). We have focused our study in the earliest floral stage that present significant gene modulation associated to precocious ovary growth in the parthenocarpic plants. Microarray analyses revealed 437 up-regulated and 507 down-regulated genes (**Figure 1B** and **Supplementary Table S2**) at this floral stage using a 2-fold threshold change. Genes were classified according to their annotated function and we selected a set of 89 unigenes were include in the functional category "Regulation of transcription" (**Figure 1C** and **Supplementary Table S3**).

Among the TFs differentially expressed we identify a unigene that presented a conserved domain characteristic of DOF TFs. Unigene SGN-U584226, corresponded to SlDOF10 gene (Solyc02g090310) and was up-regulated in PsEND1::barnase ovaries 6 dba (**Supplementary Figure S1**). SlDOF10 cDNA sequence was 1059 bp long and contained a 783 bp open reading frame flanked by 5<sup>0</sup> untranslated (5<sup>0</sup> UTR) and 3<sup>0</sup> untranslated (3<sup>0</sup> UTR) sequences of 97 and 176 pb, respectively. SlDOF10 protein contained a conserved N-terminal binding domain of 52 residues spanning a single C2/C2 zinc finger structure (DOF domain) and a bipartite nuclear localization signal (NLS) (**Supplementary Figure S1**) also described to be present in Arabidopsis DOF proteins (Krebs et al., 2010).

Genome-wide analysis of the tomato DOF family (34 members) revealed that gene family expansion originated after several duplication events where SlDOF10 and SlDOF31 are paralogs located on different chromosomes (Cai et al., 2013). However, SlDOF10 and SlDOF31 genes strongly differed in exon/intron structure in terms of intron number and exon length (**Figure 2A**) and presented low protein homology outside the DOF domain (**Supplementary Figure S2**). To analyze the duplication event of these paralogs in the context of time we inferred a phylogeny using a nucleotide dataset containing the two paralogs and several DOF homologs from a variety of species. The topology of the phylogenetic tree showed that SlDOF10 and SlDOF31 proteins placed in different clades, suggesting that the duplication resulting in these paralogs occurred early and prior to the speciation of the Solanaceae species included in the phylogenetic tree (**Figure 2B**). The homology among the proteins included in the SlDOF10 clade is not restricted to the DOF domain but extended to several stretches of amino acids throughout the whole protein (**Figure 2C**). Our data indicate a strong structural divergence of the two paralogs after duplication both at the DNA and protein level that might result in functional diversification.

#### SlDOF10 Protein Is Located in the Nucleus and Shows Transcriptional Activity

SlDOF10 protein contains a highly conserved bipartite NLS characteristic of this family of proteins (Krebs et al., 2010). In order to determine the subcellular location of SlDOF10 we fused the YFP to either the C and N terminal part of the protein. YFPtagged proteins were transiently expressed in N. benthamiana leaves and observed under the confocal microscope. The fusion protein localized exclusively to the nucleus of the epidermal cells (**Figures 3A,B**) consistent with a role for SlDOF10 as a transcription factor.

It has been reported that DOF proteins bind to the (T/A)AAAG core sequence motif found in many plant promoters (Mena et al., 2002; Yanagisawa, 2004). To test the transcriptional activation activity of SlDOF10 we performed transient transactivation assays in N. benthamiana leaves.

The reporter constructs contains 2 consensus DOF-binding sequences and a minimal 35S promoter. The effector plasmid expressing the full-length NtSVP protein was also constructed (**Figure 3C**). When both plasmids were co-expressed the expression of the LUC reporter was significantly activated compared to negative controls (**Figure 3C**). These results support the idea that SlDOF10 has the ability to bind specific DNA sequences and activate transcription.

### SlDOF10 Expression During Plant Development

In order to understand the function of SlDOF10 gene during plant development we studied the expression patterns of the gene in different tissues from seedlings and adult plants. qRT-PCR analyses showed that the gene is expressed during both vegetative and reproductive development (**Figure 4A**). SlDOF10 was expressed in 2 week-old seedling in the apical region containing the cotyledons, shoot and leaf primordia and in the basal region (hypocotyl and root) and in expanded true leaves and roots from adult plants. In the reproductive organs the expression of SlDOF10 was detected at early stages of flower development (6 dba) and decreases during flower maturation (**Figure 4A**). In dissected floral organs from flowers at anthesis the expression was higher in sepals than in the other floral organs (**Figure 4B**). Among the analyzed tissues the higher levels of expression was observed in the roots from 1 month-old adult plants.

The expression of the gene at tissular level was analyzed performing in situ hybridizations. SlDOF10 mRNA was visible in apical shoots from 2 week-old seedlings and ovaries from 2 dba flowers, specifically localized in the vascular tissue (**Figure 5A**). In ovaries, the signal was observed in the vascular tissues of the funiculus and the placenta (**Figures 5B,C**). We also generated reporter lines by fusing 2.4 Kb of the SlDOF10 promoter to the GUS reporter gene and transformed tomato plants. Consistent with the SlDOF10 pattern detected by in situ hybridization, SlDOF10pro::GUS expression in tomato plants was observed also in vascular tissue from seedling and adult plants (**Figures 5D–I**). SlDOF10pro::GUS seedlings showed expression of the GUS reporter in the vasculature of cotyledons, hypocotyls, root tips and lateral root primordia (**Figures 5D,F**). Fragment of leaf pedicels and stem were analyzed and showed no expression of the gene in the vascular tissue. However, we detected rapid activation of SlDOF10pro::GUS expression near wound sites although further analyses are required to stablish the wound-induced expression of the gene (**Supplementary Figure S3**). During floral development, expression accumulated in the receptacle, the pedicel and in the vascular tissue of sepals (**Figure 5H**). Transversal sections of SlDOF10pro::GUS flower pedicels showed expression activity in the vascular ring (**Figure 5I**). These results

FIGURE 2 | SlDOF10 belong to a clade of DOF proteins conserved in the Solanaceae family. (A) Structure of SlDOF10 and SlDOF31 genes. White boxes indicate exons and gray boxes indicate DOF domains. (B) Phylogenetic tree for SlDOF10 protein and homologous proteins from several plant species. The SlDOF10 clade is highlighted by a gray square. Sl, Solanum lycopersicum; St, Solanum tuberosum; Ca, Capsicum annuum; Nto, Nicotiana tomentosiformis; Nt, Nicotiana tabacum; Pt, Populus trichocarpa; Rc, Ricinus communis; Vv: Vitis vinifera; Md, Malus domestica; Gm, Glycine max; Cc, Citrus clementina; At, Arabidopsis thaliana; Sb, Sorghum bicolor. (C) Protein alignment of DOF proteins from the SlDOF10 clade (marked by a gray box in panel a) showing stretches of conserved amino acids throughout the complete protein. Identical amino acids are marked by stars.

suggest that SlDOF10 could have a transcriptional regulatory role on the formation of vascular tissues during reproductive development in tomato.

To evaluate whether the 2.4 kb fragment from SlDOF10 promoter can be used as a tissue-specific marker for vascular tissues, we transformed Arabidopsis thaliana with the SlDOF10pro::GUS construct. In seedlings from the transgenic plants, GUS expression was observed in the vascular tissues of cotyledons, true leaves and primary roots (**Supplementary Figure S4**). During reproductive development GUS staining was visible in the vascular tissues of all floral organs (**Supplementary Figure S2**). As reported in tomato plants, in the ovary GUS staining is observed in the placenta and the vascular tissue of the funiculus (**Supplementary Figure S4E**). This pattern of expression was maintained in the mature fruit (**Supplementary Figure S4G**). These results indicate that the 2.4 Kb fragment form the SlDOF10 promoter used in the construct contains cis-regulatory elements that are conserved

the flower pedicel (I).

across tomato and Arabidopsis species. Moreover, the promoter from the SlDOF10 gene could be used as a vascular-tissue-specific promoter for additional studies.

#### Functional Analysis of SlDOF10 Gene

To elucidate the function of SlDOF10 during plant development transgenic tomato plants with reduced levels of the gene (SlDOF10-RNAi) were generated. Additionally, as a complementary strategy gain-of-function lines (35S:SlDOF10) were also generated (see Materials and Methods for details of the constructs). The expression of the targeted gene was analyzed in the T0 RNAi lines (14) and 3 of the lines showed a reduction of 80% in the expression level of SlDOF10 (**Supplementary Figure S5A**). Four independent T0 35S:SlDOF10 plants were generated with increased SlDOF10 transcript level that range from two- to fivefold (**Supplementary Figure S5B**). Vegetative growth was not altered in the overexpressing or RNAi transgenic lines as expected by the absence of expression in the plant stems and leaf pedicels (**Supplementary Figure S3**). These plants were able to produce flowers and fruits. Two RNAi lines

(L29 and L31) and the overexpression line with the higher level of expression (L16) were selected for further characterization in the T2 generation.

Mild to severe defects were observed in the flowers of the RNAi lines that consisted in the incomplete fusion of the staminal cone (**Figure 6A**). These defects were shown by 45% of the flowers being only 16% severe defects on anther fusion. The overexpression line showed a greater proportion of affected flower (56%) and also higher rate of severe defects (27%). Despite these defects on stamen formation, overexpressing plants did not show alterations in the size of the fruits, the number of seeds or the formation of parthenocarpic fruits (**Figures 6B,C**). On the contrary, SlDOF10-RNAi lines showed smaller fruits than the wild type and a high number of seedless fruits (**Figures 6B,C**). In tomato, a relationship between fruit weight/size and seed content within a variety has been reported (Pet and Garretsen, 1983). Accordingly, the fruits from the SlDOF10-RNAi lines contained a reduced number of seeds and the occasional presence of pseudo-embryos (**Figure 6B**). However, histological sections of anthesis flowers showed that ovule development was not affected in the RNAi lines (**Supplementary Figure S6**). On the other hand results from **Figure 6D** showed that MT plants (the wild type genotype) has a natural tendency to produce seedless fruits under our growing conditions. This tendency is maintained in the 35S:DOF10 line and greatly enhanced in the SlDOF10-RNAi lines. Therefore, facultative parthenocarpy was evaluated in these plants by emasculation of unpollinated flowers. The ovaries form wild-type and overexpressing lines arrested growth whereas all the ovaries from the RNAi lines continued growing in the absence of pollination (**Figure 6C**). In the RNAi lines the weight of the ovaries (measured 18 days after emasculation) ranged from 3 to 33 times higher than the average ovary weight of the emasculated wild-type ovaries. This experiment suggests that silencing of SlDOF10 gene promotes the autonomous growth of the ovary in the absence of pollination and fertilization.

According to the expression analyses, SlDOF10 transcript is located in the vascular tissue of the ovary (**Figures 5B,C,I**) that connects the ovules with the placental tissue and the flower pedicel. We then analyzed possible changes in the ovary vascular tissue development caused by altered function of the SlDOF10 gene. We performed histological section of flower pedicels form the overexpressing and silenced transgenic lines. The pedicel is the nearest tissue connected with the ovary and the use of histological sections allows morphological studies of the vascular system in a two-dimensional distribution. Cross section of tomato pedicels showed a vascular ring with 10–12 vascular bundles. The arrangement of the vascular bundles is bicollateral, where xylem is lined with phloem on both its inner and outer faces (**Figure 7A**). Cross sections of flower pedicels showed that the area of the vascular ring was smaller in the SlDOF10-RNAi lines and bigger in the overexpressing lines when compared to the control (**Figure 7B**). However, these differences in the size of the vascular ring were not the result of changes in the number of vascular bundles (**Figure 7C**). Looking at the individual vascular bundles we observed alterations in their total area and also in the number of cells from the xylem and phloem. In fact, cell quantification showed higher number of cell in 35S:SlDOF10 and the opposite effect in the SlDOF10-RNAi line (**Figure 7D**). These results suggest a regulatory role for the SlDOF10 protein in the control of cell proliferation during the development of the vascular tissue in the flower.

#### In silico Analysis of Cis-Acting Regulatory Elements in SlDOF10 5 0 Regulatory Region

We scanned the 5<sup>0</sup> regulatory region (2463 bp) used in the pSlDOF10::GUS construct for the presence of putative cisacting regulatory elements registered in Plant CARE (Lescot et al., 2002) and PLACE (Higo et al., 1999). Several functional significant cis-acting regulatory elements associated with different processes in plant development were identified upstream of the SlDOF10 gene. The names of the identified putative cis-acting elements and their predicted functions are tabulated in **Supplementary Table S4**. Among them we identified at least twenty cis-acting regulatory element involved in light responsiveness element (GAG-motif and G-box). Also, the region contains sequences involved in different stress response. Besides, we identify cis-acting regulatory elements involved in hormones responsiveness including cytokinins, salicylic acid, jasmonic acid, gibberellin, and auxin response (**Supplementary Table S4**).

We have paid special attention to cis-acting auxin regulatory elements because fruit set and development processes are initiated by auxin-induced changes in gene expression and followed by gibberellin (Serrani et al., 2008). Therefore, we subsequently treated tomato ovaries with auxin (IAA) and gibberellins (GA3) and examined the expression levels of SlDOF10 by qRT-PCR. The results showed that auxin treatments rapidly activated SlDOF10 expression after 30 min of the treatment, with maxima expression 1 h after auxin application (**Figure 8A**). Gibberellin treatment induces the rapid and strong activation of the gene after 30 min and then declined (**Figure 8A**). In addition we tested the expression of a reporter gene driven by 2.5-kb of the 5<sup>0</sup> promoter region of the SlDOF10 gene. A single exogenous treatment of flowers with auxin was sufficient to induce strong GUS expression in SlDOF10pro::GUS plants (**Figures 8B,C**) especially in the vascular system of sepals and in the stamens. Taken together these results suggest that SlDOF10 gene could be regulated by gibberellins and auxins during reproductive development.

### DISCUSSION

We have characterized SlDOF10 gene, coding the first tomato DOF transcription factor known to be involved in the regulation of plant development. Our results revealed that SlDOF10 controls the formation of the vascular tissue during reproductive development. Several DOF proteins have been reported to regulate vascular system development in Arabidopsis. Indeed, in Arabidopsis half of the identified DOF TFs have been found expressed in the vascular tissues (Le Hir and Bellini, 2013). Most of the DOF genes characterized so far are required during the vegetative growth phase. Both AtDof5.8 and AtDof2.4 promoters become sequentially activated at early but distinct stages of procambium formation in leaf primordia. However, AtDOF5.8 is also activated during the development of flower buds, in developing stamens at the early developmental stage and in carpels at the later developmental stages (Konishi and Yanagisawa, 2007).

Despite the similarities between the pattern of expression of SlDOF10 and AtDOF5.8 genes during flower development, phylogenetic analyses showed that they belong to separate clusters (Cai et al., 2013). Three additional members of the Arabidopsis DOF family (DOF2.1, DOF4.6, and DOF5.3) were also activated at early stages of vascular strand formation in the leaf (Gardiner et al., 2010). Experimental manipulation of leaf vascular patterning correlated with changes in the expression of these genes, suggesting that DOF expression identifies characteristic steps in vein ontogeny (Gardiner et al., 2010). The role of these genes during reproductive development was not investigated. Additional DOF genes have been identified using genetic approaches. Dof5.6/HCA2 (HIGH CAMBIAL ACTIVITY 2) encodes a DOF protein with an EAR-motif associated with transcription repression. The Dof5.6/HCA2 transcript was ubiquitously expressed in all the plant organs and hca2 mutants showed pleiotropic effects on plant morphology. Interestingly, although the flowers of hca2 mutants were normal, the hca2 siliques were shorter and contained fewer seeds per silique (Guo et al., 2009). Similarly, down-regulation of SlDOF10 activity affected fruit set and seed development (**Figure 6**).

Several genes from the DOF family have been identified to be specifically involved in seed development, including DAG1/AtDOF3.7 (Papi et al., 2000; Gualberti et al., 2002), AtDof6 (Rueda-Romero et al., 2012), and DAG2 (Gualberti et al., 2002). In particular homozygous dag1 plants showed twisted siliques with a reduced number of seeds that do not develop dormancy and germinate in the absence of light (Papi et al., 2000). This phenotype correlates with the expression pattern of the DAG1 (AtDOF3.7) gene that was observed in the gynoecium, specifically localized in the vascular tissue and the funiculus that connects the placenta to the ovule (Papi et al., 2000). This expression pattern is similar to the expression of SlDOF10 in tomato ovaries (**Figure 5B**) and the pattern shown in DAG2:GUS lines (Gualberti et al., 2002) suggesting a common function for these genes. However, in the case of tomato, additional experiments are required to evaluate a possible role for SlDOF10 during seed germination and dormancy. Taken together, the results from Arabidopsis and tomato suggest an important and conserved role for this subset of DOF genes during reproductive development and in particular in the formation of flowers and seeds.

Phylogenetic analyses of DOF family in Arabidopsis and rice revealed four major clusters and nine subfamilies of orthologous genes of subfamilies named A, B1, B2, C1, C2, C3, D1, D2, and D3 (Lijavetzky et al., 2003). The tomato, Arabidopsis and rice DOF families contain a similar number of members (34, 35, and 30, respectively) and similar phylogenetic relationships (Cai et al., 2013) suggesting that they may have evolved conservatively. In the tomato family segmental duplication is predominant for DOF gene evolution although tandem duplication is also involved giving rise to ten pairs of paralogous genes (Cai et al., 2013). SlDOF31 gene was recognized as the putative paralog of SlDOF10 that probably resulted from ancient whole-genome duplication (Song et al., 2012). These two genes are located in two different chromosomes and showed important differences in the structure of the genes and the size and sequence of the protein (**Figure 2** and **Supplementary Figure S2**). This strong structural divergence of the two paralogs and the non-overlapping expression patterns suggests that after duplication, functional diversification might occur. On the other hand, our results showed that SlDOF10 is expressed in vegetative and reproductive tissues during plant development (**Figures 4**, **5**). However, SlDOF10-RNAi plants did not show obvious defects during vegetative development possibly due to genetic or functional redundancy. In this regard, previous expression analyses of tomato DOF genes showed that SlDof1, SlDof29, SlDof10, and SlDof32 have similar expression patterns (Cai et al., 2013) implying possible redundant functions. Nowadays, the lack of functional analyses of most of these genes does not permit to evaluate the genetic interactions among them.

SlDOF10 gene encodes a protein of 260 amino acids with a well conserved DOF domain (**Supplementary Figure S1**). The protein was localized in the nuclei and showed transcriptional activity

supporting its function as a transcription factor. Remarkably, SlDOF10 is the only protein from the tomato family where no additional conserved motifs were identified (Cai et al., 2013). Phylogenetic analyses using homologous proteins from different species showed that SlDOF10 cluster together with members of the Solanaceae family forming a small protein clade (**Figure 2**) being AtDOF1.1 the closest Arabidopsis homolog. AtDof1.1 (OBP2) is part of a regulatory network controlling glucosinolate biosynthesis in Arabidopsis (Skirycz et al., 2006). Interestingly, OBP2 expression was observed in the vascular tissue and stimulated by wounding and MeJA. Hormonal regulation of DOF expression was also reported in barley and rice in response to gibberellins (Mena et al., 2002; Washio, 2003) and in tobacco in response to auxin (Baumann et al., 1999). Our data indicate that SlDOF10 expression is transcriptionally regulated by auxins and gibberellins during reproductive development, key regulatory elements on fruit set initiation (Goetz et al., 2006; Serrani et al., 2008). Moreover SlDOF10 shows an overlapping expression pattern with the auxin response factors ARF8 and SlARF7 within the ovary (Goetz et al., 2006). A recent study using laser-capture microdissection and high-throughput RNA sequencing reported a comprehensive tissue-specific transcriptomic analysis during early tomato fruit development (Pattison et al., 2015). Interestingly, ten members of the C2C2-DOF family of TFs, including SlDOF10, form a coexpression cluster with several auxin related genes including the auxin-efflux carriers PIN-FORMED1 (PIN1) and PIN7, two AUX/IAA family genes (IAA13 and IAA17) involved in auxin signal transduction, and a GH3 family gene involved in auxin conjugation (Pattison et al., 2015). A transcriptional association between C2C2-DOF TFs and their potential target genes involved in auxin transport and signaling has been suggested (Pattison et al., 2015). Taken together, the data suggest that transcriptional regulation of SlDOF10 and gene function largely depends on the hormone dynamics during tomato reproductive development.

Polar auxin transport controls multiple developmental processes in plants, including the formation of vascular tissue (Gälweiler et al., 1998). During tomato fruit development, the application of auxin transport inhibitors that block export of auxins from the ovary stimulates parthenocarpic fruit set (Serrani et al., 2010). In addition the down-regulation of the auxin efflux carrier SlPIN4 leads to parthenocarpic fruit growth (Mounet et al., 2012). Also SlPIN1 has been shown to plays central roles in leaf initiation and fruit development promoting the basipetal auxin efflux from the ovary to the flower pedicel (Shi et al., 2017). SlDOF10 down-regulation reduced vascular tissue development in the flower pedicel (**Figure 7**) and induced parthenocarpic fruit growth. Precocious ovary growth could be the consequence of reduced polar auxin transport from the ovary leading to changes in the local distribution of hormones. Although additional experiments are needed to confirm this hypothesis, the functional analyses of SlDOF10 gene highlight the importance of the vascular tissue in the process of fruit set.

In tomato, further work is needed to investigate the function of the DOF genes family during plant development. However, the functional characterization of SlDOF10, the first tomato DOF gene involved in vascular tissue formation, provides insight on the role of this family of TFs during reproductive plant development.

### DATA AVAILABILITY

All datasets generated for this study are included in the manuscript and/or the **Supplementary Files**.

### AUTHOR CONTRIBUTIONS

CG-M and PR-G conceived and performed the experiments and analyzed the data. PR-G, ER, MM, and ML-M performed the experiments and analyzed the data. CG-M, JB, and LC wrote the grant that funded this work. CG-M, ER, JB, and LC wrote and reviewed and edited the manuscript.

### FUNDING

This work was supported by grants from the Spanish Ministerio de Economía y Competitividad (MINECO, BIO2013-40747-R and Intramural 2017401041).

### ACKNOWLEDGMENTS

We thank Maricruz Rochina and Marisol Gascón for technical assistance. We acknowledge support of the publication fee by the CSIC Open Access Publication Support Initiative through its Unit of Information Resources for Research (URICI).

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00216/ full#supplementary-material

FIGURE S1 | Gene sequence and conserved protein domains of the SlDOF10 gene. (A) Coding sequence and predicted protein sequence. Start and Stop codons are boxed. (B) Schematic representation of the SlDOF10 protein. The conserved amino acid of the DOF domain and the bipartite nuclear localization signal (NLS) are highlighted in red and blue, respectively.

FIGURE S2 | Protein alignment of the putative paralogs SlDOF10 and SlDOF31. The position of the DOF domain and NLS is underlined.

FIGURE S3 | Histochemical GUS staining of stems and leaf pedicels of SlDOF10pro::GUS transgenic tomato plants. (A) Leaf pedicel and (B) stem. Scales bars are: 2 mm.

FIGURE S4 | Histochemical GUS staining of transgenic Arabidopsis carrying the GUS-coding region fused to SlDOF10 promoter. SlDOF10pro::GUS expression in the vascular tissue of seedlings (A), roots (B), secondary roots (C), and flowers (D). Dissected flower showing GUS staining at the vascular tissue from the funiculus (E). Mature fruit showing GUS staining in the apical and basal region and the margin of the valve (F). Detail from mature fruit showing blue staining in the funiculus (Fu) and vascular tissue from the fruit (G). Scale bars are: 1 mm.

FIGURE S5 | Expression level of SlDOF10 gene in the RNAi and overexpressing lines. (A) Relative expression of SlDOF10 in the SlDOF10-RNAi lines measured by qRT-PCR. (B) Relative expression of SlDOF10 in the 355S:RNAi lines measured by qRT-PCR. The asterisks denote a significant difference between the transgenic lines and the wild type at p < 0.05.

FIGURE S6 | Histological sections of ovules from flowers in anthesis. (A) Wild type ovules (Micro-Tom cv.). (B) SlDOF10-RNAi plants. (C) 35S: SlDOF10 plants.

#### REFERENCES


TABLE S1 | Primers used in this work.

TABLE S2 | Genes differentially expressed in the ovaries of the parthenocarpic line PsEND1::barnase at 6 days before anthesis (dba).

TABLE S3 | Differentially regulated genes from the functional category "regulation of transcription."

TABLE S4 | Predicted cis-acting elements in SlDOF10 promoter region identified using PlantCARE and PLACE.



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Rojas-Gracia, Roque, Medina, López-Martín, Cañas, Beltrán and Gómez-Mena. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Partitioning Apomixis Components to Understand and Utilize Gametophytic Apomixis

Pankaj Kaushal<sup>1</sup> \*, Krishna K. Dwivedi<sup>2</sup> , Auji Radhakrishna<sup>2</sup> , Manoj K. Srivastava<sup>2</sup> , Vinay Kumar<sup>1</sup> , Ajoy Kumar Roy<sup>2</sup> and Devendra R. Malaviya<sup>3</sup>

1 ICAR-National Institute of Biotic Stress Management, Raipur, India, <sup>2</sup> ICAR-Indian Grassland and Fodder Research Institute, Jhansi, India, <sup>3</sup> ICAR-Indian Institute of Sugarcane Research, Lucknow, India

Apomixis is a method of reproduction to generate clonal seeds and offers tremendous potential to fix heterozygosity and hybrid vigor. The process of apomictic seed development is complex and comprises three distinct components, viz., apomeiosis (leading to formation of unreduced egg cell), parthenogenesis (development of embryo without fertilization) and functional endosperm development. Recently, in many crops, these three components are reported to be uncoupled leading to their partitioning. This review provides insight into the recent status of our understanding surrounding partitioning apomixis components in gametophytic apomictic plants and research avenues that it offers to help understand the biology of apomixis. Possible consequences leading to diversity in seed developmental pathways, resources to understand apomixis, inheritance and identification of candidate gene(s) for partitioned components, as well as contribution towards creation of variability are all discussed. The potential of Panicum maximum, an aposporous crop, is also discussed as a model crop to study partitioning principle and effects. Modifications in cytogenetic status, as well as endosperm imprinting effects arising due to partitioning effects, opens up new opportunities to understand and utilize apomixis components, especially towards synthesizing apomixis in crops.

#### Edited by:

Emidio Albertini, University of Perugia, Italy

#### Reviewed by:

Joann Acciai Conner, University of Georgia, United States Ross Bicknell, The New Zealand Institute for Plant & Food Research Ltd., New Zealand Diego Carlos Zappacosta, Universidad Nacional del Sur, Argentina

> \*Correspondence: Pankaj Kaushal pkaushal70@gmail.com

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 15 November 2018 Accepted: 18 February 2019 Published: 08 March 2019

#### Citation:

Kaushal P, Dwivedi KK, Radhakrishna A, Srivastava MK, Kumar V, Roy AK and Malaviya DR (2019) Partitioning Apomixis Components to Understand and Utilize Gametophytic Apomixis. Front. Plant Sci. 10:256. doi: 10.3389/fpls.2019.00256 Keywords: apomixis, apomeiosis, endosperm, Panicum maximum, parthenogenesis, partitioning

### INTRODUCTION

#### Overview of Apomixis Phenomenon: Genetics and Regulation

Apomixis is a natural method of clonal reproduction through seeds, whereby the progeny is represented exactly by the maternal genotype (Asker and Jerling, 1992). This phenomenon has tremendous potential in agriculture by virtue of its capacity to fix heterozygosity and hybrid vigor (Sailer et al., 2016), it may equate an "Asexual revolution" (Calzada et al., 1996) and is proposed as a "next generation breeding technology" (Hand and Koltunow, 2014).

Apomixis is widespread in the plant kingdom and naturally occuring in 326 genera representing 78 families in Angiosperms, the majority belonging to Poaceae, Asteraceae and Rosaceae (Hojsgaard et al., 2014b). Apomixis may be of gametophytic or sporophytic origin based on the tissue involved in the formation of the female gametophyte. Gametophytic apomixis is represented by either diplospory or apospory, based on the origin of embryo-sacs (ES) in the ovule from

**119**

megagametophyte or a nucellar cell, respectively (Nogler, 1984a; Asker and Jerling, 1992). When the mode of seed formation is exclusively through apomixis or sexual pathways, the plant is designated as obligate (apomictic or sexual, respectively). However, when both modes are represented in the same plant (co-exist in same ovule, or in different ovules of the same plant), it is regarded as facultative in reproduction.

The basic mechanism leading to the seed clonality relies on bypassing the two phases of variability generation, meiotic recombination and fertilization, during the seed formation, eventually resulting in seeds with copied maternal genotype. Meiosis is avoided (or eliminated/modified) via formation of apomeiotic embryo sacs (ES) while fertilization is bypassed via parthenogenetic development of the egg cell (Nogler, 1984a). Different modes of formation of apomeiotic ES to generate an unreduced egg cell have been widely discussed (Crane, 2001; Koltunow and Grossniklaus, 2003). Cyto-embryological and molecular processes were studied in model aposporous (e.g., Brachiaria, Pennisetum, Cenchrus, Hieracium, Paspalum, Poa, etc.) and diplosporous taxa (e.g., Erigeron, Taraxacum, Tripsacum, etc.), and important insights into the phenomenon have been presented (Grimanelli et al., 2001; Ozias-Akins and van Dijk, 2007; Pupilli and Barcaccia, 2012; Schmid et al., 2015; Schmidt et al., 2015; Hojsgaard, 2018).

The genetics of apomixis, as investigated in these model species, appeared broadly to be under the control of one or a few major dominant genes demonstrating Mendelian segregation (reviewed in Savidan, 2000; Ozias-Akins and van Dijk, 2007; Barcaccia and Albertini, 2013). In parallel, a Hybridizationderived Floral Asynchrony (HFA) hypothesis was also proposed advocating the apomictic phenotype as a result of asynchronous expression of duplicate genes controlling female gametophyte development (Carman, 1997).

Information on inheritance models, genetic recombination potentials, molecular markers and molecular mapping studies in gametophytic apomicts have been compiled in recent reviews (Ozias-Akins and van Dijk, 2007; Pupilli and Barcaccia, 2012; Barcaccia and Albertini, 2013; Hand and Koltunow, 2014; Brukhin, 2017; León-Martínez and Vielle-Calzada, 2019). In general, dominance, polyploidy, hybrid origin and suppressed recombination are common features related to apomixis in the majority of the apomictic species (Nogler, 1984a), e.g., Apospory Specific Genomic Region (ASGR) in Pennisetum spp. and Cenchrus ciliaris (Akiyama et al., 2005; Conner et al., 2008), Apomixis Controlling Locus (ACL) in Paspalum simplex (Calderini et al., 2006), Loss of Apomeiosis (LOA) in Hieracium subgenus Pilosella (Okada et al., 2011) and Apospory (Apo) locus in Panicum maximum (Ebina et al., 2005; Takahara et al., 2014).

Comparative gene expression studies including transcriptome analysis were conducted in many of these species and differentially expressed genes during different stages of apomictic and sexual seed formation were identified (reviewed in Brukhin, 2017; Conner and Ozias-Akins, 2017). Some candidate genes have been shortlisted as potential key genetic factors (Bicknell and Koltunow, 2004; Albertini et al., 2005; Laspina et al., 2008; Sharbel et al., 2010; Silveira et al., 2012; Okada et al., 2013).

Recent studies on gene expression network supported sexual and apomictic reproduction to be closely related developmental pathways. Apomixis is suggested to be a heterochronic phenotype which relies on deregulation of the timing of reproductive events (especially entry in apomeiosis/meiosis during ES development and parthenogenetic/zygotic embryogenesis), rather than on the alteration of a specific component of the reproductive pathway (Grimanelli et al., 2003; Tucker et al., 2003; Sharbel et al., 2010; Koltunow et al., 2011; Hojsgaard et al., 2012; Hojsgaard, 2018). The eventual expression of mode of reproduction is also believed to be regulated by modifiers, supernumerary chromatin and epigenetic modifications that may operate on account of hybridity and/or polyploidization (reviewed in Roche et al., 2001; Hand and Koltunow, 2014; Bocchini et al., 2018).

### Apomixis and Polyploidy

One of the key features on apomixis expression is its close relationship with polyploidy. In general, most of the naturally occurring apomictic species are polyploids, whereby lower forms (generally diploids) are sexually reproducing (Nogler, 1984a; Carman, 1997). However, recent recovery of natural and experimental diploids expressing apomixis indicate that though affected by the change in ploidy, apomixis expression is not restricted to polyploids (Visser and Spies, 1994; Siena et al., 2008; Lovell et al., 2013; Noyes and Wagner, 2014; Hojsgaard et al., 2014a; Schinkel et al., 2016, 2017; Klatt et al., 2018a). Effect of ploidy on apomixis expression has been studied in different apomictic systems using ploidy level variations at inter- or intraspecific levels (Savidan, 2000). Artificial polyploidization was observed to enhance (Quarin and Hanna, 1980; Quarin et al., 2001; Nassar, 2006) or reduce (Asker, 1967, 1980) its expression. Within a ploidy level, genotypic effects were more profound than ploidy effects in expressing mode of reproduction in many species such as Poa pratensis, Boechera spp., Ranunculus kuepferi and Panicum maximum (Matzk et al., 2005; Voigt-Zielinski et al., 2012; Schinkel et al., 2016; Kaushal et al., 2018). Such reports have contrasted the belief of ploidy-rise being the most important driver of apomixis evolution (Carman, 1997). In-fact, importance of hybridity over polyploidy, in governing apomixis, has been recently demonstrated (Delgado et al., 2016; Barke et al., 2018).

### THE APOMIXIS COMPONENTS AND THEIR UNCOUPLING/PARTITIONING

### Components of Apomixis

Seeds of sexual origin generate from a meiotically derived ES, generally Polygonum-type (8-nucleated), containing a reduced egg cell (1n), which develops into an embryo (2n) after fertilization with a reduced male gamete (1n). Endosperm in such seeds is a triploid (3n) tissue which develops from fertilization of a male gamete (1n) with two fused polar nuclei (1n + 1n = 2n). This pathway of seed formation may be represented as meiotic-ES:zygotic-embryogenesis:pseudogamous-endosperm. In contrast, a generalized model on seed development though gametophytic apomixis (apospory or diplospory) essentially contains three components: apomeiosis (leading to formation of unreduced

embryo sac); parthenogenesis (development of embryo without fertilization); and functional endosperm development (autonomous or pseudogamous) (Nogler, 1984a). These three components are linked functionally to generate apomictic seeds. Apomeiosis leads to the formation of meiotically unreduced embryo-sacs that contain egg cell, and polar nucleus/nuclei with sporophytic chromosome number (2n). The 2n egg cell then develops parthenogenetically (2n + 0 = 2n) to generate a 2n embryo. Embryogenesis is followed with development of endosperm either through fertilization of unreduced polar nucleus/nuclei (pseudogamy) or without fertilization (autonomous). This pathway of apomictic seed formation is represented as apomeiosis:parthenogenesis:functional-endosperm development (Asker and Jerling, 1992). The biological functions of the individual components and the progression from one stage to the next is presently under intense investigations (Grimanelli, 2012; Schmidt et al., 2015; Mirzagadheri and Horandl, 2016; Bocchini et al., 2018; Juranic et al., 2018).

### Partitioning Apomixis Components: Principle and Consequences

The apomixis components were believed to be strictly under control of "one major locus," eventually generating an apomictic phenotype, in the majority of the agamic species (Savidan, 2000; Ozias-Akins and van Dijk, 2007). Accordingly, breeding strategies and molecular studies were designed for cultivar development, understanding the mechanism, mutagenesis and traits-introgression from related wild species. Occasional deviations from expected phenotypic frequencies and ploidies were considered as spontaneous off-types (Muntzing and Muntzing, 1943; Asker, 1980; Nakagawa and Hanna, 1990; Berthaud, 2001; Caceres et al., 2001). However, recent studies supported the fact that at-least in some of the apomictic taxa (see subsequent sections), the three apomixis components viz. apomeiosis, parthenogenesis and functional endosperm development may be uncoupled (Barcaccia and Albertini, 2013). Contrary to the "one locus" theory, the partitioning principle suggests apomixis under the control of three distinct genetic determinants, each controlling an individual component, and recombination between them possible. These recombinants have been isolated phenotypically in many apomictic species and the molecular principles (molecular markers, structural and functional genomics) underlying the mechanism are being investigated (Noyes, 2006; Zavesky et al., 2007; Kaushal et al., 2008; Koltunow et al., 2011; Conner et al., 2013).

The uncoupling may lead to newer combinations of partitioned apomixis components during the seed development process. As stated earlier, generation of clonal embryos relies on operating the apomeiosis:parthenogenesis pathway (2n + 0) and this functional linkage is necessary to maintain ploidy and clonality. However, as a consequence of uncoupling of apomixis components, the functional linkage between apomeiosis and parthenogenesis is lost, and thus apomeiotically derived unreduced egg cell (2n) loses the capacity of parthenogenesis and requires fertilization with male gamete (1n) for embryo development, eventually leading to the formation of a triploid embryo (2n + 1n = 3n). This recombined pathway may be represented as apomeiosis:zygotic-embryogenesis. Alternatively, in a typical sexually derived egg cell (1n), requirement of

require fertilization for embryo development leading to triploid (2n + n = 3n; BIII) progeny. Alternatively, a meiotically derived haploid egg cell acquires the

parthenogenetic capacity and develops without fertilization, leading to formation of haploid (1n + 0 = 1n; M1) progeny.

fertilization for embryo development is lost/replaced by parthenogenesis and the resultant embryo develops without fertilization yielding a haploid embryo (1n + 0 = 1n), following a meiosis:parthenogenesis pathway (**Figure 1**). Triploids derived through 2n + n hybridization are termed as BIII hybrids and the haploids (1n + 0) as M<sup>1</sup> plants (Rutishauser, 1948). Similarly, sexually derived diploids and obligate apomicts are designated as MII and BII, respectively (Rutishauser, 1948; Aliyu et al., 2010). Broad categories of seed formation arising through partitioning events, in Polygonum-, Hieracium- and Panicum-type ES, is summarized in **Table 1**.

Such partitioning events are largely believed to be consequence of recombination between apomixis components, however these are also influenced by modifiers and epialleles, and may show varied expressivity and penetrance (Noyes, 2006; Zavesky et al., 2007; Kaushal et al., 2008, 2018; Conner et al., 2013). This is expected in view of the suggested origin of apomixis via hybridization (maintaining a state of heterozygosity) (Nogler, 1984a; Carman, 1997; Talent, 2009). These "heterozygotes" may harbor genetic determinants for both apomixis and sexual reproduction and become amenable to uncoupling, owing to above mentioned factors. Similar heterozygous situation also prevails in progeny derived from experimental crosses between sexual × apomictic parents. As an illustration, haploid (M1) progeny between sexual and apomictic parents in Potentilla collina was recovered through "parthenogenetic development of reduced ovules," whereby tendencies of formation of reduced gametes and the development of an egg cell without fertilization were derived from sexual and apomictic parent, respectively (Muntzing and Muntzing, 1943). Partitioning is also reported in experimental crosses between sexual and apomictic forms (intraas well as inter-specific hybridization) in Ranunculus, Panicum, Pennisetum, Cenchrus, and Poa (Savidan, 2000; Matzk et al., 2001, 2005; Ozias-Akins and van Dijk, 2007; Kaushal et al., 2008; Barcaccia and Albertini, 2013).

Similarly, for the third component of apomixis, viz., functional endosperm development, the mode of formation (pseudogamous or autonomous) may also get modified as a consequence of partitioning. For example, induction of autonomous mode of endosperm development in otherwise pseudogamous species by acquiring additional genetic determinants or by removal of suppressors that restrict the proliferation of polar nuclei in absence of fertilization, or vice versa. Such modifications are reported in several apomictic species (e.g., Taraxacum officinale, Panicum maximum, Hieracium spp., etc.) and in mutants mimicking apomixis components in otherwise sexual crops (van Dijk et al., 2003; Bicknell and Koltunow, 2004; Kaushal et al., 2008; Schmidt et al., 2015; Brukhin, 2017). During apomixis, although clonal embryos are generated from all categories of apomeiotic ES, the genomic ratios in endosperm largely depends on the developmental category of ES. For example, genome ratios (embryo:endosperm) in seeds derived from Polygonum- and Panicum-type ES are 2Em:3En, while it is 2Em:4En or 2Em:5En in Hieracium-type ES derived through autonomous or pseudogamous development, respectively (**Table 1**). Relative genome contribution of maternal and paternal genomes in embryo and endosperm constitution and the imprinting effects are thereby being studied to gain an insight into key genetic and epigenetic factors for successful endosperm development. Diversity in endosperm


Genome constitution in embryo and endosperm are depicted from seeds originated from three types of ES (Polygonum-, Hieracium- and Panicum-type). Embryo development may be zygotic or parthenogenetic, while endosperm development may be pseudogamous or autonomous. <sup>a</sup>Ratio of genomes in embryo:endosperm; <sup>b</sup>Contribution of maternal (m) and paternal genomes (p) in embryo (Em) and endosperm (En).

development, molecular mechanisms and the constraints of endosperm imprinting effects, are some of the issues of recent investigations (Pupilli and Barcaccia, 2012; Gehring and Satyaki, 2017; Henderson et al., 2017; Brauning et al., 2018; Depetris et al., 2018).

It would be interesting to debate whether apomeiosis and parthenogenesis have independent origins (Horandl, 2009; Talent, 2009; Tilquin and Kokko, 2016). In many taxa, the independent occurrence for capacity to generate unreduced gametes or haploid parthenogenesis suggest their independent origin during evolutionary lineage, however, their recurrent occurrence over generations will either polyploidize them out of existence or lead to haploid sterility. Interestingly, a combination of these two components stabilizes the system by maintaining the ploidy state, in spite of their individual capacities to modify it. From an evolutionary perspective, it seems logical that linkage between the apomixis components is essential for survival and perpetuation of the species to maintain the hybridity and ploidy, overcoming the constraints of genomic imbalances and ploidy levels of parental species. Species demonstrating partitioned apomixis components are regarded as evolutionary young apomicts, as compared to the species where recombination is suppressed (Pupilli and Barcaccia, 2012; Hand and Koltunow, 2014).

FIGURE 2 | Representative ES of guinea grass (cleared ovules). (A) Aposporous ES, (B) Sexual (or meiotic) ES, (C) Multiple ES (three ES seen), (D) Ovule showing proliferating polar nuclei in absence of pollination, as an indicator of AED, a cluster of four nuclei is visible in one plane, (E): Ovule showing an aborted ES. e- egg cell, p-polar nucleus, a- antipodals, ES-Embryo sac. Reprinted by permission from the Springer Nature: Kaushal et al. (2018).

In a strict sense, the manifestation of sexuality might occur at two stages during seed development: the formation of meiotic (or sexual ES) which allows meiotic progression to generate variability in the gametes (in obligate sexual or facultative individuals), as well as during fertilization between male and female gametes (syngamy), irrespective of the meiotic or apomeiotic origin of the ES (yielding BII/MII or BIII hybrids, respectively). Interestingly, isolated apomixis components generate variability, and act as a driving force in evolving agamic species (Berthaud, 2001). The situation may be more complex in facultative individuals, as both sexual and apomeiotic factors are present in the same genotype, though with different extensions (Delgado et al., 2016; Kaushal et al., 2018).

### Detection Methods to Identify Partitioned Apomixis Components

Modifications arising due to partitioning of apomixis components (ploidy of the progeny and mode of endosperm development) can be identified utilizing the characterization of embryo-sacs and ploidy estimation of the embryo of the progeny (Crane, 2001).

Histological differences between reduced (e.g., Polygonumtype: 8-nucleated sexual types) and unreduced ES (e.g., 4 nucleated Panicum-type) have been utilized to characterize the presence of apomeiosis/meiosis. Methods to analyse ES structure development within the ovule have been modified from classical sectioning procedures to more rapid callose deposition tests (Peel et al., 1997; Tucker et al., 2001) and ovule clearing techniques (Young et al., 1979; Herr, 2000; Crane, 2001). Cleared ovules are now extensively utilized to characterize mode of ES formation, quantification of apospory and abortive ES, as well as to observe autonomous endosperm development (AED) (**Figure 2**).

The potential for parthenogenesis can be tested using "auxin test" (or auxin-induced parthenocarpy) (Matzk, 1991). Auxins replace the endosperm effect, thereby allowing initial development of embryo in the absence of endosperm, provided parthenogenesis genes are available. Auxin test has been successfully utilized to identify parthenogenesis potential in Poa spp., Hypericum spp., wheat-salmon system (Matzk et al., 2007) and Dichanthium annulatum (Gupta et al., 1999).

Triploids (BIII) and haploids (M1) were identified through classical chromosome-counting methods (Asker, 1980), and more recently using flow cytometric measurements of sporophytic DNA (Aliyu et al., 2010; Conner et al., 2013; Kaushal et al., 2018). The principle of flow cytometry was also utilized to develop a highly efficient and rapid screen, described as the Flow Cytometric Seed Screen (FCSS) (Matzk et al., 2000), which analyzes relative DNA contents of embryo and endosperm cells (from single/bulked matured seeds) to estimate their ploidies (**Figure 3**). When appropriately supplemented with information on the mode of ES development, ploidy of contributing male gametes (reduced/unreduced) and mode of endosperm development (autonomous/pseudogamous) can also be estimated, thereby enabling reconstruction of the possible reproductive pathways of seed formation. FCSS has been successfully implemented in confirming partitioning effects in diverse aposporouos and diplosporous apomicts, such as Brachiaria spp., C. ciliaris, Panicum maximum, Boechera spp., Hypericum spp., Poa spp., Tripsacum dactyloides, Hieracium spp., Paspalum simplex, Onosma spp., Rosa canina, Capsella bursa-pastoris, Crataegus spp., Ranunculus auricormus complex, etc., (reviewed in Krahulcova and Rotreklova, 2010; Kolarcik et al., 2018). FCSS also provides an opportunity (over progeny analysis) to analyze those proportions of seeds that might fail to germinate, owing to disturbed embryo:endosperm ratios, hence providing a better estimate of the partitioning events (Kaushal et al., 2008; Conner et al., 2013).

In addition to above analytical techniques, molecular markers tightly linked to individual components are also being utilized to identify partitioning events (see later sections for details on molecular markers) (Pupilli and Barcaccia, 2012; Barcaccia and Albertini, 2013; Conner et al., 2013; Hand and Koltunow, 2014; Brukhin, 2017).

#### Partitioning Apomixis Components in Natural Apomictic Systems

Genetic analysis and utilization of efficient screening techniques led to identification of apomictic species with possible recombination between apomixis components, both in aposporous apomicts such as R. auricormus (Nogler, 1984b), Poa pratensis (Albertini et al., 2001; Matzk et al., 2005), Hieracium spp. (Catanach et al., 2006), Panicum maximum (Kaushal et al., 2008) Hypericum perforatum (Matzk et al., 2001; Schallau et al., 2010), as well as diplosporous apomicts, such as Erigeron annus (Noyes and Rieseberg, 2000) and T. officinale (van Dijk and Bakx-Schotman, 2004) (reviewed in Krahulcova and Rotreklova, 2010; Pupilli and Barcaccia, 2012; Barcaccia and Albertini, 2013).

Cytogenetical and genetic mapping studies demonstrated the possibility and consequences of recombination between the apomixis components. It has been suggested that the recombination between apomeiosis and parthenogenesis (and/or functional endosperm development) may not be mutually exclusive, along with involvement of minor loci or modifiers in governing the phenotype (Barcaccia et al., 2006, 2007). As an illustration, in H. perforatum, most parthenogenetic plants were aposporic, however, several aposporic plants were nonparthenogenetic and recombinants for parthenogenesis were 10-folds higher than recombinants for apospory (Schallau et al., 2010). Similarly, in apomictic P. maximum germplasm accessions, parthenogenesis was uncoupled from apospory in about 26% of cases (Kaushal et al., 2008). Similar results were also reported in P. pratensis and H. perforatum (Matzk et al., 2001, 2005). However, there are a couple of reports on complete and independent expression of apomeiosis, including an apomeiotic non-parthenogenetic inter-specific hybrid between two sexual diploid species viz., Pennisetum glaucum and P. orientale (Kaushal et al., 2010), LOA and LOP mutants in Hieracium (Koltunow et al., 2011) and an ASGR recombinant in C. ciliaris (Conner et al., 2013).

As the expression of apomixis and its components is largely affected by genotype and hybridity, identification of

partitioning events relies on exploring sufficiently large and diverse germplasm collections, including (experimental) hybrids between sexual and apomictic parents. A survey of a sufficiently large germplasm base identified the occurrence of partitioning in P. pratensis (Matzk et al., 2005), H. perforatum (Matzk et al., 2001) and Pancium maximum (Kaushal et al., 2008), as also in experimental hybrids, e.g., in R. auricomus (Nogler, 1984b), P. collina (Muntzing and Muntzing, 1943), P. maximum (Kaushal et al., 2008) and C. ciliaris (Conner et al., 2013). Although in Paspalum, parthenogenesis and apospory were reported to be inherited together (Pupilli and Barcaccia, 2012), in intervarietal crosses between sexual and apomictic parents, BIII hybrids were reported to occur (though in low frequency) in most of the apomictic progenies, and the uncoupling between apospory and parthenogenesis occurred among up to 50% cases (Caceres et al., 2001). Similarly, in P. maximum, wherein apomixis was believed to be monogenic (Savidan, 2000), uncoupling events were demonstrated in a wide scale screening of 669 genotypes (including a global germplasm collection), as well as in experimental hybrids (Kaushal et al., 2008, 2009). Recently, uncoupling of apomixis components has also been reported in C. ciliaris progenies obtained from sexual × apomictic lines utilizing FCSS and molecular markers analysis (Conner et al., 2013; Indian Grassland and Fodder Research Institute [IGFRI], 2013). These reports suggest that among many crops un-reported for partitioned apomixis components, a greater diversity in reproductive development regarding uncoupling of apomixis components is expected to be discovered by screening a larger and more diverse germplasm base, including the crosses between parents with contrasting reproductive capacities, and utilizing more efficient screening techniques.

### GENETIC REGULATION OF PARTITIONED APOMIXIS COMPONENTS

### Induction and Inheritance of Apomixis Components

Experimentally, hybridization and polyploidy were attempted to test their potential to induce individual apomixis components, owing to the fact that these two are major contributory forces for origin of apomixis. Reports on de novo appearance of apomeiosis component through hybridization and/or polyploidization are more frequent, as compared to parthenogenesis and modification in endosperm development (reviewed in Mason and Pires, 2015; Kreiner et al., 2017).

From an apomixis perspective, induction of apomeiosis (apospory) has been reported as early as 1967 in certain hybrids of Sanguisorba (Nordborg, 1967) and in intergeneric hybrid Raphanobrassica (Ellerstrom and Zagorcheva, 1977; Asker, 1980). However, empirical results on the appearance of spontaneous apospory by inter-varietal or inter-specific hybridization between two sexually reproducing species, which eventually modified the mode of embryo-sac formation, have been recently reported in Pennisetum (Kaushal et al., 2010) and R. auricomus (Hojsgaard et al., 2014a). An interspecific hybrid (2n = 16, genome GO) between two diploid and sexually reproducing species (Polygonum-type ES), viz. P. glaucum (2n = 14; GG) and P. orientale (2n = 18; OO), showed a transition from obligate sexuality (Polygonum-type ES) to apospory (>83% Panicumtype aposporous ES). Parthenogenesis was completely omitted in this plant, and it produced all BIII hybrids (2n = 23; GGO) when backcrossed with P. glaucum. The capacity for apomeiosis and zygotic embryogenesis was stable and inheritable in this hybrid, although a dosage effect was observed whereby upon adding sexual genome(s) from P. glaucum or apomictic genome from P. squamulatum, the expressivity of apospory was reduced or enhanced, respectively. The hybrid (GO) also demonstrated de novo induction of AED (proliferation of polar nuclei), suggesting the induction of this component is also affected by hybridity. Similarly, sterility effects were overcome in interspecific and inter-ploidy crosses in Ranunculus by resorting to spontaneous apospory in mode of ES formation, eventually forming viable triploid seeds (Hojsgaard et al., 2014a). A novel phenomenon was also described for induction of apomeiosis through second division restitution in interspecific cross between Saccharum officinarum and S. spontaneum, whereby formation of a 2n female gamete was triggered by the male gamete (Hermann et al., 2012). The induction is dependent on ploidy of S. sponateum as the male gamete in a dose-dependent manner and possesses the potential to be utilized for in vivo production of doubled haploids in intergeneric crosses. The induction of apomeiosis is known to be affected by hybridity and/or polyploidy, explainable on the basis of HFA theory (Carman, 1997) as well as epigenetic reprogramming of the genes involved in embryo-sac and endosperm development (Grimanelli, 2012; Kreiner et al., 2017; Hojsgaard, 2018).

Reports on induction of parthenogenesis through interspecific hybridization are rare, although inter- or intra-specific hybridization has been used to trigger haploidy via alternative pathways, such as uniparental genome elimination, utilizing genetic and cytogenetic stocks and alloplasmic cytoplasms (reviewed in Forster et al., 2007; Ishii et al., 2016). In apomictic systems, parthenogenesis component is generally contingent upon apospory or diplospory (Ozias-Akins and van Dijk, 2007). It is easier to partition it from apomeiosis, however, independent recurrent parthenogenesis is rarely naturally reported in plants, though it has been achieved experimentally (e.g., lop mutants in Hieracium; PsBBML in Pennisetum) (Koltunow et al., 2011; Conner et al., 2015; Mirzagadheri and Horandl, 2016).

Inheritance studies showed dominant inheritance of the partitioned apomixis components, however, with variable penetrance and expressivity, and were influenced by genotype and ploidy (reviewed in Ozias-Akins and van Dijk, 2007; Pupilli and Barcaccia, 2012; Barcaccia and Albertini, 2013; Hand and Koltunow, 2014). In P. pratensis, a multigene inheritance model has been proposed (Matzk et al., 2005), however, inheritance of parthenogenesis (PARTH1) as a dominant single gene was also proposed (Porceddu et al., 2002). Apomixis in T. officinale is under the control of two independent loci, one for diplospory (DIP) and the another for parthenogenesis (PAR) (Vijverberg et al., 2004). Similarly, two independent dominant loci models have been proposed in diplosporous Erigeron annuus, one for diplospory (D) and the other (F) for both parthenogenesis and AED (Noyes et al., 2007). Three dominant loci, viz., LOA, LOP and AutE control individual apomixis components in Hieracium sugenus Pilosella (Koltunow et al., 2011; Henderson et al., 2017).

#### Candidate Genes for Individual Components

The fact that apomictic and sexual systems share a common network of gene actions during seed development (Hand and Koltunow, 2014), supplemented the efforts on identification of genes mimicking apomixis components in sexual systems. Mutants of these genes/genomic regions from sexual systems have been identified to exhibit apomixis components, and those involved in essential functions during megasporogenesis, meiosis initiation and progression, megagametogenesis, embryogenesis and endosperm development (e.g., DYAD, SWI1, Elongate1, SERK, ARG, MiMe sets, AGO, DMT, hap, BBM, FIE, MEA, DME, etc.) (reviewed in Pupilli and Barcaccia, 2012; Barcaccia and Albertini, 2013; Schmidt et al., 2015; Brukhin, 2017). Transcriptome analysis involving ovular tissues during apomixis or sexual process have been compared in aposporous apomicts (e.g., Brachiaria brizantha, Pennisetum interspecific hybrids, C. ciliaris, P. maximum, Paspalum notatum, H. perforatum) and diplosporous apomicts (e.g., Boechera), and differentially expressed genes were identified (Reviewed in Conner and Ozias-Akins, 2017).

Additionally, detailed molecular analysis of genomic regions governing apomixis in natural apomictic systems led to the identification, characterization and isolation (in several cases) of key genes involved in apomictic reproduction per se or its components. These include genes controlling apomeiosis, such as APOLLO (Apomixis linked locus; Boechera spp.) (Corral

et al., 2013), HAPPY, HpARI (ARIADNE7; H. perforatum) (Schallau et al., 2010), those controlling parthenogenesis, such as ASGR-BBML (Apospory Specific Genomic Region-Baby Boom; Pennisetum squamulatum) (Conner et al., 2015); as well as that modulating endosperm development, such as PsORC3a (Origin Recognition Complex; P. simplex) (Siena et al., 2016) and AutE (AED; Hieracium subgenus Pilosella species) (Henderson et al., 2017). Promising results towards introduction of the parthenogenesis component of apomixis has been provided by utilizing PsASGR-BBML gene, which successfully developed parthenogenetic haploids in sexual crops such as pearl millet, rice and maize (Conner et al., 2015, 2017), and is reported to be conserved across Paniceae species (Worthington et al., 2016).

#### Factors Affecting Uncoupling and Expression of Partitioned Apomixis Components

Partitioning of apomixis components and their expression have been found to be largely influenced by genotypic effects, however, they are also affected by ploidy levels and dosage effects, as well as stress and environmental factors. Modifying elements present in the genetic background have also been presumed to modulate the expressivity of apomixis components (Koltunow et al., 1998; Bicknell et al., 2000; Hand et al., 2015), mostly by epigenetic regulatory networks (Curtis and Grossniklaus, 2008; Galla et al., 2017; Bocchini et al., 2018).

In apomictic plants, genotypic effects were identified to be more profound than ploidy effects in determining the mode of reproduction, as well as penetrance and expressivity of the component traits. Although modification of ploidy levels effect partitioning, it is largely found to be genotype-dependent (Burson et al., 2002; Matzk et al., 2005; Kaushal et al., 2008, 2018; Krahulcova and Rotreklova, 2010; Sharbel et al., 2010; Krahulec et al., 2011; Voigt-Zielinski et al., 2012; Delgado et al., 2014; Noyes and Wagner, 2014). Higher ploidy may accumulate the relative doses of the apomeiotic- or sexual-factors, which in turn affects the eventual expression of the trait, especially in facultative genotypes. Such dosage effects on expression of apomeiosis have been reported in apomicts, such as R. auricomus, Erigeron interspecific hybrids, Paspalum rufum, P. maximum, Pennisetum interspecific crosses and Pilosella spp. (Nogler, 1984b; Noyes, 2005; Kaushal et al., 2008, 2010; Krahulcova et al., 2011; Delgado et al., 2016). Interestingly, an enhancement in sexuality (or reduction in apospory) has been reported with rise in ploidy in a P. maximum ploidy series (2n = 6x to 11x) (Kaushal et al., 2018).

In general, occurrence of the apomeiosis:zygotic-embryogenesis pathway (leading to BIII hybrids) is reported more frequently than the meiotic:parthenogenesis pathway (M1, di/poly-haploids) (Bicknell et al., 2003; Aliyu et al., 2010; Hojsgaard et al., 2014a; Schinkel et al., 2017; Klatt et al., 2018a). However, BIII formation is found largely to be genotype-dependent and ploidy level has little effect on the expression of partitioned apomeiosis. In fact, partitioning and formation of BIII hybrids have been recently reported in diploid individuals in agamic complexes of Boechera and Ranunculus (Aliyu et al., 2010; Hojsgaard et al., 2014a; Schinkel et al., 2017; Klatt et al., 2018b; Barke et al., 2018). On the other hand, expression of the parthenogenesis component is highly influenced by the ploidy variations exhibiting high positive correlation with increasing ploidy (Kaushal et al., 2009; Aliyu et al., 2010; Noyes and Wagner, 2014). Recently, a strong relationship was identified between rise in ploidy and frequency of haploid production in plants with 6x ploidy and more (2n = 6x till 2n = 11x) in an exhaustive ploidy series of P. maximum (Kaushal et al., 2018), suggesting that these "parthenogenetic factors" may also act in a dosage dependent manner. Different effects of changes in ploidy level on expression of apomeiosis and parthenogenesis suggest existence of different mechanisms controlling these two traits (reviewed in Sokolov et al., 2008). Haploids (or polyhaploids), resultant of haploid parthenogenesis are rare in diploid plants, explainable on the basis of minimum gene-dosage model, segregation-distortion model, or gametophyte-expressed lethal model (reviewed in Bicknell and Koltunow, 2004; Talent, 2009; Cosendai and Horandl, 2010). From an evolutionary perspective, these partitioned components act as a natural phenomenon to enrich the species diversity and speciation through polyploidpolyhaploid-polyploid cycles, as demonstrated in D. annulatum (de Wet, 1968), P. maximum (Savidan and Pernes, 1982), Eragrostis curvula (Mecchia et al., 2007), Boechera spp. (Aliyu et al., 2010) and Erigeron spp. (Noyes and Wagner, 2014).

In addition to the above factors (genotype and ploidy), environmental stresses, such as higher elevations, extreme temperatures and edaphic factors, seasonal variations, nutrition, herbivory and diseases, as well as pollination timings, are also known to affect the expressivity and penetrance of apomeioisis and parthenogenesis traits (Cosendai and Horandl, 2010; Mason and Pires, 2015; Schinkel et al., 2016; Shah et al., 2016; Kreiner et al., 2017; Rodrigo et al., 2017; Kirchheimer et al., 2018; Klatt et al., 2018b). A role of stress hormone signaling has been proposed for initiating such responses and has been studied in biochemical as well as evolutionary perspectives (Koltunow et al., 2001; Polegri et al., 2010; Horandl and Hadacek, 2013). In-fact it would be interesting to identify a stress-activated molecular switch that can trigger the expression of apomixis components or vice versa. Timing of pollination is also reported to be a factor affecting frequency of BIII hybridization events (Martinez et al., 1994; Burson et al., 2002; Espinoza et al., 2002).

Recombination between components may also modify mode of endosperm development in a genotype dependent manner. Such modifications are largely identified to be genotypedependent, with little effect of the ploidy levels, and modulated by still unknown regulatory factors (Li et al., 2014; Hand et al., 2015; Gehring and Satyaki, 2017; Henderson et al., 2017; Kaushal et al., 2018).

#### PANICUM MAXIMUM AS A MODEL SYSTEM TO STUDY PARTITIONING OF APOMIXIS COMPONENTS

Panicum maximum Jacq. (syn. Megathyrsus maximus, family: Poaceae, subfamily: Panicoideae, tribe: Paniceae), commonly known as guinea grass, is a suitable system for polyploidy

and apomixis research. It is a tall, high yielding, nutritious, perennial, high seed setter, and multi-cut forage grass, adapted to humid, semi-arid and arid environments. This crop possesses substantial variability in morphology, breeding and agronomic traits (Malaviya, 1998; Kaushal et al., 1999; Sukhchain, 2010), and the global germplasm diversity has been characterized for cytological, biochemical and molecular features (Jain et al., 2003, 2006; Ebina et al., 2007; Chandra and Tiwari, 2010; Sousa et al., 2011).

Naturally occurring forms are predominated by apomictic tetraploid cytotypes (2n = 4x = 32), with occasional reports of sexual diploids (2n = 16) and facultative hexaploids (2n = 48) (Savidan, 2000; Jain et al., 2003; Kaushal et al., 2008). Sexually reproducing tetraploid lines are also reported to occur naturally as well as in experimentally induced polyploids (Nakajima et al., 1979; Hanna and Nakagawa, 1994). It has a smaller genome size (ca. 500 Mbp) and ca. 0.9 pg sporophytic DNA content (in diploid strains) (Akiyama et al., 2008; Kaushal et al., 2009). Availability of sexual as well as apomictic forms within the same ploidy level makes it a suitable system to generate desired populations to undertake inheritance and molecular biology studies.

The mode of seed formation is apospory:parthenogenesis:pseudogamous-endosperm in apomictic forms, while sexual genotypes produce seeds by syngamy of reduced male and female gametes, followed by pseudogamous endosperm development. Apomeiosis is characterized by Panicum-type aposporous ES (2 synergids, 1 egg cell, 1 polar nucleus; all 4-nuclei are unreduced), while sexual lines exhibit typical Polygonum-type reduced ES (2 synergids, 1 egg cell, 2 polar nuclei, 3 antipodals; all eight nuclei are reduced). Anatomical differences between aposporous and sexual ES permit rapid analysis for identification of mode of reproduction in germplasm and segregating populations (Nakagawa, 1990).

A dominant single gene model for controlling apomixis in guinea grass has been widely accepted (Savidan, 1980, 2000; Ebina et al., 2005), governing apomixis phenotype in simplex condition (Aaaa). Development of aposporous ES has been extensively studied cytologically and ultra-structurally (reviewed in Chen and Guan, 2012). Although still to be genomesequenced, this crop is rich in available genomic resources. Molecular markers (RAPD, RFLP, AFLP, SSR and ESTs) linked to apomixis have been developed and the aposporous linkage group has been constructed (Ebina et al., 2005; Bluma-Marques et al., 2014). Expressed sequence tags (ESTs) led to identification of aposporous ovary-specific genes (Yamada-Akiyama et al., 2009). Richness of molecular resources in this crop is further strengthened by availability of extensive genomic databases in its close relative Panicum virgatum (switch grass) (Sharma et al., 2012; Zhang et al., 2013).

An Apomixis Specific Gene (ASG-1), that showed stage specific expression in developing buds of apomictic types only, has been identified and characterized through comparative gene expression analysis (Chen et al., 1999, 2005). Transcriptome data has been generated comparing the gametogenesis stages between apomictic and sexual forms (Radhakrishna et al., 2018). Recently, irradiation-induced deletion mutants for the apomixis, controlling genomic region in tetraploid guinea grass, showed loss-of-apomixis phenotype and replaced aposporous (Panicum-type) ES with sexual type (Polygonum-type) (Takahara et al., 2016).

In contrast to a general understanding of apomixis under monogenic control, a wide scale screening of guinea grass germplasm suggested multigene control of the trait. Uncoupling of the apomixis components was demonstrated in more than 67% of the global germplasm accessions, suggesting frequent occurrence of recombination between apomeiosis and parthenogenesis components (Kaushal et al., 2009). Germplasm lines with high BIII and M<sup>1</sup> formation were also identified. Reproductive diversity for seed formation estimated through reconstruction of reproductive pathways (utilizing ES and FCSS analysis), in tetraploid and hexaploid guinea grass lines, suggested that the three components (apomeiosis, parthenogenesis and functional endosperm development) recombined freely and all phenotypic classes expected from such recombination events were recovered (**Figure 1** and **Table 1**) (Kaushal et al., 2008). Identification of certain modified pathways e.g., presence of two polar nuclei in aposporous ES fusing prior to fertilization, and fusion of only one polar nucleus in a sexual ES, provides the opportunity for better insights into seed development processes. The flexibility of guinea grass to demonstrate aposporous and sexual ES, parthenogenetic and zygotic embryo development, and pseudogamous and AED (in ovules and matured seeds) offer advantages to understand the interaction effects arising due to recombination between these apomixis components.

FIGURE 4 | Scheme for production of ploidy series (Kaushal et al., 2009, 2015b). Plants representing different ploidies viz., 3x, 4x, 5x, 6x, 7x, 8x, 9x, and 11x, were generated from a single 4x progenitor through HAPA. The recovery of plants with specified ploidy and their pathways of formation (M1, BII or BIII) is depicted. Information in parenthesis shows maternal (m) and paternal (p) genomic contribution. In all cases depicted here, male gamete was always reduced, while female gamete might be reduced or unreduced, and the embryo development may be through parthenogenesis or fertilization dependent. Reprinted by permission from the Springer Nature: Kaushal et al. (2018).

permission from John Wiley and Sons: Kaushal et al. (2009).

Consequences of partitioned apomixis components, leading to the formation of triploids (3n, BIII hybridization) and/ or haploids (1n, M<sup>1</sup> progeny), was utilized to develop a Hybridization-supplemented Apomixis-components Partitioning Approach (HAPA) for ploidy manipulations without using any chemical agent or in vitro processing. Utilizing HAPA, an exhaustive ploidy series has been developed from a single 4x (2n = 32) progenitor, represented by 3x, 4x, 5x, 6x, 7x, 8x, 9x, and 11x cytotypes (Kaushal et al., 2009, 2015b) (**Figures 4**, **5**). Such an exhaustive ploidy series offers an excellent system to understand ploidy regulated trait expression with respect to apomixis and its component traits. Male fertility is maintained at all these ploidy levels, providing a better scope for genetical and breeding experiments (Kaushal et al., 2018). Guinea grass is, thus, found to possess extraordinary flexibility to accommodate extreme genome dosage (2n = 2x till 11x), chromosome numbers (2n = 2x = 16 till 2n = 11x = 88) and sporophytic DNA content (1.8 pg to 5.0 pg), and is still capable of producing functional female gametes (both reduced and unreduced) and male (mostly reduced) gametes.

Ploidy effects on overall expression on apomixis revealed that the eventual phenotype depends on relative doses of apospory and parthenogenesis factors (Kaushal et al., 2018). Intriguingly, the proportion of facultatively reproducing progenies increased with the enhancement in ploidy levels. The phenotypic expression of partitioned apomictic components demonstrated BIII hybridization and AED to be less effected by the change in ploidy and were mostly dependent on genotypic effects. However, formation of M<sup>1</sup> progeny was highly affected by a rise in ploidy, however, appeared only in plants with 6x ploidy or more (Combes, 1975; Kaushal et al., 2008, 2018). Availability of genotypes with similar ploidy level but contrasting capacities for partitioned components (extreme high or low frequency of formation of BIII or M<sup>1</sup> progeny) are important resources to identify differentially expressed genes governing these partitioned components.

Endosperm development in guinea grass is intriguing, considering the fact that the typical 2:3 genome ratios is conserved in the embryo and endosperm of matured seeds by virtue of modification in the embryo-sac, which is a 4 nucleated Panicum type (as discussed earlier). However, it shows extraordinary flexibility in tolerating excessive deviance from typical 2em:3end genome ratios, as well as maternal and paternal genome contributions in developing embryos and endosperms. Em:End genome ratios tolerated are 2:3 (in BII/MII progenies), 1:1 (in BIII) and 1:3 (in M1) (**Table 1**). As an illustration, a 11x (2n = 88) plant will have 11em:16.5end and ≈16.5em:16.5end genome ratio in typical apomictic and BIII seed, respectively. Successful recovery of seeds, representing almost all categories (BII/MII, BIII and M1) from plants

representing ploidy series (see **Table 1**), suggests that EBN and endosperm imprinting constraints are largely relaxed in this crop (Kaushal et al., 2018). Recovery of fertile seeds from such diverse categories is also important for studying nucleo-cytoplasm as well as embryo-endosperm interactions and the ovule molecular-machinery capable of bearing such high genomic content.

The diversity in pathways of seed formation, availability of plants representing different modes of reproduction and different ploidy levels, male fertility in plants with higher ploidies and successful recovery of seeds at extremely higher ploidies (some of them expressing BIII and M<sup>1</sup> hybridization), all make this crop a potentially useful system for undertaking investigations in apomixis genetics and breeding, as well as cytogenetical and molecular studies on partitioned apomixis components.

#### UTILIZATION OF PARTITIONED APOMIXIS COMPONENTS

Understanding the partitioning phenomenon as well as utilization of the partitioned apomixis components have experimental and applied consequences. The foremost importance is towards a better understanding of apomeiotic ES development as well as elucidating parthenogenetic factors responsible for autonomous egg cell development, as the plant material polymorphic for differential capacities for these components is now available (Koltunow et al., 2011; Sahu et al., 2012; Hojsgaard et al., 2014a). It also may shed light on embryo-endosperm interactions especially for EBN and endosperm imprinting effects, chromatin dynamics, evolution of components, and more specifically the progression of the components in apomixis process.

Partitioning apomeiosis from parthenogenesis also allows for generation of variability, because the two stages of variability generation, viz., meiosis and fertilization, respectively, are rendered operational. Such possibilities eventually defy the perception considering apomixis as evolutionary dead-end and road-block to plant breeding (Darlington, 1939; Grant, 1981). Variability has been successfully generated through addition of genomes utilizing BIII hybridization in otherwise apomictic species, such as Brachiaria decumbens, Panicum maximum, Poa pratensis, E. curvula, C. ciliaris, Pennisetum orientale etc. (Bashaw and Hignight, 1990; Nogler, 1994; Naumova et al., 1999; Matzk et al., 2005; Kaushal et al., 2015a, 2018). In fact, BIII hybrids are formed directly, without intermediary of sexual relatives, and thus give rise to new apomictic biotypes, thereby further increasing the polymorphism of the agamic species complex (Nogler, 1994). Additionally, polyhaploids generated through the M<sup>1</sup> pathway offer added advantage for understanding apomixis expression at diploid/haploid levels. Polyploid-polyhaploid cycles for generation and fixation of variability in natural polyploids have already been discussed (reviewed in Berthaud, 2001). Although rarely reported, partitioning also presents a possibility of obtaining sexual polyhaploids from apomictic polyploids where apomixis is under monogenic control, such as P. maximum (Aaaa). Such sexual polyhaploids would be a resource for breeding apomictic crops where naturally occurring apomicts are polyploids.

Recombination between apomixis components presents a system to study diversity in reproductive pathways of seed development. Such systems, when duly coupled with polyploidy, offer advantages for precise understanding of the various mechanisms, leading to interaction effects between apomixis components as well as their interface with genetic, epigenetic and environmental factors (Matzk et al., 2001, 2005; Aliyu et al., 2010; Hojsgaard, 2018). Additionally, it also serves as a stable system to generate newer cytotypes, through BIII, M<sup>1</sup> or HAPA (Kaushal et al., 2009, 2015a).

Although there are several reports whereby mutagenesis in natural apomictic plants converted them to sexual (Takahara et al., 2014, 2016), reports of a single mutation converting a sexual plant to apomictic are extremely rare (Chen et al., 2018; Gaafer et al., 2018). The strategy and application rely on generating (or inducing) individual components (say apomeiosis and parthenogenesis) separately and then attempting to reconstruct the apomixis phenotype by combining these components into one background. Organizing partitioned apomixis elements to develop an apomictic crop is a major challenge for plant breeders. In-fact, the most plausible approach to engineer apomixis into present day crop plants would be an applied synergy between "evaluation" and "synthesis" approaches. Evaluation precisely generates information from natural apomicts for identification of genes (apomixis per se or its components), gene actions and other required factors (e.g., environment, ploidy etc.), which may be appropriately utilized in "synthesis" approach to transfer/induce into sexually reproducing crops of interest, or to engineer key genes governing sexuality (Bicknell and Koltunow, 2004; Kaushal et al., 2004, 2005; Barcaccia and Albertini, 2013; Hand and Koltunow, 2014; Mieulet et al., 2016; Khanday et al., 2019). Identification of key masterregulatory sequences has been long sought, which may govern entry into apomeiotic/meiotic ES development as well as parthenogenetic/zygotic development of egg cell, however, the underlying mechanism is still poorly understood. Accordingly, based on recent discoveries of key genes, a strategy to introduce a transgene "apomixis cassette" containing dominant genes conferring to apomeiosis, parthenogenesis and autonomous development, has been proposed to generate an apomictic crop (Conner and Ozias-Akins, 2017). Two alternative pathways have been suggested: utilizing an artificial miRNA (amiRNA) MiMe cassette and an egg-specific promoter fused with a weak CENH3 variant cassette to generate a MiMe + GEM apomictic transgene line, or using amiRNA MiMe cassette to create a MiMe + PsASGR-BBML apomictic transgene line. MiMe lines may generate unreduced egg cells by replacing meiosis with mitosis (d'Erfurth et al., 2009), while CENH3/GEM (Ravi and Chan, 2010; Marimuthu et al., 2011) and PsASGR-BBML (Conner et al., 2015) may induce parthenogenesis. The endosperm in these cases will maintain the required 2m:1p ratio. Partial apomixis in rice has been recently achieved by triggering parthenogenesis in MiMe generated unreduced female gametes by ectopic expression of a male specific OsBBM gene in unfertilized ovules (Khanday et al., 2019). Another plausible approach to develop an apomictic cereal could be to introduce/reassemble apomeiosis and parthenogenesis, along with AED, however, to avoid gene flow, the final genotype must be male sterile (Kaushal et al., 2004).

#### CONCLUSION

fpls-10-00256 March 6, 2019 Time: 17:27 # 13

Apomictic mode of reproduction is seemingly a complex phenomenon, whereby the eventual expression depends on numerous major and minor factors, in addition to genotypic effects. Availability of recombination potential between its three components (apomeiosis, parthenogenesis and functional endosperm-development) offers advantages for understanding the origin, evolution, genetics and molecular biology of the phenomenon. With the increasing state of knowledge and efficient technological back-up, the biology of these components, independent as well as when linked, has been subjected to intense investigations. Large scale characterization of the reproductive diversity in agamic complexes is expected to unravel detailed insights into the possibility of partitioning. Amongst the components, apomeiosis has been investigated in detail, however, information on parthenogenesis and endosperm development is still inadequate. This is also important in a view to developing a universal model for generating apomictic crops. Comparison of molecular mechanisms governing apomeiosis, parthenogenesis and relaxed endosperm imprinting in apomicts, as compared to development of unreduced egg cell, haploid embryos and endosperm development in sexual crops through alternate

### REFERENCES


pathways (e.g., restitution nuclei, endomitosis, uniparental chromosome-elimination, alloplasmic systems), is expected to yield important insights into possible overlaps during the seed formation process. Identification of the master regulatory switch triggering apomixis in sexual crops and sexuality in apomictic crops is a plant breeders' dream. Though amalgamation of information gathered from apomictic and sexual systems (evaluation and synthesis approach) has led to the proposed model towards developing apomictic crops (Pupilli and Barcaccia, 2012; Conner and Ozias-Akins, 2017; Khanday et al., 2019), signaling pathways, cell-to-cell interactions (Juranic et al., 2018), and protein and metabolome investigations may greatly strengthen the state of knowledge.

### AUTHOR CONTRIBUTIONS

PK involved in conceptualization, literature collection, compilation and writing the manuscript. KD, AR, MS, and VK performed literature collection and wrote the manuscript. AR and DM performed compilation and wrote the manuscript.

## FUNDING

This work was supported by Indian Council of Agricultural Research, India, Department of Science and Technology, India, and Department of Biotechnology, India for Apomixis Research.

apomict Hypericum perforatum L. Heredity 96, 322–334. doi: 10.1038/sj.hdy. 6800808


praealtum and its loss of parthenogenesis (lop) mutant. BMC Plant Biol. 18:206. doi: 10.1186/s12870-018-1423-1


Caceres, M. E., Matzk, F., Busti, A., Pupilli, F., and Arcioni, S. (2001). Apomixis and sexuality in Paspalum simplex: characterization of the mode of reproduction in segregating progenies by different methods. Sex. Plant Reprod. 14, 201–206. doi: 10.1007/s00497-001-0109-1

Calderini, O., Chang, S. B., de Jong, H., Busti, A., Paolocci, F., Arcioni, S., et al. (2006). Molecular cytogenetics and DNA sequence analysis of an apomixislinked BAC in Paspalum simplex reveal a non-pericentromere location and partial micro-colinearity with rice. Theor. Appl. Genet. 112, 1179–1191. doi: 10.1007/s00122-006-0220-7

Calzada, J.-P. V., Crane, C. F., and Stelly, D. M. (1996). Apomixis: the asexual revolution. Science 274, 1322–1323. doi: 10.1126/science.274.5291.1322

Carman, J. G. (1997). Asynchronous expression of duplicate genes in angiosperms may cause apomixis, bispory, tetraspory, and polyembryony. Biol. J. Linn. Soc. 61, 51–94. doi: 10.1111/j.1095-8312.1997.tb01778.x

Catanach, A. S., Erasmuson, S. K., Podivinsky, E., Jordan, B. R., and Bicknell, R. A. (2006). Deletion mapping of genetic regions associated with apomixis in Hieracium. Proc. Natl. Acad. Sci. U.S.A. 103, 18650–18655. doi: 10.1073/pnas. 0605588103

Chandra, A., and Tiwari, K. K. (2010). Isolation and characterization of microsatellite markers from guinea agrass (Panicum maximum) for genetic diversity estimate and cross-species amplification. Plant Breed. 129, 120–124. doi: 10.1111/j.1439-0523.2009.01651.x

Chen, L., and Guan, L. (2012). "Ultrastructural mechanisms of aposporous embryo sac initial cell appearance and its developmental process in gametophytic apomicts of Guinea grass (Panicum maximum)," in The Transmission Electron Microscope, ed. K. Maaz (Rijeka: InTech). doi: 10.5772/34912

Chen, L., Guan, L., Sio, M., Hoffmann, F., and Adachi, T. (2005). Developmental expression of ASG-1 during gametogenesis in apomictic guinea grass (Panicum maximum). J. Plant Physiol. 162, 1141–1148. doi: 10.1016/j.jplph.2005.02.010

Chen, L., Miyazaki, C., Kojima, A., Saito, A., and Adachi, T. (1999). Isolation and characterization of a gene expressed during early embryo sac development in apomictic Guinea grass (Panicum maximum). J. Plant Physiol. 154, 55–62. doi: 10.1016/S0176-1617(99)80318-6

Chen, X., Lai, H. G., Sun, Q., Liu, J. P., Chen, S. B., and Zhu, W. L. (2018). Induction of apomixis by dimethyl sulfoxide (DMSO) and genetic identification of apomictic plants in cassava. Breed. Sci. 68, 227–232. doi: 10.1270/jsbbs.17089

Combes, D. (1975). Polymorphisme et mode de reproduction dans la section des Maximae du genre Panicum (Graminees) en Afrique. Coll. Menoires ORSTOM 77, 1–100.

Conner, J. A., Goel, S., Gunawan, G., Cordonnier-Pratt, M. M., Johnson, V. E., Liang, C., et al. (2008). Sequence analysis of bacterial artificial chromosome clones from the apospory-specific genomic region of Pennisetum and Cenchrus. Plant Physiol. 147, 1396–1411. doi: 10.1104/pp.108.119081

Conner, J. A., Gunawan, G., and Ozias-Akins, P. (2013). Recombination within the apospory specific genomic region leads to the uncoupling of apomixis components in Cenchrus ciliaris. Planta 238, 51–63. doi: 10.1007/s00425-013- 1873-5

Conner, J. A., Mookkan, M., Huo, H., Chae, K., and Ozias-Akins, P. (2015). A parthenogenesis gene of apomictic origin elicits embryo formation from unfertilized eggs in a sexual plant. Proc. Natl. Acad. Sci. U.S.A. 112, 11205– 11210. doi: 10.1073/pnas.1505856112

Conner, J. A., and Ozias-Akins, P. (2017). Apomixis: engineering the ability to harness hybrid vigor in crop plants. Methods Mol. Biol. 1669, 17–34. doi: 10. 1007/978-1-4939-7286-9\_2

Conner, J. A., Podio, M., and Ozias-Akins, P. (2017). Haploid embryo production in rice and maize induced by PsASGR-BBML transgenes. Plant Reprod. 30, 41–52. doi: 10.1007/s00497-017-0298-x

Corral, J. M., Vogel, H., Aliyu, O. M., Hensel, G., Thiel, T., Kumlehn, J., et al. (2013). A conserved apomixis-specific polymorphism is correlated with exclusive exonuclease expression in premeiotic ovules of apomictic Boechera species. Plant Physiol. 163, 1660–1672. doi: 10.1104/pp.113.222430

Cosendai, A.-C., and Horandl, E. (2010). Cytotype stability, facultative apomixis and geographical parthenogenesis in Ranunculus kuepferi (Ranunculaceae). Ann. Bot. 105, 457–470. doi: 10.1093/aob/mcp304


de Wet, J. M. J. (1968). Diploid–tetraploid–haploid cycles and the origin of variability in Dichanthium agamo species. Evolution 22, 394–397. doi: 10.1111/ j.1558-5646.1968.tb05906.x

Delgado, L., Galdeano, F., Sartor, M. E., Quarin, C. L., Espinoza, F., and Ortiz, J. P. A. (2014). Analysis of variation for apomictic reproduction in diploid Paspalum rufum. Ann. Bot. 113, 1211–1218. doi: 10.1093/aob/mcu056

Delgado, L., Sartor, M. E., Espinosa, F., Soliman, M., Galdeano, F., and Ortiz, J. P. A. (2016). Hybridity and auto polyploidy increase the expressivity of apospory in diploid Paspalum rufum. Plant Syst. Evol. 302, 1471–1481. doi: 10.1007/s00606- 016-1345-z

Depetris, M. B., Acuna, C. A., Pozzi, F. I., Quarin, C. L., and Felitti, S. A. (2018). Identification of genes related to endosperm balance number insensitivity in Paspalum notatum. Crop Sci. 58, 813–822. doi: 10.2135/cropsci2017. 04.0260

d'Erfurth, I., Jolivet, S., Froger, N., Catrice, O., Novatchkova, M., and Mercier, R. (2009). Turning meiosis into mitosis. PLoS Biol. 7:e1000124. doi: 10.1371/ journal.pbio.1000124

Ebina, M., Kouki, K., Tsuruta, S., Akashi, R., Yamamoto, T., Takahara, M., et al. (2007). Genetic relationship estimation in guinea grass (Panicum maximum Jacq.) assessed on the basis of simple sequence repeat markers. Grassl. Sci. 53, 155–164. doi: 10.1111/j.1744-697X.2007.00086.x

Ebina, M., Nakagawa, H., Yamamoto, T., Araya, H., Tsuruta, S., Takahara, M., et al. (2005). Co-segregation of AFLP and RAPD markers to apospory in guinea grass (Panicum maximum Jacq.). Grassl. Sci. 51, 71–78. doi: 10.1111/j.1744-697X. 2005.00011.x

Ellerstrom, S., and Zagorcheva, L. (1977). Sterility and apomictic embryo sac formation in Raphanobrassica. Hereditas 87, 107–120. doi: 10.1111/j.1601-5223. 1977.tb01251.x

Espinoza, F., Pessino, S. C., Quarin, C. L., and Valle, E. M. (2002). Effect of pollination timing on the rate of apomictic reproduction revealed by RAPD markers in Paspalum notatum. Ann. Bot. 89, 165–170. doi: 10.1093/aob/mcf024

Forster, B. P., Heberle-Bors, E., Kasha, K. J., and Touraev, A. (2007). The resurgence of haploids in higher plants. Trends Plant Sci. 12, 368–375. doi: 10.1016/j. tplants.2007.06.007

Gaafer, R. M., El Shanshoury, A. R., El Hisseiwy, A. A., AbdAlhak, M. A., Omar, A. F., El Wahab, M. M. A., et al. (2018). Induction of apomixis and fixation of heterosis in Egyptian rice Hybrid1 line using colchicine mutagenesis. Ann. Agric. Sci. 62, 51–60. doi: 10.1016/j.aoas.2017.03.001

Galla, G., Zenoni, S., Avesani, L., Altschmied, L., Rizzo, P., Sharbel, T. F., et al. (2017). Pistil transcriptome analysis to disclose genes and gene products related to aposporous apomixis in Hypericum perforatum. Front. Plant Sci. 8:79. doi: 10.3389/fpls.2017.00079

Gehring, M., and Satyaki, P. R. (2017). Endosperm and imprinting, inextricably linked. Plant Physiol. 173, 143–154. doi: 10.1104/pp.16.01353

Grant, V. (1981). Plant Speciation. New York, NY: Columbia University Press.

Grimanelli, D. (2012). Epigenetic regulation of reproductive development and the emergence of apomixis in angiosperms. Curr. Opin. Plant Biol. 15, 57–62. doi: 10.1016/j.pbi.2011.10.002

Grimanelli, D., Garcia, M., Kaszas, E., Perotti, E., and Leblanc, O. (2003). Heterochronic expression of sexual reproductive programs during apomictic development in Tripsacum. Genetics 165, 1521–1531.

Grimanelli, D., Leblanc, O., Perotti, E., and Grossniklaus, U. (2001). Developmental genetics of gametophytic apomixis. Trends Genet. 17, 597–604. doi: 10.1016/S0168-9525(01)02454-4

Gupta, S., Bhat, B. V., Bhat, V., Gupta, M. G., and Ahmad, S. T. (1999). Estimation of frequency of apomixis by auxin induced parthenocarpy technique in Dichanthium annulatum (Forssk.) Stapf. Range Manag. Agrofor. 19, 53–58.



Pennisetum glaucum. Mol. Biotechnol. 51, 262–271. doi: 10.1007/s12033-011- 9464-9


Savidan, Y. (2000). Apomixis, genetics and breeding. Plant Breed. Rev. 18, 13–85.


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Kaushal, Dwivedi, Radhakrishna, Srivastava, Kumar, Roy and Malaviya. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Construction of the First SNP-Based Linkage Map Using Genotypingby-Sequencing and Mapping of the Male-Sterility Gene in Leaf Chicory

Fabio Palumbo<sup>1</sup> , Peng Qi2,3, Vitor Batista Pinto<sup>4</sup> , Katrien M. Devos2,3 and Gianni Barcaccia<sup>1</sup> \*

<sup>1</sup> Laboratory of Genomics for Plant Breeding, Department of Agronomy Food Natural Resources Animals and Environment (DAFNAE), University of Padova, Legnaro, Italy, <sup>2</sup> Institute of Plant Breeding, Genetics and Genomics, Department of Plant Biology, University of Georgia, Athens, GA, United States, <sup>3</sup> Institute of Plant Breeding, Genetics and Genomics, Department of Crop and Soil Sciences, University of Georgia, Athens, GA, United States, <sup>4</sup> Department of General Biology, Federal University of Viçosa, Viçosa, Brazil

#### Edited by:

Roberto Papa, Marche Polytechnic University, Italy

#### Reviewed by:

Therese Bengtsson, Swedish University of Agricultural Sciences, Sweden Ezio Portis, University of Turin, Italy

> \*Correspondence: Gianni Barcaccia gianni.barcaccia@unipd.it

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 14 December 2018 Accepted: 20 February 2019 Published: 11 March 2019

#### Citation:

Palumbo F, Qi P, Pinto VB, Devos KM and Barcaccia G (2019) Construction of the First SNP-Based Linkage Map Using Genotyping- by-Sequencing and Mapping of the Male-Sterility Gene in Leaf Chicory. Front. Plant Sci. 10:276. doi: 10.3389/fpls.2019.00276 We report the first high-density linkage map construction through genotyping-bysequencing (GBS) in leaf chicory (Cichorium intybus subsp. intybus var. foliosum, 2n = 2x = 18) and the SNP-based fine mapping of the linkage group region carrying a recessive gene responsible for male-sterility (ms1). An experimental BC<sup>1</sup> population, segregating for the male sterility trait, was specifically generated and 198 progeny plants were preliminary screened through a multiplexed SSR genotyping analysis for the identification of microsatellite markers linked to the ms1 locus. Two backbone SSR markers belonging to linkage group 4 of the available Cichorium consensus map were found genetically associated to the ms1 gene at 5.8 and 12.1 cM apart. A GBS strategy was then used to produce a high-density SNP-based linkage map, containing 727 genomic loci organized into 9 linkage groups and spanning a total length of 1,413 cM. 13 SNPs proved to be tightly linked to the ms1 locus based on a subset of 44 progeny plants analyzed. The map position of these markers was further validated by sequencespecific PCR experiments using an additional set of 64 progeny plants, enabling to verify that four of them fully co-segregated with male-sterility. A mesosynteny analysis revealed that 10 genomic DNA sequences encompassing the 13 selected SNPs of chicory mapped in a peripheral region of chromosome 5 of lettuce (Lactuca sativa L.) spanning about 18 Mbp. Since a MYB103-like gene, encoding for a transcription factor involved in callose dissolution of tetrads and exine development of microspores, was found located in the same chromosomal region, this orthologous was chosen as candidate for male-sterility. The amplification and sequencing of its CDS using accessions with contrasting phenotypes/genotypes (i.e., 4 male sterile mutants, ms1ms1, and 4 male fertile inbreds, Ms1Ms1) enabled to detect an INDEL of 4 nucleotides in its second exon,

responsible for an anticipated stop codon in the male sterile mutants. This polymorphism was subsequently validated through allele-specific PCR assays and found to fully co-segregate with male-sterility, using 64 progeny plants of the same mapping BC<sup>1</sup> population. Overall, our molecular data could be practically exploited for genotyping plant materials and for marker-assisted breeding schemes in leaf chicory.

Keywords: Cichorium intybus, genetic linkage map, male sterility, ms1 locus, single nucleotide polymorphism (SNP) markers, genotyping-by-sequencing (GBS), transcription factor MYB103

#### INTRODUCTION

Linkage maps based on molecular markers play a key role in the study of the genetics and genomics of crop plants. Among the possible applications, the development of high-density linkage maps has simplified the discovery of Mendelian genes (Kaur et al., 2014; Zhao et al., 2015; Huang et al., 2016) and quantitative trait loci (QTL) (Colasuonno et al., 2014; Marcotuli et al., 2017; Schumann et al., 2017). The first genetic linkage map of chicory (Cichorium intybus subsp. intybus L.), a leafy vegetable crop belonging to the family Asteraceae and widely cultivated in many European countries, consisted of 431 SSR and 41 EST markers, and covered 878 cM (Cadalen et al., 2010).

Chicory is a diploid plant species (2n = 18) that is naturally allogamous, due to an efficient sporophytic self-incompatibility system (Barcaccia et al., 2003; Lucchin et al., 2008). In addition, outcrossing is promoted by a number of traits, including: (i) a floral morpho-phenology (i.e., proterandry, with the anthers maturing before the pistils) unfavorable to selfing in the absence of pollen donors (Pécaut, 1962; Desprez et al., 1994); and (ii) a competitive advantage of allo-pollen grains and tubes (i.e., pollen genetically diverse from that produced by the seed parents) (Desprez and Bannerot, 1980; Eenink, 1982). Two main botanical varieties can be recognized within C. intybus subsp. intybus to which all the cultivated types of chicory belong. The first is var. foliosum, which traditionally includes Witloof chicory, Pain de sucre, Catalogne and Radicchio and all the cultivar groups whose commercial products are the leaves (i.e., leaf chicory). The second is var. sativum and comprises all the types whose commercial product, either destined to industrial transformation or direct human consumption, is the root (i.e., root chicory). In root chicory, Gonthier et al. (2013) identified molecular markers associated with the Nuclear Male-Sterility 1 (NMS1) locus and the Sporophytic Self-Incompatibility (SSI) locus. These two loci were both mapped to narrow genomic regions belonging, respectively, to linkage groups 5 and 2 of the genetic map developed by Cadalen et al. (2010). Similarly, in leaf chicory, Barcaccia and Tiozzo Caenazzo (2012, 2014) mapped molecular markers linked to the male-sterility gene (ms1) within linkage group 4, according to the map by Cadalen et al. (2010).

Recently, a chicory genetic linkage map spanning 1,208 cM was developed by Muys et al. (2014) using an F<sup>2</sup> population composed of 247 plants. This map comprised 237 markers (i.e., 170 AFLP, 28 SSR, 27 EST-SNP, and 12 EST-SSR markers) and covered about 84% of the chicory genome. The markers were then used to find potential orthologous based on sequence homology in mapped lettuce EST clones from the Compositae Genome Project Database (Muys et al., 2014). A total of 27 putative orthologous pairs were retained, pinpointing seven potential blocks of synteny that covered 11% of the chicory genome and 13% of the lettuce genome, opening new avenues for the comparative analysis of these two species.

Mapping of the self-incompatibility and male-sterility mechanisms in chicory is important, not only to understand the genetic basis of the main reproductive barriers that act in flowering plants, but also because of the potential applications of these loci for breeding F<sup>1</sup> hybrid varieties. In fact, although in the past chicory varieties were mainly synthetics produced by intercrossing a number of phenotypically superior plants, selected on the basis of morpho-phenological and commercial traits, recently private breeders and seed firms have developed methods for the development of F<sup>1</sup> hybrids.

In the last century, male sterile mutants have allowed the exploitation of heterosis (i.e., hybrid vigor) through the development of F<sup>1</sup> hybrid varieties in many agricultural and horticultural crops. In general, male-sterility is defined as the failure of plants to develop anthers or to form functional pollen grains and it is more prevalent than female-sterility. In nature, male sterile plants have reproduction potentials because they can still set seeds, as female-fertility is unaffected by most of the mutations responsible for male-sterility. This behavior is known to occur spontaneously via mutations in nuclear and/or cytoplasmic genes involved in the development of anthers and pollen grains. Barcaccia and Tiozzo Caenazzo (2012, 2014) have recently identified and characterized a spontaneous male sterile mutation in cultivated populations of leaf chicory, namely Radicchio (C. intybus subsp. intybus var. foliosum L.). Cytological analyses revealed that microsporogenesis proceeds regularly up to the development of tetrads when the microspores arrest their developmental program showing a collapse of the exine. At the beginning of microgametogenesis, non-viable shrunken microspores were clearly visible within anthers. Moreover, genetic segregation data derived from replicated F<sup>2</sup> and BC<sup>1</sup> populations clearly supported a nuclear origin, monogenic control and recessive nature of the male-sterility trait in the leaf chicory mutants (Barcaccia and Tiozzo Caenazzo, 2012, 2014).

In this work, taking advantage of the method of genotyping based on 27 mapped microsatellite marker loci scattered throughout the linkage groups of leaf chicory (Ghedina et al., 2015) and the first genome sequence draft of leaf chicory with the functional annotation of more than 18,000 unigenes (Galla et al., 2016), we successfully constructed a high-density linkage map and finely mapped the ms1 locus in leaf chicory. After a preliminary genetic mapping of the ms1 locus using SSR and

EST markers, a Genotyping-By-Sequencing (GBS) methodology was used to narrow down the chromosomal window around the ms1 gene, first of all developing well-saturated linkage groups for this species and then selecting molecular markers and candidate genes for male-sterility exploitable for marker-assisted breeding and gene cloning programs.

### MATERIALS AND METHODS

### Plant Materials and Genomic DNA Extraction

Several male sterile mutants sharing the same genotype at the ms1 locus were discovered by T&T Srl Agricola (Blumen Group SpA) within local varieties of radicchio "Red of Chioggia" stemmed from recurrent phenotypic selection programs (Barcaccia and Tiozzo Caenazzo, 2012, 2014). A backcross (BC1) population segregating 1:1 for the male-sterility trait and comprising 198 individual plants was generated as follows. A male-sterile mutant plant (genotype msms), belonging to a cultivated population of radicchio was crossed with a male-fertile plant (genotype MsMs) selected from a local accession of wild chicory, in order to maximize genetic diversity and polymorphism levels. An individual F<sup>1</sup> male-fertile plant heterozygous at the malesterility locus (genotype Msms) was selfed and one F<sup>2</sup> malesterile progeny (genotype msms) was then backcrossed as seed parent to a sister F<sup>1</sup> male-fertile plant (genotype Msms) used as pollen donor (**Figure 1**). The agronomic field trials for plant phenotyping (discrimination of male-fertile and malesterile phenotypes) were conducted at the experimental farm "Lucio Toniolo" of the University of Padova, located in Legnaro (Padova, Italy) (45◦ 210 5,6800N 11◦ 570 2,7100E – 8 m above the sea level). Seeds were sown in February and grown in heated greenhouse, under light/dark cycle conditions of 12/12 h and temperature of 20 ◦C. Uniformly sized, 4 week-old seedlings were transplanted in the field under polyethylene tunnels and under controlled pollination conditions. The soil texture was the following: 46 sand, 24% clay and 30% loam; pH = 7.9; electric conductibility 112 µS; organic carbon 1.1%. Flowering started around 150 days after sowing and all individuals of the BC<sup>1</sup> population were phenotyped using three flowers per plant by visual observations of the anthers (in vivo screening for the presence/absence of pollen) and cytological investigations (in vitro staining of pollen). More in details, cytological investigations were accomplished using both aceto-carmine and DAPI solutions. The aceto-carmine staining technique was used to measure discriminant phenotypic descriptors, such as shape, size and coloration of pollen grains, as reported by Janssen and Hermsen (1976) and implemented by Barcaccia et al. (1998), while DAPI staining was accomplished as described by Barcaccia and Tiozzo Caenazzo (2012). Microscopy characterized a mutant phenotype by shapeless, small and shrunken microspores as compared to the wild-type ones (**Figure 1**). It is worth mentioning that in mutant plants at the stage of dehiscent anthers, microspores were arrested in their development at the uninucleate stage, and collapsed before their release from the tetrads. Viable pollen grains were never detected in mature anthers, demonstrating full expression of the male-sterility trait (Barcaccia and Tiozzo Caenazzo, 2012, 2014).

Total genomic DNA of the parents and progeny was isolated from 100 mg of fresh leaf tissue using the DNeasy <sup>R</sup> Plant minikit (Qiagen, Hilden, Germany) following the recommendations of the manufacturer. Quality and concentration of DNA samples were estimated by spectrophotometric analysis (NanoDrop 2000c UV-Vis, Thermo Fisher Scientific, San Jose, CA, United States) and quality was also assayed by agarose gel electrophoresis (1.0% w/v agarose TAE 1× gel containing 1× SYBR <sup>R</sup> Safe, Thermo Fisher Scientific).

#### Preliminary Genetic Mapping of the ms1 Locus With Simple Sequence Repeat (SSR) and Cleaved Amplified Polymorphic Sequence (CAPS) Markers

The entire BC<sup>1</sup> population of leaf chicory was used for genetic mapping of the male-sterility gene using three selected SSR markers (M4.12, M4.11b, and M4.10b) and one EST-derived CAPS marker previously mapped on linkage group 4 (Cadalen et al., 2010; Barcaccia and Tiozzo Caenazzo, 2012, 2014). This step was preliminarily applied in order to determine and validate upstream and downstream backbone DNA markers encompassing the ms1 locus.

An AFLP-derived amplicon corresponding to marker E02M09 (Barcaccia and Tiozzo Caenazzo, 2012), whose sequence was found to encompass a microsatellite region (GenBank accession JF748831), was converted into a SSR marker and renamed as M4.12 by Ghedina et al. (2015). This sequence-tagged site marker was mapped on linkage group 4 and was therefore included in the genetic analysis of the BC<sup>1</sup> population. Among the microsatellite marker loci publicly available for the leaf chicory genome (Cadalen et al., 2010) and associated to the linkage group 4, the M4.11b (Ghedina et al., 2015) (synonym EU03H01) contained an imperfect [(TG)5CG(TG)7] microsatellite motif (GenBank accession KF880802) and M4.10b (Ghedina et al., 2015) [synonym EU07G10 (Cadalen et al., 2010)] carried a (CT)8TT(CT)5CC(CT)3TT(CT)<sup>7</sup> microsatellite motif (GenBank accession KX534081).

For SSR amplification, the three-primer strategy reported by Schuelke (2000) was adopted, with some modifications. Briefly, forward primers were tagged at their 5<sup>0</sup> end with universal sequences (M13 or PAN2, unpublished) and used in PCR reactions in combination with sequence-specific reverse primers, and M13 and PAN2 oligonucleotides labeled with the fluorophores 6-FAM and NED, respectively.

The PCR reactions were performed in a total volume of 10 µl containing approximately 20 ng of gDNA template, 1× Platinum <sup>R</sup> Multiplex PCR Master Mix (Applied Biosystems, Carlsbad, CA, United States), GC enhancer 10% (Applied Biosystems), 0.05 µM tailed forward primer (Invitrogen Corporation, Carlsbad, CA, United States), 0.1 µM reverse primer (Invitrogen Corporation) and 0.23 µM universal primer (Invitrogen Corporation). The following thermal conditions were adopted for all reactions: 2 min at 95◦C for the initial denaturation step,

of the segregating population was then backcrossed as seed parent to a sister F<sup>1</sup> male-fertile plant (genotype Msms) used as pollen donor. Individuals of the BC<sup>1</sup> population were phenotyped by visual observations of anthers (in vivo screening for the presence/absence of pollen) and cytological investigations (in vitro staining of pollen using acetocarmine and DAPI solutions; for detailed information on pollen viability analysis please see Barcaccia and Tiozzo Caenazzo, 2012). At flowering, anthers were preliminary screened for the absence vs. presence of pollen and microscopy was then used to validate the sterile vs. fertile phenotype (mutants were characterized by shapeless, smaller and shrunken microspores as compared to the wild-type ones).

45 cycles at 95◦C for 30 s, 55◦C for 30 s and 72◦C for 45 s. A final extension step at 72◦C for 30 min terminated the reaction, to fill-in any protruding ends of the newly synthesized strands.

Amplicons were initially separated and visualized on 2% agarose gels in 1× TAE gel containing 1× Sybr Safe DNA stain (Life Technologies, Carlsbad, CA, United States). The remainder of the fluorescent labeled PCR products (8 µl) was subjected to capillary electrophoresis on an ABI PRISM 3130xl Genetic Analyzer (Thermo Fisher). LIZ500 (Applied Biosystems) was used as molecular weight standard.

The CAPS marker was developed from a MADS-box gene (GenBank accession AF101420) which was initially considered (Cadalen et al., 2010) but later disproven (Barcaccia et al., 2016) to be a candidate gene for male-sterility. Several primer pairs were designed for nested PCR assays using PerlPrimer v1.1.21 and used to amplify the full-length sequence and sub-regions of the MADS-box gene from plants of the BC<sup>1</sup> population phenotyped

for male-sterility. Amplification reactions were performed in a 9700 Thermal Cycler (Applied Biosystems) with the following conditions: initial denaturation at 94◦C for 5 min followed by 30 cycles at 94◦C for 30 s, 57◦C for 30 s, 72◦C for 60 s and a final extension of 10 min at 72◦C, and then held at 4 ◦C. The quality of PCR products was assessed on a 2% (w/v) agarose gel stained with 1× SYBR <sup>R</sup> SafeTM DNA Gel Stain (Life Technologies). Amplicons were Sanger-sequenced to detect SNPs potentially associated with male-sterility and to locate restriction sites. PCR products were then cleaved using NcoI (Promega, Madison, WI, United States) as endonuclease specific for single nucleotide variants, following the protocol suggested by the manufacturer. Amplicons were digested at 37◦C for 2 h. CAPS variants were visualized on 2.5% (w/v) agarose gels (Life Technologies) stained with 1× SYBR <sup>R</sup> SafeTM DNA Gel Stain (Life Technologies).

Segregation data from the three SSR markers and the MADSbox gene-specific CAPS marker were analyzed with JoinMap <sup>R</sup> v. 2.0 (Stam, 1993) using the BC<sup>1</sup> population type option. Genetic association between each of the markers and the male-sterility trait was assessed by recording the target ms1 locus as a qualitative trait. The grouping module was applied with a LOD threshold of 3 and a maximum recombination frequency r of 40%. The genetic distance between each pair-wise comparison of marker locus and target locus, expressed in centiMorgans (cM), was calculated from the recombination frequency corrected with the Kosambi's mapping function (Kosambi, 1943). MapChart v.2.3 (Voorrips, 2002) was used to display the map.

### Construction of a SNP-Based Linkage Map Using Genotypingby-Sequencing (GBS)

Genomic DNA from 22 male sterile progeny and 22 male fertile progeny from the BC<sup>1</sup> population, along with 2 DNA samples from the parental plants were quantified with the dsDNA BR assay on a Qubit <sup>R</sup> 1.0 fluorometer (Invitrogen, Carlsbad, CA, United States) and DNA concentrations were normalized to 20 ng/µl. DNA samples were shipped to LGC Genomics (Berlin, Germany) for GBS library preparation, sequencing and subsequent bioinformatic analysis. Briefly, DNA samples were digested with the restriction enzyme MslI (NEB, Beverly, MA, United States) and the indexed Illumina libraries were prepared using the Ovation Rapid DR Multiplex System (Nugen, Leek, Netherlands) according to the protocol provided by the manufacturer. The 46 libraries were pooled removing PCR primers and small amplicons by Agencourt XP bead purification (Beckman Coulter, High Wycombe, United Kingdom) and then normalized using the Trimmer Kit (Evrogen, Moscow, Russia). The normalized library pool was amplified using MyTaq polymerase (Bioline, Taunton, MA, United States) and standard Illumina TrueSeq amplification primers (Illumina Inc., San Diego, CA, United States). The nGBS library was finally size selected on a LMP-Agarose gel, removing fragments smaller than 300 bp and larger than 500 bp, and sequenced on a single lane of an Illumina NextSeq 500 v2 (2 × 150 bp, Illumina Inc., San Diego, CA, United States).

Raw reads were de-multiplexed and split according to their barcodes using Illumina bcl2fastq 2.17.1.14 software (Illumina, San Diego, CA, United States) and sequencing adapter remnants were clipped. After this step, using proprietary LGC Genomics software reads were processed as follows: (i) trimming of the 3 0 -end (to get a minimum average Phred quality score of 20 over a window of ten bases); (ii) discarding reads with 5<sup>0</sup> -ends not matching the restriction enzyme site; (iii) removing all reads containing undetermined (N) bases; (iv) discarding reads with final length < 64 bases. CD-HIT EST (Fu et al., 2012) clustered all the processed sequences in a reference catalog of consensus loci, allowing up to 5% difference. BWA v.0.7.12<sup>1</sup> was used to align the reads from each sample against the newly constituted reference catalog and Freebayes v1.0.2-16<sup>2</sup> was then employed at default settings for variant discovery. The raw SNP variants were filtered by applying customized Perl scripts according to the following rules: (i) minimum allele count exceeding eight reads; (ii) allele frequency across all samples between 5 and 95%; (iii) genotypes observed in at least 32 samples; (iv) discard adjacent SNPs and SNPs with more than two alleles (i.e., only biallelic SNPs were taken into account).

Segregation data, analyzed using a modified version of MapMaker v3 software<sup>3</sup> consisted of GBS-SNP data, genotypic scores for SSR markers M4.10b, M4.12, M2.6, and M2.4 (Ghedina et al., 2015) and qualitative scores for the male-sterility locus ms1. SSRs M2.6 and M2.4 were included because they have been shown by Gonthier et al. (2013) to be associated with the self-incompatibility locus. Linkage groups were formed at a LOD threshold of 5. Marker orders within single linkage groups were determined using the MapMaker functions 'order,' 'try,' and 'ripple,' and were checked manually to ensure optimal placement of the marker loci. Genetic distances were calculated using the Kosambi mapping function and graphically represented using MapChart v2.3 (Voorrips, 2002).

The newly developed genetic map was enriched by locating markers on the first genome draft of leaf chicory (Galla et al., 2016) using a BLASTN approach (similarity >90%, E-value < 1E-50). Putative functional annotation of genes present in the mapped contigs was performed using BLASTX against the TAIR10 (Berardini et al., 2015) database (E-value < 1E-5).

#### Validation of SNP Variants Linked to the Male-Sterility Locus Through Allele Specific (AS)-PCR Assays

All SNPs potentially associated with the male-sterility ms1 locus and exhibiting a maximum of three recombinant events with ms1 were validated in a larger number of progeny through allele-specific PCR (AS-PCR) assays.

A total of 64 genomic DNA samples (i.e., 32 male sterile plants and 32 male sterile plants) from the same BC<sup>1</sup> population but not included in the GBS analysis were used for amplification using two sets of primers for each male-sterility associated SNP. Each primer set consisted of a different allele-specific

<sup>1</sup>http://bio-bwa.sourceforge.net/

<sup>2</sup>https://github.com/ekg/freebayes#readme

<sup>3</sup>http://rna-informatics.uga.edu/malmberg/rlmlab

forward primer and a common locus-specific reverse primer. The two allele-specific PCR primers were designed so that the 3<sup>0</sup> nucleotide was complementary to one allele of the putative polymorphism. Amplicons were produced were in a 9700 Thermal Cycler (Applied Biosystems) with the following conditions: initial denaturation at 94◦C for 5 min followed by 30 cycles at 94◦C for 30 s, 55◦C for 30 s, 72◦C for 60 s and a final extension of 10 min at 72◦C, and then held at 4◦C. PCR products were separated on 2.0% w/v agarose TAE 1× gels containing 1× SYBR <sup>R</sup> Safe stain (Thermo Fisher Scientific). Segregation data for ms1 and ms1-associated markers obtained in the entire set of 108 (44 used in GBS + 64 used in AS-PCR) BC<sup>1</sup> progeny were used to build a new genetic linkage map for the chromosomal block carrying the target locus with JoinMap <sup>R</sup> v. 2.0 (Stam, 1993). The BC<sup>1</sup> population type option was adopted and the Kosambi mapping function was used to calculate genetic distances. The resulting map was drawn with MapChart v.2.3 (Voorrips, 2002).

### Micro-Synteny Comparison Between Lactuca sativa and Cichorium intybus Homologous Chromosomal Segments

Considering that the recent release of the high quality Lactuca sativa genome assembly (Reyes-Chin-Wo et al., 2017) represents a reference assembly for the whole Asteraceae family, a BLASTN approach (E-value < 1E-20) was performed aligning the chicory contigs carrying the 13 SNPs associated to the ms1 locus against the genome of lettuce. Robust evidences of microsynteny between these two species, made MYB103 worth to be investigated due to its possible functional involvement in the male sterility (Zhang et al., 2007). Since contig\_119275 from the chicory genome draft, showed the best match with Lsat\_1\_v5\_gn\_5\_561 locus, annotated in L. sativa as MYB103, it was used to design three couple of primers, covering the entire coding DNA sequence (CDS, **Table 1**). Amplicons of four commercial male sterile accessions (namely D49, A89, B86, and D17) and four male fertile inbreed lines (namely 2111, 202, 231, and 334) were Sanger-sequenced to detect polymorphisms between these two groups with contrasting microgametogenesis. According to a four nucleotides insertion observed in all male sterile accessions within the putative MYB103 gene, two additional couples of primer were used in an AS-PCR assay to test the 64 genomic DNA samples (i.e., 32 male sterile plants and 32 male sterile plants) from the BC<sup>1</sup> population previously used to validate the ms1-associated SNPs (**Table 1**). In details two allele-specific forward primers were designed so that one amplified the wild type allele (without the insertion) while the other amplified only the mutated allele (with the insertion). Two reverse primers were designed within a conserved region of the second exon in order to produce amplicons of different size, distinguishable by standard agarose gel electrophoresis (i.e., MYB103\_MS and MYB103\_MF, **Table 1**). The two PCR reactions specifically designed for the wild type and the mutated alleles were performed separately for each DNA sample at the same conditions used for the previous AS-PCR analysis. The two PCR outputs of each individual were mixed and run together on a 2.0% w/v agarose TAE 1× gels containing 1× SYBR <sup>R</sup> Safe stain (Thermo Fisher Scientific).

### RESULTS

#### Molecular Mapping of the SSR and CAPS Markers in the Linkage Group Carrying the Male-Sterility Locus

The fine genetic mapping of three selected SSR markers (**Table 1**) was successfully pursued and the genetic recombination estimates were validated using 198 BC<sup>1</sup> individual plants of the segregating population on the basis of chi-square values against independent assortment patterns (**Supplementary Table S1**). The genetic distances between the male-sterility gene and the microsatellite markers M4.12 (EU02M09) and M4.11b (EU03H01) mapped apart from the ms1 locus were equal to 5.8 and 12.1 cM, respectively. An additional mapped microsatellite marker, M4.10b (EU07G10), belonging to the same linkage group, was located downstream the ms1 locus at a genetic distance of 31.2 cM (**Figures 2A,B**). The DNA sequences of the genomic regions containing these SSR markers were deposited in GenBank as accessions KX534081, KF880802, and JF748831. According to the nomenclature of Cadalen et al. (2010), the gene responsible for male-sterility was associated to linkage group 4, as predicted by Barcaccia and Tiozzo Caenazzo (2012, 2014).

Sequence data published by Galla et al. (2016) on the genome draft of leaf chicory were used to discover, predict and annotate all genes of the genomic contigs encompassing the three molecular markers used for the SSR analysis. Regarding M4.12 (Ghedina et al., 2015), the mapped marker closest to the ms1 locus, its sequence was found to match with contig\_84164, long 36,250 nucleotides and putatively linked with gene models AT3G11330, AT4G03600, AT5G50170, and AT5G63130 (**Table 2**). It is worth noting that the former one encodes for PIRL9, a member of the Plant Intracellular Ras-group-related LRRs (Leucine rich repeat proteins) and is required for differentiation of microspores into pollen grains.

Concerning the MADS-box L2/R21 gene (AF101420) as candidate, the alignment of sequences of part of its 5<sup>0</sup> -UTR region, exon 1 and the early region of intron 1 recovered from both male-sterile mutants and wild-type plants enabled to map the MADS-box locus on the linkage group 4 (Cadalen et al., 2010) by means a CAPS markers. In fact, three SNPs determining a restriction site were discovered in the amplified sequence of the first exon of the MADS-box gene. The cleavage site of the six-base cutter NcoI endonuclease was found to include a polymorphism at position 61 of the nucleotide sequence of the male fertile genotype when compared with the male sterile genotype (GenBank accessions KX455840 and KX455841). The amplification-restriction protocol for the detection of the CAPS marker alleles was applied to the total 198 BC<sup>1</sup> individual plants of the mapping population. In particular, 14 individuals scored recombinant genotypes when compared with the malesterile and male-fertile phenotypes, so that the MADS-box gene was mapped at a genetic distance of 7.3 cM apart from the

TABLE 1 | List of primer pairs used to amplify the mapped markers and the candidate gene MYB103 of linkage group 9 [corresponding to linkage group 4 of Cadalen et al. (2010)].


Cadalen et al., 2010; Ghedina et al., 2015; Barcaccia and Tiozzo Caenazzo, 2012, 2014; Present paper.

The name of the locus, the GenBank accession of the trait amplified, the polymorphism observed in ms mutants and the primer pairs are reported. SNPs marked with an asterisk were not tested with allele-specific primer combinations.

ms1 locus (**Figures 2A,B**). A genomic sequence, corresponding to contig\_95308, long 8,023 nucleotides, was found to match with the sequence encompassing the CAPS marker that, from a BLASTX approach with the TAIR database, proved to be annotated as AT4G24540, encoding for a protein involved in flowering (**Table 2**).

#### The First SNP-Based Linkage Map of Chicory Through GBS

A GBS approach was applied to a subset of the BC<sup>1</sup> population, consisting of 22 male sterile and 22 male fertile plants, in order to build the first SNP-based linkage map in Cichorium spp. and to identify markers associated with the ms1 locus. NextSeq 500 v2 Illumina platform produced 419,884,246 raw reads. After quality and adapter trimming, we obtained 339,081,743 reads that were used to create a reference catalog of 1,192,451 consensus loci. A raw pool of 16,353 SNPs was identified using Freebayes v1.0.2-16. After removal of (1) SNPs with more than 30% of missing data, (2) SNPs with a sequence depth ≤ 8×, (3) tri- and tetra-allelic SNPs, and (4) SNPs with allele frequencies across all samples ≤5% and ≥95%, 1,995 SNPs were retained for the construction of a genetic linkage map. A total of 727 SNPs clustered and mapped into 9 linkage groups spanning a total

locus. (D) Chromosomal region around the ms1, considering the recombinant data of the aforesaid 13 SNP and a total number of 108 BC<sup>1</sup> samples.

TABLE 2 | List of markers that, aligning against contigs of the first genome draft (Galla et al., 2016), were functionally annotated on the bases of matches with Arabidopsis protein database (TAIR10).


Cadalen et al., 2010; Ghedina et al., 2015; Barcaccia and Tiozzo Caenazzo, 2012, 2014; Present paper; <sup>∗</sup>For T4402 is reported the best match with the NR database; na, not available; nd, no significant match detected.

Localization of the marker within the genic region is also reported.

length of 1,413 cM (**Figure 3** and **Supplementary Table S2**). Each mapped SNP-carrying read was used to anchor the first genome draft of leaf chicory by conducting a default BLASTN analysis. A total of 688 out of 727 genetically mapped reads (95%) strongly matched (similarity >90%, E-value < 1E-50) sequences in at least one genomic contig. Considering top hits only, 3.7 Mb (0.3% of the whole genome) of the chicory genome sequence was anchored. Among the mapped contigs, 18.6% aligned with expressed regions from the TAIR Database (BLASTX, E-value < 1E-5). This allowed organizing 128 coding regions over the 9 linkage groups (**Supplementary Table S3**). Among them, it was possible to identify 46 different enzymatic proteins, 8 membrane proteins and 4 transcription factors.

The ms1 locus, along with the M4.10b and M4.12 SSR markers, mapped to linkage group 9 of our genetic map (**Figures 2C**, **3**), allowing us to associate it with the linkage group 4 from Cadalen et al. (2010) (see also **Figure 2A**). In particular, marker M4.12 co-segregated with the target gene and it was mapped at 7.8 cM from the ms1 locus (**Figure 2C**). Thirteen SNPs exhibited ≤3 recombination events with the target ms1 locus, seven of which (T11292, T4393, T4402, T4399, T4401, T4390, and T4395) cosegregated with ms1 in the population of 44 BC<sup>1</sup> progeny (**Figure 2C**). The GBS reads corresponding to these 13 markers have been deposited in GenBank under accession numbers KX789069-KX789080, KX789082. Ten of the chicory contigs carrying the mapped SNPs showed a significant match (E-value < 10-5) with TAIR database (**Table 2** and **Supplementary Table S3**).

Since a recent study located the SSI locus (Gonthier et al., 2013) in linkage group 2 from Cadalen et al. (2010), two SSR markers from this group (namely M2.6 and M2.4) were used to genotype the BC<sup>1</sup> samples employed for the GBS strategy. This analysis allowed us to associate the S-locus of leaf chicory to our linkage group 5 and the genetic distance between the two SSR markers resulted equal to 7.9 cM (**Figure 3**).

E-value < 1E-5). The correspondence between each underlined tag and the best TAIR match is reported in Supplementary Table S3. The male-sterility locus (ms1) was assessed by recording the target locus as a putative gene fully co-segregating with the trait. Four SSR markers, M4.10b, M4.12 [from linkage group 4 of Cadalen et al. (2010)], M2.6 and M2.4 [from linkage group 2 of Cadalen et al. (2010)] were used to genotype the same samples employed for the SNP-based map and integrated in the linkage map. M4.10b and M4.12 were chosen because co-segregating with the ms1 (Barcaccia and Tiozzo Caenazzo, 2012, 2014), M2.6 and M2.4 were selected because were found to be associated with the sporophytic self-incompatibility (SSI) locus by Gonthier et al. (2013).

### Validation of the SNPs Linked to the ms1 Locus

The map positions of the 13 male sterility-associated SNPs were validated by analyzing an additional 64 BC<sup>1</sup> progeny (32 male sterile and 32 male fertile). This brought the number of BC<sup>1</sup> progeny analyzed to 108, including the initial pool of 44 BC<sup>1</sup> samples analyzed by GBS. For each of the SNP markers, two pairs of primers targeting the two alleles were used in separate reactions. Because BC<sup>1</sup> progeny are either heterozygous or homozygous for the recurrent parent allele, the primer set that amplified the recurrent parent allele generated amplification products in all BC<sup>1</sup> progeny and hence acted as positive control. The other primer set amplified the alternate allele which was present only in heterozygous progeny (**Supplementary Figure S1**). Linkage analysis across the 108 progeny showed that all 13 SNPs mapped to a 9.9 cM region on linkage group 9 [corresponding to linkage group 4 according to Cadalen et al. (2010), **Figure 2D**]. Four SNPs (T11292, T4393, T4401, and T4402) co-segregated with the ms1 locus (**Figure 2D** and **Supplementary Table S4**).

The GBS reads carrying the SNP markers T11292 (KX789069) and T4401 (KX789077) aligned against contig\_55191 and contig\_71514 in the chicory draft genome assembly (Galla et al., 2016). These contigs had significant matches with gene models AT3G61700 (contig\_55191) and AT3G61420 (contig\_71514) (**Table 2** and **Supplementary Table S3**) that encoded, respectively, for a helicase with zinc-finger protein and a BSD domain (BTF2-like transcription factors, synapseassociated proteins and DOS2-like proteins). The SNPs were located downstream from the coding region. The read carrying SNP T4402 (KX789078) aligned with contig\_2496, but this sequence did not show any significant BLAST hits with proteins in the TAIR database. Extending the BLASTX analyses to NCBI's NR database, contig\_2496 was found to match significantly (E-value = 1E-43, Similarity 44%) with a locus from Helianthus annuus (Asteraceae family) encoding for a transposase of the

MuDR family (XP\_022024718). Moreover, the discriminant SNP resulted to be non-synonymous and it was located in the zincfinger domain (ZNF\_PMZ) of the protein. The read carrying SNP T4393 (KX789073) did not show any significant match with either TAIR or the NR database.

#### Micro-Synteny Relationships Between L. sativa and C. intybus Narrow Chromosomal Segments Enabled the Isolation of MYB103 as Candidate Gene for Male-Sterility

A BLASTN approach, performed using the lettuce genome as database, highlighted that 10 out of the 13 contigs carrying the ms1-associated SNPs, aligned (E-value < 1E-20) with as many loci of L. sativa, all located in a peripheral region of ∼18 Mbp of chromosome 5 (**Figure 4A**). In the same chromosomal region of L. sativa mapped a transcription factor from the MYB family, known as MYB103 and functionally required for anther development in Arabidopsis thaliana (Zhang et al., 2007). From a new BLASTN alignment, carried out using the lettuce genome as database, contig\_119275 showed the 92% of similarity (E-value = 0.0) with the MYB103 locus of L. sativa (namely Lsat\_1\_v5\_gn\_5\_561). Since a reciprocal BLASTN search accomplished using the chicory genome draft as database and Lsat\_1\_v5\_gn\_5\_561 as query confirmed the match (91% similarity, E-value = 0.0), the CDS region entirely included within contig\_119275 was considered the putative orthologous of MYB103 in chicory. The DNA segment carrying the two exons of the gene as well as the intron between them was amplified by target-sequence PCR and Sanger-sequenced in eight chicory accessions (4 male sterile - namely D49, A89, B86, D17 – and four male fertile inbreed lines, namely 2111, 202, 231, and 334) using a strategy of overlapping primer walking. The multiple sequence alignment enabled to detect an insertion of four nucleotides (TTAA) in position 1497 of the contig, within the second exon of the male sterile individuals (**Figure 4B**). Moreover, based on the AS-PCR assay (**Figure 4C**) performed using 64 BC<sup>1</sup> accessions (32 male sterile and 32 male fertile), MYB103 was found to fully co-segregate with ms1 and no recombinants were detected. Sequence data of the entire MYB103 gene were deposited in GenBank with accession no. MK285053–MK285054.

## DISCUSSION

To the best of our knowledge, this is the first time that malesterile mutants have been genetically characterized in leaf chicory (C. intybus subsp. intybus var. foliosum). The main goals of our study were to build the first SNP-based linkage map in this species and to accomplish the molecular mapping of the male-sterility trait.

A deep comprehension of nuclear male-sterility systems is extremely important for their exploitation in plant breeding, as male-sterility is one of the most effective methods to produce F<sup>1</sup> hybrid varieties in crop plants (Rajeshwari et al., 1994; Havey, 2004; Barclay, 2010; Acquaah, 2012). F<sup>1</sup> hybrids are usually developed by crossing two highly homozygous parental lines, selected to obtain highly heterozygous progeny, which are characterized not only by high uniformity of phenotypic traits, but also by strong heterosis in terms of productivity.

A recent cytological study of a naturally occurring malesterile mutant in leaf chicory has shown that micro-sporogenesis proceeds regularly up to the development of tetrads (Barcaccia and Tiozzo Caenazzo, 2012, 2014). After that, all microspores arrest their developmental program. At the beginning of micro-gametogenesis, non-viable shrunken microspores were clearly visible within anthers. Moreover, detailed investigations indicated the occurrence of meiotic abnormalities in the malesterile mutants, especially at prophase I. Abnormal pairings and chromosomal loops were observed during pachytene. This new mutant, whose male-sterility is caused by a recessive nuclear gene, has been applied in the production of F<sup>1</sup> hybrids of Radicchio, and has been recently subjected to patenting (Barcaccia and Tiozzo Caenazzo, 2012, 2014). However, beyond the fact that the NMS was discovered and mapped in root chicory (Desprez et al., 1994; Gonthier et al., 2013) and leaf chicory (Barcaccia and Tiozzo Caenazzo, 2012, 2014) within linkage groups 5 and 4, respectively, no genetic information is available about this locus.

Exploiting an SSR-based approach, the gene responsible for male-sterility in leaf chicory was found genetically linked to the genomic locus M4.12 (JF748831), an AFLP-derived amplicon encompassing an SSR region, about 5.8 cM apart from the ms1 locus. Two additional SSR markers, corresponding to genomic loci M4.10b (KX534081) and M4.11b (KF880802), were mapped in the same linkage group at a genetic distance equal to, respectively, 31.2 and 12.1 cM from the ms1 locus. Although mapped on a chromosomal region surrounding the gene of interest, the three discovered SSR markers were found loosely linked to the ms1 gene. Considering the functional annotation of the three SSR-containing genomic sequences, E02M09, the closest marker to the ms1 locus, was found to be putatively linked with a TAIR gene (AT3G11330, PIRL 9 protein) required for differentiation of microspores into pollen grains (Forsthoefel et al., 2013), which definitely is the phenomenon described as disrupted in our ms1 mutants based on cytological observations (Barcaccia and Tiozzo Caenazzo, 2012, 2014). However, the relatively high number of recombinants detected in the BC<sup>1</sup> population, supporting a genetic distance of 5.8 cM from the ms1 locus, proved that it could not be responsible for male-sterility in leaf chicory.

The diagnostic CAPS marker derived from the MADS-box L2/R2 gene (KX455840–KX455841) was found genetically linked at 7.3 cM apart from the ms1 locus. This marker encompasses a genomic block that includes a MADS-box protein (AT4G24540) and a protein that acts as a floral repressor (AT4G22540). Functional analyses by molecular genetic studies in model eudicots, such as A. thaliana L., have shown that the proteins encoded by these two genes are both essential for the regulation of various aspects of flower development (Yamaguchi and Hirano, 2006) but no information about their involvement in the malesterility mechanism is available. Moreover, the relatively high number of recombinants detected in the BC<sup>1</sup> population raises some doubts regarding its role in the pollen development in leaf chicory.

A genotyping-by-sequencing approach allowed us to construct the first SNP-based linkage map and to narrow down the genomic window around the ms1 locus in leaf chicory. A total of 727 reads-carrying SNPs were clustered into 9 linkage groups. The first genome draft of leaf chicory was then used to anchor 688 contigs and 128 coding regions to the genetic map. Among the enzymes mapped, it was possible to identify proteins involved in the biosynthesis of N-glycan (AT5G19690), cutin, suberin and wax (AT5G55340), diterpenoid (AT5G25900) and amino acids, including valine, leucine and isoleucine (AT1G31180). Other enzymes resulted specifically involved in metabolism processes like glycine, serine and threonine metabolism (AT4G29840), glyoxylate and dicarboxylate metabolism (AT2G05710), inositol phosphate (AT5G42810) and ascorbate and aldarate metabolism (AT5G56490). Two different proteins VAMP713 (AT5G11150) and VAMP714 (AT5G22360), with a key role in the vacuolar trafficking during salt stress (Leshem et al., 2006) were mapped in the linkage groups 7 and 8, respectively. Finally, a noteworthy SNP marker mapped in the linkage group 9 was associated to a genomic contig that, in turn, matched with an ERF BUD ENHANCER in Arabidopsis (EBE, AT5G61890). This transcription factor, member of the APETALA2/ETHYLENE RESPONSE FACTOR (AP2/ERF) transcription factor superfamily, was found to promote cell proliferation, leading to enhanced callus growth, to stimulate axillary bud formation and outgrowth, and to affect shoot branching, acting in cell cycle regulation and dormancy breaking (Mehrnia et al., 2013).

Curiously, 22 mapped contigs matched with as many mitochondrial genes of Arabidopsis and, in particular, ATMG00300 (14 matches with as many contigs) and ATMG00 750 (3 matches with as many contigs). From TAIR database they resulted annotated as 'Gag-Pol-related retrotransposon family protein' and 'Gag-Pol-Env polyprotein,' respectively. We found that these two classes of mitochondrial retrotransposons were detected in multiple copies throughout the nuclear genome of several species. At this regards, according to what reported in GenBank database, at least three ATMG00300-like copies were located within chromosomes 3, 7, and 15 of Malus domestica as well as in the linkage groups 2, 5, and 6 of Glycine max. In A. thaliana, the same locus was found also within chromosome 2. This is in accordance with large and unexpected organellar-to-nuclear gene-transfer events highlighted in species like rice (Ueda et al., 2005) and Arabidopsis (Lin et al., 1999). Conversely, we cannot exclude an opposite situation where transposable elements were originally transferred from the nuclear genome to the mitochondrial one as highlighted in Malus x domestica (Goremykin et al., 2012). However, the abundance of retroelements sequences identified within the chicory map (e.g., 24 out of 128 coding regions were retrotransposon family proteins) is coherent with what already reported in other species like rice or maize where the retroelements represented, respectively, 14% (Sasaki et al., 2002) and 49% (Meyers et al., 2001) of the whole genome.

Two SSR markers (M2.6 and M2.4), mapped on the linkage group 2 by Cadalen et al. (2010), known for carrying the SSI locus (Gonthier et al., 2013) in root chicory, were used to genotype the 44 samples employed for the GBS. This analysis enabled to associate the self-incompatibility locus to the linkage group 5 of leaf chicory. The genetic distance between these two mapped SSR markers was 7.9 cM whereas they were 14.9 cM apart in the genetic map developed by Cadalen et al. (2010).

Recording the target ms1 locus as a putative gene fully cosegregating with the trait mapped and using two SSR markers [i.e., M4.10b and M4.12, according to Ghedina et al. (2015)] co-segregating with the ms1, enabled to overlap our linkage group 9 with linkage group 4 from Cadalen et al. (2010) (see **Figures 2A,B**). At least 13 SNPs were found tightly linked to the target locus, exhibiting three or less recombinants (**Figure 2C**). In particular, seven out of 13 SNPs (i.e., T11292, T4393, T4402, T4399, T4401, T4390, and T4395) did not show recombinants. To increase the robustness of this finding, an AS-PCR assay was developed focusing on these 13 loci and increasing the number of BC<sup>1</sup> samples essayed up to 108. This new round of analysis proved to be fast, cheap and highly efficient and, most importantly, allowed us to finely map all the SNPs in a chromosomal DNA region spanning 9.9 cM (**Figure 2D**) in length, both upstream and downstream the ms1 locus. The allelic variants of four SNPs proved still to fully co-segregate with the target trait and the corresponding reads were mapped at 0 cM from the ms1. Interestingly, two of these genomic sequences, namely T11292 (KX789069) and T4401 (KX789077), retrieved a significant match with two TAIR gene models: a helicase with zinc-finger protein (AT3G61700) and a BSD domain (BTF2-like transcription factors, synapse-associated proteins and DOS2-like proteins, AT3G61420) characterizing the RNA polymerase II transcription factor B. The fact that these two genes resulted to be strictly associated also in the chromosome 3 of A. thaliana (∼100 kb one from the other), strengthens the possibility that they may interact synergistically. Moreover, according to Honys and Twell (2004), both genes resulted differentially expressed during micro-gametogenesis. Nevertheless, no information about their involvement in the male-sterility mechanism is available and the position of the two SNPs downstream of the coding regions raises some doubts regarding their role in the pollen development.

Patterns of conserved synteny from the genomes of different organisms play a central undertaking in the field of molecular biology (Veltri et al., 2016). At this regard, the 13 contigs carrying the SNPs associated to the ms1 locus were aligned against the lettuce genome, currently considered the reference assembly for the Asteraceae family (Reyes-Chin-Wo et al., 2017). Ten of these contigs significantly matched with as many loci mapping in a small peripheral region, representing the 5.5% of chromosome 5 of L. sativa. It is worth noting that the 10 contigs of chicory spanned a chromosome region of 9.9 cM, constituting again the 5.5% of linkage group 4 of C. intybus. Looking at the micro mesosynteny observed between these two species – defined as the conservation of gene content but not gene order or orientation (Ohm et al., 2012) – transcription factor MYB103 stood out as the most interesting locus to investigate. In fact, according to Zhang et al. (2007), a mutation in the first exon of MYB103 in A. thaliana is responsible for a male sterile phenotype. In particular, the tapetum development and callose dissolution resulted altered in MYB103 defective plants. Moreover, most of the microspores in mature anthers were degraded and surviving microspores lacked exine. This is totally in agreement with the cytological findings reported in chicory by Barcaccia and Tiozzo Caenazzo (2012, 2014) as well as in other crops like wheat, rice, canola, and cotton (Phan et al., 2012). A BLASTN approach enabled to identify the orthologous of MYB103 in chicory and the nucleotide similarity with the same gene in lettuce resulted very high (91%, E-value = 0.0). From a BLASTX alignment, the amino acid similarity resulted very high (90%, E-value = 1E-84) also between the putative MYB103 of chicory and the orthologous gene in A. thaliana. The sequence conservation observed among these three species is coherent with the conservation of the gene function already reported by Phan et al. (2012). Preliminary analyses carried out on eight unrelated accessions (four male sterile and four male fertile inbreed samples) highlighted within the male sterile group, the occurrence of a four nucleotides insertion in the second exon of the putative CDS (see **Figure 4**). The insertion introduced a pre-mature stop codon in the MYB103 coding sequence and the protein predicted in silico resulted 123 amino acids shorter (i.e., 198 amino acids long for the mutants rather than 321 as for the wild types). Zhang et al. (2007) highlighted that mutations of MYB103 negatively affect the expression of both A6 – a putative β-1,3-glucanase involved in callose dissolution – and MS2 – a fatty acid reductase putatively involved in sporopollenin synthesis and therefore in exine formation. Our hypothesis is that a truncated variant of MYB103, lacking more than a third of the amino acid sequence, may affect the process of pollen development in leaf chicory. As a matter of fact, we demonstrated that MYB103 fully co-segregates with ms1, further corroborating the evidence that it could be the responsible for the male sterility in C. intybus.

#### CONCLUSION

The male-sterility gene (ms1) of leaf chicory was firstly confined to a chromosomal region spanning 5.8 cM through an SSR-based approach. The construction of a SNP-based linkage map and the application of an AS-PCR assay enabled to narrow down the genomic window around the target locus and to select SNPs mapped at 0 cM from ms1 locus. Moreover, the newly developed genetic linkage map combined with the micro-mesosynteny analysis enabled to identify MYB103, a gene encoding for a transcription factor of the MYB family, fully co-segregating with the ms1 locus, which could be considered a primary candidate gene for male-sterility.

#### DATA AVAILABILITY

The datasets generated for this study can be found in GenBank (Accession Numbers: KX789069, KX789070, KX789071, KX789072, KX789073, KX789074, KX789075, KX789076, KX789077, KX789078, KX789079, KX789080, KX789082, MK285053, and MK285054).

#### AUTHOR CONTRIBUTIONS

FP and GB designed the research. FP and VP conducted and controlled the experiments. PQ carried out the bioinformatics

analyses. FP and KD analyzed data. FP, GB, and KD wrote the manuscript. All authors contributed to editing the manuscript.

#### FUNDING

This project was financially supported by T&T Srl Agricola, Rosolina (Rovigo, Italy), within a Research Contract stipulated with DAFNAE, University of Padova (Italy), and titled "Development of new radicchio F1 hybrid varieties using molecular marker-assisted breeding methods".

#### ACKNOWLEDGMENTS

We wish to thank Mr. Roberto Tencani, CEO of Blumen Group SpA (Bologna, Italy), to have kindly allowed the utilization of data related to the European Patent EP2713705-B1 "Cichorium spp. male-sterile mutants", Intl. application no. PCT/EP2011/058765, for scientific purposes. This research was carried out in partial

#### REFERENCES


Barclay, A. (2010). Hybridizing the world. Rice Today 9, 32–35.


fulfillment of the Ph.D. Program of FP by taking advantage of the Doctoral Research Fellowship funded by the University of Padova (Italy). VP, who worked for 6 months at the Laboratory of Genomics for Plant Breeding, University of Padova (Italy), received financial support from CAPES/PSDE, Brazil (Process No. 88881.132586/2016-01). We also wish to thank Drs. Alice Patella and Mariano Giannone, Blumen Group SpA (Italy), for their assistance with breeding materials of leaf chicory, and Dr. Ibrahim Hmmam, Faculty of Agriculture, University of Cairo (Egypt), for his invaluable technical help with some of the preliminary SNP-specific PCR experiments performed during December 2016 at the Laboratory of Genomics for Plant Breeding, University of Padova (Italy).

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00276/ full#supplementary-material


Janssen, A. W. B., and Hermsen, J. G. T. (1976). Estimating pollen fertility in Solanum species and haploids. Euphytica 25, 577–586. doi: 10.1007/BF00041595


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Palumbo, Qi, Pinto, Devos and Barcaccia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Genomics of Flower Identity in Grapevine (Vitis vinifera L.)

Fabio Palumbo† , Alessandro Vannozzi\* † , Gabriele Magon, Margherita Lucchin and Gianni Barcaccia

Department of Agronomy, Food, Natural Resources, Animals, and Environment, University of Padua, Legnaro, Italy

The identity of the four characteristic whorls of typical eudicots, namely, sepals, petals, stamens, and carpels, is specified by the overlapping action of homeotic genes, whose single and combined contributions have been described in detail in the so-called ABCDE model. Continuous species-specific refinements and translations resulted in this model providing the basis for understanding the genetic and molecular mechanisms of flower development in model organisms, such as Arabidopsis thaliana and other main plant species. Although grapevine (Vitis vinifera L.) represents an extremely important cultivated fruit crop globally, studies related to the genetic determinism of flower development are still rare, probably because of the limited interest in sexual reproduction in a plant that is predominantly propagated asexually. Nonetheless, several studies have identified and functionally characterized some ABCDE orthologs in grapevine. The present study is intended to provide a comprehensive screenshot of the transcriptional behavior of 18 representative grapevine ABCDE genes encoding MADS-box transcription factors in a developmental kinetic process, from preanthesis to the postfertilization stage and in different flower organs, namely, the calyx, calyptra, anthers, filaments, ovary, and embryos. The transcript levels found were compared with the proposed model for Arabidopsis to evaluate their biological consistency. With a few exceptions, the results confirmed the expression pattern expected based on the Arabidopsis data.

Keywords: ABCDE genes, whorls, MADS, blooming, anthesis

#### INTRODUCTION

Grapevine (Vitis vinifera L.) represents one of the most cultivated fruit crops on a global scale, with a production reaching approximately 75 million tons of berries and overlaying approximately 7.5 million hectares (OIV, 2016). Based on statistics from the OIV, wine is the main product of viticulture (68%), followed by fresh grapes for consumption (30%), raisins (1.8%), and minor products, such as juices, jellies, ethanol, vinegar, grape seed oil, tartaric acid, and fertilizers (0.2%). Considering the high economical value of global viticulture products, the lack of information and studies related to the genetic control of flower development and, more widely, to grapevine reproduction is quite surprising. In reality, the relatively limited interest of the scientific and producer communities on this issue is the main consequence of the static nature of viticulture, at least for what concerns the "old world." The European and, on a larger scale, the global wine industry, is mainly focused on a few major cultivars, markedly restricting the range of genetic solutions for yield and quality improvement and relying almost exclusively on the improvement

Edited by:

Sergio Lanteri, University of Turin, Italy

#### Reviewed by:

David Smyth, Monash University, Australia Osvaldo Failla, University of Milan, Italy

\*Correspondence:

Alessandro Vannozzi alessandro.vannozzi@unipd.it

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 15 December 2018 Accepted: 27 February 2019 Published: 21 March 2019

#### Citation:

Palumbo F, Vannozzi A, Magon G, Lucchin M and Barcaccia G (2019) Genomics of Flower Identity in Grapevine (Vitis vinifera L.). Front. Plant Sci. 10:316. doi: 10.3389/fpls.2019.00316

**151**

and optimization of management and oenological techniques. Moreover, although in recent years, novel varieties have been introduced to the market, as exemplified by the recent introduction of the first 10 disease-resistant grapevines produced in Italy by the University of Udine and the Institute of Applied Genomics (IGA) or by the PIWI (pilzwiderstandsfähig or fungusresistant grape varieties) obtained by crossing European grape varieties and American or Asian fungus-resistant varietals, there is still a strong conservatism in viticulture, which is predominantly based on the use of clonally propagated traditional varieties.

Despite this, the complete understanding of the genetic mechanisms underlying the flowering process remains of primary importance since it profoundly affects winemaking and grapevine production. In fact, each stage of flower formation is critical for the development of the resulting population of berries. Examining the anatomy of the Vitis spp. reproductive system, wild V. vinifera, along with some American and Asian species, are dioecious with either male and female flowers, whereas all the main varieties employed for grape and wine production have inflorescences characterized by hermaphroditic flowers (Battilana et al., 2013). In this latter case, the conical panicleshaped grapevine inflorescence is characterized by three levels of branching along the rachis, and triplets of flowers (dichasium) represent the last level of this complex structure. From the outside, each flower is characterized by four concentric whorls: sepals, petals, stamens, and carpels. The calyx represents the outermost ring and is formed by five sepals, while the calyptra or cap is a modified epidermal tissue composed of five joined petals. Although both structures play a protective role toward the inner reproductive whorls, the calyx is a permanent layer, while the cap is released when the pollen is mature. The androecium is organized in five stamens that, in turn, are each composed of a long filament ending with a bilocular anther. The anther, containing four pollen sacs, consists of three layers: the tapetum, endothecium, and epidermis. Finally, the gynoecium (or pistil) is the innermost layer. The stigma, responsible for pollen reception; the style, through which the pollen tube grows; and the ovary, where four ovules are protected and compartmentalized in two locules; represent the three major components of the pistil (Carmona et al., 2008; Vasconcelos et al., 2009).

The specification of such floral organs is controlled by a complex genetic regulatory network that acts in a coordinated way through a set of promotive and antagonistic iterations (Vasconcelos et al., 2009). This whole multitude of processes is synthesized and articulated into the "ABC model" (Coen and Meyerowitz, 1991), which links the overlapping expression patterns of homeotic genes to specific structures that are arranged in the four aforementioned whorls (Weigel and Meyerowitz, 1994). This model initially included only the homeotic genes of classes A, B, and C, but later it was extended to also include genes belonging to classes D and E (Jordan, 2006). The specific interactions that aid in the function of these genes lead to the differentiation of each specific flower whorl by encoding MADSdomain transcriptional factors, and in one case an AP2 TF (Irish, 2017). In more detail, the A-class genes specify the identity of the sepals (first whorl) when expressed alone and the petals (second whorl), when expressed in combination with the B-class genes (Jack, 2004). Moreover, they were found to repress the C-class genes in these whorls (Coito et al., 2018). The C-class genes alone specify the carpel identity, whereas in combination with some B-genes, they specify the stamen identity (Coen and Meyerowitz, 1991). It was demonstrated that the C-class genes repress the A-class genes in the third and fourth whorl; thus, A- and C-gene activities are mutually repressive (Jack, 2004). The D-class genes, together with some C-class genes, are involved in ovule identity specification within the carpel (Favaro et al., 2003; Carmona et al., 2008; Vasconcelos et al., 2009). The discovery of the importance of some genes later grouped in the E-class led to a revision of the ABC model (Honma and Goto, 2001; Theißen, 2001): these genes, discovered and characterized only recently due to genetic redundancy and overlapping functionality (Vasconcelos et al., 2009) are expressed in all floral whorls (Boss et al., 2002) and seem to function redundantly, forming complexes with A, B, C, and D proteins (Vandenbussche et al., 2003; Castillejo et al., 2005).

In the last 15 years, the molecular and genetic bases of floral development have been mainly investigated through the studies on three dicots: Antirrhinum majus, Petunia hybrida, and Arabidopsis thaliana (Jack, 2004). Studies on Arabidopsis, in particular, produced such a contribution to the research that, even today, its floral development model has been translated to a wide range of plant species with agro-economic importance (Coen and Meyerowitz, 1991; Fornara et al., 2003). Compared to the information available for the ABCDE model in Arabidopsis, data on the genetic and molecular processes involved in the grapevine reproductive phase remain limited (Carmona et al., 2008). Only a few genes have been functionally characterized, and based on their expression pattern, they were associated with certain stages of development and specific processes. Nonetheless, the grapevine ontogenetic mechanisms of organ formation and development are quite different with respect to other annual herbaceous or woody polycarpic plants (Carmona et al., 2007a,b), making it an extremely interesting system for the study of specific aspects of plant reproductive development.

Taking advantage of a recent reclassification of the grapevine MADS-box gene family performed by Grimplet et al. (2016), we provided a transversal approach to ABCDE model-involved genes by a transcriptional point of view focusing on the V. vinifera cv Pinot noir genotype, characterized by monoecious plants and hermaphroditic flowers. In particular, we evaluated the expression of 16 grapevine MADS-box orthologs in different floral tissues and at different time points before and after anthesis to ascertain whether the transcriptional behavior of these genes is in agreement with that observed in Arabidopsis and other plant species.

#### MATERIALS AND METHODS

#### Plant Material and Sample Collection

Grapevine samples (V. vinifera L. cv Pinot noir, clone 115, grafted onto Kober 5BB rootstock) were collected from a germplasm collection Guyot-trained vineyard established in 2009 and located in the experimental farm "Lucio Toniolo" in Legnaro

(PD) (45◦ 210 5,6800N 11◦ 570 2,7100E, −8 m above the sea level) during the growing season 2017/2018. The soil texture was as follows: 46% sand, 24% clay, and 30% loam; pH = 7.9; electric conductivity, 112 µS; and organic carbon, 1.1%. With the aim of following the main stages of flower development in different whorls both before and after anthesis, we considered seven time points over a time range of 22 days. Five samplings were performed before anthesis (50% of caps off), which took place on May 22nd, whereas two additional samplings were performed after anthesis. More precisely, preanthesis samples were collected 14 (I), 11 (II), 8 (III), 6 (IV), and 1 (V) days before flowering, whereas postanthesis samples were collected 6 (VI) and 8 (VII) days after flowering when fecundation had already occurred. At each time point considered, three inflorescences from three different plants were sampled and suddenly frozen in liquid nitrogen. For each inflorescence a consistent pool of single flowers was collected, each of which was dissected in the relative whorls with the aid of a scalpel and under a stereomicroscope. Flowers collected from preanthesis inflorescences were dissected into the calyx, ovary, anthers, anther filaments, and cap, whereas those collected after anthesis were dissected into the calyx, ovary, and embryos.

#### RNA Extraction and cDNA Synthesis

For each sample, whorl-variable amounts of tissue were ground in liquid nitrogen, and total RNA was extracted using the SpectrumTM Plant Total RNA Kit (Sigma-Aldrich, United States) following the manufacturer's instructions. RNA quality and quantity were checked by means of conventional electrophoresis and spectrophotometry using a NanoDrop-1000 (Thermo Fisher Scientific). cDNA was synthetized starting from 500 ng of RNA using the InvitrogenTM SuperScriptTM IV VILOTM Master Mix (Thermo Fisher Scientific) according to the manufacturer's instructions.

#### RT-qPCR Expression Analyses

Sixteen grapevine ABCDE genes were selected as described by Grimplet et al. (2016) based on their orthology with A. thaliana genes and confirmed by means of a BLASTN approach. Primers were designed using the Primer-BLAST program of the National Center of Biotechnology Information (Rockville Pike, Bethesda, MD, United States) with melting temperatures between 58.83◦C and 62.01◦C, a length between 19 and 23 bp and finally with an amplicon length between 71 and 180 bp. The PN40024 accessions selected, together with the specific oligonucleotide sequences are reported in **Supplementary Table S1**. All the RT-qPCRs were performed on a StepOnePlus Real Time PCR system following the PowerUp SYBR Green Master Mix method (Applied Biosystems, Foster City, CA, United States). Each reaction was carried out in a volume of 10 µL, which contained 5 µL of SYBR Green, 1.2 µL of forward primer, 1.2 µL of reverse primer, 0.6 µL of sterilized water, and 2 µL of 1:10-diluted cDNA as a template. The run method set was as follows: initial denaturation 95◦C for 20 s, followed by 40 cycles of denaturation at 95◦C for 3 s and primer annealing, extension and gathering the fluorescence signal at 60◦C for 30 s. Subsequently, the melting curve analysis was achieved to verify the specificity of the primer with the following program: 95◦C/15 s, 60◦C/1 min, and 95◦C/15 s. The baseline and threshold cycles (Ct) were automatically determined by the software of the system. Three technical replicates were taken in each biological replicate. The ubiquitin-conjugated enzyme gene (VIT\_08s0040g00040) was used as an internal control. The relative expression level for all selected genes was calculated using the QGene method (Muller et al., 2002).

### RESULTS AND DISCUSSION

#### Class-Specific Expression Levels

In 2016, Grimplet et al. (2016) performed a true genome-wide analysis of the whole set of MADS-box genes in grapevine based on the v1 and v2 predictions (CRIBI, University of Padua, Padua, Italy) of the 12X PN40024 grapevine reference genome (Jaillon et al., 2007). All the 90 MADS-box genes were named according to the Super-Nomenclature Committee for Grape Genome Annotation (sNCGGa) (Grimplet et al., 2014) and include all those TFs belonging to the ABCDE model. We utilized this new classification to select 16 genes known for their involvement in flower organ identity in Arabidopsis and, for some of them, in grapevine (Boss et al., 2003; Carmona et al., 2007b) and screened their behavior in seven stages of flower development over a range of different tissues/organs. Inflorescences from three different plants of V. vinifera cv Pinot noir were collected in correspondence to the late G-stage (Baggiolini scale, Baggiolini and Keller, 1954) at 14 (I), 11 (II), 8 (III), 6 (IV), and 1 (V) days before flowering, whereas postanthesis samples were collected 6 (VI) and 8 (VII) days after flowering when fecundation had already occurred (J-stage, Baggiolini scale) (**Figure 1**). Subsequently, flowers were dissected into the calyx, ovary, anthers, anther filaments, and cap in preanthesis (stages I–V), and the calyx, ovary, and embryos in postanthesis (stages VI and VII) and screened for the expression of selected ABCDE genes. The nomenclature of the selected genes was based on those of Grimplet et al. (2016), together with the orthologs in Arabidopsis; the PN40024 12x v1 IDs and some additional information is reported in **Table 1**.

FIGURE 1 | Vitis vinifera cv Pinot noir inflorescences sampled at 14 (stage I), 11 (stage II), 8 (stage III), 6 (stage VI), and 1 (stage V) days prior to anthesis and at 6 (stage VI) and 8 (stage VII) days after anthesis. Samples were collected in three propagations of the same clone.

#### A-Class Genes

fpls-10-00316 March 19, 2019 Time: 17:59 # 4

The first class of genes considered in the present study is represented by Class-A, which encompass four major TFs, namely, APETALA1 (AP1), FRUITFULL1 (FUL1), FRUITFULLlike, and an additional class-A gene designated AP2 (Theißen et al., 2016). The latter was not considered in the present study since it does not encode a MADS-box TF. APETALA1 (AP1) has two main functions: the first is to specify the identity of sepals (first whorl), when expressed alone, and petals (second whorl), when expressed in combination with the B-class genes APETALA3 (AP3) and PISTILLATA (PI; Jack, 2004); the second function is to repress the C-class genes in these whorls (Coito et al., 2018). The putative ortholog of Arabidopsis AP1 was isolated in grapevine (VAP1) (Calonje et al., 2004) and corresponds to the gene VviAP1 (VIT\_01s0011g00100) based on a recent MADS-box classification (Grimplet et al., 2016). Based on previous results, its expression pattern seemed to differ substantially from that of AP1 in Arabidopsis. In fact, VAP1 was found to be expressed predominantly in stamens and developing carpels and to be excluded from the sepalforming region soon after the meristem determination, not consistent with a function in sepal identity specification (Calonje et al., 2004; Carmona et al., 2008). These observations, together with other evidence obtained in other plant species, such as A. majus (Huijser et al., 1992) and G. hybrida (Yu et al., 1999), led to questions about the role of these genes in the specification of sepal identity and provided arguments to revise the involvement of AP1 in class-A gene action (Litt and Irish, 2003). In reality, our results confirmed a high level of expression of VviAP1 in both the calyx and cap compared to all other tissues analyzed (**Figure 2**), perfectly in agreement with previous observations in other plant species, including Arabidopsis (Bowman et al., 1993), Camellia japonica (Sun et al., 2014) and Medicago truncatula (Roque et al., 2018). Concerning the calyx, a high level of transcripts was detected at all time points considered, including the postanthesis stages, with the highest accumulation detected at 6 days prior to anthesis. A significant transcript level was also found in the cap (CA) organ, whose identity is specified by the joint action of VviAP1 and B-class genes. In stamens (anthers and filaments) and in the carpel, VviAP1 was almost absent and therefore consistent with mutually repressive activity of the A- and C-class genes (Jack, 2004). Our results are partially in contraposition with that observed by Coito et al. (2018), who evaluated the spatiotemporal expression of the same gene in male female and perfect grapevine flowers. Surprisingly, in that study, VviAP1 expression was barely detected in the sepal and petal regions and restricted to the inner part of the flower. We believe the incongruence between these observations and the role of A-class genes was because the phenological stages considered were too precocious (from the B to G stage based on the Baggiolini scale) with respect to those considered in the present study (starting from the late G stage to J stage).

Another MADS-box gene belonging to the A-class is FRUITFULL (FUL), which was shown to play a role in carpel and fruit development contributing to the normal development of the gynoecium valve (Gu et al., 1998) and floral meristem identity redundantly with AP1 (Ferrándiz et al., 2000). Based on the grapevine MADS-Box classification proposed by Grimplet et al. (2016), two AtFUL orthologs were selected: VviFUL1 (VIT\_17s0000g04990) and VviFUL2 (VIT\_14s0083g01030). The


For each gene, the corresponding nomenclature based on Grimplet et al. (2016) and previous literature, together with the PN40024 12X v1 ID and the relative Arabidopsis orthologs are indicated. The asterisk refers to gene nomenclature based on Grimplet et al. (2016).

expression level of VviFUL2 was consistent with that in the previous literature. In fact, the ovary was the organ showing the highest accumulation of this transcript, especially at 14 and 11 days prior to anthesis (stages I and II). VviFUL2 corresponded to VFUL-L identified by Calonje et al. (2004), who observed, in agreement with our results, that the expression of VviAP1 (VAP1 in Calonje et al., 2004) and VviFUL2 (VFUL-L in Calonje et al., 2004) was only partially overlapping. In fact, very early in flower development, the expression of VFUL-L becomes restricted to the carpel-forming region at the central part of the flower meristem and continues to be expressed at high levels through the early stages of fruit development (Calonje et al., 2004). More recently, the expression of FUL-like genes in carpels was also described in Aquilegia coerulea (Pabõn-Mora et al., 2013) and C. japonica (Sun et al., 2014).

More intriguing is the expression pattern of the other grapevine paralogues, namely, VviFUL1, whose expression was generally lower in terms of normalized transcript level if compared to VviFUL2 but also showed significant expression in the filament and, to a lower extent, in the ovary at all stages prior to blooming. An interesting observation is the fact that the expression of this gene has significant overlap with the expression of VAP1 described by Calonje et al. (2004).

#### B-Class Genes

B-class genes are delegates to specify petals and stamen identity through the joint expression with A- and C-class genes, respectively (Coen and Meyerowitz, 1991). In Arabidopsis, the B-class group is constituted by APETALA3 (AP3) and PISTILLATA (PI), and their expression is in agreement with their role in the ontogenesis of the two aforementioned whorls. In grapevine, three orthologs of B-function genes have been characterized and correspond to VviPI (VIT\_18s0001g01760), VviAP3a (VIT\_18s0001g13460), and VviAP3b (VIT\_04s0023g02820) according to Grimplet et al. (2016). In reality, based on phylogenetic analyses (Poupin et al., 2007; Coito et al., 2018), these genes cluster in three different clades: PI, AP3, and TM6, respectively. While VviPI, corresponding to VvMADS9 and/or VvPI (Sreekantan et al., 2006; Poupin et al., 2007), forms a clearly separate clade, namely, the PI clade (Poupin et al., 2007; Coito et al., 2018), VviAP3a and VviAP3b, previously designated VvAP3 and VvTM6, respectively (Poupin et al., 2007; **Table 1**),

present differences at the level of specific C-terminal motifs (Vandenbussche et al., 2003).

The role of these genes in petal and stamen identity was confirmed in our analysis (**Figure 3**). The highest VviPI, VviAP3a, and VviAP3b transcript levels were detected in the cap and stamen (anther and filament) tissues. Regarding VviAP3a, previous studies in a hermaphroditic variety of V. vinifera (Poupin et al., 2007), showed that the expression of this gene is restricted to petals and stamens, whereas in situ hybridization experiments indicated that the VviAP3a transcripts also localize in the carpels but are limited to early stages of flower development (Coito et al., 2018). In reality, a discrete expression of VviAP3a was also detected in carpels and embryos in our analysis. Similar results were observed in Paeonia lactiflora, where the B-class genes PLAP3-1 and PLAP3-2 were found to be strongly expressed in petals, stamen, and carpels (Gong et al., 2017).

A possible explanation for the increase in VviAP3a expression in embryos (stages VI and VII corresponding to 6 and 8 days postanthesis) (**Figure 3**) could be ascribed to the close relation intervening between B- and B-sister class genes, which are hypothesized to be redundantly involved in ovule identity specification and proven to reach high transcript levels in embryo integuments (Theißen and Melzer, 2006).

TM6 is considered to be a B-class homeotic gene, and although an Arabidopsis ortholog was not identified, it was detected in other plant species, including Solanum lycopersicum flowers (de Martino et al., 2006) and P. hybrida (Rijpkema et al., 2006). It is worth noting that VviAP3b in situ hybridization detected this transcript not only in petals and stamens but also in the ovary. This observation is in agreement with our results indicating a discrete transcript accumulation on carpels in the first stage considered (I).

Concerning PISTILLATA, in Arabidopsis, this gene is expressed in cells that will give rise to petals, stamens and carpel primordia in the early stages of flower development (Goto and Meyerowitz, 1994; Sundström et al., 2006). Coito et al. (2018) observed an accumulation of the VviPI transcript in Vitis flower development in the center of the flower meristem, at early stages of flower development, but when sepals start to emerge, the highest expression was observed in cells that will develop into petals and stamens, remaining confined to the second and third whorls during the later stages of flower development. Our results, which considered even later stages of flower development, confirmed the high accumulation in the cap and stamens, but, interestingly, the highest level of transcript accumulation was detected in the filament tissue with respect

to the anthers. Overall, as far as flower development proceeded and anthesis drew nearer, the expression of VviPI declined in all tissues considered.

#### B-Sister Genes

technical replicates.

There is another homeotic gene category closely related to the B-class genes: the B-sister class. MADS-box genes belonging to this cluster are phylogenetically close to the B-class genes and have been identified in all angiosperms and gymnosperms investigated so far showing highly conserved ovular expression (Becker et al., 2002; Chen et al., 2012). Genes belonging to this class, in contrast to those of the B-class, are expressed exclusively in female flower structures, in particular, in the integument tissues surrounding the ovules (Theißen and Melzer, 2006). Arabidopsis B sister (ABS) belongs to this gene class and, beyond having a proven function in seed pigmentation (Nesi et al., 2002), is hypothesized to have a role in ovule formation, acting in a complex and redundant interaction network. In rapeseed canola (Brassica napus), the ABS ortholog (BnTT161-4) was found to be involved in embryo and seed development, whereas in rice (Oryza sativa), OsMADS29 is mainly expressed in the ovule and regulates the expression of pivotal genes involved in programmed cell death in the nucellar region of developing seeds (Yin and Xue, 2012; Chen et al., 2013). An exhaustive study on the transcriptional behavior of B-sister genes in grapevine still remains to be elucidated. Based on the PN40024 12X genome prediction, three MADSbox genes are phylogenetically related to AtABS in grapevine: VviABS1 (VIT\_10s0042g00820), VviABS2 (VIT\_01s0011g01560), and VviABS3 (VIT\_02s0025g02350) (Grimplet et al., 2016). Considering that ovaries sampled from stages I to V also included ovules, with the exception of VviABS3, which did not show a clear expression pattern in all tissues analyzed, VviABS1 and VviABS2 expression perfectly matched what was expected (**Figure 4**). In fact, the ovary in preanthesis (all stages) and embryo, were the only tissues where transcripts accumulated (**Figure 4**). It is likely that the high level of VviABS1 and VviABS2 transcripts in preanthesis ovaries is fully attributable to the presence of the ovules within the carpels.

#### C-Class Genes

The C-class gene AGAMOUS (AG) specifies carpel identity in model species, whereas in combination with AP3 and PI, the gene specifies stamen identity (Coen and Meyerowitz, 1991).

the expression of UBIQUITIN gene and plotted as normalized transcript expression. Bars indicate the SE of three biological replicates each one composed of three technical replicates.

Two putative orthologs of the AG gene subfamily were identified in grapevine (Grimplet et al., 2016) and were designated VviAG1 (VIT\_12s0142g00360) and VviAG2 (VIT\_10s003g02070). Whereas VviAG1 (VIT\_12s0142g00360), which was previously cloned by Boss et al. (2001) now appears to be strictly related to Arabidopsis SHP1, a D-class gene (Theißen et al., 2016; Coito et al., 2018), VviAG2 has not been considered in previous studies and seems to be the best candidate ortholog of the Arabidopsis AGAMOUS gene (AT4G18960) based on a recent reannotation of the TAIR database. Thus, we believe this study is the first one to consider the expression of grapevine AGAMOUS orthologs during flower development. Although having different functions in the determination of floral identity, the C- and D-class genes form a monophyletic MADS-box clade, known as the AG subfamily of MADS-box genes. Whereas phylogenetic analyses did not clearly assign VviAG1 and VviAG2 as class-C or class-D genes, expression analyses did, clearly showing a high level of transcript accumulation in stamens (anthers and filaments) and ovaries for both of them, suggesting their role as class-C factors (**Figure 5**). For this reason, we decided to consider VviAG1 and VviAG2 together.

A more detailed examination at the stage of VviAG2 transcript accumulation showed that a marked peak was observed in the ovary 14 days prior to anthesis (stage I), and its transcription tended to decrease in this organ as anthesis got closer. A relatively high level of expression was also detected in anthers and filaments in the earlier stages considered here (stages I and II, corresponding to 14 and 11 days to anthesis). VviAG1 showed a much higher transcript level compared to its paralogues, peaking in the ovary (stages I and V) and in the filament at stage II. Moreover, a moderate transcript level was maintained in the anthers overall during the whole developmental kinetic process. VviAG1 showed an increasing expression in both the ovary and embryo during postanthesis, which is biologically consistent with a putative role in berry development (Carmona et al., 2008). It is worth noting that the VviAG1 sequence corresponding to VvMADS1 was identified in 2001 by Boss et al. (2001), who reported the expression of this gene in the two inner whorls, as well as during berry development. An interesting aspect emerging from our results is that VviAG1 is moderately expressed in sepals (stages I and IV). This observation is not in agreement with the role of C-class genes. Nevertheless, similar results

had already been observed in P. lactiflora where the PLAG ortholog of VviAG1 was expressed in sepals (Gong et al., 2017). Moreover, Boss et al. (2002) showed that the overexpression of this gene in grapevine is associated with an altered sepal morphology raising the question of whether this gene could have a role in determination of this whorl or in the interaction with A-class genes.

The last two class-C genes considered were VviAGL6a (VIT\_15s0048g01270) and VViAGL6b (VIT\_16s0022g02330). Both VviAGL6a and VviAGL6b show homology to Arabidopsis AGL6 and AGL13, whose function in flower development still remains to be elucidated (Boss et al., 2003; Ohmori et al., 2009; Schauer et al., 2009), although it was proven that AGL6 and AGL13 bind to AG, and for this reason, it can be involved in determining AG functional specificity (Fan et al., 1997; Hsu et al., 2014). Moreover, yeast two-hybrid studies in several plant species revealed that AGL6 and AGL13 proteins can interact with AP1/FUL-like, B-class, D-class, and SEP-like MADS-box proteins (Hsu et al., 2003; de Folter et al., 2005). Recently, it was suggested by Hsu et al. (2014) that AGL6 could act in a regulatory feedback as a repressor of AGL13 involved in male and female gametophyte morphogenesis.

It's worth pointing out that Hsu et al. (2014) recently described AGL13 as a possible ancestor of the E-class genes. Considering the close phylogenetic relationship between AGL6 and AGL13, we cannot rule out the possibility that these genes belong to the E-class rather than the C-class.

VviAGL6a corresponds to the VvMADS3 gene (Boss et al., 2002), whose expression was first detected in late inflorescence development with greater transcript levels in petals compared to the inner two whorls present. In agreement with this observation, VviAGL6a was expressed in the cap at the first stages considered, with a decrease until anthesis. It's worth noting that, together with petals, the filament also showed a high accumulation of this transcript, reaching a peak at 11 days prior to anthesis (stage II). The expression of this gene in the filament has not been described before, but it must be considered that this is the first study examining the specific expression of this gene in this tissue. However, Rijpkema et al. (2009) already described the accumulation of the VviAGL6a ortholog PhAGL6 in Petunia anthers. VviAGL6b, whose expression was never evaluated in a kinetic study of flower development, exhibited a pattern of expression, which is comparable to the one observed for VviSEP2 (see "E-Class Genes" section), especially concerning the calyx. In this organ, the expression level of VviAGL6b decreased approaching anthesis and then increased again after fertilization. This is in agreement with the redundant and overlapping functionality of E-class genes.

#### D-Class Genes

The D-class gene SEEDSTICK (STK), together with AG, is involved in ovule identity specification within the carpel (Favaro et al., 2003; Carmona et al., 2008; Vasconcelos et al., 2009). The closest accession in grapevine was designated VviAG3

(VIT\_18s0041g01880). **Figure 6** clearly shows that VviAG3 is totally switched-off in all whorls considered except for the ovary, where it is poorly expressed. Conversely, VviAG3 appeared to be strongly induced with an increase in the embryo, in agreement with that reported by Boss et al. (2002), who observed a strong expression of this gene (namely, VvMADS5) in mature carpels, developing seeds and pre- and postveraison berries. Recently, a target resequencing of this gene in a collection of 124 grapevine cultivars showed that a point variation causing the arginine-197-to-leucine substitution was fully linked to stenospermocarpy (Royo et al., 2018).

#### E-Class Genes

technical replicates.

The involvement of E-class genes in flower identity was discovered relatively recently as a consequence of their high genetic redundancy and overlapping functionality (Vasconcelos et al., 2009). The discovery of the importance of the SEPALLATA (SEP) genes, namely, SEP1, SEP2, SEP3, and SEP4, led to a revision of the first ABC model (Honma and Goto, 2001; Theißen, 2001). The SEP genes play a crucial role in petal, stamen, and carpel formation. In fact, all the flower whorls of the sep1/sep2/sep3 triple mutant develop into sepals and flowers that become indeterminate (Pelaz et al., 2000). Vegetative leaves, rather than sepals, are formed in the sep1/sep2/sep3/sep4 quadruple mutants (Ditta et al., 2004). In grapevine, based on the new classification of Grimplet et al. (2016) four gene predictions were associated with the E-class group, namely, VIT\_14s0083g01050, designated VviSEP1, VIT\_17s0000g05000, designated VviSEP2, VIT\_01s0010g03900 designated VviSEP3, and finally VIT\_01s0011g00110, corresponding to VviSEP4.

The ortholog of Arabidopsis SEP1, namely, VviSEP1 (VIT\_14s0083g01050), designated MADS2 based on Boss et al. (2002) was previously found to be expressed early during flower development until anthesis in all the whorls except for sepals. As shown in **Figure 7**, the overall expression of the VviSEP1 gene is detectable in all the floral whorls, at least for the first stages considered (stages I and II), an observation consistent with the facts that the Arabidopsis orthologs SEP1-4 have been postulated to specify all whorl identities by complexing A-, B-, C-, and D-class proteins (Vandenbussche et al., 2003;

Castillejo et al., 2005) and that its expression is totally shut down after fecundation occurred (stages VI and VII). The ovary is the organ showing the highest expression of VviSEP1 in all preanthesis stages (from I to V) in agreement with that observed by Boss et al. (2002) examining the expression of VvMADS2.

Similar to VviSEP1, VviSEP2 showed a generalized expression in all organs in the first stages except for the anthers (stages I– III), with a decrease in all tissues except for the calyx, whose expression lasted over fertilization. The persistence of VviSEP2 expression in the calyx over the whole kinetic process described here is surprising, considering these genes are mainly required for the activity of the B- and C-class floral homeotic genes, and the triple sep1/2/3 mutants produce sepals in the place of all floral organs (Castillejo et al., 2005).

SEP3 is involved in sepal, petal, stamen, carpel, and ovule development, and its ectopic expression is sufficient to activate AtAP3 and AtAG (Pelaz et al., 2000; Favaro et al., 2003; Ditta et al., 2004). In Vitis, the expression of VviSEP3 prior to anthesis was slightly confusing, with transcript accumulation detected in filaments and the ovary and calyx depending on the stage. An interesting aspect is the increasing transcript accumulation observed in petals from stages I to V (except for stage IV). This expression pattern could suggest a possible role of this gene in the development and maturation of these organs. Something similar was hypothesized for anther SEP gene, namely, FaMADS9 in strawberry (Seymour et al., 2011). Of special interest is the postanthesis expression detected in ovary and embryo tissues, which is in agreement with previous observation performed by Boss et al. on this gene [VvMADS4 in Boss et al. (2002)]. These observations are not unexpected considering that SEP orthologs were found to be pivotal for the normal ripening of berries, such as strawberry (Seymour et al., 2011) and in tomato (S. lycopersicum; Ampomah-Dwamena et al., 2002).

Finally, VviSEP4, whose function was not previously investigated, showed its highest expression in the ovary at stages I and V, although a basal and constant accumulation was also detected in the calyx and filaments. As a general observation, it must be noted that the expression of VviSEP3 and VviSEP4 appeared to be 1–2 orders of magnitude lower compared to those of VviSEP1 and VviSEP2. In Arabidopsis, the closest orthologous gene, AtSEP2, was functionally demonstrated to determine the identity of carpels and stamens (Pelaz et al., 2000).

#### The Grapevine ABCDE Model

The "ABCDE" model maintains that class A+E genes specify sepals, A+B+E specify petals, B+C+E specify stamens, C+E specify carpels, and C+D+E specify ovules. According to the recent nomenclature of MADS-box TFs performed by Grimplet et al. (2016), we selected 18 genes belonging to these different classes and determined their expression in different organs and time points. **Figure 8** illustrates the mean standardized expression value of all the MADS-box genes in the six different whorls considered in the present study. The standardization on the mean expression of each single gene among all the time points and tissues allowed better comprehension of the contribution of different classes to the development of a particular organ. Based on our findings, a complex of three class-A genes, namely, VviFUL1, VviFUL2, and VviAP1 together with the four class-E genes, determines calyx identity in grapevine flower (**Figure 8**). It is worth noting that two other genes,

namely, VviAGL6b and VviABS3, belonging to class-C and the class B-sister, respectively, were significantly expressed in sepals throughout the whole kinetic process considered. Based on the Arabidopsis model, these genes should not be involved in the identity of the sepals. Nevertheless, regarding VviAGL6b, it must be considered that several authors questioned its membership in class-E rather than its function as a class-C MADS box (Hsu et al., 2014).

organs analyzed. Asterisks indicate genes whose expression was not expected in a given whorl based on the floral ABCDE model.

The three class-A genes, designated as VviFUL1 and VviAP1, and, to a lesser extent VviFUL2, were coexpressed with the B-class genes VviAP3a, VviAP3b, and VviPI in petals. Together with these genes, a significant transcript accumulation was also detected for the class-E genes VviSEP1-3 and the C-class genes VviAGL6a and VviAGL6b. In addition, in this case, the expression pattern of the latter genes suggests their putative role as class-E genes rather than as class-C ones (**Figure 8**).

Concerning the third whorl, a complex of three class-B genes, i.e., VviAP3a, VviAP3b, and VviPI, plus the two class-C genes, VviAG1 and VviAG2, were expressed in both filaments and anthers, whereas VviAGL6a and VviAGL6b were detected only in filaments (**Figure 8**). Concerning the class-E genes, all 4 VviSEP genes showed a discrete expression in the filament, while only VviSEP1 was detected in anthers. It is worth mentioning the ectopic expression of the class-A gene VviFUL1 and class-B-sister gene VviABS3 in the filament, which has never been reported in the previous literature (**Figure 8**).

In the inner whorl, the carpel, together with the expected expression of class-C (VviAG1 and VviAG2) and class-E genes (VviSEP1, VviSEP3, and VviSEP4), a high level of expression for class-A gene VviFUL2 and class-B-sister genes (VviABS1 and VviABS2) was detected. The involvement of the latter two classgenes in the ovary is probably because, during the first five stages (I–V), ovules were not separated from ovaries; thus, the expression of ovule-specific genes in ovary is a consequence of a copresence of both organs in the samples used for transcriptional analyses.

Finally, looking at gene expression in embryos, we found the genes belonging to the class-B-sisters VviABS1 and

VviABS2 and the class-D gene VviAG3, as expected based on the ABCDE model.

Expression data obtained over a developmental kinetic process and represented in **Figure 8** enabled us to develop a grapevine model adapting our transcriptional evidence to the ABCDE model of organ identity determination in A. thaliana. **Figure 9** shows our proposed ABCDE model for grapevine. What is remarkable, although with several slight differences, is its robustness. In fact, it seems that the overall logic of the process is also conserved in grapevine. Identifying all the MADS-box genes involved in flower identity and analyzing their behavior in different whorls and developmental stages represents a first step in the understanding of grapevine flower genetics and biology. We believe that the present study could provide a starting point for a better comprehension of many aspects directly or indirectly related to flower and berry biology. Among these aspects are the determinants of large seasonal variation in grape yield, the sex determination in monoecious and dioecious species, phenomena related to seedset such as apireny and stenospermocarpy but also to fruit set, such as millerandage or the gradient in berry maturation within the cluster. Some of these aspects, such as those related to berry maturation and quality, have been investigated more thoroughly, given their economic importance for the wine industry and table grape production, while some have been just recently been taken into account, such as the flower sex determination (Coito et al., 2018) and stenospermocarpy (Royo et al., 2018). Some others have been fairly ignored.

#### CONCLUSION

The aim of this work was to translate the ABCDE model verifying the coherence between Arabidopsis and grapevine in terms of flower identity gene expression. The results revealed that the majority of genes investigated follow the expression pattern expected based on the model, being detected, alone or in combination, in the floral whorls where they were supposed to be accumulated. Some of the genes considered here

#### REFERENCES


had already been functionally characterized, but many merit investigation to ascertain their functional conservation with their Arabidopsis MADS-box orthologs. In particular, it will be very interesting to focus on the B-sister genes and E-class genes, whose expression and functionality merit further investigation. We believe that, starting from these robust transcriptional evidences, new functional genomic tools available nowadays will greatly contribute to deeply understand the contribution of single and combined TFs in the determination of flower identity in grapevine.

### AUTHOR CONTRIBUTIONS

GB and ML designed the research. GM and FP conducted and controlled the experiments and analyzed the data. AV carried out the bioinformatics analyses. FP and AV wrote the manuscript. All authors contributed to editing the manuscript.

### FUNDING

This study was supported by the Starting Grants 2015 CARIPARO project Cod. VANN\_START16\_01.

#### ACKNOWLEDGMENTS

The authors wish to thank Sara Sgubin, who has helped with collecting and dissecting grapevine flowers and Francesca Toso who has kindly contributed to the drafting of **Figure 9**.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00316/ full#supplementary-material



different expression patterns in mutants with abnormal petal and stamen structures. Funct. Plant Biol. 33, 877–886. doi: 10.1071/FP06016


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Palumbo, Vannozzi, Magon, Lucchin and Barcaccia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Rise of Apomixis in Natural Plant Populations

#### Diego Hojsgaard\* † and Elvira Hörandl\* †

Department of Systematics, Biodiversity and Evolution of Plants (with Herbarium), Albrecht-von-Haller Institute for Plant Sciences, University of Göttingen, Göttingen, Germany

Apomixis, the asexual reproduction via seed, has many potential applications for plant breeding by maintaining desirable genotypes over generations. Since most major crops do not express natural apomixis, it is useful to understand the origin and maintenance of apomixis in natural plant systems. Here, we review the state of knowledge on origin, establishment and maintenance of natural apomixis. Many studies suggest that hybridization, either on diploid or polyploid cytotypes, is a major trigger for the formation of unreduced female gametophytes, which represents the first step toward apomixis, and must be combined to parthenogenesis, the development of an unfertilized egg cell. Nevertheless, fertilization of endosperm is still needed for most apomictic plants. Coupling of these three steps appears to be a major constraint for shifts to natural apomixis. Adventitious embryony is another developmental pathway toward apomixis. Establishment of a newly arisen apomictic lineage is often fostered by sideeffects of polyploidy. Polyploidy creates an immediate reproductive barrier against the diploid parental and progenitor populations; it can cause a breakdown of genetic selfincompatibility (SI) systems which is needed to establish self-fertility of pseudogamous apomictic lineages; and finally, polyploidy could indirectly help to establish an apomictic cytotype in a novel ecological niche by increasing adaptive potentials of the plants. This step may be followed by a phase of diversification and range expansion, mostly described as geographical parthenogenesis. The utilization of apomixis in crops must consider the potential risks of pollen transfer and introgression into sexual crop fields, which might be overcome by using pollen-sterile or cleistogamous variants. Another risk is the escape into natural vegetation and potential invasiveness of apomictic plants which needs careful management and consideration of ecological conditions.

Keywords: apomictic crops, grass cultivars, polyploidy, reproductive assurance, sexuality, speciation, triploid bridge

#### INTRODUCTION

Sexuality is well entrenched in all seed producing plants. Seeds are an integral part of diaspores that enhance plant dispersals and store all nutrients needed to start the new generation. Therefore, a wide variety of insects, animals and men found food resources in many plant seeds. The development of human societies is tightly linked to the domestication and improvement of crop

#### Edited by:

Emidio Albertini, University of Perugia, Italy

#### Reviewed by:

Alfred Huo, University of Florida, United States Nobutaka Mitsuda, National Institute of Advanced Industrial Science and Technology (AIST), Japan Joann Acciai Conner, University of Georgia, United States

#### \*Correspondence:

Diego Hojsgaard Diego.Hojsgaard@ biologie.uni-goettingen.de Elvira Hörandl Elvira.Hoerandl@ biologie.uni-goettingen.de

†These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 14 November 2018 Accepted: 07 March 2019 Published: 02 April 2019

#### Citation:

Hojsgaard D and Hörandl E (2019) The Rise of Apomixis in Natural Plant Populations. Front. Plant Sci. 10:358. doi: 10.3389/fpls.2019.00358

**166**

**Abbreviations:** BIII hybrid, offspring produced by fertilization of unreduced egg cells; CitRWP, a RWP-RK domain– containing protein found in citrus; FCSS, flow cytometric seed screen; MC, megaspore; MMC, megaspore mother cell; NC, nucellus cell; PHERES1, a transcription factor encoded by a MADS-box gene; SC, self-compatibility.

species through artificial selection and genetic breeding (see e.g., Gupta, 2004). Both traditional and molecular plant breeding techniques are designed to modify and exploit sexuality to create new heterozygous seed varieties with desired allele combinations for high yield, resistance to different environmental stressors, or nutritionally enriched seeds (e.g., golden rice; see Dirks et al., 2009). However, the same sexual mechanisms that are manipulated to improve plant varieties (e.g., engineering meiosis by reverse breeding) are simultaneously the ones responsible for diminishing heterozygosity and segregating successful gene combinations (Lambing et al., 2017).

The formation of a seed involves a number of complex developmental steps, highly regulated and coordinated that still are not well understood (Bradford and Nonogaki, 2007). Sexual seed development is initiated by the process of double fertilization, which involves the fusion of reduced female and male gametes and leads to the development of the embryo and the endosperm (**Figure 1**). The hormone auxin has a crucial role during the initial development of seed structures and as a trigger of fertilization-independent seed development (Figueiredo and Kohler, 2018), a condition that occurs naturally in (apomictic) plants at low frequencies. Apomicts have evolved mechanisms that circumvent sexual pathways (**Figure 1**) by forming functional female gametophytes without meiosis (apomeiosis), developing embryos without fertilization (parthenogenesis), and a functional endosperm. Unreduced gametophytes can develop via two main developmental pathways: (1) two unreduced MCs are formed via restitutional meiosis or via mitotic division (diplospory); (2) a somatic, unreduced cell of the nucellus develops into an embryo sac (apospory). Although gamete fusion is a strict requirement for initiation of seed development in nature, apomictic plants can produce seeds through a single fertilization of the polar nuclei (pseudogamy) or without fertilization (autonomously) (**Figure 1**). Therefore, seed development without fertilization represents a trait of high economic relevance to exploit heterosis and preserve superior allele combinations (Koltunow and Grossniklaus, 2003). Synthetic clonal seed production had been exercised in both Arabidopsis and rice, aiming at introducing apomixis-like features into crops (Marimuthu et al., 2011; Mieulet et al., 2016). However, while cultivated crops are expected to be genetically uniform in a similar way to apomictic clonemates, the introduction of apomixis in crop fields might bring new ecological threats derived from the biological advantages apomictic plants show compared to sexual ones (e.g., uniparental reproduction, unidirectional gene transfer; Hörandl, 2006). The escape of an apomixis gene into a wild relative may provide immediate invasive-like features to the recipient individual, but also other unintended (i.e., pleiotropic) benefits like increased fitness or pathogen resistance already observed in cases of crop x wild hybridizations (Chapman and Burke, 2006).

Before speculating about the biosafety and ecology of a potential apomictic crop, we can gain comparable valuable information from observations in natural apomictic plant populations. Apomicts exhibit a variety of developmental alternatives to bypass sexual pathways and produce clonal seeds (**Figure 1**). In single ovules, apomicts might use both sexual and apomictic seed development alternately (only one pathway proceed; **Figures 1a,b**) or even simultaneously (either both pathways proceed or are combined forming a BIII; **Figures 1d,e**). Understanding the dynamics of apomixis in natural populations can provide useful information to know how an apomictic crop may behave in natural fields and visualize potential ecological threats. In recent years, the use of different technologies had enlarged our understanding of the genetic and developmental basis of apomixis in different plant species (Ozias-Akins and van Dijk, 2007; Conner et al., 2015; Hojsgaard and Hörandl, 2015) and has brought new light into initial steps and dynamics during the foundation and spread of new apomictic populations. Here we review main findings about the rise and dynamics of apomixis in natural plant populations.

### THE FOUNDATIONAL PHASE: THE EMERGENCE OF AN APOMICTIC INDIVIDUAL

Despite decades of research, it is still unclear how apomixis originates de novo in natural populations. Two main possibilities can be envisioned: either, seeds are dispersed from an apomictic source population, and the seedlings would find a new apomictic population; or, a spontaneous shift to apomixis happens in an otherwise sexually reproducing individual. The first case is difficult to trace in plants, as neither seed dispersal nor pollen dispersal can be easily documented in natural populations. Establishment of an apomictic newcomer in an otherwise sexual population is also hampered by minority effects (see below), and reduced fecundity (Hörandl and Temsch, 2009). Of course, the first scenario shifts the natural origin of apomixis just to another source population.

For the spontaneous de novo emergence of apomixis in natural populations, different hypotheses have been proposed. Traditionally, hybridization was regarded as a main trigger for emergence of apomixis (Ernst, 1918; Asker and Jerling, 1992; Carman, 1997). Evidence for hybrid origin of natural apomictic taxa is available in an increasing number of molecular studies (e.g., Koch et al., 2003; Paun et al., 2006; Lo et al., 2010; Beck et al., 2012; Šarhanová et al., 2017). The emergence of apomixis in hybrids has been confirmed even for diploid species. In fact, almost all natural diploid apomictic species in the genus Boechera are hybrids (Kantama et al., 2007; Aliyu et al., 2010; Beck et al., 2012). Synthetic diploid F<sup>1</sup> hybrids in the Ranunculus auricomus complex of diploid, obligate sexual parental species showed spontaneous emergence of apospory in first hybrid generation (Hojsgaard et al., 2014a), and increased frequencies of apospory and first functional apomictic seeds in the diploid F<sup>2</sup> (Barke et al., 2018). These studies also shed light on the open question why only few plant hybrid combinations would express spontaneous apomixis: hybridization appears to affect only one component of apomixis, i.e., the formation of an unreduced embryo sac from a diplosporous or aposporous initial cell. The other steps of apomictic seed formation, namely parthenogenesis and endosperm formation, are apparently not influenced by hybridization (Barke et al., 2018).

egg-cell. MMC, megaspore mother cell; MC, megaspore; NC, nucellus cell; BIII hybrid, offspring produced by fertilization of unreduced egg cells. Size of nuclei corresponds to relative ploidy level.

Other authors focused on polyploidization, following the observation that almost all natural apomictic plant populations are polyploids. Polyploidy could result in a "genomic shock" and genome-wide changes of gene expression (Koltunow

and Grossniklaus, 2003). Carman (1997) developed the most comprehensive theory for polyploidization being the trigger for natural apomixis: climatic fluctuations during the Pleistocene would have caused range shifts and secondary

contact hybridization of different ecotypes; the subsequent changes in timing of gene expression patterns in the cascade of megasporogenesis-megagametogenesis would be changed so that the megasporogenesis phase would be skipped, resulting in suppression of sexuality and expression of apomixis. In principle this could also happen after autopolyploidization in duplicated genes. Developmental and transcriptomic studies in fact revealed signs of asynchrony of gene expression in apomictic development (e.g., Polegri et al., 2010; Sharbel et al., 2010). In Paspalum notatum, artificial polyploidization led to the expression of apomixis in two synthetic autotetraploids while a third induced autopolyploid remained sexual (Quarin et al., 2001). Likewise, other autopolyploid Paspalum species, e.g., P. plicatulum and P. simplex, remained sexual after artificial polyploidization (Sartor et al., 2009).

In natural systems, the effects of polyploidy for the functionality of apomixis are not yet clear. A positive effect of autopolyploidization on establishing higher frequencies of apomictic seed formation has been observed in polyploidized Paspalum rufum (Delgado et al., 2014). Allele dosage effects of apospory or diplospory-specific genomic regions in polyploids on frequencies of apospory/diplospory have been observed in different model systems (Ozias-Akins and van Dijk, 2007). Dosage effects may further enhance development of unreduced embryo sac formation compared to meiotic reduced ones (Sharbel et al., 2010; Hojsgaard et al., 2013). A classical model by Nogler (1984) suggested that the apospory-controlling factors would have lethal effects in haploid gametes, thereby requiring diploid gametes for inheritance. However, this hypothesis was rejected by findings of Barke et al. (2018) that apospory can be inherited by haploid gametes in diploid R. auricomus hybrids.

Moreover, some model systems appear to express apomixis without any signs of hybridity or polyploidy. In Paspalum, many diploid species exhibit development of unreduced female gametophyte at low frequencies (reviewed in Ortiz et al., 2013), some of which seem to be able to have apomictic seed formation (Siena et al., 2008; Ortiz et al., 2013; Delgado et al., 2014, 2016). In the alpine species Ranunculus kuepferi, large scale FCSS screenings revealed spontaneous apomictic seed formation at low frequencies in otherwise sexual, diploid wild populations in the Alps (Schinkel et al., 2016). These diploid populations are not hybrids, they are geographically distant and isolated from each other and from apomictic tetraploids; no apparent dispersal or gene flow could be traced between them in population genetic studies (Cosendai et al., 2013). More detailed FCSS study further contradicted the hypothesis of a contagious origin of apomixis in diploids via pollination from tetraploid apomicts, but rather suggested a female triploid bridge of rare BIII hybrid formation, via female unreduced gametes produced by diploid plants (Schinkel et al., 2017; **Figure 2**). Among tetraploids of R. kuepferi, not even a single tetraploid obligate sexual population or individual could be found in the whole range of the species, which contradicts the idea that the shift to apomixis happened after polyploidization. Experimental studies rather suggested that cold shocks and frost treatments during development can increase frequencies of apomictic and also BIII seed formation in diploid R. kuepferi (Klatt et al., 2018). Although the frequencies of these events are low, they might be effective in evolutionary time periods.

What actually might trigger unreduced embryo sac formation under natural conditions? The appearance of apospory in otherwise sexual diploid plant species has been reported from Paspalum (reviewed by Ortiz et al., 2013), and in many Asteraceae genera that were otherwise not apomictic (Noyes, 2007). Aposporous initials act as surrogate cells for the meiotic products, or the spores, and ectopic, aposporous cell formation depends on production of a meiotic tetrad in Hieracium (Koltunow et al., 2011). This may not be the case in other apomictic systems like Paspalum, where the MMC often initiates abortion before entering the meiotic division (e.g., Hojsgaard et al., 2008). Cell-to-cell communication and/or direct contact between the emerging aposporous initial cell and the MC appears to take place before the former suppresses development of the latter (Schmidt et al., 2014, 2015; Juranic et al., 2018). Also, restitutional female meiosis, the process resulting in diplospory, is relatively widespread in plants. In Taraxacum, a model with first division restitution, the DIPLOSPORY (DIP) locus could be characterized and located on one NOR chromosome (Vijverberg et al., 2010; Vasut et al., 2014). Restitutional meiosis, however, can also be triggered by extreme temperatures and other environmental factors (De Storme and Geelen, 2013; Mirzaghaderi and Hörandl, 2016). In female development, unreduced MCs just have to develop into unreduced embryo sacs to produce unreduced female gametes. This capacity of unreduced gamete formation fits to the hypothesis discussed by Carman (1997); Hojsgaard et al. (2014b): that all angiosperms may have an inherent potential for shifting to apomixis.

Comparative transcriptomic studies on sexual and apomictic plants suggested that many stress-associated genes are differentially regulated in the premeiotic to gametophytic stage (Sharbel et al., 2010; Schmidt et al., 2014, 2015; Shah et al., 2016; Rodrigo et al., 2017). Prolonged photoperiods triggered increased sexual MC formation in facultative apomictic plants and resulted in reprogramming of secondary metabolite profiles (Klatt et al., 2016). Hence, natural apomeiosis has to be seen in the context of the physiological condition of the plant. This fits to a more general hypothesis that sexuality would have evolved and established early in eukaryote evolution as a DNA repair tool after oxidative stress conditions (Hörandl and Hadacek, 2013; Hörandl and Speijer, 2018). Fluctuating environmental conditions and stress response are hypothesized to be the major natural triggers for expressing the meiotic pathway. This aspect also sheds a new light on the putative role of polyploidy on expression of apomixis. Polyploids in general better regulate environmentally induced stress conditions, showing homeostatic maintenance of reproductive output under elevated abiotic stress, and therefore have a fitness advantage over diploids in climatically variable or extreme habitats (Schoenfelder and Fox, 2015). Hence, low stress conditions in polyploid reproductive tissues would stimulate the sexual meiotic pathway in archesporial cells to a lesser extent, thereby releasing the inherent potential of plants for apomeiosis (Hörandl and Hadacek, 2013).

In wild populations, the spontaneous appearance of the fully functional apomictic pathway is probably limited by the failure of

connecting unreduced gamete formation to parthenogenesis and endosperm formation. Parthenogenesis itself is a process which again may occur spontaneously in natural populations. Again, the totipotency of plant cells allows for embryogenesis to start from different cell types, be it fertilized or unfertilized egg cells, somatic cells, or can be even induced in other tissues like microspores (Soriano et al., 2013). Somatic embryogenesis is in Paspalum associated with SOMATIC EMBRYOGENESIS RECEPTOR-LIKE KINASE (SERK) genes, and altered temporal and spatial expression of SERK gene copies appear to be associated with apomixis (Podio et al., 2014b). In apomictic Paspalum plants, cytosine-methylations inactivate genes that otherwise repress parthenogenesis (Podio et al., 2014a). In Pennisetum and in Brachiaria, the ASGR-BABY BOOM-like (ASGR-BBML) gene could be identified to control parthenogenesis (Conner et al., 2015; Worthington et al., 2019). In rice, ectopic expression of the BABY BOOM1 (BBM1) gene results in parthenogenesis (Khanday et al., 2019). In Hieracium subg. Pilosella, LOSS OF APOMEIOSIS (LOA) and LOSS OF PARTHENOGENESIS (LOP) loci control apomixis, whereby gametophytic expression of LOP is required for both parthenogenesis and endosperm formation (Koltunow et al., 2011). The endosperm and parthenogenesis loci are linked but separate (Ogawa et al., 2013). In apomictic Boechera, genomic imprinting appears to be involved in the expression of parthenogenesis (Kirioukhova et al., 2018). In sexual Boechera species, paternal and maternal alleles are expressed for embryogenesis, while in parthenogenetic taxa, maternal expression of the PHERES1 gene is drastically increased compared to sexual species. The changes in expression are probably due to altered DNA methylations. Reduced expression of Methyltransferase1 (MET1) and increased expression of Domains-arranged-Methyltransferases (DRM2) will cause cytosin-demethylations, resulting in the observed high expression levels of maternal PHERES1 alleles (Kirioukhova et al., 2018).

In natural populations, haploid parthenogenesis has been reported as a rare event from many plants that is rarely successful (Asker and Jerling, 1992). Haploid embryos probably suffer too much from having just one chromosome set to establish a haploid progeny in natural environments. Polyhaploid progeny, however, was achieved in higher frequencies from 7x or 8x mother plants in Hieracium (Rosenbaumova et al., 2012). The authors suggested precocious embryogenesis controlled by gametophytes as a putative mechanism. Frequencies of polyhaploids in apomictic plant seed progenies are often very low (<5%) and they usually represent the smallest proportion

of all developmental pathways (Bicknell et al., 2003; Kaushal et al., 2008; Krahulcova et al., 2011; Schinkel et al., 2017). Timing of pollination appears to be important for the expression of parthenogenesis. In many apomictic species, early proembryos had been observed at blooming (e.g., Cooper and Brink, 1949; Burson and Bennett, 1971; Hojsgaard et al., 2008), indicating accelerated parthenogenetic development in some ovules. Anticipated pollinations in facultative apomictic P. notatum were able to unlock the recalcitrant nature of unreduced egg cells to fertilization, increasing the formation of BIII progeny (Martinez et al., 1994). In a similar experiment but using different plant materials, Espinoza et al. (2002) could show experimentally in P. notatum that early pollination (before anthesis) and also late pollination (after anthesis) increased frequencies of apomictic offspring formation, while pollination during anthesis resulted in higher frequencies of sexual seeds. In natural plant populations, pollination during full anthesis is probably the most frequent "default" situation, because insects will be attracted by full floral displays, and wind-pollination is also most efficient in fully opened spikelets. Hence, pollination during anthesis in apomicts would maximize fertilization of reduced egg cells.

The overall data suggest that differential penetrance of parthenogenesis among ovules, carrying reduced and unreduced female gametophytes, might play a relevant role in creating the observed diversity of seed formation pathways. Under this context, a shift in timing of pollination in natural diploids, presenting low proportions of unreduced gametophytes, can significantly increase the relative success of unreduced gametophyte against reduced ones and favor the formation of asexual seeds. This may explain why the above-mentioned diploids of P. rufum (Delgado et al., 2014) or alpine R. kuepferi produced some fully apomictic seeds under wild conditions (Schinkel et al., 2016) and in experiments (Klatt et al., 2018). Accelerated flower development is a common feature of alpine plants, and a putative adaptation to short vegetation periods in alpine environments, especially in early flowering plants (Körner, 2003). Diploid R. kuepferi flowers directly after snow melting, a time when many insect pollinators are not yet available as a pollen vector. Hence we suppose that delayed pollination can easily happen under natural conditions, favoring occasional parthenogenetic development of unreduced egg cells. Further experimental work will be needed to understand the appearance of parthenogenesis under natural conditions.

Endosperm formation is under a different genetic or epigenetic control and is dependent on fertilization of polar nuclei (pseudogamy) in most natural apomicts. Therefore the endosperm and parthenogenesis loci are linked but separate. In Asteraceae, tissues in the ovule other than endosperm appear to provide sufficient nutrients for the embryo (Cooper and Brink, 1949). Likewise, plant families without endosperm formation in the seeds, i.e., Melastomataceae and Orchidaceae, can apparently express autonomous apomixis (Renner, 1989; Teppner, 1996; Zhang and Gao, 2018). Hence autonomous apomixis might evolve when the selective force for endosperm formation is weak. Pseudogamy, however, is predominant in most other families (Mogie, 1992) and is an important constraint for successful seed formation. Some species are sensitive to deviations from an 2 maternal : 1 paternal genome contribution in the endosperm while others are more tolerant (Talent and Dickinson, 2007). Precocious embryo development combined to late pollination would probably indirectly favor double fertilization of polar nuclei, as no receptive egg cells would be available when pollen tubes reach the micropyle. Both sperm nuclei would be directed to fertilize polar nuclei, which has, in Polygonum type embryo sacs, positive effects on endosperm development by maintaining 2 maternal : 1 paternal genome ratios (see above).

Taken together, the coupling of three developmental steps for functional apomictic seed formation is probably realized unfrequently in natural populations. The need of combining mutations for at least three developmental steps makes a mutagenic origin of apomixis in nature very unlikely, as each single component would be selected against (Van Dijk and Vijverberg, 2005). It rather seems that a coincidence of environmental conditions might alter developmental gene expression patterns, resulting in rare apomictic seed formation. Since apomeiosis avoids meiotic "resetting" of DNA methylation patterns (see Paszkowski and Grossniklaus, 2011), altered epigenetic states might be inherited in clonal seeds and may establish apomictic progeny.

Sporophytic apomixis, also called adventitious embryony includes embryogenesis out of somatic tissues of the nucellus or the integuments (Naumova, 1992). Apomictic embryos often develop from several initial cells in parallel or after sexual embryogenesis, resulting in more than one seedling within a seed (polyembryony). Although adventitious embryony is taxonomically the most widespread developmental pathway of apomixis (Hojsgaard et al., 2014b), the genetic control mechanisms are less well studied than in gametophytic apomixis. Because adventitious embryos arise without disturbing the sexual program, its genetic basis is expected to be less complex and a single mutation could initiate somatic embryogenesis. Genomic and transcriptomic analysis of Citrus species revealed 11 candidate loci associated to apomixis. An insertion at the promotor region of CitRWP is associated with polyembryony (Wang et al., 2017). Similar as in gametophytic apomixis, adventitious embryony often appears in polyploids and/or hybrids (Alves et al., 2016; Mendes et al., 2018), but also in diploids or paleopolyploids (Carman, 1997; Whitton et al., 2008).

#### THE ESTABLISHMENT PHASE: THE FORMATION OF AN APOMICTIC POPULATION

During this phase, the uncoupled expression of apomixis developmental steps mentioned before is expected to be functional to the establishment of a polyploid apomictic population, required for the survival of the lineage. Uncoupled activation of apomeiosis and parthenogenesis in a diploid cytotype would drive an increase in ploidy and a shift in dosage that can help to stabilize the coordinated expression of apomixis elements and the formation of a number of polyploid individuals producing clonal seeds. In natural conditions, this mostly happens through a triploid intermediary that facilitates the

formation of even polyploids, like in sexual systems. However, the presence of partial apomixis and uncoupled parthenogenesis can have different outcomes (**Figure 2**) and foster the establishment of new polyploid populations (Hojsgaard, 2018).

#### Indirect Effects of Polyploidy

Polyploidization might have manifold indirect, positive effects on establishing apomictic individuals: first, polyploidy creates an immediate reproductive barrier against the diploid parental and progenitor population; second, polyploidy may cause a breakdown of genetic self-incompatibility (SI) systems which is needed to establish self-fertility of pseudogamous apomicts; and third, polyploidy could indirectly help to establish an apomictic cytotype in a novel ecological niche by changing the overall physiological features and adaptive potentials of the plants.

The interactions of cytotypes in populations with mixed cytotypes will likely lead to polyploidization of the offspring rather than increasing frequencies of occasional diploid apomictic individuals in a population. The following process can be envisioned: the appearance of a diploid apomeiotic individual within an otherwise diploid sexual, self-incompatible population will initially lead to a minority cytotype disadvantage (Levin, 1975), because mostly haploid pollen from the majority of surrounding sexual plants will be transferred to its stigmas (**Figure 2a**). The apomictic, diploid pioneer-producing unreduced embryo sacs will probably be mostly cross-fertilized and produce triploid BIII hybrid offspring, which means that hardly any diploid apomictic progeny can be formed (**Figure 2b**). In natural diploid populations, identification of apomictic progeny is difficult but feasible using the appropriate molecular approaches (e.g., Siena et al., 2008) or flow cytometric seed screenings (Schinkel et al., 2016). The experimental evidence support the mentioned idea of constraints to the formation of diploid apomictic progeny in nature (Siena et al., 2008; Hojsgaard et al., 2014a; Barke et al., 2018). An exceptional case is represented by apomictic diploids from Boechera. Different species within Boechera show a complex evolutionary history of hybridization and polyploidy, in which -besides the occurrence of apomictic triploids- diploid cytotypes can be sexual or apomictic, the latter being able to produce apomictic seeds recurrently (Aliyu et al., 2010). For details about the possible origin of the patterns of reproductive mode and ploidy variation observed across Boechera see Lovell et al. (2013). Once the mentioned BIII triploids are produced, apomeiosis may be more successful for female unreduced gamete formation as it circumvents meiosis dysfunction and the formation of aneuploid gametes. Microsporogenesis and pollen formation, however, will mostly fail producing an array of genetically and chromosomally unbalanced gametes rendering fertilizations unsuccessful, as observed in sexual triploids (e.g., Duszynska et al., 2013). Only parthenogenetic eutriploid embryos would develop, further skipping the molecular consequences of unbalanced genes and chromosomes observed in aneuploid embryos (Birchler and Veitia, 2012). When pollen is not essential for endosperm formation, as it is the case in most Asteraceae, then a triploid, pollen-sterile, highly obligate apomictic lineage would rapidly become established by selection against the sexual pathway (**Figure 2c**). This scenario is confirmed by the occurrence of different natural populations of triploid apomicts showing autonomous endosperm development in Erigeron (Noyes and Rieseberg, 2000), Hieracium (Bicknell et al., 2000), Taraxacum (Tas and van Dijk, 1999), and by a mathematical model for origins of 3x Taraxacum clones (Muralidhar and Haig, 2017). Recurrent formation of novel 3x dandelion clones can happen in mixed sexual/apomictic populations (Martonfiova, 2015). In pseudogamous diploid apomicts, triploid BIII cytotypes would probably not readily establish a population unless requirements for parental genomic contributions are relaxed, because pollen formation will be heavily disturbed in triploids and hamper proper endosperm formation. However, fertilization of unreduced 3x egg cells with well-developed haploid pollen from surrounding diploid sexuals can result in tetraploid plants in the next generation, as experimentally observed in most apomicts (Martínez et al., 2007; Hojsgaard et al., 2014a). In tetraploids, meiosis and pollen production is expected to be more stable, and diploid pollen will be available for pseudogamy. When the capacity for apomeiosis was inherited from the triploid mother, and coupling to parthenogenesis is successful, an apomictic tetraploid offspring could originate via a female triploid bridge (**Figure 2d**), as observed in R. kuepferi (Schinkel et al., 2017), P. simplex (Urbani et al., 2002) and likely in all apomictic systems where occasional triploids had been recorded in natural populations. Alternatively, if during the phase of establishment of the new population, apomixis cannot be stabilized in the new tetraploids but instead meiosis is re-installed and coupled to syngamy (**Figure 2e**), then sexual polyploidization could be the consequence of this transient BIII hybrid (Hojsgaard, 2018).

In some apomictic systems, sexual cytotypes are polyploid and no evidence of occurrence of diploid cytotypes is found in nature. Here we might consider two alternative explanations. In one case, sexual diploids could first undergo sexual polyploidization via unreduced male gametes fertilizing reduced egg cells (male triploid bridge) (De Storme and Geelen, 2013), and then become extinct. The second possibility is that a polyploid apomict in an agamic complex reverts to sexuality, while diploid sexuals become extinct, and then it produces higher ploidy apomictic cytotypes by repeating the cycle. In both cases, among sexual tetraploids, a similar mechanism of rare apomictic seed formation as in diploids might start from predominantly sexual tetraploid populations, like in Potentilla puberula, where just single individuals showed some apomixis, whereas predominant apomixis occurred in cytotypes with higher ploidies (5x to 8x cytotypes; Dobes et al., 2013). Once the high polyploid apomictic lineage is established, heteroploid cross-fertilizations will rather negatively influence fertility of the lower-ploid sexuals, but not the fitness of the higher ploid apomictic plants (Dobes et al., 2018). A similar cytotype distribution and reproductive features was observed, for example, in H. pilosella (=Pilosella officinarum; Mráz et al., 2008), in Paspalum durifolium or in P. ionanthum (Ortiz et al., 2013). Seed abortion in sexuals after heteroploid cross-fertilization versus high female fitness after homoploid crosses, but also induced selfing (Mentor effects),

can contribute to maintenance of diploid sexual populations (Hörandl and Temsch, 2009).

An important side-effect of polyploidization is breakdown of SC systems (SI), resulting in self-fertility, as it is also well known from sexual polyploids (Comai, 2005; Hörandl, 2010). SI systems act in the stigma and in the style, and have a genetic control independent from embryo sac development by S-alleles (de Nettancourt, 2001). Nevertheless, an important selective mechanism can help to establish polyploid, pseudogamous, selfcompatible (SC) clonal lineages: a self-incompatible apomictic plant can neither use its own pollen, nor the pollen of genetically identical clone-mates around, because the S allele configuration will be the same in surrounding clone-mates. In contrast, an apomictic self-compatible pioneer plant can not only use self-pollen for pseudogamy, it can also use pollen of surrounding clone-mates with identical genotypes for seed production (Hörandl, 2010). In this way, the newly formed selfcompatible polyploid clone becomes completely independent from pollen of surrounding sexual progenitors (**Figure 2d**). By using self-pollen, an appropriate endosperm balance can also be more easily achieved. Self-fertility with pseudogamy further avoids the negative effects of sexual selfing, namely loss of heterozygosity and inbreeding depression (Hörandl, 2010). Selffertility is further beneficial for founding new populations by single or few founders, even after long distance dispersal of seeds (Baker, 1967; Hörandl, 2006; Cosendai et al., 2013).

Taken together, a couple of internal and external factors have to coincide to combine the different steps of apomixis. Under natural conditions, functional apomictic seed formation in diploids probably requires certain altered ecological conditions, and successful polyploidization for establishment. The rarity of events, which have to be combined, may also cause the low actual frequencies of natural apomixis in angiosperms (see Hojsgaard et al., 2014b for a recent review).

#### THE DIVERSIFICATION PHASE: RANGE EXPANSION AND SPECIATION OF APOMICTS

Once an apomictic polyploid population is established, its survival will depend upon the neopolyploids' capacity to either outcompete parental diploids, or move into another habitat. The occupation of a novel ecological niche is in many cases a side-effect of polyploidy. As discussed above, polyploidy per se alters many physiological and cellular features, which may be advantageous in a novel environment. This aspect has attracted much attention in the past, and some authors have seen the ecological potential of polyploidy as the main factor for the wide distribution of some apomicts (Bierzychudek, 1985). Niche shifts of polyploids compared to diploids have been documented in allopolyploid Crataegus (Coughlan et al., 2017), but also in the autopolyploids R. kuepferi (Kirchheimer et al., 2016, 2018) and Paspalum intermedium (Karunarathne et al., 2018). These non-hybrid apomicts do not show a pronounced genetic diversification, but nevertheless managed to occupy habitats outside the ecological range of the diploids. The evidence indicates that polyploid apomicts behave as generalists and are less competitive than specialist diploids in their native range, but they are more competitive in the peripheral areas of parental diploids (Karunarathne et al., 2018), a condition that may prelude ecological differentiation between cytotypes. Together with inherent biological features of apomixis (i.e., reproductive assurance and clonality), ecological niche shift is another important factor for geographical range expansions and diversification of apomicts. If the new apomictic polyploids cannot adapt to novel environmental conditions, their evolutionary potential would likely be restricted. Mau et al. (2015) found widespread niche conservatism between sexual and apomicts, and found that ploidy is a stronger driver for niche divergence compared to reproductive mode in diploid-triploid cytotypes of Boechera, but substantial variation in directions of niche differentiation was found among species. Thus, according to Mau et al. (2015), homoploid apomicts in Boechera are trapped in the ecological niches of their sexual ancestors.

### Reproductive Assurance and Clonality

Self-fertility and apomixis enable plants to form new populations starting with a single individual (Baker, 1955). Thus, plants benefit twice from apomixis. On the one hand, apomixis allows for founder events and spread of species' populations after seed dispersal. On the other hand, apomixis creates clonal offspring and hence it multiplies genotypes, and it is likely that the fittest ones locally will be established by differential seed sets and better competing aptitudes. This double advantage provides plants with better colonizing abilities and both have a relevant impact on genetic variation at population level and in the biogeographic distribution of cytotypes. These combined features contribute to observed patterns of geographical parthenogenesis (e.g., Kearney, 2005; Hörandl, 2006).

#### Geographical Parthenogenesis

Asexual animals and plants often have larger distribution areas than their sexual relatives (Kearney, 2005). In plants, the strong co-occurrence of apomixis and polyploidy made it difficult to entangle their effects on colonization patterns (Bierzychudek, 1985; Hörandl, 2006). In the exceptional case of the Boechera holboellii complex, Mau et al. (2015) found stronger evidence for ploidy driven ecological-niche divergence rather than for reproductive systems. Despite the fact that this observation contradicts general and well-supported patterns of geographic parthenogenesis, all other studies suggest apomixis might speed up dispersal and facilitate range expansions in those polyploids. In the alpine species R. kuepferi, diploid cytotypes remained in their refugial area in the southwestern Alps, while tetraploids colonized the whole Alps, the Apennines, and Corsica (Cosendai and Hörandl, 2010). Only tetraploids managed to adapt higher elevations and a colder climatic niche (Kirchheimer et al., 2016; Schinkel et al., 2017). A simulation study of recolonization of the Alps revealed strong combinational effects of niche differentiation and mode of reproduction for tetraploids (Kirchheimer et al., 2018). Similarly, in prairies and grasslands that were not associated to icesheet covers during the last glaciation, species also depict

patterns of unequal geographic distribution between sexual and apomictic cytotypes. In the grass species P. intermedium, diploid cytotypes are less geographically expanded than tetraploid cytotypes and located in northern, climatically milder areas within the distribution of the species in South America (Karunarathne et al., 2018). Tetraploids occupied southern areas by better coping with less productive and harsher environmental variation (Karunarathne et al., 2018). Thus, in most plant systems, apomixis promotes range expansions by exploiting the advantages of clonality and polyploidy.

#### Population Differentiation and Speciation

As apomixis freezes genetic variation and reduces genotype variability, populations are expected to evolve independently by reduced gene flow. With time they may evolve into populations holding gene pools that are differentiated enough to avoid hybridization via pre- and postzygotic barriers, and geographically distant (isolated) populations may become subspecies or new species (e.g., Arrigo and Barker, 2012). In fact, morphological differentiation is observed within apomictic complexes, whereby facultative apomixis in hybrid species can promote (slow) divergent selection and formation of microspecies (i.e., an apomictic lineage with particular morphology and genetically homogeneous) (e.g., Burgess et al., 2014), causing severe problems in taxonomy. For a detailed analysis on how to treat apomictic taxa and which species concepts to refer to Haveman (2013), Majeský et al. (2017), Hörandl (2018). An alternative would be to have a reversal to sexuality in one of those widespread populations (Hörandl and Hojsgaard, 2012). A shift back to sexual reproduction in a distant population from the parental sexual species of the polyploid apomicts would also allow for independent evolution, the accumulation of genetic and morphological changes and the acquisition pre- and postzygotic barriers to gene flow (Hojsgaard and Hörandl, 2015).

#### NATURAL VS. CULTIVATED APOMICTIC PLANT SYSTEMS

Apomictic plant systems produce clones from seeds, a trait that offers enormous potential for the development of cultivars specifically suited to livestock pastures. Natural apomictic populations are dynamic. Evidence suggests that they are often founded by a single individual that multiplies that genotype to establish a small local population. Along time, likely as a response to local edaphic conditions and environmental variation, the population acquires variation mainly due to residual sexuality and spontaneous mutations. Thus, natural apomictic populations are genotypically diverse (e.g., Daurelio et al., 2004; Paun et al., 2006).

Cultivated crops consist of genetically highly uniform individuals, usually derived from crosses between inbreed lines or highly selected heterozygous plant materials. In this sense, apomictic natural populations with genetically uniform clone-mates and pedigree cultivars of apomictic forage grasses are qualitatively similar, and to some extend comparable to cultivated crop fields with variable levels of heterozygosity and a contrasting type of reproduction involve in the formation of the next generation of seeds. Plant reproductive processes, flowering time, formation of floral organs, pollen viability, etc., are strongly influenced by climatic conditions and can affect seed yields and crop performance (Hampton et al., 2016). Climatic factors govern crop growth and development and are subject to spatial changes in both direction and magnitude, making necessary fine scale analyses to discern the differential spatial responses of crops to climate variability and their impacts on crop yields (Kukal and Irmak, 2018). Therefore, genetically pure sexual crops and apomictic cultivars are expected to have alike responses to variable environmental conditions which might help us understand the short-term effects of ecological and climatic changes on seed production and crop yields, and estimate potential climate-influenced crop yield gain/loss expected from the transfer of apomixis to main crops. Since most apomicts are facultative, in both natural clonal lineages as in commercial forage cultivars, the formation of low proportions of recombinant progeny is expected to increase genetic heterogeneity and inbreeding which may decrease forage productivity. Therefore, at least in forage breeding, a proper reproductive characterization of the selected material using multiple experimental approaches can benefit both breeding programs, by knowing potential rates of hybridization and trait introgression, and field management strategies by estimating proportions of non-maternal offspring expected per generation (Hojsgaard et al., 2016).

Most natural apomicts maintain the pollen function for pseudogamy. For apomictic crops, this means that they could act as pollen donors and introgress adjacent sexual crop stands (van Dijk and van Damme, 2000). To avoid introgression of apomictic variants into sexual ones, apomictic crops would have to be pollen-sterile. However, natural systems show us that the pollen function can be only abandoned with autonomous apomixis, which occurs mostly in plants with or without a weak endosperm formation (see above). Hence, breeding strategies for pollensterile apomictic cultivars may be most useful for forage plants where the major interest of the farmer is in vegetative growth. But for crops plants which are mostly cultivated for their seed yield, the need of pseudogamy for proper endosperm formation requires functional pollen. Breeding strategies aiming at enforced autogamy (e.g., within cleistogamous spikelets or flowers) in crop variants may help to overcome this problem.

Most domesticated crops are highly dependent on use of high inputs (e.g., fertilizers and herbicides), and therefore an escape from cultivation is less likely than an introgression event to a wild relative species (e.g., Arnaud et al., 2003; Uwimana et al., 2012). In contrast, apomictic forage crops are less dependent on inputs and can both introgress a wild relative and escape from cultivation. The same features that make apomictic crops suitable for development of cultivars also make them better invaders, instigating a potential threat to biodiversity and environmental risk. In Latin America, for example, superior Brachiaria grasses for livestock production have been widely adopted covering approximately 25 million hectares<sup>1</sup> , and many areas of Brazil have been recorded as invasive (Almeida-Neto et al., 2010; under

<sup>1</sup>https://ciat.cgiar.org/

the name Urochloa spp. in Zenni and Ziller, 2011). Similar cases are recognized worldwide but little is known about the adverse impacts of such invasions on biodiversity. A comparative study on sexual and apomictic invasive plants showed that the latter have similar abilities for ecological niche shifts and establishment in novel, invaded areas like sexual species (Dellinger et al., 2016). A few studied cases show apomictic grasses dominating invaded habitats and displacing native grasslands (Marshall et al., 2012; Dennhardt et al., 2016). Thus, a better understanding surrounding the origin and dynamics of natural apomictic populations, as well as the variation in the expression of residual sexuality and other sources of genetic variation, can help identify and target effective management actions for apomictic crops, which might currently contribute to the sustainable intensification of forage-based systems.

#### CONCLUSION AND OUTLOOK FOR FUTURE STUDIES

Apomixis is a complex, developmental trait expected to have an enormous impact in plant breeding if introduced into main crops, both shortening the time required to develop a new variety and to increase revenues. Currently, apomixis is exploited for the creation of forage cultivars but reports documenting, e.g., lack of genetic homogeneity and genetic erosion in those cultivars are not available. Although eclipsed for a long time, the mechanisms responsible for the rise and dynamics of apomixis in natural plant populations, the potential outcomes of apomixis in natural systems and its role in plant evolution start being disentangled. The knowledge about the genetic and ecological factors governing developmental interactions between meiotic

#### REFERENCES


and apomictic pathways, as well as the population dynamics at local and regional scales can not only help us to decipher the strategies plants use to respond and adapt to the environment, but it also provides valuable information to use on apomictic crop management and production practices.

#### DATA AVAILABILITY

No original datasets were produced for this article.

#### AUTHOR CONTRIBUTIONS

All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.

#### FUNDING

Basic research to this manuscript was funded by the German Research Foundation Deutsche Forschungsgemeinschaft (DFG) (projects Ho 4395/4-1, Ho 4395/1-2 to EH and project Ho5462/1-1 to DH).

#### ACKNOWLEDGMENTS

We thank the series editors Emidio Albertini and Fulvio Pupilli for the invitation to present this review article, and reviewers for their valuable comments on the manuscript. We acknowledge support by the Open Access Publication Funds of the Göttingen University.





**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Hojsgaard and Hörandl. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Boechera Genus as a Resource for Apomixis Research

Vladimir Brukhin1,2 \*, Jaroslaw V. Osadtchiy<sup>2</sup> , Ana Marcela Florez-Rueda<sup>3</sup> , Dmitry Smetanin<sup>3</sup> , Evgeny Bakin1,4, Margarida Sofia Nobre<sup>3</sup> and Ueli Grossniklaus<sup>3</sup>

<sup>1</sup> Theodosius Dobzhansky Center for Genome Bioinformatics, St. Petersburg State University, Saint Petersburg, Russia, <sup>2</sup> Department of Plant Embryology and Reproductive Biology, Komarov Botanical Institute RAS, Saint Petersburg, Russia, <sup>3</sup> Department of Plant and Microbial Biology, Zürich-Basel Plant Science Center, University of Zurich, Zurich, Switzerland,

<sup>4</sup> Bioinformatics Institute, Saint Petersburg, Russia

The genera Boechera (A. Löve et D. Löve) and Arabidopsis, the latter containing the model plant Arabidopsis thaliana, belong to the same clade within the Brassicaceae family. Boechera is the only among the more than 370 genera in the Brassicaceae where apomixis is well documented. Apomixis refers to the asexual reproduction through seed, and a better understanding of the underlying mechanisms has great potential for applications in agriculture. The Boechera genus currently includes 110 species (of which 38 are reported to be triploid and thus apomictic), which are distributed mostly in the North America. The apomictic lineages of Boechera occur at both the diploid and triploid level and show signs of a hybridogenic origin, resulting in a modification of their chromosome structure, as reflected by alloploidy, aneuploidy, substitutions of homeologous chromosomes, and the presence of aberrant chromosomes. In this review, we discuss the advantages of the Boechera genus to study apomixis, consider its modes of reproduction as well as the inheritance and possible mechanisms controlling apomixis. We also consider population genetic aspects and a possible role of hybridization at the origin of apomixis in Boechera. The molecular tools available to study Boechera, such as transformation techniques, laser capture microdissection, analysis of transcriptomes etc. are also discussed. We survey available genome assemblies of Boechera spp. and point out the challenges to assemble the highly heterozygous genomes of apomictic species. Due to these challenges, we argue for the application of an alternative reference-free method for the comparative analysis of such genomes, provide an overview of genomic sequencing data in the genus Boechera suitable for such analysis, and provide examples of its application.

Keywords: genome assembly, Boechera, apomixis, apomeiosis, diplospory, pseudogamy, genomics, heterozygosity

### GAMETOPHYTIC APOMIXIS AND ITS RELEVANCE TO AGRICULTURE

Apomixis is defined as the asexual reproduction through seeds and results in the formation of genetically uniform progeny (Nogler, 1984a; Asker and Jerling, 1992; Grossniklaus, 2001; Bicknell and Koltunow, 2004; Van Dijk, 2009; Kotani et al., 2014). During sexual reproduction, egg and central cell – the gametes of the reduced female gametophyte (embryo sac) – each get fertilized by one sperm cell to produce the embryo and endosperm, respectively (Dresselhaus et al., 2016).

#### Edited by:

Fulvio Pupilli, Italian National Research Council (CNR), Italy

#### Reviewed by:

Andrea Mazzucato, Università degli Studi della Tuscia, Italy Sònia Garcia, Spanish National Research Council (CSIC), Spain Amal Joseph Johnston, Universität Heidelberg, Germany

> \*Correspondence: Vladimir Brukhin vbrukhin@gmail.com

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 29 November 2018 Accepted: 14 March 2019 Published: 02 April 2019

#### Citation:

Brukhin V, Osadtchiy JV, Florez-Rueda AM, Smetanin D, Bakin E, Nobre MS and Grossniklaus U (2019) The Boechera Genus as a Resource for Apomixis Research. Front. Plant Sci. 10:392. doi: 10.3389/fpls.2019.00392

**179**

In contrast, apomictic embryos are not the result of a fusion of male and female gametes but develop clonally from unreduced maternal cell lineages in the ovule (matroclinous inheritance). Characteristic components of gametophytic apomixis are (i) avoidance of meiosis (apomeiosis), (ii) development of the embryo from an unreduced egg cell without fertilization (parthenogenesis), and (iii) formation of functional endosperm either autonomously or by fertilization of the central cell (pseudogamy) (Koltunow, 1993; Grossniklaus, 2001; Koltunow and Grossniklaus, 2003; Hand and Koltunow, 2014).

A drawback of sexual propagation is the segregation of advantageous traits in subsequent generations, such that progeny can lose the advantageous gene combinations of their parents (Spillane et al., 2004; Brukhin, 2017). The study of apomixis has drawn greater interest over the last two decades because of its potential to fix agriculturally valuable characteristics over many generations. The introduction of apomixis into crop plants would allow the long-term fixation of complex genotypes, including those of F1 hybrids often used in agriculture. This would facilitate crop breeding and hybrid seed production and could greatly benefit subsistence farmers by providing them access to high-yielding hybrid crops (Grossniklaus et al., 1998; Spillane et al., 2004; Conner and Ozias-Akins, 2017). Many possible uses of apomixis in agriculture have been proposed and its importance for sustainability and food security has been recognized (Karpechenko, 1935; Jefferson, 1994; Toennissen, 2001; Grossniklaus et al., 2003; Spillane et al., 2004; Conner and Ozias-Akins, 2017). Unfortunately, almost no natural gametophytic apomicts have been found among major crop cultivars and the introgression of apomixis from wild apomictic relatives has so far been unsuccessful (Savidan, 2000).

As an alternative to introgression, genes relevant to apomixis could either be identified in sexual model systems by identifying mutants displaying components of apomixis, or by isolating the relevant genes from an apomictic species (e.g., Grossniklaus, 2001; Pupilli and Barcaccia, 2012; Rodriguez-Leal and Vielle-Calzada, 2012; Barcaccia and Albertini, 2013; Conner and Ozias-Akins, 2017). For the latter approach, an important question is how to choose a convenient apomictic model plant, which will allow the deciphering of the molecular mechanisms underlying the components of apomixis. Three apomictic genera that have been studied in depth, Hieracium, Paspalum and Pennisetum, have large genomes and are polyploid, which is true for the vast majority of currently known apomicts (Asker and Jerling, 1992; Carman, 1997; Hojsgaard et al., 2014). Although these features complicate molecular genetic studies, research in these natural apomicts have greatly contributed to progress in the field (reviewed in e.g., Ortiz et al., 2013; Bicknell et al., 2016; Conner and Ozias-Akins, 2017). In contrast to the natural apomicts mentioned above, Boechera spp. have a relatively small genome (∼170–230 Mb) and Boechera is the only known genus where apomixis is found at the diploid level in the wild (Böcher, 1951; Dobeš et al., 2004b; Sharbel et al., 2005; Voigt-Zielinski et al., 2012). In addition, Boechera spp. are close relatives of the model plant Arabidopsis thaliana, which is very well studied in terms of molecular genetics and functional gene annotation. However, the genomes of apomictic accessions of Boechera are characterized by extremely high heterozygosity, accompanied by alloploidy and aneuploidy that resulted from hybridization events (e.g., Schranz et al., 2005; Koch et al., 2003; Mandáková et al., 2015). This poses challenges to perform a phased assembly and detailed annotation (reviewed in Hirsch and Buell, 2013) of the genomes of apomictic Boechera accessions.

In this review, we will present the particularities of phylogeny, reproduction, and genetics of the Boechera genus, and discuss strategies for assembly and annotation of the genomes of apomictic Boechera accessions.

### TAXONOMY AND HABITATS OF THE MOST IMPORTANT Boechera SPECIES

The genus Boechera comprises mainly North American species of biennial and perennial herbaceous crucifers, characterized by a base chromosome number of n = 7. Previously, these species were included in the genus Arabis L., from which they were excluded based on the difference in the base chromosome number (Löve and Löve, 1976), which is n = 8 in Arabis spp. Molecular genetic data confirmed the difference between the two genera. It was shown that the similarity between them is convergent, representing two evolutionary independent lineages in Brassicaceae (Al-Shehbaz, 2003). Recently, the taxonomy of the genus Boechera has been further developed using molecular markers. Currently, 110 species have been described within the genus, 71 of them are diploid and presumably sexual although diploid apomicts have also been described, and 38 are reported to be apomictic triploids of hybridogenic origin (Windham and Al-Shehbaz, 2006, Windham and Al-Shehbaz, 2007a,b). Thus, Boechera is the fifth largest genus within the Brassicaceae.

Most studies on the reproductive biology of Boechera involve just a small number of species. These are the widely distributed sexual diploid Boechera stricta (A. Gray) A. Löve & D. Löve (**Figure 1**), the sexual and apomictic plants previously known under the name Boechera holboellii (sensu lato, s. l.) (Hornem.) A. Löve & D. Löve, and apomicts of a hybridogenic origin previously referred to as Boechera divaricarpa (A. Nelson) A. Löve & D. Löve (Windham and Al-Shehbaz, 2007b). Recently, several studies also used Boechera gunnisoniana (Rollins) W.A. Weber (Ta¸skin et al., 2003, 2004, 2009a; Schmidt et al., 2014; Kirioukhova et al., 2018). The rest of the species were mainly investigated to study particular aspects of apomixis in a geographically large number of species (Aliyu et al., 2010; Corral et al., 2013; Mau et al., 2013, 2015).

Until recently, the Pleistocene relict B. holboellii (Arabis holboellii) was treated in a broad sense as a species with a scattered range (i.e., consisting of several geographically isolated areas due to reasons of historical nature) (Böcher, 1951). However, recent taxonomic studies using molecular markers showed that it is a polyphyletic, artificial taxon, including a number of distinct species (Windham and Al-Shehbaz, 2006; Alexander et al., 2013). At present, B. holboellii is considered in the narrow sense as plants growing in Greenland. It includes sexual and apomictic diploid and triploid forms and the latter ones,

FIGURE 1 | Boechera stricta grown in the greenhouse of the Department of Plant and Microbial Biology of the University of Zurich.

unlike the North American species, appear to be autotriploids (Windham and Al-Shehbaz, 2006).

The continental North American accessions, which previously were included in B. holboellii, are distinguished as series of individual species that form an agamic complex (Stebbins, 1950). The basis of this complex consists of four diploid sexual species, in which, however, the presence of facultative apomixis cannot be excluded: Boechera collinsii (Fernald) A. Löve & D. Löve, Boechera pendulocarpa (A. Nelson) Windham & Al-Shehbaz, Boechera polyantha (Greene) Windham & Al-Shehbaz, and Boechera retrofracta (Graham) A. Löve & D. Löve. The remaining species are triploid apomicts of hybridogenic origin that are morphologically very similar to the parental sexual species: B. consanguinea (retrofracta × fendleri), B. goodrichii (retrofracta × gracilipes), Boechera grahamii (stricta × collinsii), B. pauciflora (sparsiflora × retrofracta), B. pinetorum (rectissima × retrofracta × sparsiflora), Boechera quebecensis (holboellii × stricta), and B. tularensis (retrofracta × rectissima × stricta) (Windham and Al-Shehbaz, 2007a,b). The B. quebecensis is distributed in isolated areas of North-Eastern America, implying the presence of one of its putative parents (Greenlandic B. holboellii s. s.) on the North American continent in the past.

The apomictic B. divaricarpa is probably the most problematic species in the genus from a taxonomic viewpoint. Traditionally, a large diversity of hybrids involving B. stricta as one of the parents (including B. stricta × B. holboellii s. l.) were referred to as B. divaricarpa in many articles on the reproductive biology of the genus Boechera. Such uncareful use of the name could be a potential source of confusion. As Windham and Al-Shehbaz (2007b) state, the correct use of the name B. divaricarpa should be restricted to plants containing genomes of B. stricta and B. sparsiflora. For hybrids of B. stricta × B. collinsii, the name B. grahamii should be used. The hybrids of B. stricta × B. holboellii s. s. should be referred to as B. quebecensis. In cases where the second parent of the hybrid is uncertain, the name "B. divaricarpa" should be avoided and replaced by "B. stricta hybrid."

In terms of prospective models for the study of apomixis, B. gunnisoniana deserves attention. It is a triploid species of presumably hybridogenic origin with diploid sexual species B. oxylobula and B. thompsonii (=B. pallidifolia) as parents (Mateo de Arias, 2015). It is characterized by almost obligate pseudogamous apomixis (Roy, 1995; Ta¸skin et al., 2004; Schmidt et al., 2014), a small plant size, and relatively fast development (approximately 4 months from planting to seed).

Although the vast majority of species of the genus Boechera grows in North America, the occurrence of two putative Boechera species in the Russian Far East has been reported, representing an example of East Asian/North American floristic disjunction. B. falcata (Turcz.) Al-Shebaz from the Russian Far East is closely related to the well-known North American apomicts (Boechera s. s.) based on molecular markers (Al-Shehbaz, 2005; Kiefer et al., 2009; Alexander et al., 2013), and its more detailed study with respect to the potential presence of apomixis is of a great interest. Another species is endemic of the Baikal region and the Russian Far East, Borodinia (=Boechera?) macrophylla (Turcz.) German. Recent molecular genetic studies showed its close relationship with seven Boechera species from the Eastern United States (Al-Shehbaz and German, 2010; Alexander et al., 2013).

### ADVANTAGES OF THE Boechera GENUS FOR THE STUDY OF APOMIXIS

Over the last decade, various species of the genus Boechera have been adopted as a model to study the molecular basis of apomixis, in addition to its well-established role as a study system in evolutionary ecology (reviewed in Rushworth et al., 2011). Among the advantages of Boechera spp. as a model are:

(i) Its close relationship to the model plant A. thaliana (L.) Heynh. (Huang et al., 2016), for which extensive molecular genetic resources are available, whose genome is fully sequenced and very well annotated, and in which many genes required for reproduction are

known, facilitating the search for genes involved in the control of apomixis in Boechera spp.;


### CYTO-EMBRYOLOGICAL STUDIES IN THE GENUS Boechera

The first detailed cyto-embryological studies of apomixis in the genus Boechera were undertaken by the Danish botanist Tyge W. Böcher (1947, 1951, 1954, 1969). He discovered the presence of apomixis in B. holboellii s. l. (referred to by him as Arabis holboellii) in diploid and triploid plants, and described megasporogenesis and microsporogenesis in a number of sexual and apomictic Boechera accessions from Greenland and North America. Particularly remarkable was his description of forms with varying degrees of chromosome synapsis in meiotic prophase. He also noted the presence of plants with different ploidy levels (mainly 2n and 3n, rarely 4n, 5n, and 6n) and aneuploids (2n = 16, 22, 23, and 30), and assumed a hybrid nature for the latter (Böcher, 1954).

Nearly 50 years later, a Dutch-Russian team performed embryological studies of cleared specimens of B. holboellii s. l. (accessions from Greenland and Colorado) using differential interference contrast microscopy (DIC) together with a flow cytometric seed screen (FCSS) analysis (Naumova et al., 2001). The presence of meiotic and apomeiotic events during megasporogenesis was demonstrated (**Figures 2–4**). By screening a large number of cleared ovules, the formation of an unreduced embryo sac through diplospory and parthenogenetic development of the embryo in apomictic Boechera accessions were confirmed. In sexual accessions, embryo sac development follows the Polygonum type (Maheshwari, 1950; **Figures 2**, **4**): the diploid megaspore mother cell (MMC) undergoes meiosis to produce a tetrad of haploid megaspores, three of which degenerate while the functional megaspore undergoes three mitotic divisions to give rise to an eight-nucleate, seven-celled embryo sac comprising an egg cell, two synergids, three antipodal cells, and two polar nuclei that fuse to form the homo-diploid nucleus of the central cell. In the anthers, pollen grains develop from the pollen mother cells (PMCs) that undergo meiosis to produce tetrads of microspores. After their separation, each microspore divides asymmetrically into a large vegetative and a smaller generative cell. After pollination, the vegetative cell germinates producing a pollen tube that transports the sperm cells to the embryo sac, while the generative cell divides once more to form two sperm cells that will later fertilize the egg and central cell, giving rise to a 2n embryo and a 3n endosperm, respectively (**Figures 2**, **4**).

In apomictic Boechera accessions, diplosporous apomeiosis of the Taraxacum type (Crane, 2001) is the most common (Böcher, 1951; Naumova et al., 2001; Schmidt et al., 2014; Mateo de Arias, 2015; Windham et al., 2015; **Figures 3**, **4C**). In meiotic diplospory, the embryo sac originates from an MMC that undergoes an aberrant meiosis without chromosome segregation, resulting in the formation of a dyad of unreduced megaspores. A characteristic feature of the MMC is the lack of callose deposition around the cell (Rodkiewicz, 1970; Nogler, 1984a; Carman et al., 1991). In contrast, callose was observed in the cell wall between the two cells of the dyad (**Figure 3D**). Dyad formation is most commonly observed (Roy, 1995; Ta¸skin et al., 2004; Schmidt et al., 2014); however, rare triads and even tetrads can be found (Schmidt et al., 2014). The chalazal dyad cell undergoes three rounds of mitosis, producing an unreduced eight-nucleate embryo sac that is morphologically similar to the Polygonum type (Crane, 2001; Rojek et al., 2018).

Aposporous apomeiosis was previously thought to be uncommon in Boechera spp., but recent reports described its occurrence in several species. The overwhelming majority of aposporous embryo sacs developed according to the Hieracium type (Crane, 2001). The sexual MMC in this case might degenerate or undergo meiosis as was observed in rare instances in B. microphylla (Carman, 2007; Mateo de Arias, 2015; Carman<sup>1</sup> ). In B. retrofracta × stricta hybrids, diplospory of the Antennaria type (Crane, 2001) was also observed rarely (Carman see text footnote 1). In a FCSS, it was also shown that the percentage of mature sexual seeds (in relation to apomictic seeds) in all studied apomicts was significantly lower than the percentage of morphologically normal meiotic tetrads

<sup>1</sup>https://reeis.usda.gov/web/crisprojectpages/1000552-cytological-andmolecular-characterizations-of-reproduction-in-sexual-and-apomicticboechera-brassicaceae.html

FIGURE 3 | Apomeiotic megasporogenesis in B. holboellii s. l. (A) megaspore mother cell (MMC); (B) diplosporous dyad (Dy); (C) uninucleate diplosporous embryo sac (DES) with remnants of the "megaspore" ("MS"); (D) callose in the cell wall of a diplosporous dyad (B,C – Naumova et al., 2001; A,D – Osadtchiy et al., 2017).

(in relation to apomeiotic events). This implies that meiosis proceeds abnormally in the most cases, resulting in inviable seeds, while apomixis serves as an "escape from sterility" (Mateo de Arias, 2015).

In most apomicts, pollen development is unaffected. In Boechera spp., however, apomeiosis also occurs during microsporogenesis (Böcher, 1951; Ta¸skin et al., 2009a). In triploid apomicts, meiosis I fails as the chromosomes are unable to correctly pair at pachytene. The chromosomes migrate to opposing poles of the PMC and decondense. After cytokinesis the dyad, unlike the meiotic tetrad, is enclosed by a callose wall. Chromosome synapsis with the formation of bi- and trivalents in the metaphase I occurs in B. holboellii s. l., whereas apomeiosis in B. gunnisoniana is completely asynaptic. Investigation of microsporogenesis in apomictic triploid B. holboellii s. l. and B. gunnisoniana showed that in the triploids, the majority of pollen grains are unreduced, formed through apomeiotic dyads (98% in B. holboellii s. l., 90% in B. gunnisoniana), while the rest of the pollen was formed through (partially abnormal) meiosis, resulting in tetrads or triads of microspores (Ta¸skin et al., 2009a), or sometimes even in monads in triploid B. holboellii s. l. (Böcher, 1951). In diploid apomicts, variability in apomeiosis is higher. In different accessions of the B. holboellii complex, pairing and cross-over events can occur normally at pachytene in some accessions, resulting in reduced pollen, or trivalents and even quadrivalents can be formed in others. Some accessions showed mostly diploid pollen formation, while others displayed evidence of haploid and diploid pollen (Kantama et al., 2007; Ta¸skin et al., 2009a; Kirioukhova et al., 2018). In fact, it has been observed that both diploid and triploid apomicts can produce reduced and unreduced pollen in varying proportions (Böcher, 1951; Voigt et al., 2007; Aliyu et al., 2010; Voigt-Zielinski et al., 2012).

In sexually reproducing Boechera spp. double fertilization occurs, while in apomicts only fertilization of the polar nuclei takes place. However, fully autonomous apomixis can also be found in rare cases (Matzk et al., 2000; Naumova et al., 2001), it is more often observed in triploids (at frequencies of up to 15%) than in diploids (1.33% at most) (Aliyu et al., 2010). The formation of embryos with a doubled set of chromosomes as a result of the fusion of unreduced male and female gametes can also occur as a rare event (Naumova et al., 2001). An important conclusion of the FCSS-based study was that, in apomicts, all mature seeds were derived from unreduced female and male gametes (Naumova et al., 2001; **Table 1**).

TABLE 1 | The reproductive modes in B. holboellii s. l. accessions (based on cyto-embryological investigation of ovules and FCSS) (from Naumova et al., 2001).


<sup>a</sup>Unreduced embryo sac (ES), parthenogenetic embryo, and pseudogamous endosperm development. <sup>b</sup>Unreduced ES, parthenogenetic embryo, and autonomous endosperm development. <sup>c</sup>Unreduced ES and double fertilization by unreduced pollen.

In apomicts, endosperm ploidy varies according to the ploidy of the sperm cells, although the most common ratio is a 2 maternal:1paternal (2m:1p) genome ratio. Exceptions, although at lower frequencies, do exist, indicating that there is some degree of flexibility or, at least, that the system is "leaky" (Aliyu et al., 2010).

A large-scale FCSS, covering 16 Boechera spp. revealed a wide variability in reproductive mode within diploid genotypes, ranging from obligate sexual to nearly obligate apomictic. By assessment of the percentage of apomeiosis, sexual and parthenogenetic embryo formation, as well as sexual, pseudogamous, and autonomous endosperm development, it was shown that all investigated facultative apomicts of the same genotype had either a very low (1–3%) or a very high (87–99%) percentage of the apomeiosis, and individuals with intermediate frequencies were not observed. Furthermore, all triploids were found to be obligate apomeiotic. A genotype-specific correlation between apomeiosis on the one hand, and parthenogenesis combined with pseudogamous or autonomous endosperm development on the other hand, showed that frequencies of the latter never exceeded the frequency of apomeiosis (Aliyu et al., 2010). This may indicate a close relationship of their genetic control and a key role of apomeiosis for all subsequent stages of apomictic development.

#### POPULATION GENETIC STUDIES IN THE GENUS Boechera WITH RESPECT TO APOMIXIS

Apomictic and sexual lineages within the genus Boechera can inter-cross (Schranz et al., 2005; Alexander et al., 2013). Distinct evolutionary forces are expected to drive the evolution of lineages that differ in their reproductive modes. In sexual

lineages, recombination increases the probability of elimination of deleterious mutations (Hill-Robertson effect, Hill and Robertson, 1966; Felsenstein, 1976). Apomictic lineages, in contrast, reproduce asexually and do not undergo recombination; thus they cannot recover adaptive alleles once deleterious mutations occur within these alleles (Charlesworth, 2008). Therefore, one expects the accumulation of deleterious alleles, a phenomenon known as Muller's ratchet (Muller, 1964; Charlesworth and Charlesworth, 1997). Recent comparisons of apomictic and sexual lineages in Boechera spp. have supported these population genetic expectations (Lovell et al., 2013, 2014, 2017). Lovell et al. (2017) used apomictic and sexual populations of B. spatifolia and investigated patterns of nucleotide variation across both reproductive modes through whole-genome sequencing. They found an elevated sequence diversity and heterozygosity, together with an increased mutation accumulation, in apomictic populations (Lovell et al., 2017). Likewise, in a larger survey of 37 natural populations of four Boechera spp. (B. stricta, B. retrofracta, B. polyantha, and B. pendulocarpa), microsatellite markers showed the same trend (Lovell et al., 2013): higher levels of heterozygosity were found in apomicts compared to sexuals, independent of the ploidy level of the apomict.

In apomictic lineages, evolution occurs due to both genetic drift and natural selection (Charlesworth and Wright, 2001; Glémin et al., 2006; Brukhin and Baskar, 2019). The lower efficiency of selection expected in apomictic lineages would lead to an increased extinction risk of apomicts through the accumulation of deleterious alleles and an incapacity to adapt to environmental changes (Darlington, 1958; Muller, 1964; Bengtsson, 2009; Brukhin and Baskar, 2019). Contrary to these expectations, the genus Boechera is highly diverse, including several apomictic lineages (Alexander et al., 2013). A likely explanation for the survival of apomictic Boechera lineages is intra- and interspecific gene flow within the genus (Böcher, 1951; Sharbel and Mitchell-Olds, 2001; Dobeš et al., 2004a,b; Schranz et al., 2005; Beck et al., 2012; Lovell et al., 2013, 2017; Schilling et al., 2018). Gene flow mainly occurs from sexuals to apomicts, while apomicts are able to produce reduced pollen that can pollinate sexual lineages and transfer the dominant factor(s) conferring apomixis. Mutual gene flow between apomictic and sexual lineages may allow introgression of adaptive alleles from sexual into apomictic lineages, as posited by Van Dijk et al. (2009) in Taraxacum spp. Moreover, apomixis in Boechera spp. is facultative where different individuals may produce both sexual and apomictic offspring (Schranz et al., 2005; Aliyu et al., 2010). These probable instances of sexuality in apomicts may suffice to purge deleterious mutations and restore the fitness of apomictic lineages, securing their evolutionary survival (Van Dijk et al., 2009).

Differences in the strength of natural selection acting on sexual versus apomictic B. spatifolia populations were described by Lovell et al. (2014). The authors analyzed selection gradients by correlating genotypic trait means with relative fitness measurements, and found a reduction in the strength of adaptive evolution in apomictic relative to sexual lineages. Apomictic lineages experienced relatively less quantitative and molecular genetic differences between populations than sexuals. Also, divergence between apomictic populations was not correlated with environmental variation but, conversely, genomic structure and quantitative traits of sexual lineages were highly correlated with latitude, climatic variables, and elevation (Lovell et al., 2014). A common garden experiment revealed that flowering time was under strong selection in high-altitude sites. This is in agreement with studies in B. stricta (Anderson et al., 2011), which showed flowering time to be under directional selection in a study using recombinant inbred lines subjected to lab and field experiments.

Several studies assessed genetic dynamics and the extent of natural selection in Boechera populations. Earlier work on the population dynamics of the sexual species B. fecunda (Song and Mitchell-Olds, 2007) and B. stricta (Song et al., 2009), using sequence data from several nuclear loci and microsatellites, revealed similar levels of polymorphism and population differentiation in both species, regardless of the marked difference between the widespread B. stricta and the endangered B. fecunda with a reduced range. Similarly, studies comparing the widespread species B. stricta and B. latifolia with the rare species B. crandallii and B. vivariensis did not find strong associations between species size range and within-population genetic diversity (Lovell and McKay, 2015). However, the more widespread species exhibited higher phenotypic plasticity and quantitative trait structure (Qst), while the rare species contained stronger signatures of selection evidenced by higher Qst: Fst ratios, with Fst referring to the fixation index (Lovell and McKay, 2015). Extending the work of Song and Mitchell-Olds (2011) on B. fecunda, Leamy et al. (2014) found regional adaptation through extensive quantitative characterization of populations in Montana (United States) using microsatellite markers. Their analyses of genetic (Fst) and quantitative trait differentiation (Qst) showed evidence for divergent selection acting on water use efficiency and a contribution of the regional environmental conditions to local adaptation. Likewise, Lee and Mitchell-Olds (2011), using microsatellite markers and phenotypic quantitative analyses, demonstrated that water availability was the key environmental variable explaining genetic differentiation between two major genetic groups of B. stricta in Eastern and Western North America. All of these studies relied on microsatellite data and Fst estimations and should be interpreted with caution as microsatellite markers are not ideal for measuring population differentiation (Balloux and Lugon-Moulin, 2002; Putman and Carbone, 2014). Likewise, the use of Fst as a measure of population differentiation has been criticized (Jost, 2008; Meirmans and Hedrick, 2010; Whitlock, 2011; Jakobsson et al., 2013).

Whole-genome sequencing and chromosome painting on the same two major genetic groups of B. stricta investigated by Lee and Mitchell-Olds (2011) identified an inversion in Linkage Group 1 of the B. stricta genome (Lee et al., 2017). Populations carrying the inversion had lower polymorphism in Linkage Group 1, lower Tajima's D, and more linkage disequilibrium than populations without the inversion. Furthermore, the inversion had a strong effect on flowering time in near-isogenic lines under greenhouse conditions. These results showed that this inversion has important ecological impacts on the species and that natural

selection is driving the differentiation of B. stricta populations in North America (Lee et al., 2017).

Hybridization is common between members of the genus Boechera (Böcher, 1951; Sharbel and Mitchell-Olds, 2001; Dobeš et al., 2004a,b; Windham and Al-Shehbaz, 2007a,b). It was reported that hybridization occurs across the whole genus and happened repeatedly and independently (Schranz et al., 2005; Alexander et al., 2013). The earliest molecular evidence supporting hybridization comes from analysis of ITS and chloroplast sequence data, and gene flow between species now known as B. stricta and B. retrofracta was inferred by phylogeographic analyses (Dobeš et al., 2004a,b). It should be noted that conclusions on the hybrid nature of individuals, which are based on a single locus or an organellar genome, may not accurately reflect the history of a clade or population (Doyle, 1992; Maddison et al., 2006). However, Schranz et al. (2005) performed extensive crossing experiments, showing that successful crosses are possible among several members of the genus. This indicates a lack of intrinsic reproductive isolation barriers, and thus the possibility for extensive gene flow among different Boechera species.

Thus far, microsatellite markers have been central in identifying species and putative hybrids in the Bochera genus (Li et al., 2017). Beck et al. (2012) studied Boechera individuals using a set of 13 microsatellites. Hybrids between B. fendleri × B. stricta and B. retrofracta × B. stricta were confirmed using this methodology (Beck et al., 2012). Using similar methods, Lovell et al. (2013) studied 231 individuals from 37 natural populations of four Boechera species (B. stricta, B. retrofracta, B. polyantha, and B. pendulocarpa). They concluded that all triploid individuals were apomictic hybrids. This was not the case for diploid apomictic accessions, which behaved as true species rather than hybrid individuals. Based on these results, it was concluded that hybridization is an indirect correlate of apomixis in the genus Boechera.

With the advent of next generation sequencing technologies, the identification of hybrids is now more refined and precise. By using whole-genome sequencing in B. spatifolia, Lovell et al. (2017) investigated whether apomictic populations had a hybrid origin or not. Analysis of 22<sup>0</sup> 000 haplotype trees across the genome indicated a hybrid origin of the apomictic B. spatifolia accessions. In another study using genotyping-bysequencing methods, Schilling et al. (2018) assessed genomic variation in 79 individuals of eight Boechera species. Admixture analyses allowed to precisely identify hybrid individuals. This study provided evidence of recent and ancient admixture and variation across species.

#### INHERITANCE AND GENETIC ASPECTS OF APOMIXIS IN THE GENUS Boechera

The seminal work of Nogler in the 1970es had shown that apomixis is genetically controlled (summarized in Nogler, 1984a). Subsequent crossing experiments of apomictic individuals as pollen donors with sexual maternal plants showed that apomixis is inherited as a dominant trait in many species (Grossniklaus et al., 2001). Early studies had indicated that apomixis is inherited as a single dominant locus, for instance in Ranunculus auricomus and Panicum maximum, where apomeiosis and parthenogenesis were found to cosegregate (Savidan, 1982; Nogler, 1984b). However, later studies found that different loci control the developmental components of apomixis, i.e., apomeiosis, parthenogenesis, and formation of functional endosperm, in most apomicts. It was also found that the genomic regions conferring apomixis or apomeiosis exhibit suppressed recombination (reviewed in Grossniklaus, 2001; Grossniklaus et al., 2001; Bicknell and Koltunow, 2004; van Dijk and Vijverberg, 2005; Barcaccia and Albertini, 2013; Hand and Koltunow, 2014; Hand et al., 2015; Brukhin, 2017). Apomixis is also frequently associated with hybridization and resulting polyploidy (Koltunow and Grossniklaus, 2003). The duplicated genomic load might be the cause of the deregulation, in space and time, of genes associated with sexual reproduction (Grimanelli et al., 2001; Grossniklaus, 2001; Spillane et al., 2001; Koltunow and Grossniklaus, 2003; Barcaccia and Albertini, 2013), as the newly formed polyploid hybrid faces the asynchronous expression of genes involved in reproduction (Carman, 1997, 2007; Grimanelli et al., 2001; Grossniklaus, 2001; Van Dijk, 2009). Apomixis, as an escape from sterility, has been speculated to be a transitional period in the evolution of neopolyploids, especially when facultative (Hörandl and Hojsgaard, 2012; Hojsgaard et al., 2014). Recent data indicate that apomixis is associated with increased diversity (Hojsgaard et al., 2014), suggesting that apomixis may actually contribute to the establishment of new polyploids (Hojsgaard, 2018) and to the diversification of angiosperms (reviewed in Brukhin and Baskar, 2019).

One of the unique features of apomixis in the genus Boechera is that it can occur at the diploid level. Diploid Boechera apomicts are highly heterozygous hybrids (Beck et al., 2012), and recent cytogenetic and population studies of the sexual and apomictic Boechera spp. have shown that these diploid genomes can be complex. Based on marker and ploidy analysis in diverse Boechera species, the emerging model proposes that, first, genetic factors for apomeiosis would independently arise. Such an individual, apomeiotic in the female side only, would stably generate seeds with a 2C embryo and 5C endosperm by self-pollination with reduced pollen. Other individuals might be apomeiotic on the male side only, generating seeds with a variety of ploidies. Reduced pollen from female-apomeiotic individuals would allow crossing with sexual individuals, thereby disseminating the phenotype. Over time, eventually both femaleand male-apomeiotic individuals would cross, and the resulting seeds with a 2C embryo and 6C endosperm would become stable diploids with unreduced male and female gametes. These diploid apomicts, as they also produce fertile unreduced pollen, are then capable of fertilizing sexual diploids, which could result in triploid apomicts (Lovell et al., 2013).

Metaphase chromosome painting by genomic in situ hybridization demonstrated that all investigated apomictic lineages showed signs of a hybridogenic origin. All were found to be alloploid with a varying number of chromosomes inherited from either B. holboellii s. l. or B. stricta. The structure of their chromosomes was strongly affected by the consequences of

hybridization, resulting in aneuploidy, and the replacement of homeologous chromosomes (Kantama et al., 2007). Therefore, these apomictic Boechera spp. are not univocal diploids, rather they have a polyhaploid origin (Sokolov et al., 2011). It should be noted that these cytogenetic data do not exclude the possible existence of true diploids among Boechera spp., although they cast doubt on it. Inheritance of apomixis-related traits has been proposed to be associated with the heterochromatic chromosomes Het, Het', and Del found in apomictic diploids (Kantama et al., 2007). According to Kantama et al. (2007), all diploid apomictic accessions examined had at least four B. stricta chromosomes, including Het and Del, and the combination of these chromosomes might be important for the manifestation of apomixis.

Recent studies have shown that the Het chromosome is the altered homolog of the first chromosome of B. stricta, which underwent an accumulation of pericentromeric heterochromatin, while the Het' + Del pair is the result of Het breakage followed by a pericentric inversion in the Het' chromosome (Mandáková et al., 2015). According to earlier data, in some lineages Del could have resulted from a translocation fusing the proximal segment of the B. stricta chromosome to the distal segment of the B. holboellii s. l. chromosome (Kantama et al., 2007). However, hybridizing sexual and apomictic Boechera accessions failed to produce apomictic progeny, despite the inheritance of the Het chromosome (Schranz et al., 2006). When crossing sexual B. stricta diploids with apomictic B. divaricarpa allodiploids carrying the Het chromosome, the F1 offspring were triploid and had low fertility but were not apomictic despite carrying the Het chromosome. The F2 population displayed an array of ploidy levels and chromosome numbers, and an equally low fertility. The few F3 individuals seemed to maintain the high ploidy of their parents and fertility increased relative to their F1 and F2 ancestors, but did not reach the levels of the individuals used in the original cross. In any case, there were no apomictic progeny produced. Thus, the genetic control of apomixis in Boechera spp. is not limited to the inheritance of aberrant chromosomes (Schranz et al., 2006).

Chromosomal regions with suppressed recombination around apomixis-related genes, often in a hemizygous state and enriched with repeat sequences and transposons, has been found in many phylogenetically distant apomicts, both dicots, and monocots (Grossniklaus et al., 2001; Ozias-Akins et al., 2003; Van Dijk et al., 2009; Okada et al., 2011; Kotani et al., 2014). It is assumed that such a recombinationally inert region can contain several linked genes with different functions, the synergistic effects of which could lead to apomictic development. However, in many apomicts, the loci controlling the different components of apomixis are in distinct regions of the genome. In Boechera spp., the most likely candidates to carry such loci are the aberrant chromosomes Het, Het', and Del. Taking into account the hybridogenic nature of Boechera apomicts as a mechanism that triggered the emergence and subsequent evolution of such recombinationally inert blocks bearing apomixis-related genes, hybridization of species with incomplete chromosomal homology may have resulted in the formation of nonrecombinant, hemizygous regions from which such blocks evolved (Sharbel et al., 2010).

Several theories speculate on the mechanisms that control apomixis. Gene mutations have the appeal of the master regulator hypothesis, in which the mutation of a gene upstream of a regulatory cascade would lead to apomeiosis, parthenogenesis, and/or autonomous endosperm development (Koltunow and Grossniklaus, 2003), or the acummulation of mutations in lowrecombining regions for each aspect of apomixis, as evidenced in some of the aforementioned apomictic species. There have been various mutants identified in Arabidopsis that lead to apomeiotic phenotypes (Schmidt et al., 2015). Interestingly, they seem divided between cell-cycle regulators/core meiotic genes (Ravi et al., 2008; d'Erfurth et al., 2009, 2010; Zhao et al., 2017), and genes involved in small RNA (sRNA) pathways (Olmedo-Monfil et al., 2010; Schmidt et al., 2011).

While no mutants have yet been studied in Boechera, two loci have been identified which correlate with female and male apomeiosis. The APOmixis Linked Locus (APOLLO), which encodes an Asp-Glu-Asp-Asp-His exonuclease, is downregulated in sexual ovules when they enter meiosis and upregulated in apomeiotic ovules (Corral et al., 2013). APOLLO shows biallelic inheritance with "apo-" and "sex-" alleles. These alleles differ in a 20-nucleotide polymorphism in the 5<sup>0</sup> untranslated region of the exonuclease gene. All tested apomictic Boechera accessions were heterozygous for the APOLLO alleles, having at least one apoallele and one sexallele, while all sexual genotypes are homozygous for sexalleles (Corral et al., 2013). APOLLO's male counterpart is the Unreduced Pollen GRAin Development2 (UPGRADE2) locus, which is exclusively expressed in PMCs of apomictic species. It encodes a chimeric long non-coding RNA (lncRNA) with the potential to form stable secondary structures. UPGRADE2 arose from duplication of UPGRADE1, followed by insertion of a functional gene and subsequent exonization, which made it transcriptionally active (Mau et al., 2013). There is a high correlation between the presence of these apomixis-associated loci and the apomictic mode of reproduction (98.4% for APOLLO, 96% for UPGRADE2 in 275 Boechera accessions from 22 species), although it was also found that, in sexuals, 2.27% had the APOLLO apoallele and 34.48% UPGRADE2. Although APOLLO is thus the most suitable diagnostic indicator of apomixis in different Boechera species and accessions (Mau et al., 2015), its function during reproduction has not yet been elucidated. The independence of APOLLO and UPGRADE2 is consistent with population genetic studies, which showed that male and female apomeiosis are inherited independently, although they usually correlate with each other at the population level (Lovell et al., 2013).

Kliver et al. (2018) found two additional, more distant copies of APOLLO, which may indicate past duplication events. An examination of apo- and sex-alleles of APOLLO indicates that they arose after the separation of the Boechera genus and form two separate clades. Given that B. retrofracta and B. stricta are sexual species, it was not surprising that they carried sexalleles of APOLLO (Kliver et al., 2018). The authors suggest an evolutionary scenario where, after triplication that likely

took place before the separation of the Brassicaceae, one of the APOLLO copies might have acquired a novel function in the common ancestor of Boechera spp., leading to the separation of the apomictic lineages. The Ka/Ks ratio of APOLLO alleles indicates that the branch leading to the apo-alleles is under positive selection (Ka/Ks = 1.4646), which is typical for paralogs that acquired a novel function.

Epigenetic changes in gene regulation have also been proposed to lie at the origin of apomixis. It has been demonstrated that in A. thaliana polyploidization following interspecific hybridization leads to dramatic changes in gene expression (Lee and Chen, 2001), making it a suitable unifier of both the hybridization and gene mutation hypothesis, whereby epialleles rather than mutant alleles would play an initial role in deregulating reproductive genes in space and time (Grimanelli et al., 2001; Grossniklaus, 2001; Spillane et al., 2001; Koltunow and Grossniklaus, 2003). sRNAs have been implicated in epigenetic reprogramming during gametogenesis and post-fertilization events (Martinez and Köhler, 2017), and of the genes involved in the sRNA pathway, AGO9 has been shown to interact with 24-nucleotide sRNAs derived from transposable elements in ovules. It is not clear if the apomeiotic phenotype of ago9 mutants is due to the lack of silencing transposable elements, or a consequence of other sRNAs that interact with AGO9 (Vielle-Calzada et al., 2012). In Boechera, sRNA expression profiling revealed Boechera-specific conserved sRNAs and microsatellite-like RNAs (misRNA), many of which have potential binding sites in exonic regions, the majority of their targets being regulatory factors. The quantitative variation in misRNA target binding was hypothesized to result from microsatellite-length polymorphisms either in their precursors or target genes, which could account for transcriptome-wide shifts in gene regulation between sexuals and apomicts (Amiteye et al., 2013). Such a shift has, in fact, been observed not only in apomictic versus sexual Boechera, but the apomictic ovule has also a significant overrepresentation of transcription factors activity (Sharbel et al., 2010; Schmidt et al., 2014), as well as a significantly different regulation of core cell cycle, sRNA pathway genes (Schmidt et al., 2014), and heterochronic differences in imprinted genes (Sharbel et al., 2010).

Although the genes controlling the components of apomixis in Boechera spp. are not yet identified, the data on the inheritance of apomixis and on apomixis-associated loci provide valuable entry points for further studies. Ultimately, functional studies of candidate genes will be required and the experimental tools required for such analysis need to be further developed.

### MOLECULAR TOOLS FOR THE GENUS Boechera: TRANSFORMATION, LASER CAPTURE MICRODISSECTION, AND TRANSCRIPTOMICS

One of the most useful tools to study molecular mechanisms is transformation for stable transgene expression. Agrobacteriummediated transformation has been the method of choice whenever possible, as any DNA sequence contained between the two tumor-inducing (Ti) borders of the plasmid can efficiently be introduced into a plant genome (Gelvin, 2003). While the "floral dip" method is standard for stable transformation of A. thaliana (Clough and Bent, 1998), most plant species require more elaborate transformation procedures. The most widely used method is co-cultivation of explants with Agrobacterium, which then transform into callus tissue and subsequently undergo organogenesis to regenerate a transformant plant. With this method, even recalcitrant cultivars of various crops can be successfully transformed (reviewed in Altpeter et al., 2016). Several sexual and apomictic Boechera species have been investigated for their potential to be transformed by Agrobacterium, and it was reported to be possible to regenerate shoots from hypocotyl-derived calli of sexual B. stricta and apomictic B. gunnisoniana and B. holboellii (Ta¸skin et al., 2003, 2015). Somatic embryos derived from immature cotyledons of the apomict B. divaricarpa (Ta¸skin et al., 2009b) and, as a proof-of-concept, stable transformants of B. gunnisoniana were also generated (Ta¸skin et al., 2003). These advances open up the genus Boechera to the possibilities offered by the study of transgenic lines.

Major advances have also been made in the characterization of transcriptomes in sexual and apomicitic Boechera spp. Microarrays were first used to describe transcriptomes in Boechera spp., followed by various sequence-based approaches, including SuperSAGE using Sanger sequencing (Matsumura et al., 2006). Currently, RNA-sequencing (RNA-seq), based on next generation sequencing technologies, is the method of choice to study transcriptomes (Wang et al., 2009). Continuous methodological improvements now allow high precision and high throughput studies of single cell (Picelli et al., 2014), live (Lovatt et al., 2014), and low input (Schmidt et al., 2012; Florez-Rueda et al., 2016) transcriptomes. Not surprisingly, most transcriptomic studies on Boechera spp. focused on differences in expression between apomictic and sexual accessions (Sharbel et al., 2009, 2010; Amiteye et al., 2011, 2013; Aliyu et al., 2013; Schmidt et al., 2014; Shah et al., 2016), while a minority focused on the ecological interactions of plants with their environment (Cano et al., 2013; Gill et al., 2016; Kannan et al., 2018).

The first transcriptomic studies in Boechera spp. were performed by Sharbel et al. (2009, 2010) using the SuperSAGE technique (Matsumura et al., 2006). They quantified gene expression in manually dissected ovules at the MMC stage of two sexual (B. stricta and B. holboelli) and two apomictic accessions (both B. divaricarpa). Additionally, two cDNA libraries representing apomictic and sexual accessions were sequenced using Roche's 454 technology (Sharbel et al., 2009). These were the first reference transcriptomes for the genus Boechera and formed the basis for future studies (see below). In a second study, Sharbel et al. (2010) quantified gene expression between a single apomict and a single sexual Boechera individual at four different developmental time points but without biological or technical replication. Stage-specific and heterochronic patterns of gene expression were identified (Sharbel et al., 2010). Because these first transcriptomic studies (Sharbel et al., 2009, 2010) used single libraries from single individuals without biological replication, they cannot account

for variation between individuals and lack the statistical power for a robust identification of genes that are differentially expressed in sexuals versus apomicts (Lee et al., 2000; Meyers, 2004; Conesa et al., 2016).

The sRNA fraction of the Boechera transcriptome (Amiteye et al., 2011) was identified by a reanalysis of transcriptome data (Sharbel et al., 2009) and the sequencing of two sRNA libraries (Amiteye et al., 2013). Using a Boechera-specific microarray based on the sexual and apomictic reference transcriptomes (Sharbel et al., 2009), an analysis of copy number variation (CNV) in transcriptionally active regions of 10 sexual and 10 apomictic Boechera accessions was performed (Aliyu et al., 2013). The gene ontology classes found enriched in apomictic CNVs (e.g., pollenpistil interaction), led to the hypothesis that CNV in these gene classes serves to buffer the effects of deleterious mutations.

The first attempt to compare sexual and apomictic development in Boechera spp. at the cellular level was pioneered by Schmidt et al. (2014). While previous studies used whole ovules (Sharbel et al., 2009, 2010), they used laser-assisted microdissection (LAM) to isolate the apomictic initial cell (AIC), nucellus, egg, central, and synergid cells of the triploid apomict B. gunnisoniana (Schmidt et al., 2014). After LAM, cDNA libraries were produced and sequenced using SOLiD technology. A reference transcriptome from pooled floral tissues of B. gunnisoniana was sequenced with Illumina technology (Schmidt et al., 2014). An analyses of gene expression and gene ontology enrichment uncovered the upregulation of spermidine metabolism and patterns of altered expression in the AIC. Comparison to female gametophyte cell-specific transcriptomes of A. thaliana (Wuest et al., 2010) identified regulatory pathways that differ between sexual and apomictic germlines, including hormonal, epigenetic, cell cycle control, and transcriptional regulatory pathways (Schmidt et al., 2014). Likewise, comparison of egg cell-specific transcriptomes of B. gunnisoniana and A. thaliana identified genes expressed only in the apomictic egg cell (Florez-Rueda et al., 2016). Future studies that exploit single-cell transcriptomics by comparing apomictic and sexual Boechera spp. are expected to shed light onto the molecular basis of apomixis in the genus Boechera.

A study of apomictic and sexual Boechera seedlings focused on the response to abiotic and biotic conditions and stressspecific changes that might underlie apomixis (Shah et al., 2016). A relationship between apomixis and environmental conditions is also supported by a phenomenon known as


<sup>1</sup>Annotated proteins were used for BUSCO benchmarking C, complete; S, complete-single-copy; D, complete-duplicated; F, fragmented; M, missing BUSCO groups were found (Waterhouse et al., 2017).

FIGURE 5 | The B. divaricarpa genome is extremely heterozygous. Distribution of the number of 19-mer occurrences in B. stricta, which has only one peak, indicating a highly homozygous genome, and B. divaricarpa, which has 2 peaks with a pronounced difference in height. Note that the number of occurrences of different K-mers in the first peak (ca. 90) is half of that in the second peak (ca. 180), suggesting that first and second peak represent the heterozygous and homozygous part of the genome, respectively. N50 values are based on the assembly of Illumina paired-end reads (100× coverage) by Platanus. Heterozygosity has an immediate effect on the contiguity of the genome assembly.

"geographical parthenogenesis" (reviewed in Hörandl, 2006). This concept is based on the observation that apomictic lineages have larger distributional ranges than their sexual relatives. In Boechera spp., niche differentiation was found to be driven by ploidy rather than reproductive mode (Mau et al., 2015), indicating low support for geographical parthenogenesis in the genus. Nevertheless, the great variation in ploidy level and reproductive mode and its relationship with niche differentiation make the genus Boechera a good model for the study of plant-environment interactions (Rushworth et al., 2011). From this perspective, the transcriptomes of an obligate triploid apomict and a diploid sexual, both isolated from a droughtprone habitat, were compared. Specific meiotic genes were found to be down-regulated and stress-related transcription factors and chaperons upregulated in apomictic seedlings (Shah et al., 2016), but the relevance of these findings for reproduction is unknown.

#### GENOMIC RESOURCES FOR THE GENUS Boechera AND CHALLENGES TO GENOME ANALYSIS

The advent of next generation sequencing along with progress in bioinformatics tools opened a new chapter in the study of apomixis, allowing the search for apomixis-associated loci and the comparison, genotyping, and phylogenetic analysis of Boechera species and accessions using whole-genome sequencing.

Currently, the genomes of only two Boechera spp. have been assembled and published (**Table 2**). Both B. stricta and B. retrofracta are self-pollinating, diploid sexuals and have largely homozygous genomes, which are straightforward to assemble. Notably, repeats in the genome of B. retrofracta occupy almost 40% of the genome space. Nearly half of them are long terminal repeats (LTRs) (18.27%) (Kliver et al., 2018). In contrast, only 20% of the B. stricta genome are annotated as repeats (Lee et al., 2017, assembly v1.2). The difference in the repeats number correlates with the difference in their genome sizes of the (**Table 2**). In some apomictic species, the apomixis loci are associated with heterochromatin and/or substantial repetitive sequences (Hand and Koltunow, 2014). The chromosomes carrying the LOSS-OF-APOMEIOSIS (LOA) locus in Hieracium praelatum and the APOSPORY-SPECIFIC GENOMIC REGION (ASGR) in Pennisetum squamulatum are characterized by extensive repetitive sequences and transposon-rich regions (Okada et al., 2011). In apomictic Paspalum simplex, the region containing apomixis-related loci has undergone largescale rearrangements due to transposable elements (Calderini et al., 2006). These similarities in repetitive, heterochromatic regions in the genomes of apomicts have led to the hypothesis that these regions might serve as a sink to sequester factors involved in sexual reproduction, triggering apomixis (Grossniklaus, 2001; Koltunow and Grossniklaus, 2003). In line with this idea, some Boechera apomicts have largely heterochromatic chromosomes (Kantama et al., 2007) and some transposon families were found enriched in an apomictic Boechera lineage (Aliyu et al., 2013). Due to extensive hybridization within the Boechera genus, however, the repeat content of the genome might not reflect the mode of reproduction but rather its phylogeographic history.

The final genome annotation of B. stricta and B. retrofracta encompassed about 27<sup>0</sup> 000 genes in both species. The presence of a slightly greater number of predicted transcripts in B. stricta can be explained by lack of gene expression data for B. retrofracta, which resulted in a less complete gene annotation overall, as confirmed by BUSCO benchmarking (**Table 2**).

Assembling the genomes of diploid apomictic Boechera species is difficult because they exhibit high levels of heterozygosity (**Figure 5**), which results from the combination of disparate genomes as consequence of their hybridogenic origin (Beck et al., 2012). For example, the genome heterozygosity rate of B. divaricarpa is around 2.5% as estimated by GenomeScope (Vurture et al., 2017). Because of all the reasons mentioned above, sequencing and de novo assembly of such a plant genome can result in a highly fragmented genome draft. Annotation of the protein coding genes may not always be correct,

TABLE 3 | Publicly available genomic sequencing data<sup>1</sup> of Boechera genus suitable for reference-free (k-mer based) analysis.


1 Illumina pair-end libraries only. <sup>2</sup>Times of typical Boechera genome size (220Mbases). <sup>3</sup>Reproductive mode assignment is based on k-mer profile.

considering that nearly identical genes are notoriously difficult to assemble. Thus, a mosaic sequence can be formed that does not represent any member of the gene family. The high level of fragmentation and mis-assembly could prevent our ability to draw true conclusions about the evolution of apomixis-associated loci and the molecular mechanisms underlying this interesting phenomenon (Claros et al., 2012).

A key challenge is the assembly of the short reads into contiguous sequences (contigs), which then are assembled into chromosome-scale scaffolds. Another complication is the assignment of genetic variants to the correct homeologous chromosome, a process known as haplotyping (Korbel and Lee, 2013).

Only recently, approaches have been developed that are capable to solve the problem of heterozygous genome assembly. Pacific Biosciences long-read sequencing technology and FALCON/FALCON-Unzip algorithms were used to assemble heterozygous genomes including an F1 hybrid of A. thaliana and the widely cultivated Vitis vinifera cv. Cabernet Sauvignon (Chin et al., 2016). Further development of this assembler resulted in FALCON-Phase, a new method that reconstructs contig-length phase blocks using Hi-C short-reads, which is able to produce true diploid assemblies (Kronenberg et al., 2018). Linked-Read sequencing technology (10× Genomics) has recently been successfully employed for a de novo assembly of the heterozygous F1 diploid pepper (Capsicum annuum) hybrid genome (Hulse-Kemp et al., 2018).

Recently, methods for a haplotype-aware, phased assembly of polyploid genomes were developed for cases where the parental species are known (Akama et al., 2014; Kyriakidou et al., 2018). However, speciation in the genus Boechera has a very complex history, where sexual diploids gave rise to multiple apomictic species through hybridization-associated polyploidy, alloploidy, and aneuploidy. This complexity of speciation resulted in the unprecedented genome diversity of the apomictic species in this genus, which was further exaggerated by mutation accumulation (Lovell et al., 2017) and elevated transposon activity in the apomicts (Ferreira de Carvalho et al., 2016). It is thus often not clear what the parental ancestors of polyploid Boechera spp. are. Genomic analysis based on a haploid reference genome might not reflect the reality, especially for apomicts with highly heterozygous diploid/polyploid genomes. In such a situation, many loci might be completely absent in the reference genome because even in the ideal case, it represents only a consensus genome. Bearing in mind also the problems in producing genome assemblies for apomictic species, we would like to outline the potential of an alternative, reference-free approach for comparative genomic analyses of sexual and apomictic species in the genus Boechera.

A reference-free (or, more general, alignment-free) approach to sequence comparison does not rely on alignment and, therefore, it is especially valuable for analyzing genomes of organisms that do not have a reference (Zielezinski et al., 2017). K-mer or word frequency method is one of the most popular alignment-free method for comparative genome analysis, but its successful application can be hindered by insufficient sequencing depth and biases of genome sampling. Illumina paired-end (PE) sequencing of random-primed libraries produce the most suitable data for processing by this approach. **Table 3** provides a selection of next generation sequencing data from Boechera spp. that satisfy the requirements for alignment-free methods.

To compare sexual and apomictic accessions using the K-mer method, we reanalyzed B. spatifolia sequencing data (Lovell et al., 2017). **Figure 6** shows K-mer (K = 27) profiles of sequencing reads from the eight sympatric pairs of sexual and apomictic B. spatifolia genotypes (Lovell et al., 2017). The K-mer profile provides an estimate of effective sequence coverage and reflects

the rate of genome heterozygosity, the amount of sequencing errors along with errors of sample preparation and sequencing data processing (Supplementary Note 1 in Vurture et al., 2017). As seen in **Figure 6**, profiles of apomictic individuals are clearly distinguishable from profiles of sexual individuals, even for a K-mer coverage as low as 10. The K-mer profiles of the apomictic individuals is shaped by a higher level of heterozygosity compared to sexual individuals (Li et al., 2017) and, in case of high sequence coverage, the profile contains two peaks which represent the heterozygous and homozygous part of the genome.

We also used an alignment-free method to investigate genetic relatedness in the B. spatiofolia individuals analyzed by Lovell et al. (2017). **Figure 7** shows genetic variation analysis analogous to the one described by Lovell et al. (2017, **Figure 1B**) but using Mash software for genetic distance estimation (Ondov et al., 2016). In contrast to Lovell et al. (2017) who used A. lyrata as reference for the alignment of B. spatiofolia sequencing reads, this approach is based solely on the data contained in the reads. Nevertheless, the resulting tree is rather similar.

### CONCLUSION

Apomixis produces progeny that is genetically identical to the mother plant, a trait of great agronomical importance. Unfortunately, the molecular mechanisms underlying apomixis are only poorly understood. A better understanding of the genetic networks that control the components of apomixis are crucial for its introduction into crop plants. Apomicts of the genus Boechera represent a convenient model to study apomixis as it also occurs at the diploid level and genomes of Boechera spp. are comparatively small. Despite these advantages, genome assembly and annotation of the apomictic Boechera lineages is complicated due to such phenomena as a high level of heterozygosity of their genomes, which results from chromosome rearrangements, accompanied by alloploidy, aneuploidy, and substitutions of homeologous chromosomes occurring during hybridization events. The use of next generation sequencing and novel bioinformatic approaches should help to overcome these challenges and facilitate generating the first comprehensive genome of an apomictic plant in the near future. Attempts to apply reference-free methods for the assembly and comparative analysis of such genomes are currently underway. A combination of systems biology approaches to analyze RNAseq and genomic data from sexual and apomictic Boechera species, as well as functional approaches in transgenic plants, will facilitate the disentanglement of the genetic control of apomixis at the molecular level. This is prerequisite for the engineering selfsustaining, apomictic hybrids in sexual crop plants.

#### DATA AVAILABILITY

fpls-10-00392 March 30, 2019 Time: 16:29 # 15

All datasets generated for this study are included in the manuscript and/or the supplementary files.

#### AUTHOR CONTRIBUTIONS

VB and UG designed and directed the study, wrote the introduction and conclusion, and advantages of the Boechera genus for the study of apomixis. JO and VB analyzed the systematic position of Boechera and habitats. JO, MSN, and VB carried out cyto-embryological studies in the Boechera

#### REFERENCES


genus. AF-R and UG worked on population genetics of Boechera with respect to apomixis. VB and MSN performed the inheritance and genetic aspects of apomixis in Boechera. MSN and AF-R performed the molecular experiments in Boechera: transformation, laser capture microdissection, and transcriptomics. DS and EB analyzed the available NGS data and genomic resources for Boechera. EB inquired the transcriptomic investigations for analysis of apomictic plants. All authors read and approved the final manuscript.

#### FUNDING

This work was supported by the University of Zurich and grants 16-54-21014 from the Russian Foundation for Basic Research and 1.52.1647.2016 from St. Petersburg State University (to VB), as well as the Marie Curie Action IDP BRIDGES from the European Union and grant IZLRZ3\_163885 from the Swiss National Science Foundation (to UG) through the Scientific & Technological Cooperation Programme Switzerland-Russia (STCPSR). Research in St. Petersburg was carried out within the framework of the state assignment No. AAAA-A18- 118051590112-8 to BIN RAS.


by the rust fungus Puccinia monoica in Boechera stricta. PLoS One 8:e75293. doi: 10.1371/journal.pone.0075293


contribute to genomic divergence under asexuality. BMC Genomics 17:884. doi: 10.1186/s12864-016-3234-9




Spring Harb. Symp. Quant. Biol. 77, 17–21. doi: 10.1101/sqb.2012.77. 014894


Harv. Papers Bot. 12, 235–257. doi: 10.3100/1043-4534(2007)12[235:NANSOB] 2.0.CO;2


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The reviewer AJ declared a past co-authorship with one of the authors UG to the handling Editor.

Copyright © 2019 Brukhin, Osadtchiy, Florez-Rueda, Smetanin, Bakin, Nobre and Grossniklaus. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Loss-of-Function of a Tomato Receptor-Like Kinase Impairs Male Fertility and Induces Parthenocarpic Fruit Set

Hitomi Takei1,2† , Yoshihito Shinozaki1,2† , Ryoichi Yano<sup>1</sup> , Sachiko Kashojiya<sup>1</sup> , Michel Hernould3,4, Christian Chevalier<sup>3</sup> , Hiroshi Ezura1,5 and Tohru Ariizumi1,5 \*

<sup>1</sup> Graduate School of Life and Environmental Sciences, University of Tsukuba, Tsukuba, Japan, <sup>2</sup> Japan Society for the Promotion of Science (JSPS), Kôjimachi, Japan, <sup>3</sup> UMR1332 BFP, Institut National de la Recherche Agronomique (INRA), Villenave-d'Ornon, France, <sup>4</sup> UMR1332 BFP, University of Bordeaux, Bordeaux, France, <sup>5</sup> Tsukuba-Plant Innovation Research Center, University of Tsukuba, Tsukuba, Japan

#### Edited by:

Andrea Mazzucato, Università degli Studi della Tuscia, Italy

#### Reviewed by:

Mariola Plazas, Instituto de Biología Molecular y Celular de Plantas (IBMCP), Spain Giuseppe Leonardo Rotino, Council for Agricultural Research and Economics, Italy

#### \*Correspondence:

Tohru Ariizumi ariizumi.toru.ge@u.tsukuba.ac.jp †These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 15 November 2018 Accepted: 18 March 2019 Published: 16 April 2019

#### Citation:

Takei H, Shinozaki Y, Yano R, Kashojiya S, Hernould M, Chevalier C, Ezura H and Ariizumi T (2019) Loss-of-Function of a Tomato Receptor-Like Kinase Impairs Male Fertility and Induces Parthenocarpic Fruit Set. Front. Plant Sci. 10:403. doi: 10.3389/fpls.2019.00403 Parthenocarpy arises when an ovary develops into fruit without pollination/fertilization. The mechanisms involved in genetic parthenocarpy have attracted attention because of their potential application in plant breeding and also for their elucidation of the mechanisms involved in early fruit development. We have isolated and characterized a novel small parthenocarpic fruit and flower (spff) mutant in the tomato (Solanum lycopersicum) cultivar Micro-Tom. This plant showed both vegetative and reproductive phenotypes including dwarfism of floral organs, male sterility, delayed flowering, altered axillary shoot development, and parthenocarpic production of small fruits. Genome-wide single nucleotide polymorphism array analysis coupled with mapping-by-sequencing using next generation sequencing-based high-throughput approaches resulted in the identification of a candidate locus responsible for the spff mutant phenotype. Subsequent linkage analysis and RNA interference-based silencing indicated that these phenotypes were caused by a loss-of-function mutation of a single gene (Solyc04g077010), which encodes a receptor-like protein kinase that was expressed in vascular bundles in young buds. Cytological and transcriptomic analyses suggested that parthenocarpy in the spff mutant was associated with enlarged ovarian cells and with elevated expression of the gibberellin metabolism gene, GA20ox1. Taken together, our results suggest a role for Solyc04g077010 in male organ development and indicate that loss of this receptor-like protein kinase activity could result in parthenocarpy.

Keywords: Solanum lycopersicum, fruit set, male sterility, in situ hybridization, next generation sequencing, gene mapping

**Abbreviations:** Bc1F2, Single backcross and second filial generation; Bc3F2, Backcross three times and second filial generation; Bcip, 5-Bromo-4-chloro-3<sup>0</sup> -indolyl phosphate; Bsa, Bovine Serum Albumin; F1, First filial generation; F2, Second filial generation; Faa, Formalin-Acetic acid-Alcohol; Ms, Murashige and Skoog medium; Nbt, Nitroblue Tetrazolium; Pbs, Phosphate buffered saline; Rnai, Rna interference; Sgn, Sol Genomics Network; Snp, Single Nucleotide Polymorphism; Spff, Small Parthenocarpic Fruit and Flower; Sspe, Sodium Chloride-Sodium Phosphate-Edta.

## INTRODUCTION

fpls-10-00403 April 12, 2019 Time: 16:53 # 2

The flower-to-fruit transition, also known as "fruit set," corresponds to a major developmental shift that transforms an ovary into a fruit (Gillaspy et al., 1993). This genetically programmed process is coordinated by a complex network of signaling pathways that are activated by interacting endogenous and exogenous cues, although the genetic and molecular factors that control the flower-to-fruit transition remain poorly understood (Ariizumi et al., 2013). The development of parthenocarpic fruit has been observed under some conditions; this pollination-independent seedless fruit can arise when fertilization is inefficient, mainly due to male sterility. Some naturally occurring tomato genetic parthenocarpy has been identified, and these parthenocarpic mutants have been designated pat, pat-2, and Pat-k/SlAGL6 (Shinozaki and Ezura, 2016; Klap et al., 2017; Takisawa et al., 2018). The pat mutant is characterized by short anthers, partial male sterility, and the production of small fruits (Mazzucato et al., 1998). The locus of the gene responsible for pat phenotypes was narrowed down to chromosome 3 (Beraldi et al., 2004). In addition, the gene encoding SlGA20ox1, the key enzyme for gibberellin (GA) accumulation in the pollinated tomato ovary, is highly expressed in pat ovaries; this is likely to activate GA metabolism and increase GA levels in the unpollinated ovaries, thus triggering parthenocarpy (Olimpieri et al., 2007). The pat-2 phenotype appears to be caused by a recessive mutation at a single locus on chromosome 4, in a gene encoding a zinc finger homeodomain protein (Nunome, 2016); GA also accumulates at high levels in unpollinated pat-2 ovaries (Fos et al., 2000). Furthermore, it has been shown that fruit set initiation through both pollination-dependent and -independent processes occurs concomitantly with the down-regulation of a family of floral homeotic MADS-box genes, which regulate floral organ identities (Wang et al., 2009; Tang et al., 2015). Indeed, the loss of function of several MADS-box genes can cause tomato parthenocarpy. For instance, the loss of function of tomato MADS-box 29, tomato MADS-box 5, and DEFENCIENS/TOMATO APETALA3/STAMENLESS result in parthenocarpy, together with abnormal stamen differentiation (Pnueli et al., 1994; Ampomah-Dwamena et al., 2002; Mazzucato et al., 2008; Quinet et al., 2014; Okabe et al., 2019). Moreover, parthenocarpy was induced in tomatoes that were genetically transformed in order to inhibit stamen development at an early stage of differentiation via the expression of the BARNASE ribonuclease gene under a stamen-specific promoter (Medina et al., 2013). Although the mechanisms underlying the role of the stamen in parthenocarpy have not yet been fully characterized, it has been hypothesized that stamens could counteract fruit set initiation before pollination in tomato plants, and this may be associated in part with elevated levels of GA (Okabe et al., 2019).

Flowers and fruits are considered to represent sink organs because their development requires high level of nutrients such as sucrose, as a carbon source (Osorio et al., 2014). The vasculature within flowers, fruits, and their pedicels is therefore of major importance because it transports nutrients and water to these organs (Ranci ˇ c et al., 2010 ´ ). XYLEM INTERMIXED WITH PHLOEM1 (XIP1) is one of the proteins with a key role in the organization of vasculature in Arabidopsis (Shiu and Bleecker, 2001). This protein is a leucine-rich repeat receptor-like kinase (RLK) that belongs to a large family with at least 216 members encoded in the Arabidopsis genome. A loss of XIP1 resulted in modification of vascular bundle organization and abnormal lignification of phloem cells, transforming them to xylem cells (Bryan et al., 2012).

To identify key regulators of parthenocarpy, the present study characterized a novel tomato parthenocarpic mutant known as small parthenocarpic fruit and flower (spff ), which was isolated from a population where mutations were introduced via exposure to γ-ray irradiation (Saito et al., 2011). The spff mutant exhibits small flower formation, male sterility, and increased transcription of GA20ox1 in young ovaries. Furthermore, a rapid highthroughput approach followed by functional validation using RNA interference (RNAi) resulted in the identification of a gene encoding a novel RLK protein.

### MATERIALS AND METHODS

#### Plant Material and Growth Conditions

Tomato wild-type (WT) plants, Solanum lycopersicum "Ailsa-Craig" and "Micro-Tom," and spff mutant plants were grown in pots and irrigated daily with Otsuka first and Otsuka second fertilizer solutions under greenhouse conditions in Tsukuba, Japan. The greenhouse was maintained at the ambient temperature and light photoperiod in July and August. WT S. lycopersicum "Micro-Tom" and spff plants for RNA sequence (RNA-seq) analysis, RNAi experiments, and histological analyses were grown in rockwool and irrigated daily with Otsuka first and Otsuka second under vertical farm conditions at 25◦C with a 16/8 h light/dark cycle.

#### Histological Analysis

Histological analysis of flower tissues was processed as described by Hao et al. (2017). Wax-embedded floral buds were cut into 10-µm cross-sections, layered onto glass slides, and dried overnight at 42◦C. The cell size and the number of cell layers were evaluated and the significance of group differences were statistically analyzed using Student t test.

#### Pollen Number and Germination Assay

Pollens were obtained from anthers at the anthesis stage and germinated in 1 mL of pollen germination medium (0.52 M sucrose, 1.6 mM boric Acid, 1 mM CaCl2, 1 mM Ca(NO3)2, 1 mM MgSO4, and 0.01 mM Tris–HCl, pH 7.0). After incubation for 16 h at room temperature, pollen grains were observed under a light microscope. The pollen germination ratio was calculated by dividing the number of germinated pollen (in which the size of the pollen tube is twice or more the diameter of the pollen grain) by the total number of pollen grains; this was defined as the number of pollens observed within one microscopic field. The determinations were made for three replicate biological experiments.

#### High-Density Genetic Mapping

fpls-10-00403 April 12, 2019 Time: 16:53 # 3

For genetic mapping by an Infinium assay (Illumina) using the SolCAP single-nucleotide polymorphism (SNP) array<sup>1</sup> , an F<sup>2</sup> population was derived from a cross between the spff mutant (Micro-Tom background) and WT plants (Ailsa-Craig background). Genomic DNA of 44 F<sup>2</sup> plants (43 with spff mutant phenotypes and one with the WT phenotype), together with F1, WT Micro-Tom and parental plants of each genotype, was extracted from fresh leaves using Maxwell 16 DNA purification kits, according to the manufacturer's protocol (Promega). A total of 48 DNA samples were then used for the SolCAP analysis, using the method described by Sim et al. (2012). Of the 7600 markers analyzed, 1956 markers showed polymorphisms that distinguished between Micro-Tom and Ailsa-Craig; these were used for genotyping. SNPs were obtained from the Kazusa Marker Database<sup>2</sup> . For the linkage analysis, we examined the genotypes at the position 59,966,064 bp on chromosome 4 with the tomInf4732 SNP marker, with sequences of AAGCTT and AAGATT in Micro-Tom and Ailsa-Craig, respectively. Each genotype was discriminated using the primers listed in **Supplementary Table S1**, followed by restriction digestion with Hind III for 8 h at 37◦C.

### Mapping-By-Sequencing

For further fine mapping based on the mapping-by-sequencing approach (Abe et al., 2012; Garcia et al., 2016), an F<sup>2</sup> population was constructed by crossing the spff mutant and WT, in the Micro-Tom background (**Supplementary Figure S1**). Genomic DNA was extracted from fresh leaves of F<sup>2</sup> plants that exhibited the spff mutant phenotype, as described above. The same amount of extracted DNA from 20 individual plants was pooled and sequenced by 100 bp paired-end sequencing (HiSeq 2000; Illumina). Mutation or variant information was obtained using the Bowtie2-Samtools-GATK (Genome Analysis Tool Kit) pipeline (Li et al., 2009; McKenna et al., 2010; Langmead and Salzberg, 2012). Briefly, Illumina short reads were aligned onto the tomato genome reference SL2.40 by Bowtie2 version 2.2.1<sup>3</sup> with default parameters. Mutations or variants including SNPs or insertion-deletions (Indels) were then detected by GATK version 3.5 (McKenna et al., 2010). SNPs and Indels that might cause nonsynonymous amino acid substitution, a premature stop codon, or frameshift were identified using HaplotypeCaller, as described previously (McKenna et al., 2010; Pulungan et al., 2018). Allele frequency datasets were also obtained using GATK. Because the Micro-Tom cultivar is not inbred and relatively many intracultivar variations are present between individuals, we subtracted such intra-cultivar variants from the SNP/Indel datasets using next generation sequencing datasets of several WT Micro-Tom individuals (Pulungan et al., 2018). Candidate genes with a high SNP/Indel index and reliable read numbers (≥10) were then identified. In this analysis, the SNP/Indel index was calculated as the proportion of sequenced reads that included mutant allele SNPs or Indels, in relation to the WT allele.

## Linkage Analysis of the spff Locus

The spff mutant was backcrossed four times with Micro-Tom WT in order to purify the responsible mutation and finally obtain BC4F<sup>2</sup> plants (**Supplementary Figure S1**). Linkage analysis was performed using DNA extracted from F2, BC2F2, BC3F2, and BC4F<sup>2</sup> populations (**Supplementary Table S1**). Genomic DNA was extracted by DNeasy Miniprep kit (QIAGEN) and amplified by PCR with TaKaRa Ex Taq (TAKARA) and the primer set shown in **Supplementary Table S2**. The PCR products were purified by the Illustra ExoStar kit (GE Healthcare) and then sent to Eurofins Genomics for sequencing.

#### Construction of the RNAi Plasmid

The RNAi construct was designed using Gateway technology (Invitrogen). Total RNA was extracted from WT ovaries using the RNeasy Plant Mini Kit (QIAGEN), followed by the removal of genomic DNA using RNA Clean & Concentrator (ZYMO RESEARCH). cDNAs were then synthesized using the SuperScript VILO cDNA Synthesis Kit (Thermo Fisher Scientific). A 521 bp fragment of the Solyc04g077010 transcript was amplified using the KOD Plus kit (TOYOBO); the cDNA was used as the template, and SlXIPRNAiF1 and SlXIPRNAiR1 were the primers (**Supplementary Table S2**). The amplicon was then cloned into the donor pBI-sense, antisense-GW vector (INPLANTA INNOVATIONS INC., Japan), allowing expression under the control of the constitutive 35S promoter. The resulting plasmid was introduced into WT Micro-Tom by Agrobacteriummediated transformation using A. tumefaciens GV2260 (Sun et al., 2006). Transgenic lines were selected on Murashige and Skoog (MS) agar plates containing kanamycin (100 mg L−<sup>1</sup> ).

#### RNA Sequencing

Ovaries were collected from flowers at anthesis, separated into three replicates (15–17 ovaries in each replicate) and ground in liquid nitrogen. Total RNA extraction from the ovaries and subsequent cDNA synthesis were performed as described above. Genome-wide RNA expression levels were analyzed by HiSeq (Illumina) with 100 bp single-end reads. The raw reads were subjected to quality filtering before employing the TopHat2- Cufflinks pipeline to calculate the number of reads and calculate expression levels using the reads per kb of transcript per million mapped reads (RPKM), as described previously (Yano et al., 2018). Comprehensive data were analyzed using multiple t tests (p < 0.05), followed by the Bonferroni correction method, with false discovery rate analysis. Genes with mean RPKM values of ≥1 (three replicates) were considered to be expressed. Genes were considered differentially expressed if the log2 fold ratios were ≥ 1.0 or ≤ −1.0, with false discovery rate adjusted p values (q values) of < 0.05.

### Expression Analysis by Quantitative Reverse Transcription PCR (qRT-PCR) and RT-PCR

For qRT-PCR analysis, the leaves were ground to a fine powder in liquid nitrogen. Total RNA extraction from the samples and subsequent reverse transcription reactions were performed as

<sup>1</sup>http://solcap.msu.edu

<sup>2</sup>http://marker.kazusa.or.jp/Tomato/

<sup>3</sup>https://solgenomics.net/organism/Solanum\_lycopersicum/genome

described above. PCRs were carried out by the CFX96 system (Bio-Rad), using the SYBR Premix Ex Taq kit (TaKaRa) and the appropriate gene-specific primers (**Supplementary Table S2**) according to previously described procedures (Shinozaki et al., 2015). Technical triplicates were performed for each sample, with biological triplicates. The expression levels were calculated using the delta-delta CT method (Pfaffl, 2001), with normalization to the expression of the reference gene, SAND (Expósito-Rodríguez et al., 2008). For RT-PCR analysis, cDNA synthesis was performed as described above and equal amount of cDNA was used as templete to observe level of SPFF mRNA in various tissues.

#### In situ Hybridization

fpls-10-00403 April 12, 2019 Time: 16:53 # 4

The riboprobes used to detect spff transcripts were made from a 775 bp fragment amplified from tomato root cDNAs by PCR using the ishF2-ishR1 primer set. The PCR product was used for subsequent PCR using the ishT7F2-ishR1 primer set for sense, and the ishF2-ishT7R1 primer set for antisense, riboprobes; this introduced the T7 RNA polymerase promoter at the 5<sup>0</sup> and 3<sup>0</sup> ends, respectively. Labeled riboprobes were synthesized by in vitro transcription in the presence of digoxigenin-UTP (DIG RNA Labeling kit, SP6/T7; Roche) and used for in situ hybridization. The plant tissue processing and in situ RNA hybridization experiments were performed following the protocol described by Sicard et al. (2008). Primer sequences used in this study are shown in **Supplementary Table S2**. For the comparative analysis between WT and spff mutant, both WT and spff mutant samples were mounted on the same glass slides to allow the direct comparison under the same condition.

### RESULTS

#### Identification of the Single Recessive Parthenocarpic spff Mutant

A visual screening of tomato M<sup>3</sup> populations obtained after γ-ray irradiation-induced mutagenesis in the genetic background of Micro-Tom, a dwarf and rapid growth variety (Matsukura et al., 2007; Saito et al., 2011), resulted in the isolation of a mutant line (TOMJPG4121) that produced small seedless parthenocarpic fruit (**Figure 1A**). These plants also produced smaller flowers than the WT plant, particularly due to their narrower petals and shorter anthers (**Figure 1B**). We therefore called this line the spff mutant. Although the spff mutant did not produce seeded fruits by practical self-pollination, crossing WT pollen to the spff stigma did result in seeded fruits (**Figure 1A**); these F<sup>1</sup> seeds germinated normally, suggesting that spff is male-sterile, with the ovary retaining substantial fertility. Furthermore, all of the resulting six F<sup>1</sup> plants exhibited normal flower morphology, with no evidence of parthenocarpic ability, indicating that these mutant phenotypes were recessive. Thirty-three out of 109 F<sup>2</sup> progenies obtained through crossing with the WT cultivar Micro-Tom, and 43 out of 186 F<sup>2</sup> progenies obtained through crossing with the WT cultivar Ailsa-Craig, exhibited the spff mutant flower morphology and parthenocarpy phenotypes (**Table 1** and **Supplementary Figure S2**). These segregation ratios

FIGURE 1 | Parthenocarpic fruit and small flower production in spff mutant cultivar Micro-Tom. (A) Representative fruit of the spff mutant. Spontaneously obtained parthenocarpic fruit (left) and seeded fruit from manual pollination of the flower with WT pollen (right) are shown. (B) Comparison of WT and spff flowers at anthesis stage. Bars are 1 cm.

corresponded to the expected 3:1 for a single recessive gene (Chi-squared = 1.62 for Micro-Tom and 0.35 for Ailsa-Craig background at p < 0.05 ). These data suggested the presence of a monogenic recessive mutation in the spff line. In the spff and WT cultivar Micro-Tom or Ailsa-Craig F<sup>2</sup> populations, anthesis of the first flower was delayed in the plants with the spff phenotype for 19 or 15 days, respectively, as compared to plants with the WT phenotype (**Supplementary Figure S3**); this indicated that the flowering delay trait was tightly associated with the spff flower morphology and parthenocarpy phenotypes.

### Characterization of the Pleiotropic Mutant Phenotypes in spff

For detailed phenotypic characterizations, the spff mutant in the M<sup>3</sup> population was backcrossed four times with WT cultivar Micro-Tom pollen to reduce mutagen-induced background mutations (**Supplementary Figure S1**). The resulting BC4F<sup>2</sup> plants that exhibited spff phenotypes were analyzed. First, we examined the parthenocarpic phenotype in the spff mutant. The spff yielded obligate parthenocarpic fruit under spontaneous

TABLE 1 | Segregation test of spff mutant traits at F<sup>2</sup> progeny of a self-pollinated F<sup>1</sup> plant.


The spff mutant phenotypes were evident as small flower and parthenocarpic fruit formation.

<sup>a</sup>chi-squared test (p < 0.05). ns, not significant.

production, and this was not observed in WT plants (**Figure 2A**). Compared to the pollinated WT fruits, the spff parthenocarpic fruits were smaller and lighter (**Figures 2B–E**). For cytological characterization of parthenocarpy at the early developmental stage, we prepared cross-sections of the ovaries at anthesis and examined the number of cell layers and cell size within the pericarp (**Figure 3**). The spff mutant cells were significantly larger than the WT cells, by approximately 1.3-fold (WT = 202 ± 17 µm<sup>2</sup> , spff = 272 ± 11 µm<sup>2</sup> ), and spff had fewer

cell layers. This suggested that spff parthenocarpy was associated with cell enlargement, rather than active cell division.

Further, the smaller flowers produced in spff reflected the presence of smaller constitutive tissues, including the petals, style, and anthers; the clearly defective anther may explain the malesterility of this mutant (**Figures 4A–H**). To evaluate the male fertility of spff, cross-sections of the WT and spff anthers at the anthesis stage were compared. The oval-shaped WT anther locules included pollen grains that showed a germination rate

of approximately 60 ± 5% (**Figures 4D,I,K,L**). In contrast, the spff anther locules were shrunken and contained very few pollen grains, which were unable to germinate (**Figures 4H,J–L**); this indicated that the spff mutant was fully male sterile. In addition, histological observations of the spff and WT ovaries at the bud length 4 mm indicated the presence of equivalent internal structures, except for their size (**Figures 4M–O**), consistent with the fact that the spff retained substantial female fertility (**Table 1**).

We also found that spff affected plant architecture, with an altered pattern of axillary shoot development (**Supplementary Figures S4A–C**). The lateral branches of spff showed increased sympodial growth, in which vegetative and inflorescence stems were more actively developed from the individual first axillary buds, leading to a bushy plant morphology. These data characterizing the phenotypes of spff indicated that the mutation conferred pleiotropic effects on both reproductive and vegetative morphology in tomato plants.

We next compared yield potential between WT and spff mutant. Since spff mutant showed significant growth delay compared to WT leading to late fruit production (**Figures 5A,B**), which made it difficult to conduct comparative yield quantification, WT and spff mutant plants were grown in a greenhouse for 112 and 173 days, respectively, until they nearly reached vegetative growth maturation, determining the yield of ripe red fruits as well as the total number of fruits per plant. The yield (total weight) of ripe red fruit in spff mutant was reduced to 28 % of WT albeit longer growth period and higher number of fruits per plant, suggesting less impact of its potential for improving yield (**Figures 5C–F**).

### Identification of the Gene Associated With the spff Phenotype

The spff mutation was mapped using an F<sup>2</sup> population obtained by crossing spff mutants (Micro-Tom background) with WT plants (Ailsa-Craig background) by an Illumina SNP Infinium analysis with the SolCAP array (Sim et al., 2012). We generated in total of 186 F<sup>2</sup> plants consisting of 143 plants showing WT phenotypes and of 43 plants showing spff mutant phenotypes (**Table 1**). Genotyping of 43 plants with the spff mutant phenotype using 1956 SNP markers pointed to a 2.6 Mbp region flanked by two SNP markers (solcap\_snp\_sl\_36809 and solcap\_snp\_sl\_3746) on chromosome 4. Based on the tomato genome release SL2.40, these two markers flanked locations at positions 57,939,715 bp and 60,553,996 bp, respectively (**Figure 6A** and **Supplementary Figure S5**). No other possible candidate loci were detected, consistent with the fact that the spff phenotypes reflected a monogenic recessive mutation (**Table 1**). According to the Kazusa Marker Database<sup>3</sup> (based on SL2.40), this candidate region included 267 protein-coding genes. The tomInf4732 SNP, which discriminated between Ailsa-Craig and Micro-Tom alleles within the candidate region using primer set F4-R4 (**Supplementary Table S2**), was used to further genotype 73 F<sup>2</sup> plants. These included 43 plants with spff phenotypes and 30 with WT phenotypes, allowing us to narrow down the region of interest to 2.0 Mbp, which included 205 genes.

We next employed mapping-by-sequencing (Abe et al., 2012; Garcia et al., 2016) of an F<sup>2</sup> population derived by crossing spff with WT in a Micro-Tom background. DNA from 20 individual mutant phenotype F<sup>2</sup> plants was sequenced by Illumina HiSeq, and cleaned reads were mapped onto the cultivar Micro-Tom reference genome; polymorphisms were substituted against the cultivar Heinz reference genome version SL2.40 (Kobayashi et al., 2014). The Bowtie2-Samtools-GATK pipeline identified and calculated the frequencies of potential spff -specific SNPs and Indels. This analysis identified 77 mutant homozygous SNPs and Indels within the region narrowed down by SNP Infinium analysis (**Supplementary Table S3**). These 77 mutations were present in the coding regions of 46 genes, which were considered to represent candidate genes for

structure of Solyc04g077010. Chromosome 4 is indicated by the green bar. The gray boxes indicate the untranslated regions, orange boxes indicate the exons, separated by the introns. (B) Nucleotide and amino acid sequences of the WT and the spff mutant around the mutation point. The upper black letters indicate nucleotides, and the lower red letters indicate putative amino acids. The 2 bp deletion indicated by hyphens introduces a premature stop codon in the first exon in the spff mutant. (C) The putative amino acid length and domain structure of the RLK encoded by Solyc04g077010, in WT and the spff mutant. In B, C <sup>∗</sup> indicate the stop codon.

the spff phenotype (**Supplementary Table S4**). Five of these candidates (Solyc04g076020, Solyc04g076100, Solyc04g076250, Solyc04g076320, and Solyc04g077010) were chosen for further linkage analysis. These were selected because of their relatively high expression levels in flowers and fruits, according to tomato eFP browser (Winter et al., 2007; The Tomato Genome Consortium, 2012), and because of the predicted impact of the mutation on the encoded protein. Their linkages with the spff phenotypes were analyzed using marker-based approaches at F<sup>2</sup> and backcrossed populations listed in **Supplementary Table S1** with the five primer sets shown in **Supplementary Table S2**. The F18-R18 marker for a 2 bp deletion in the Solyc04g077010 gene (**Figure 6B**), which encodes an RLK, showed perfect segregation with the spff phenotypes. All of the 83 mutant-phenotype plants, and none of the 80 non-parthenocarpic plants, were homozygous for this mutation; the non-parthenocarpic plants were either heterozygous or azygous for this mutation, while four other mutations were not perfectly linked with the spff phenotypes (**Supplementary Table S5**). We realized that the gene model of Solyc04g077010 in the tomato gene annotation ITAG2.3/SL2.40 differed from the latest ITAG3.2/SL3.0<sup>4</sup> , in which Solyc04g077010 consists of two exons spanning 2871 bp and encoding 957 amino acids. The mutation identified in the present study was located in the first exon and led to a frame shift, which

introduced a premature stop codon at position 494 and therefore generated a truncated protein composed of 493 amino acids (**Figure 6C**). RLK proteins are structurally characterized by three conserved domains: a receptor domain containing a varying number of leucine-rich repeats; a transmembrane domain; and a kinase domain that transduces the downstream signal via autophosphorylation (Shiu and Bleecker, 2001). The RLK protein encoded by Solyc04g077010 harbors a single transmembrane domain between amino acids 505 and 524. This suggested that the mutation would cause a loss-of-function of this protein, thus resulting in the spff mutant phenotypes.

To confirm this, RNAi was used to reduce Solyc04g077010 expression. The RNAi vector targeted the first exon of this gene, which encoded a highly specific receptor domain that was confirmed to be unlikely conserved in other tomato genes encoding RLK proteins by the BLAST search. The RNAi vector was introduced into Micro-Tom plants and three transgenic lines were obtained; these showed significantly reduced mRNA expression of the target protein (**Figure 7A**). These three independent transgenic plants showed resemblance to spff phenotypes such as producing small flowers and fruits with parthenocarpy (**Figures 7B–H**). Further, those RNAi showed complete male sterility, while pollination of WT pollen gave rise to mature viable seeds as observed in spff mutant. These

<sup>4</sup>https://solgenomics.net/organism/Solanum\_lycopersicum/genome

analyses demonstrated that the spff phenotypes resulted from a loss-of-function of this RLK protein-encoding gene.

### Vasculature-Specific Expression of SPFF Gene in Flower Receptacle

The in silico expression profile obtained by RNA-seq and RT-PCR analyses (Winter et al., 2007; The Tomato Genome Consortium, 2012) revealed that Solyc04g077010 was expressed in various plant organs, including roots, leaves, buds, and flowers (**Supplementary Figures S7A**, **S8**). Previously published transcriptome data (Ezura et al., 2017) indicated that this gene was expressed in floral organs both before and after anthesis, and transcripts were observed in individual floral organs including the ovary/pistil, anther, petal, and sepal, with the highest expression observed in the ovary/pistil at 1 day before anthesis (**Supplementary Figure S7B**). Interestingly, a spatiotemporal analysis of the transcriptome of developing tomato fruits (Fernandez-Pozo et al., 2017; Shinozaki et al., 2018b) revealed vasculature-specific expression of Solyc04g077010 in the fruit pericarp throughout development (**Supplementary Figure S7C**). Consistent with this, predominant expression of this gene was also found in fruit internal tissues, columella and placenta (**Supplementary Figure S7D**), with a high abundance in thick vascular bundles.

To unravel the spatio-temporal expression pattern of Solyc04g077010 during flower development, in situ mRNA hybridization was performed in WT floral buds at different stages of development. In the early developing 1.1 mm bud, the transcript signal was exclusively observed in the vasculature tissues of the receptacle (**Figures 8A,D**). As development proceeded, the SPFF transcripts were also detected in the vasculature of the pedicel (2.9 mm bud) (**Figures 8B,E**), and in the vasculature of the columella tissue (4.5 mm bud) (**Figures 8C,F**). We also observed reduced SPFF transcripts in receptacle and leaves of spff compared to WT (**Supplementary Figure S6A**), indicating that the spff mutation influences both transcript abundance and protein function.

#### Solyc04g077010 Mutation May Affect Hormonal Regulation at the Transcriptional Level

To obtain insights into the molecular mechanisms underlying parthenocarpy in the spff mutant plant, the ovarian transcriptome at the anthesis stage, corresponding to flowerto-fruit transition, was compared to that of WT plants. Our RNA-seq analysis identified a total of 25 differentially expressed genes; 13 of these were significantly up-regulated in spff plants (log2 fold-change > 1) and 12 were significantly down-regulated (log2 fold-change < −1) (q values < 0.05 for the comparison with WT, **Supplementary Table S6**). Notably, the up-regulated genes in the spff ovary included SlGA20ox1 (Solyc03g006880), which encodes a key GA biosynthetic enzyme that is induced by pollination and is also highly expressed during parthenocarpy in the pat mutant (Olimpieri et al., 2007; Serrani et al., 2007b). In spff, the expression level of SlGA20ox1 was > 10-fold of that observed in the WT plant. This result suggested that GA

buds. Distribution of Solyc04g077010 expression was detected by an antisense probe (A–C) and a low background signal was detected by a control sense probe (D–F) in 1.1 mm (A,D), 2.9 mm (B,E), and 4.5 mm (C,F) buds. Black arrows indicate signals detected in vasculature bundles and their distribution in the receptacle, pedicel, and/or columella. Bars are 100 µm.

is involved in the parthenocarpic early transition from flower to fruit exhibited by the spff mutant. To gain further insights into this, we compared our differentially expressed genes with previously published transcriptomic data obtained from GAtreated and -untreated unfertilized ovaries (Tang et al., 2015). One of our 13 up-regulated genes (SlGA20ox1) and three of our 12 down-regulated genes [Solyc02g078150 (Plant-specific domain TIGR01615 family protein), Solyc12g094620 (catalase), and Solyc05g005150 (F-box/Kelch repeat-containing F-box family protein)] were found in the list of genes that were up- and down-regulated by GA treatment, respectively.

#### Flower Receptacle Development Is Not Likely to Be Affected in the spff Mutant

A database BLASTP search showed that the protein encoded by XYLEM INTERMIXTED WITH PHLOEM1 (XIP1) is the closest homolog to tomato SPFF, with 63% amino acid identity (E-value 0, score 1153 bits, and 77% positives) with the Arabidopsis counterpart (GenBank accession no. BAC42540.1). Arabidopsis xip1 loss-of-function mutants showed excessive

anthocyanin accumulation in the leaves and severe defects in plant growth, while fertility was not affected (Bryan et al., 2012). Here, the spff mutant did not show excessive anthocyanin accumulation in the leaves and showed severe male sterility (**Figure 4** and **Supplementary Figure S4D**). Nevertheless, the fact that the xip1 mutants altered plant vascular development, represented by intermixed xylem with phloem, suggests a similar function for the SPFF protein, whose expression was indeed localized to vasculature in the fruit and inflorescence tissues (**Figure 8** and **Supplementary Figure S7C**). To unravel this, we compared xylem-phloem distribution patterns between WT and spff mutant receptacles. Cross-sections of receptacle were stained with Safranin O and Astra blue to visualize lignified (seen as red) and unlignified (seen as blue) tissues. **Supplementary Figure S9** shows that the stained receptacle cross-sections did not reveal significant xylem-phloem intermixing in the spff mutant.

### DISCUSSION

#### The Gene Associated With the spff Phenotype Encodes a Putative RLK Involved in Flower and Fruit Development

This study aimed to identify and characterize the gene underlying a newly isolated tomato mutant, named spff, which showed parthenocarpy and floral organ dwarfism as its major phenotypes (**Figures 1**–**4**). A high-throughput approach combining highdensity genetic mapping (**Supplementary Figure S5**) and mapping-by-sequencing, followed by conventional genetic linkage analysis (**Supplementary Tables S3**–**S5**), allowed the rapid identification of a potential causal mutation in a gene located on chromosome 4, Solyc04g077010 (**Figure 6**). This gene encodes a potential RLK that appeared to be mainly expressed in the receptacle of young floral buds (**Figure 8** and **Supplementary Figure S6**). A 2 bp deletion mutation was identified, which introduced a premature stop codon that leads to the production of a truncated RLK protein (**Figure 6**) as well as to reduced transcript abundance (**Supplementary Figure S6**). Using RNAi approach, we confirmed that the spff phenotypes could be reproduced by silencing Solyc04g077010 (**Figure 7**), and thus concluded that this is the causative gene for the spff mutant.

The Solyc04g077010 homolog in Arabidopsis, xip1, was reported to be involved in vascular bundle differentiation (Bryan et al., 2012). The xip1 mutant shows aberrant xylemlike cells within the phloem in inflorescence stems. Although Solyc04g077010 appeared to be expressed in close vicinity to the vascular bundle (**Figure 8** and **Supplementary Figures S6**, **S7**), xylem-like cells were not present within the phloem (**Supplementary Figure S9**). Moreover, fertility was not affected in Arabidopsis xip1 mutant plants, where the inflorescence stems are shorter than those of the Col-0 accession plants, and the cotyledons and rosette leaves show a purple color, indicative of anthocyanin accumulation. Since these phenotypes were not observed in the present spff mutant (**Figures 1**, **4** and **Supplementary Figure S4**), Solyc04g077010 does not seem to be a functionally conserved ortholog of XIP1. It is more likely to be a novel gene that has possibly acquired a specific function in tomato, although further analyses are needed to confirm this functional dissimilarity with the Arabidopsis XIP1 gene.

### Hypothesis for How the spff Mutant Induces Parthenocarpy

Parthenocarpy can mimic the molecular mechanisms underlying pollination-dependent ovary growth (Li et al., 2014). Fruit set initiation and parthenocarpy are regulated by complex hormone networks. Molecular genetic studies of many mutants/genotypes and transcriptome analyses of early fruit development have suggested that parthenocarpy is in part induced through a hierarchical scheme of temporal regulation by multiple hormones, initiated by the accumulation of auxin; this induces intense cell division, with the subsequent induction of GA metabolism triggering active cell expansion (Martí et al., 2007; Serrani et al., 2007a, 2008). Thus, GA should act as the downstream signal and cell expansion most likely plays a crucial role for fruit set initiation in tomato (Serrani et al., 2008; Shinozaki et al., 2015). The present study revealed that the spff mutant exhibited higher levels of GA20ox1 than WT plants (**Supplementary Table S6**); this is one of the key factors involved in GA biosynthesis in tomato ovaries (Olimpieri et al., 2007; Serrani et al., 2007b). Further, three GA-down regulated genes (Solyc02g078150, Solyc12g094620, and Solyc05g005150) were found in the list of differentially expressed genes identified by the RNA-seq analysis in the unfertilized ovary of spff mutant (**Supplementary Table S6**). In addition, the small parthenocarpic fruits produced by the spff mutant were characterized by enlarged cells, rather than an increased number of cell layers in the ovary pericarp, most likely due to a lack of intense cell division (**Figure 3**). This was consistent with the characteristics of parthenocarpic fruit induced by increased GA sensitivity (Martí et al., 2007). In contrast, auxin-induced parthenocarpy is associated with intensive cell division in the pericarp, resulting in an increased number of cell layers (Wang et al., 2009). The spff mutant also showed reduced pollen fertility (**Figures 4I–L**), which could reflect an increased GA response (Livne et al., 2015). These results suggest that the RLK encoded by Solyc04g077010 functions to repress the GA response in reproductive organs, and that spff parthenocarpy may result in part from an increased GA response.

Additionally, the association of parthenocarpy with early male organ developmental abnormality has been observed in tomato plants. Mutations or genetic suppressions of MADSbox genes, which inhibit functional stamen development by causing homeotic conversions, can induce parthenocarpy (Pnueli et al., 1994; Ampomah-Dwamena et al., 2002; Mazzucato et al., 2008; Quinet et al., 2014; Okabe et al., 2019). Furthermore, the over-accumulation of BARNASE mRNA under a stamen-specific promoter triggers early anther ablation and parthenocarpy (Medina et al., 2013), while loss of function of SEXUAL STERILITY/HYDRA results in complete male sterility and parthenocarpy (Hao et al., 2017; Rojas-Gracia et al., 2017). Recently, a

tap3 mutant has also been described in which stamens are converted into a carpelloid structure and GA over-accumulates in unfertilized ovaries, most likely due to the overexpression of GA metabolism genes such as GA20ox1 (Okabe et al., 2019). Taken together with the fact that the spff mutant shows male sterility and GA20ox1 is highly expressed in the unfertilized ovary of the spff mutant (**Supplementary Table S6**), it is possible that parthenocarpy in the spff mutant involves increased levels of GA20ox1 transcripts through the association with male sterility. Since our transcriptome analysis revealed no differential expression of MADS-box genes between WT and spff mutants (**Supplementary Table S6**), and no homeotic conversion phenotypes were observed in the spff mutant (**Figures 1**, **4**), the association of floral homeotic genes with the Solyc04g077010 gene, and the mechanisms involved in GA20ox1 gene regulation, require further elucidation.

The in situ mRNA analysis showed that Solyc04g077010 was strongly expressed in vascular bundle cells of the floral receptacle and pedicel (**Figure 8**). Vascular systems in inflorescence stems are important for nutrient and signal transportation during developmental events in the reproductive organs (Ranci ˇ c´ et al., 2010). We therefore hypothesize that the RLK encoded by Solyc04g077010 may be involved in the transportation of molecular substances essential for normal floral organ development, and that loss-of-function mutations of this gene may lead to the disruption of integrity of such a system, which may then cause anther abortion. Since we identified little cytological evidence for structural differences between the vascular bundles observed in WT and spff mutant plants (**Supplementary Figure S9**), future studies are required to investigate this possibility in more detail.

Although the role of RLK family proteins in the regulation of fruit development has yet to be fully delineated, a celltype specific transcriptome study of tomato ovaries showed that several genes encoding RLKs were enriched in the cluster that is mainly expressed in the funiculus of the developing seed. These included a homolog of the Arabidopsis HAESA gene, which is involved in specifying seed abscission zones, suggesting that the tomato homolog may possess a similar function (Pattison et al., 2015). Furthermore, silencing of an invertase inhibitor gene in the SlINVINH1-RNAi line, causing increased cell wall invertase activity, was associated with an overall reduction in the transcription of RLK family members in young ovaries, suggesting that RLK may play a role in sensing the modification of cell wall components, thereby regulating downstream gene expression (Ru et al., 2017). Elucidation of RLK activities, including the identification of ligands and kinase domain target proteins, would provide valuable insights into the involvement of RLK proteins in the regulation of fruit development.

#### CONCLUSION

In conclusion, this study identified a novel tomato mutant showing parthenocarpy and this was caused by the loss of function in the gene encoding a receptor kinase gene designated as SPFF. The parthenocarpic variety potentially shows improved fruit productivity due to increased fruit set efficiency (Shinozaki et al., 2018a), although the spff produced delayed growth, smaller mature fruits and reduced yield compared to WT (**Figure 5**). Such unfavorable traits render this mutant less attractive for breeding application, but it would be interesting to identify hypomorphic (weaker) alleles of spff carrying less detrimental phenotypes through screening from TILLING populations or genome editing approaches (Okabe et al., 2011; Shimatani et al., 2017), and investigate their potentials for impact on breeding application.

### AUTHOR CONTRIBUTIONS

YS and TA contributed to the mutant screening. HT, YS, RY, and TA contributed to genetic mapping and transcriptomic analysis. HT and YS performed phenotypic characterizations of mutant plants. SK and HT contributed to expression analysis. HT, MH, and CC contributed to histological analysis and in situ hybridization assays. HT, YS, MH, CC, HE, and TA wrote the manuscript. All authors reviewed and approved the final manuscript.

### FUNDING

This work was supported by JSPS KAKENHI, grant no. 15KK0273, Program to Disseminate Tenure Tracking System, and JSPS bilateral program to TA, Science and Technology Research Promotion Program for Agriculture, Forestry, Fisheries and Food Industry, Japan (Grant No. 26013A) to HE, a grant from the Japan Society for the Promotion of Science to YS (16J00582), HT (18J20505), and SK (18J00528).

### ACKNOWLEDGMENTS

We thank all members in our lab for critical comments on this research. Seeds of Micro-Tom WT (TOMJPF00001), Ailsa-Craig WT (TOMJPF00004), and spff (TOMJPG4121) were obtained from the National BioResource Project (NBRP) of the Japan Agency for Medical Research and Development (AMED). This research was partly supported by the "Sustainable Food Security Research Project" in the form of an operational grant from the National University Corporation.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00403/ full#supplementary-material

### REFERENCES

fpls-10-00403 April 12, 2019 Time: 16:53 # 12


Micro-Tom of tomato. J. Plant Growth Regul. 26, 211–221. doi: 10.1007/s00344- 007-9014-7


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Takei, Shinozaki, Yano, Kashojiya, Hernould, Chevalier, Ezura and Ariizumi. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Finding a Compatible Partner: Self-Incompatibility in European Pear (*Pyrus communis*); Molecular Control, Genetic Determination, and Impact on Fertilization and Fruit Set

#### *Hanne Claessen1 , Wannes Keulemans1 , Bram Van de Poel2 and Nico De Storme1 \**

*1 Laboratory for Plant Genetics and Crop Improvement, Division of Crop Biotechnics, Department of Biosystems, KU Leuven, Leuven, Belgium, 2 Laboratory for Molecular Plant Hormone Physiology, Division of Crop Biotechnics, Department of Biosystems, KU Leuven, Leuven, Belgium*

#### *Edited by:*

*Emidio Albertini, University of Perugia, Italy*

#### *Reviewed by:*

*Irene Serrano, University of Göttingen, Germany Luigi Russi, University of Perugia, Italy Celia M. Cantin, Fundacion Agencia Aragonesa para la Investigacion y el Desarrollo, Spain*

*\*Correspondence:* 

*Nico De Storme nico.destorme@kuleuven.be*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 15 November 2018 Accepted: 18 March 2019 Published: 16 April 2019*

#### *Citation:*

*Claessen H, Keulemans W, Van de Poel B and De Storme N (2019) Finding a Compatible Partner: Self-Incompatibility in European Pear (Pyrus communis); Molecular Control, Genetic Determination, and Impact on Fertilization and Fruit Set. Front. Plant Sci. 10:407. doi: 10.3389/fpls.2019.00407*

*Pyrus* species display a gametophytic self-incompatibility (GSI) system that actively prevents fertilization by self-pollen. The GSI mechanism in *Pyrus* is genetically controlled by a single locus, i.e., the S-locus, which includes at least two polymorphic and strongly linked S-determinant genes: a pistil-expressed *S-RNase* gene and a number of pollen-expressed *SFBB* genes (S-locus F-Box Brothers). Both the molecular basis of the SI mechanism and its functional expression have been widely studied in many Rosaceae fruit tree species with a particular focus on the characterization of the elusive *SFBB* genes and S-RNase alleles of economically important cultivars. Here, we discuss recent advances in the understanding of GSI in *Pyrus* and provide new insights into the mechanisms of GSI breakdown leading to self-fertilization and fruit set. Molecular analysis of S-genes in several self-compatible *Pyrus* cultivars has revealed mutations in both pistil- or pollen-specific parts that cause breakdown of self-incompatibility. This has significantly contributed to our understanding of the molecular and genetic mechanisms that underpin self-incompatibility. Moreover, the existence and development of self-compatible mutants open new perspectives for pear production and breeding. In this framework, possible consequences of self-fertilization on fruit set, development, and quality in pear are also reviewed.

#### Keywords: *Pyrus communis*, gametophytic self-incompatibility, fertilization, fruit set, S-RNase, SFBB

## INTRODUCTION

Self-incompatibility (SI) refers to all genetic mechanisms in flowering plants that prevent self-fertilization through the recognition and rejection of self-pollen by the style of a flower (DeNettancourt, 1977). SI is generally classified into two types: heteromorphic and homomorphic SI. The heteromorphic SI system includes distyly and tristyly and inhibits self-fertilization through the production of more than one morphological flower type. In contrast, homomorphic SI inhibits self-fertilization through genetic or biochemical mechanisms that operate regardless of flower morphology (Charlesworth, 2010; Orević et al., 2014). There are two main types of homomorphic SI: gametophytic selfincompatibility (GSI) and sporophytic self-incompatibility (SSI) (Kao and Huang, 1994) (**Figure 1**). In GSI, the genotype of the haploid pollen itself (gametophyte) determines its incompatibility type, while in SSI, the genotype of the diploid parental plant (sporophyte) that acts as the pollen donor determines the incompatibility type (Hiscock and Tabah, 2003). GSI is considered the most prevalent SI system in the plant kingdom and occurs in Solanaceae, Rosaceae, and Plantaginaceae (Franklin-Tong and Franklin, 2003a).

European pear (*Pyrus communis*) exhibits an RNase-based gametophytic self-incompatibility system (Sassa et al., 1992). This system is genetically controlled by a single locus, named the S-locus, which includes at least two polymorphic genes that are tightly linked: a pistil-expressed gene and one or several pollen-expressed genes (Entani et al., 2003; Kao and Tsukamoto, 2004). The pistil S-gene encodes an S-RNase, which is highly expressed in the style and catalyzes degradation of RNA (Sassa et al., 1992, 1996; Zhou et al., 2016). In *Pyrus* species, the pollen S-determinant is proposed to consist of multiple *F-box* genes, called *SFBBs* (S-locus F-Box Brothers) (Sassa et al., 2007). Across all *Pyrus communis* genotypes, there exists great variability in the S-locus haplotype, i.e., as reflected by the allelic variability in both the pollen- and pistil-expressed genes. For fertilization to take place, the S-haplotype of the pollen grain must differ from the two S-haplotypes of the diploid pistil, otherwise the growth of the pollen tube is arrested after it has migrated through about one-third of the length of the style (DeNettancourt, 1977; Franklin-Tong and Franklin, 2003b). The GSI mechanism therefore inhibits specific hybridizations between *Pyrus communis* genotypes that carry the same S-haplotypes.

Self-incompatibility and other reproductive strategies that promote outcrossing, such as dioecy, dichogamy, and male sterility, are considered to have played an important role in the success of angiosperms. By stimulating outbreeding, SI promotes gene flow and associated genetic diversity on which selection can act (DeNettancourt, 1977). However, in crop cultivation, SI often forms a major obstacle. In monocultures, for example, where compatible mates and, therefore, crosspollination events are also limited, SI leads to a reduced production of seeds and/or fruits (Miller and Gross, 2011). In addition to this, cross-pollination is often strongly hampered by adverse weather conditions and a low attractiveness of flowers to insect pollinators (Quinet et al., 2016), causing great year-to-year variation in pollination efficiency. For pear and other fruit trees, this can lead to unpredictable fruit set and financial insecurity for the grower, even when compatible pollinizers have been planted in the orchard. In plant breeding, SI strongly limits the range of possible mating combinations and hybridization events. In outcrossing species, it is very difficult to combine desirable traits of two incompatible parents through simple cross-pollination. Because of this, introduction of self-compatibility has become a major objective in many fruit tree species. On the one hand, the use of self-compatible lines will broaden the available options for increasing genetic variability in crop improvement. On the other hand, at the level of crop production, it will avoid the need for specific pollinizer cultivars, and result in a more uniform fruit set which is less influenced by environmental fluctuations (Tehrani and Brown, 2010; Cachi and Wünsch, 2014).

Spontaneous induction of self-compatibility in SI species has been found to occur in nature, as, for example, shown by the broad variety of self-compatible genotypes in *Prunus* (Sawamura et al., 2013; Cachi and Wünsch, 2014). In *Pyrus*, however, only a small number of spontaneous self-compatible mutants have been identified (Sawamura et al., 2013; Wu et al., 2013), most of which belong to the species *Pyrus pyrifolia* (Japanese pear – syn: *Pyrus serotina*). These selfcompatible *Pyrus* genotypes are most often the result of a pistil-part mutation, that either leads to a non-functional S-RNase or that reduces its expression in the style (Li et al., 2009; Sawamura et al., 2013; Wu et al., 2013).

In this review, we present the latest insights into the mechanism of SI in *Pyrus* and specifically in *Pyrus communis*. We present

recent advances on the genetic determination and molecular control of SI, and additionally discuss breakdown of selfincompatibility and its impact on fruit development and final fruit set in pear.

### THE GENETIC CONTROL OF GSI IN *PYRUS*

#### The S-locus

Genetic control of GSI in *Pyrus* is mainly situated at and regulated by the S-locus. In addition, however, some "modifier" genes that are not linked to the S-locus are also known to play a role in the functioning of GSI. These "modifier" genes, like the SKK1 protein, often interact with the proteins coded by the S-locus (Wu et al., 2013; Xu et al., 2013). The *Pyrus* S-locus consists of a single *S-RNase* gene surrounded by multiple *F-box* genes, which are referred to as *SFBB* genes (Sassa et al., 2007). A schematic overview of the S-locus in *Pyrus*, *Prunus*, and Solanaceae is presented in **Figure 2**. In *Pyrus*, the S-locus is positioned at the subtelomeric region of chromosome 17 (Maliepaard et al., 1998; Yamamoto et al., 2002). A first prediction of the genomic structure of the S-locus of pear was made in Japanese pear (*Pyrus pyrifolia)* using BAC cloning and sequencing (Sassa et al., 2007). This work revealed for the first time the presence of multiple *SFBB* genes that surround the *S-RNase*. Sequence comparison of the genomic regions surrounding the *S2*- and *S4-RNases* revealed that the S-haplotypes in pear can show significant variation in the position and orientation of the *SFBB* genes relative to the *S-RNase* gene (**Figure 2**) (Okada et al., 2011). As yet, it is still unclear how many SFBB genes are involved in the GSI system. However, based on findings in apple (*Malus domestica*), the currently estimated number of SFBB genes is 17–19 (Pratas et al., 2018). In addition, the S-locus sequence in pear was also found to contain numerous transposon-like sequences which are proposed to generate polymorphisms among S-haplotypes, and which likely contribute to the suppression of meiotic recombination between the *S-RNase* and *SFBBs* (Okada et al., 2011). In contrast, the S-locus of *Prunus* species consists of a single *S-RNase* gene and a single *SFB* (S-haplotype specific F-box) gene, which determines pollen specificity. The specific S-genes are surrounded by three *SLFL* (S-locus F-box-like) genes, whose function is still not clarified (**Figure 2**) (Entani et al., 2003; Romero et al., 2004; Matsumoto et al., 2008).

Recombination suppression in the S-locus region is essential because the pistil S- and pollen S-genes must inherit as one single unit in order to maintain the functionality of the SI system (Roalson and McCubbin, 2003). This recombinationsuppressed region is predicted to be much larger in *Malus* and *Pyrus* compared to that of *Prunus* species (Matsumoto and Tao, 2016a). The size prediction of the S-locus of *Prunus* was based on the observation that the region around the *S-RNase* gene exhibits extreme sequence diversity and contains transposable elements, which is in contrast to the high colinearity and presence of conserved genes outside this region (Entani et al., 2003). The size of this region was estimated to be at least 1 Mb in *Malus* and at least 649 kb in *Pyrus* species, compared to merely 70 kb in *Prunus* species (Entani et al., 2003; Okada et al., 2011; Matsumoto and Tao, 2016a).

#### The *S-RNase* Gene

The pistil determinant of GSI in Rosaceae was first identified in Japanese pear (*Pyrus pyrifolia*) as a stylar RNase which shows high similarity with the previously identified S-RNase in Solanaceae (Sassa et al., 1992). In Solanaceae, a stylar glycoprotein with ribonuclease activity acts as the female S-determinant of GSI, because it is abundantly expressed in the style of self-incompatible *Nicotiana alata* and co-segregates with the observed S-phenotypes (Bredemeijer and Blaas, 1981; Anderson et al., 1986; McClure et al., 1989). This RNase belongs to the RNase T2 family and is therefore named S-RNase (McClure et al., 1989). Members of the RNase T2 family have a wide range of cytotoxic functions: from rRNA degradation to direct induction of cell death (Luhtala and Parker, 2010). In *Pyrus,* functional analysis of self-compatible genotypes carrying

FIGURE 2 | Putative S-locus structure of *Pyrus*, *Prunus*, and Solanaceae species. In all cases, the S-locus contains an *S-RNase* gene (purple arrow), which acts as the pistil S-determinant. For *Pyrus* and Solanaceae species, this *S-RNase* gene is surrounded by a large number of *SFBB/SLF* genes (blue arrows) which are proposed to make up the pollen S-determinant. For *Pyrus*, the expected number of *F-box* genes (*SFBB* genes) is approximately 18–20, which is comparable to the observed number of F-box genes (*SLF* genes) in *Petunia* (Solanaceae). It is expected that the size, orientation, and position of these *F-box* genes relative to the *S-RNase* gene are variable between S-haplotypes. In *Prunus*, the pollen S-determinant is the *SFB* gene (green arrow) that is located closest to the *S-RNase*. The three surrounding *SLFL* genes (dark blue arrows) are relatively closely related to the *SFBB* and *SLF* genes of *Pyrus* and Solanaceae, respectively. It is suggested that they function as the general inhibitor in the *Prunus* SI system. Figure based on DeFranceschi et al. (2012).

spontaneous pistil-part mutations confirmed that the S-RNase is indeed the female S-determinant (Huang et al., 1994; Murfett et al., 1994; Sassa et al., 1997; Sanzol, 2009a). Importantly, studies in Solanaceae have shown that the RNase activity of the S-RNase is essential for pistil S-function (Huang et al., 1994; Kowyama et al., 1994; Matsumoto and Tao, 2016a). Based on this, the mechanism of GSI in *Pyrus* was suggested to act *via* RNase-based degradation of the cellular RNA in germinating pollen tubes, thereby causing inhibition of pollen tube growth (Huang et al., 1994; Murfett et al., 1994). However, more recent evidence in *Pyrus* revealed that incompatible pollen tubes exhibit several typical characteristics of programmed cell death (PCD) during SI reaction (Liu et al., 2007; Wang et al., 2009, 2010; DeFranceschi et al., 2012). Also, a recent study demonstrated that the S-RNase of *Pyrus bretschneideri* (PbrS-RNase) directly interacts with PbrActin1 and induces crosslinkage between actin filaments of incompatible pollen tubes (Chen et al., 2018b). Similar actin-binding properties are observed for other members of the T2-RNase family, for example for ACTIBIND, a T2-RNase produced by *Aspergillus niger* B1 (CMI CC 324626) (Roiz et al., 2000). This interaction, however, was shown to be non-S-allele specific and independent of RNase activity. Overall, these observations demonstrate that RNA degradation might not be the only process involved in the inhibition of pollen tube growth and indicate that the S-RNase may have other targets than cellular RNA in the pollen tube (Chen et al., 2018b).

The structure of the *S-RNase* gene of *Pyrus* consists of five small, conserved regions (C1 to C5) and one hypervariable region (the Rosaceae hypervariable region, RHV), which contains the single intron that is highly polymorphic in length. The hypervariable region of the Rosaceae *S-RNase* gene corresponds to one of the two hypervariable regions of the Solanaceaous *S-RNase*, namely HVa (Matsuura et al., 2001). Specifically in the Maloidae, a highly conserved hexapeptide region (IIWPNV) is located immediately downstream of the RHV region. The DNA sequence encoding this hexapeptide has frequently been used for the development of consensus primers for PCR-based S-genotyping (DeFranceschi et al., 2012). S-RNase genotyping by PCR is commonly used in Rosaceae species to determine incompatibility relations between cultivars, often in combination with field-controlled pollination assays (Zuccherelli et al., 2002; Larsen et al., 2016; Herrera et al., 2018). **Figure 3** displays a schematic representation of the protein sequence of the S-RNase of *Pyrus, Prunus*, and Solanaceae*.* All conserved regions in the Rosaceae S-RNases show high sequence similarity with the conserved regions of the S-RNase in Solanaceae, except for C4, which was therefore renamed "Rosaceae Conserved Region 4" (RC4) (Zisovich et al., 2004). Conserved regions C1, RC4, and C5 are thought to be involved in the stabilization of the enzyme structure due to the high number of hydrophobic amino acids (Ida et al., 2001; Matsuura et al., 2001; Zisovich et al., 2004). RC4 and more specifically the proline at position 156 have been suggested to be responsible for the interaction with actin in *Pyrus bretschneideri* (Chen et al., 2018b). Analogous protein domain structures were also found in ACTIBIND and RNASET2, members of the RNase T2 family in fungi and humans that also bind actin to induce cross-links between actin filaments (Roiz et al., 2000; Chen et al., 2018b). The C2 and C3 S-RNase regions contain conserved catalytic histidine residues that play an important role in RNase activity (Horiuchi et al., 1988; Kao and Huang 1994). The hypervariable RHV region between C2 and C3 is located at the protein surface and was therefore long thought to underpin selective interaction between the S-RNase and the pollen S-determinant (Matton et al., 1997). However, studies in European pear (*Pyrus communis*) revealed that PcS106 pollen tube growth is not inhibited by the PcS116-RNase, although PcS106 and PcS116 S-RNase have identical deduced amino acid sequences in their RHV region. This suggests that the RHV region is not sufficient for selective interaction with the pollen S-protein (Zisovich et al., 2004). Four other protein regions (PS1–PS4) might be more important for the selective interaction with the pollen S-determinants. These regions show an excess amount of non-synonymous amino acid substitutions over synonymous substitutions (Ka/Ks), suggesting high sensitivity to positive selection (Ishimizu et al., 1998; Zisovich et al., 2004).

Since a single protein is unlikely to interact with all four PS regions, the common hypothesis is that multiple proteins simultaneously interact with the S-RNase to determine the SI response in pear (Matsuura et al., 2001; Vieira et al., 2010).

### The Pollen-S Gene(s)

The pollen S-determinant of GSI was identified almost 15 years after the discovery of the pistil S-determinant. The first promising candidate was identified in *Antirrhinum hispanicum*, a member of the Plantaginaceae. Sequencing of a 64-kb region around the *S2-RNase* gene revealed the presence of an F-box gene, named *AhSLF-S2* (*A. hispanicum* S-locus F-box of S2-haplotype) (Lai et al., 2002; Zhou et al., 2003). F-box proteins contain at least one F-box domain and are one of three components of the SCF complex (SkpI, Cullin and F-box protein complex), which mediates targeted protein degradation *via* the ubiquitin-26S proteasome. In this complex, the F-box protein specifically recognizes the target protein and thereby contributes to the specificity of SCF (Kipreos and Pagano, 2000; Smalle and Vierstra, 2004). As predicted for the pollen S-determinant, *AhSLF-S2* is polymorphic, linked to the S-locus, and specifically expressed in pollen (Qiao et al., 2004). Moreover, the *Ah*SLF-S2 F-box protein was found to physically interact with S-RNases in a non-allele-specific way, confirming its role as pollen S-determinant (Qiao et al., 2004). In Rosaceae, S-linked F-box genes were first reported as candidate for the pollen S-determinant in *Prunus* (Entani et al., 2003; Ushijima et al., 2003; Sonneveld et al., 2005). In *Prunus mume* (Japanese apricot), the genomic region surrounding the *S-RNase* gene contains at least four F-box genes, but only the F-box gene closest to the *S-RNase* gene was found to encode the pollen S-determinant (Entani et al., 2003). This specific F-box gene was termed the *SFB* (S-haplotype-specific F-box), whereas the other three were named *SLFLs* (S-locus F-box like) (**Figure 2**) (Entani et al., 2003).

In *Pyrus* and *Malus,* emerging evidence suggests the presence of multiple related F-box genes within the S-locus, which are referred to as *SFBB*s (S-locus F-box brothers) (Zhou et al., 2003). Using a BAC library from the apple cultivar Florina, two *SFBB* genes were identified in the 317-kb sequence around the S9-RNase: *MdSFBB*9-a and *MdSFBB*9-b (Sassa et al., 2007). Using primers derived from these *MdSFBB* sequences, the same study isolated six cDNA sequences from pollen of the Japanese pear (*Pyrus pyrifolia*) cultivar Kosui (*Pp*S4-*Pp*S5): *PpSFBB*4-α*, PpSFBB*4-β, *PpSFBB*4-γ*, PpSFBB5*5-α*, PpSFBB*5-β, *PpSFBB*5-γ. Cleaved amplified polymorphic sequence (CAPS) analysis of these *PpSFBB* genes confirmed their linkage with the *S-RNase* gene and also pollen-specific expression (Sassa et al., 2007). This study therefore showed that the S-haplotypes of apple (M. domestica) and Japanese pear (*Pyrus pyrifolia*) contain multiple copies of the *SFBB* gene and postulated that these *SFBB* genes are convincing candidates to act as the pollen S-determinant in *Malus* and *Pyrus* (Sassa et al., 2007). The *PpSFBB*<sup>γ</sup> gene was further characterized in other S-haplotypes of Japanese pear and, based on its inherent variability, used for the development of a molecular S-genotyping assay (Kakui et al., 2007). A later study in Japanese pear in which a 240-kb region surrounding the *S4-RNase* was sequenced resulted in the identification of a new pollen-specific F-box gene, named *S*<sup>4</sup> *F-box0* (F4-haplotype F-box protein gene), that differs from the previously identified *PpSFBB*4α-γ (Okada et al., 2008). The self-compatible (SC) "Osa Nijisseiki" (*Pp*S2/*Pp*S4sm) is a natural stylar-part mutant (sm) derived from "Nijisseiki" (*Pp*S2/*Pp*S4) that lacks pistil S function but retains pollen S function (Okada et al., 2011). The S4smhaplotype of this mutant has a 236-kb deletion in the S-locus, causing loss-of-function of both the *S4-RNase* and *S4F-box0* genes. These findings suggest that the pollen S-determinant for the S4-haplotype is not conferred by the *S4F-box0,* and should be located outside the region spanning 48 kb upstream to 188 kb downstream of *S4-RNase* (Okada et al., 2008). However, an earlier study showed that S4sm pollen are not only rejected by styles carrying the S4-haplotype, but also by S1-haplotye styles, while still being compatible with styles of other non-self S-haplotypes. This suggests that S4F-box0 specifically recognizes S1-RNase (Okada et al., 2011). This observation is consistent with the non-self-recognition model proposed for the GSI system in *Pyrus* (Kubo et al., 2010; DeFranceschi et al., 2012). In this model, each SFBB F-box protein specifically recognizes only one or a few S-RNases, and multiple SFBB work together to recognize non-self S-RNases and mark them for degradation. Okada et al. (2011) sequenced an additional 6 *SFBB* genes in a 378-kb region around *S2-RNase* (*PpSFBB*4 u1-u4, 4 d1-d2) and 10 *SFBB* genes in a 649-kb region around the *S4-RNase* (*PpSFBB*2-u1-u5,2-d1-d5) of Japanese pear. Among these, *PpSFBB*4-d1 was found to correspond to the previously identified *S4F-box0*. Similarly, in European pear (*Pyrus communis*), multiple S-locus F-box genes have been identified (Sassa et al., 2007). Six polymorphic sequences were obtained from "Abbé Fetel" (*Pc*S104–2/*Pc*S105) and ten from "Max Red Bartlett" (*Pc*S101/*Pc*S102). Hereby, *SFBB*<sup>α</sup> , *SFBB*<sup>β</sup> , and *SFBB*<sup>γ</sup> appeared highly homologous to *PpSFBB*<sup>α</sup> , *PpSFBB*<sup>β</sup> , and *PpSFBB*<sup>γ</sup> , respectively. Also, two additional *SFBB* groups were defined, *SFBB*<sup>δ</sup> and *SFBB*<sup>ε</sup> , which showed strong homology with the *MdSFBB*3-β and *MdSFBB*9-β genes of apple, respectively (DeFranceschi et al., 2011). Similarly to *Pyrus*, a multitude of *SFBB* sequences has been identified in the genomic region surrounding the *S-RNase* of other Rosaceae species. In *Prunus*, multiple *SLFL* genes have been identified at the S-locus, of which three show specific expression in pollen: *SLFL1*, *SLFL2*, and *SLFL3* (Matsumoto et al., 2008). In *M. domestica*, 10 S-haplotypes were screened for the presence of *SFBB* genes using transcriptome sequencing of anthers. For a given S-haplotype, this resulted in the identification of 17–19 *SFBB* genes (Pratas et al., 2018). A similar number of *SLF* genes (namely 16–20 *SLF* genes, classified into 18 types) was observed in *Petunia*, which also exhibits a non-self-recognition type GSI system (Matsumoto and Tao, 2016a; Pratas et al., 2018). Several SLFs thereby even appeared to target the same S-RNase (Sun and Kao, 2013; Matsumoto and Tao, 2016a). These findings suggest that the actual number of *SFBB* genes in *Pyrus* might also be around 17–20, although the exact number is still unknown. The number of pollen determinant genes currently identified in both *Petunia* and *Pyrus* are less than the number of known *S-RNase* alleles (40 or more in *Petunia*, approximately 30 in *Pyrus communis*) (Goldway et al., 2009; Sanzol, 2009a,b; Kubo et al., 2015; Larsen et al., 2016). It is therefore suggested that some SLF types in *Petunia* should interact with multiple S-RNase allelic variants, while some S-RNases may be recognized by multiple SLF types. Based on empirical data of SLF and S-RNase interaction in *Petunia* and using Monte Carlo simulation, it was estimated that 16–20 SLFs in each haplotype are adequate to recognize the vast majority of S-RNase targets (Kubo et al., 2015).

Altogether, these findings show that the S-locus of *Pyrus* contains multiple *SFBB* genes and thus shows strong similarity with the *Solanaceae* S-locus, which contains multiple *SLF* genes (**Figure 2**). This indicates that the SI mechanism of *Pyrus* is more similar to the SI mechanism of Solanaceae than to that of *Prunus*. However, it is currently still unknown how many *SFBB* genes in *Pyrus* are linked to the S-locus and which of these actually function as pollen S-determinant.

### THE MOLECULAR MECHANISMS OF SELF-RECOGNITION AND REJECTION IN GSI OF *PYRUS*

#### The Mechanism of Incompatible Pollen Tube Rejection in *Pyrus*: Cellular and Biochemical Aspects

Selfing or self-pollination in pears triggers a self-incompatibility (SI) reaction in the incompatible pollen tubes. This encompasses multiple cellular and biochemical changes including alterations in the actin cytoskeleton (Liu et al., 2007), swelling of mitochondria, collapse of the mitochondrial membrane potential, leakage of cytochrome c into the cytosol (Wang et al., 2009), tip-localized reactive oxygen species (ROS) and Ca2+ disruption (Wang et al., 2010), and degradation of nuclear DNA (Wang et al., 2009, 2010). Many of these structural and biochemical changes are characteristic of programmed cell death (PCD), which also occurs during self-pollen rejection in Papaveraceae (Bosch and Franklin-Tong, 2008; DeFranceschi et al., 2012). However, so far there is no evidence of a direct link between these processes and basic S-RNase function. Therefore, it is generally accepted that the S-RNase causes pollen tube lethality in *Pyrus* solely by degrading pollen tube RNA (McClure et al., 1990; Huang et al., 1994; DeFranceschi et al., 2012). However, increasing evidence indicates that the S-RNase may act as a trigger for biochemical processes that eventually lead up to pollen tube rejection instead of directly causing it *via* RNA degradation (Wang et al., 2010; Wang and Zhang, 2011; Qu et al., 2017). Several studies in *Pyrus* and related species showed that the S-RNase interacts with (1) F-actin (Liu et al., 2007), (2) phospholipase C (PLC) (Qu et al., 2017), and (3) pyrophosphatase (PPa) (Li et al., 2018). These interactions trigger the question whether and how overall S-RNase-based RNA degradation in selfed pollen tubes causes growth arrest. **Figure 4** provides an overview of the biochemical processes linked to pollen rejection that are described below.

A first essential process in the inhibition of pollen tube growth in *Pyrus* seems to be the destruction of the Ca2+ gradient inside the pollen tube (**Figure 4**) (Qu et al., 2017). The maintenance of a tip-focused cytoplasmic Ca2+ gradient inside the pollen tube is essential for pollen tube growth (Holdaway-Clarke and Hepler, 2003; Qu et al., 2017). This gradient is maintained by the influx of Ca2+ from the style apoplast *via* membrane channels, such as the tip-localized hyperpolarizationactivated Ca2+ channels. These Ca2+ channels are induced by D-myo-inositol-1,4,5-triphosphate (IP3), which is formed by phospholipase C (PLC) in the PLC-IP3 pathway (Kost et al., 1999; Qu et al., 2017). PLC cleaves phosphatidylinositol 4,5-bisphosphate (PIP2) to IP3 to stimulate influx of extracellular free Ca2+ through Ca2+ channels. In *Pyrus pyrifolia*, S-RNase interacts with PLC in an allele-specific manner. In case of the presence of self S-RNase, this interaction will result in a severe inhibition of PLC, while the presence of non-self S-RNase in the same concentration will only cause a minimal inhibition. When the activity of pollen tube PLC is blocked by self S-RNase, the concentration of IP3 decreases at the tube apex and Ca2+ influx is lowered. This lowered Ca2+ influx eventually disrupts the internal Ca2+ gradient and hence leads to a reduced growth rate of the incompatible pollen tube (Qu et al., 2017).

A second mechanism causing a reduction of pollen tube growth may be the change in inorganic pyrophosphate (PPi) concentration in the pollen tube (**Figure 4**). A recent study in apple revealed that S2-RNase physically interacts with soluble inorganic pyrophosphatase (MdPPa), resulting in a non-competitive inhibition of its activity in self-(incompatible) pollen tubes. PPase functions in the removal of excess PPi, hence RNase-based decrease of MdPPA activity leads to elevated levels of inorganic pyrophosphate (PPi). This is associated with an inhibition of tRNA aminoacylation, resulting in accumulation of uncharged tRNA (Li et al., 2018). Uncharged tRNAs regulate global gene expression in response to changes in amino acid pools, as evidenced in bacterial cells, and hence act as effector molecules that stall protein synthesis (Raina and Ibba, 2014). Considering the continuous release of PPi during pollen tube growth, it is now assumed that RNase-based decrease of PPA activity leads to an excess of PPi and uncharged tRNA. This in turn causes cessation of pollen tube elongation (DeGraaf et al., 2006) due to inhibition of the overall cell metabolism (Chen et al., 1990; Li et al., 2018).

A third cellular process occurring during SI-based pollen tube growth inhibition is the depolymerization of filamentous actin in the pollen tube and formation of punctate actin foci (**Figure 4**) (Liu et al., 2007). S-RNases of *Pyrus bretschneideri* directly interact with filamentous actin (F-actin) in a non-allelespecific manner and depolymerize the actin cytoskeleton in pollen tubes independent of its RNase activity (Chen et al., 2018b). In contrast, apple S-RNases do not directly interact with F-actin. Instead, they inhibit an actin-binding protein complex that contains myosin, villin, and GRAM (MdMVG), and that directly binds to and severs F-actin (Yang et al., 2018). This subtle difference indicates that the actual cellular mechanism of self-pollen tube rejection might differ somewhat between *Malus* and *Pyrus* species. The pollen tube cytoskeleton is essential for both pollen tube growth and transport of sperm cells (Ushijima et al., 2003). Depolymerization of F-actin may

are allowed to interact with multiple targets inside the pollen tube. (2) Self S-RNase interacts with and inhibits phospholipase C (PLC), leading to a decreased production of IP3 which in its turn reduces Ca*<sup>2</sup>*<sup>+</sup> import through Ca*<sup>2</sup>*<sup>+</sup> channels. This reduced Ca*<sup>2</sup>*<sup>+</sup> uptake leads to the mitigation of the Ca*<sup>2</sup>*<sup>+</sup> gradient in the pollen tube tip, inhibiting pollen tube growth. (3) Self S-RNases stimulate the expression of phospholipase D (PLD), which stimulates production of phosphatidic acid (PPa). PPa can temporarily delay actin depolymerization in pollen tubes, providing a first defensive mechanism against pollen tube growth inhibition. (4) However, self S-RNases can also interact directly with F-actin, causing actin depolymerization and leading to pollen tube growth inhibition. (5) Self S-RNases can physically interact with pyrophosphatases (PPases), and thereby inhibit their activity. This leads to the accumulation of inorganic pyrophosphate (PPi) which also causes reduced pollen tube growth. (6) Upon challenge with self S-RNases, the mitochondrial membrane collapses, causes leakage of cytochrome c into the cytosol and a cessation of H2O2 production. The presence of self S-RNases in the pollen tube also reduces NADPH levels, causing a decrease in plasma membrane ROS formation. As a result, tip-localized ROS accumulation is disrupted, providing another trigger for induction of PCD. (7) Challenge with self S-RNases has also been shown to cause RNA degradation and nuclear DNA degradation, two processes that are also linked to programmed cell death (PCD). Question marks denote processes in which the exact role of the S-RNase still needs to be elucidated.

therefore in itself be sufficient enough to inhibit pollen tube growth and even cause induction of PCD (Thomas et al., 2006). In the early stages of the self-incompatibility reaction, F-actin depolymerization may be slowed down by the action of phosphatic acid (PPA) (Chen et al., 2018b). This is supported by a study in *Pyrus bretschneideri* that revealed an increased expression of phospholipase D (*Pbr*PLDδ1) upon challenge by self S-RNases (Chen et al., 2018b). Specific knockdown of *Pbr*PLDδ1 thereby accelerated pollen tube death during the early stages of SI and this acceleration was alleviated by the addition of exogenous PPA during SI. It was concluded that an increase in *Pbr*PLDδ1 derived PPA temporarily delays actin depolymerization in the pollen tube and provides a protective mechanism against PCD signaling until sufficient accumulation of incompatible S-RNase ultimately triggers induction of PCD. Following F-actin depolymerization, intracellular transglutaminases (TGases) are proposed to induce aberrant reorganization of F-actin in incompatible pollen tubes to form the observed actin foci (Poulter et al., 2011). In *Pyrus*, the activity of these intracellular TGases has been found to increase in the pollen tubes during the SI response (Iorio et al., 2012).

The processes described above are all linked to the disruption of tip-localized ROS which typically occurs in incompatible pollen tubes following challenge with self S-RNase (Wang et al., 2010). In compatible pollination events, ROS generated by NADPH oxidases in the mitochondria and plasma membrane accumulate at the pollen tube tip to form a gradient that promotes pollen tube elongation (Potocký et al., 2007). In line with this, compatible pollen tubes show an accumulation of mitochondria together with an accumulation of H2O2 in the mitochondria and the cell wall of the subapical region of the pollen tube. Upon challenge with self S-RNase, the mitochondrial membrane potential collapses, causing cytosolic leakage of cytochrome c and disruption of ROS production in incompatible pollen tubes (**Figure 4**). In line with this, self S-RNase-treated pollen tubes in *Pyrus pyrifolia* showed a complete lack of H2O2 in the mitochondria and cytosol (Wang et al., 2009, 2010). Additionally, the presence of self S-RNase in the pollen tube was found to substantially reduce NADPH levels, causing a decrease in ROS formation at the plasma membrane (**Figure 4**) (Wang et al., 2010). These events were all observed *in vitro* and occur immediately after addition of self S-RNase to the *Pyrus pyrifolia* pollen tubes. Moreover, when an NADPH oxidase inhibitor (diphenylene iodonium chloride, DPI) and a ROS scavenger (TMPP) are used to mimic ROS disruption, the same events occur as observed in the presence of S-RNase: decreased Ca2+ currents, depolymerized actin cytoskeleton, and induction of nuclear DNA degradation.

These results indicate that tip-localized ROS disruption occurs very early in the SI response in *Pyrus*, and putatively acts as a central trigger for pollen tube growth inhibition and PCD (Wang et al., 2010)*.* However, as described above, S-RNase also directly interacts with several other targets, to cause decreased Ca2+ currents, increase in PPi levels, and actin depolymerization independent of ROS. It is therefore still unclear which sequence of events makes up the SI response. Moreover, as most of these analyses were performed using *in vitro* systems, described observations and their timing may be artificial or differ somewhat from the actual events occurring during SI response in the pollen tube.

#### Compatible Pollen in *Pyrus* GSI: Recognition of Non-self S-RNases and Their Degradation

The first step in the SI response of *Pyrus* is the uptake of S-RNase protein by the pollen tube from the transmitting tissue of the style in a non-allele specific way. Both self and non-self S-RNases enter the pollen tube (Luu et al., 2000; Goldraij et al., 2006; Meng et al., 2014b), indicating that the self-recognition process happens inside the pollen tube and that there is a mechanism that inhibits the activity of non-self S-RNases, but not of self S-RNases (**Figure 4**) (DeFranceschi et al., 2012). S-RNase uptake by the pollen tube has been proposed to occur *via* two ways: by endocytosis (Luu et al., 2000; Goldraij et al., 2006; Meng et al., 2014b) or by membrane transporters (Meng et al., 2014a; Williams et al., 2015). In apple, evidence for both processes exists (Meng et al., 2014a,b). *In vitro* tests showed that S-RNase uptake by the pollen tube depends on Golgi vesicle trafficking and additionally relies on an intact and dynamic cytoskeleton (Meng et al., 2014b). In parallel, the membranebound ATP-binding cassette transporter MdABCF was found to promote transport of S-RNase into the apple pollen tube (Meng et al., 2014a). MdABCF is thereby thought to coordinately interact with the cytoskeleton to support S-RNase import into the pollen tube. After uptake, the S-RNases are recognized as being self or non-self. For GSI in *Pyrus*, a recognition model was proposed based on two important findings. The first finding was the identification of an F-box gene as the pollen S-determinant in the Rosaceae (Lai et al., 2002; Entani et al., 2003; Qiao et al., 2004), suggesting that S-RNases are recognized by an F-box protein in the pollen tube, marking them for degradation by the 26S proteasome (Zhang et al., 2009; DeFranceschi et al., 2012). However, the large number of *S-RNase* allele variants and their high degree of sequence diversity raised the question of how one single F-box protein can specifically mark all non-self S-RNases, but leave the self S-RNase intact (Williams et al., 2015). This was explained by a second important finding, i.e., the concomitant presence of multiple SFFB genes on the S-locus in both *Pyrus* and *Malus* species (Sassa et al., 2007). Each of these SFBB proteins could recognize a single or multiple non-self S-RNases and multiple SFBB proteins can work together to interact with a single non-self S-RNase. The SFBB protein that specifically recognizes the self S-RNase would in that case never be present in the S-haplotype, leaving the self S-RNase untargeted and free to inhibit growth of the self (incompatible) pollen tube. Importantly, a phenomenon observed earlier in both Rosaceae and Solanaceae supports this model, i.e., competitive interaction. Competitive interaction occurs when a single pollen grain is universally self- and cross-compatible, because it carries two different S-locus haplotypes (heteroallelic pollen). This occurs, for example, in diploid pollen produced by polyploid varieties. Because in this situation, all the pollen grains carry two different haplotypes (e.g., S1-S2), there is no incompatibility possible because pollen can degrade both types of self S-RNases. More specifically, the S1-RNase of the style is degraded by the action of the SFBB genes present on the S2-haplotype, while the S2-RNase is targeted by the SFBB genes of the S1-haplotype (DeFranceschi et al., 2012). In contrast, pollen carrying two copies of the same S-haplotype (homoallelic pollen) remains self-incompatible (Crane and Lewis, 1942; Lewis, 1947; Brewbaker, 2010; Qi et al., 2011b). Remarkably, *Prunus* only harbors one *F-box* gene as pollen S-determinant and shows full absence of competitive interaction (Hauck et al., 2005; Nunes et al., 2006). This has led to the assumption that there exist two different regulatory models for GSI in the Rosaceae, namely "selfrecognition by a single factor" in the *Prunus* genus and "non-selfrecognition by multiple factors" in the *Pyrus* and *Malus* genera (Kao and McCubbin, 1996; Sassa et al., 2007; Kubo et al., 2010; DeFranceschi et al., 2012).

In short, the accepted model for *Prunus* species proposes that the pollen-expressed SFB protein protects self S-RNases from degradation by the general inhibitor (GI) which binds non-specifically to all S-RNases in the pollen tube. In case of an incompatible reaction, the SFB protein recognizes the enzyme complex consisting of the GI and the self S-RNase and polyubiquinates the GI for degradation. This will protect and release the cytotoxic self S-RNase and eventually lead to degradation of self-pollen tube (incompatible). In case of a compatible reaction, the SFB protein does not recognize the non-self S-RNase and GI enzyme complex and the non-self S-RNase remains inhibited by the GI (Ushijima et al., 2003; Tao and Iezzoni, 2010; Matsumoto and Tao, 2016a). Protein interaction analysis and *in vitro* ubiquitination assays revealed that the three SLFLs in the *Prunus* S-locus interact both with SSK1 (SLF-interacting Skp1-like protein 1) and S-RNase and can tag ubiquitin molecules onto the S-RNases (Chen et al., 2018a). This finding suggests that the three SLFLs are likely candidates for the GI (Matsumoto and Tao, 2016b; Chen et al., 2018a). The SLFLs show close relationship with the SFBB genes of *Pyrus* (Aguiar et al., 2015; Akagi et al., 2016) and may hence degrade S-RNase in a very similar manner (Matsumoto and Tao, 2016b). However, more evidence is needed to unambiguously confirm that the SLFLs act as the GI.

In contrast, the "non-self-recognition by multiple factors" model of *Pyrus* assumes multiple pollen S-determinants (SFBBs) that function in a non-self-recognition system for S-RNase degradation (Sassa et al., 2007; Kubo et al., 2010). A functional S-haplotype is proposed to possess multiple SFBBs that each recognize and inhibit a single or a subset of non-self S-RNases, but to lack the SFBB protein(s) that recognize the self S-RNase. Self S-RNases are therefore not inhibited and hence cause the rejection of incompatible pollen tubes (Kubo et al., 2010). In this aspect, the GSI model for *Pyrus* is much more similar to the model for Solanaceae compared to that for *Prunus* (DeFranceschi et al., 2012).

Recently, molecular studies provided more insights into the specific interaction of SFBBs with non-self RNase. In *Pyrus*, recognition of self/non-self S-RNase is proposed to happen through direct interaction with one or several SFBBs in the cytosol of the pollen tube (Li et al., 2016). In *Pyrus* and *Malus*, the RHV and possibly four other protein regions (PS1–PS4) of the S-RNase play an important role in this specific recognition (Vieira et al., 2007; Li et al., 2016).

#### Degradation of Non-self S-RNases in *Pyrus* GSI: Mechanistic Insights

The actual detoxification of non-self S-RNases during a compatibility reaction in *Pyrus* is currently described by two models: the "Protein Degradation Model" and the "Compartmentalization Model" (McClure et al., 2011). The "Protein Degradation Model" proposes ubiquitination and subsequent degradation by the 26S proteasome of all non-self S-RNases after interaction with one or multiple SFBB proteins (Hua et al., 2008; McClure et al., 2011). S-RNase degradation is thereby mediated by an SCF complex, which generally consists of four components: an F-box protein, Skp1, Cullin1, and Rbx1 (Xu et al., 2013). The F-box protein determines substrate specificity, and Skp1 connects the F-box protein to Cul1, which together with Rbx1 transfers a ubiquitin moiety from the ubiquitincharged E2 enzyme to the substrate (Matsumoto and Tao, 2016a). In *Pyrus bretschneideri*, the complex is referred to as the SLF-containing SCF complex (SCFSLF complex) and contains the SFBB protein, a pollen-specific SSK1 protein (SLF-interacting Skp1-like protein 1), a pollen-specific Cullin1, and Rbx1, as illustrated in **Figure 5** (Zhao et al., 2010; Xu et al., 2013; Williams et al., 2015). In Solanaceae, Plantaginaceae, and *Maloidae*, SSK1 interacts with the pollen-specific F-box protein to form an SCF complex (Huang et al., 2006; Zhao et al., 2010; Xu et al., 2013; Li et al., 2014; Minamikawa et al., 2014; Yuan et al., 2014; Matsumoto and Tao, 2016a), supporting a similar mechanism of S-RNase ubiquitination by the SCF complex in *Pyrus*. However, in *Petunia inflata,* another E3 ubiquitin ligase and SCF-like complex, i.e., containing S-RNase-binding protein 1 (SBP1) instead of Skp1 and Rbx1, also seems to be involved in GSI. SBP1 thereby interacts with SLF, another Cul1, as well as with S-RNase, although in a non-allele-specific manner (**Figure 5**) (Hua and Kao, 2006, 2008; Minamikawa et al., 2014). These interactions, however, are not nearly as strong as with SSK1. The SBP1-containing SCF-like complex therefore probably mediates merely a basal level of S-RNase degradation (Williams et al., 2015). In addition, ubiquitination by the SBP1-containing complex *in vitro* does not occur in an S-allele-specific manner, nor shows specificity to S-RNases (Hua and Kao, 2006). Interestingly, apple also contains a similar pollen-expressed homolog of SBP1, namely MdSBP1 (Yuan et al., 2014). Both MdSBP1 and MdSSK1 interact with MdSFBB *in vitro*, but MdSSK1 interacts more strongly with MdSFBB and its transcript level is over 100 times higher than MdSBP1 (Minamikawa et al., 2014; Williams et al., 2015), suggesting that in apple also, SFBBmediated degradation of S-RNase predominantly involves MdSSK1. In support of this, MdSBP1 is not specifically expressed in pollen, indicating that it is involved not solely in pollen tube rejection but also in other processes (Yuan et al., 2014).

The "Compartmentalization Model," on the other hand, is based on findings in tobacco (*Nicotiana alata*) and explains an alternative mechanism for S-RNase detoxification (Goldraij et al., 2006). In this model, both self and non-self S-RNases are taken up by the pollen tube and immediately stored in vacuoles, preventing cytotoxic activity and allowing pollen tube growth through the style. In case of incompatibility, these vacuoles are ruptured and S-RNase is released into the cytoplasm, resulting in inhibition of pollen tube growth. This vacuolar rupture is proposed to happen *via* the action of a non-S-RNase named HT-B after the clone HT from which it was purified (McClure et al., 1999). HT-B is a pistil-specific non-S-factor that was identified in *Nicotiana* and *Solanum*, and that is stabilized by a complex of self S-RNase and SLF (Goldraij et al., 2006). Evidence of sequestration of S-RNases into the pollen tube vacuoles was also found in *M. domestica* (Meng et al., 2014b). It is thereby assumed that S-RNase proteins entering the pollen tube are enveloped by Golgi-derived vesicles which subsequently transport them *via* actin and microtubule filaments to the vacuole for sequestration. However, strong evidence supporting this is as yet lacking.

FIGURE 5 | Two SCFSLF complexes are proposed to operate concomitantly in the recognition of S-RNases in the pollen tube of Rosaceae and Solanaceae: the SCFSLF complex (A,B) and the SBP1-containing complex (C). (A) The SCFSLF complex is considered the main agent in self/non-self S-RNase discrimination in both Rosaceae and Solanaceae. It consists of an F-box protein (SFBB in Rosaceae), which determines the allele-specific interaction with the S-RNase, a pollen-specific Cullin 1, SSK1, and Rbx1. When a non-self S-RNase is recognized by the F-box protein, the S-RNase is ubiquitinated by the E2-conjugating enzymes, marking it for degradation by the 26S-proteosome. (B) When the S-RNase is not recognized, no interaction will occur and therefore no ubiquitination, leaving the self S-RNases intact. (C) The SBP1-containing complex is proposed to mediate a basal level of S-RNase degradation. This complex acts in a non-S-allele-specific manner and contains S-RNase-binding protein (SBP1) instead of Skp1 and Rbx1, together with a different Cullin1 protein. In this complex, SBP1 is suggested to replace the function of RBX1 and SSK1*.*

It is possible that a combination of S-RNase "protein degradation" and "compartmentalization" describes the actual sequence of events in pollen recognition of *Pyrus*, as is proposed in Solanaceae (Williams et al., 2015). In that case, the SCFSLF complex, described above, may mono-ubiquitinate non-self S-RNases in the cytosol to mark them for transport to the vacuole (Shenoy, 2016), while it may also poly-ubiquitinate non-self S-RNases for direct degradation. In case of self S-RNase, the SCF-like complex cannot bind, but the non-S-RNase-specific SCF-like complex carrying SBP1 may mediate S-RNase polyubiquitination for baseline degradation or mono-ubiquitination for vacuolar sequestration. However, in the latter case, the majority of the self S-RNases are expected to remain intact and, when concentration of self S-RNase increases, tolerance of the pollen tube will be trespassed leading to pollen tube growth abortion (Williams et al., 2015). In Solanaceae, the unification of the two models might explain two contradicting observations: (1) the requirement of SCFSLF complexes in the cytosol of the pollen tube for a compatible reaction and (2) the sequestration of the S-RNases into the vacuole (Williams et al., 2015). In contrast, in *Pyrus*, the degradation model is more readily accepted as the sole model for pollen tube recognition.

#### Breakdown of Self-Incompatibility Allows Self-Fertilization

Transition to self-compatibility is commonly observed in selfincompatible species. This suggests that the benefits of producing outbred offspring of higher genetic quality do not always outweigh the drawbacks of a reduced progeny in cases where pollination or the availability of compatible pollen donors is limited (Vallejo-Marín, 2007; Igic et al., 2008; Baldwin and Schoen, 2017). Induction of self-compatibility (SC) in otherwise self-incompatible (SI) plants may result from either physiological or genetic changes, including mutations (DeNettancourt, 1977). Physiological changes leading to self-compatibility are always temporary and are collectively referred to as pseudo-compatibility (PC). In most species, individuals are either self-compatible (SC) or self-incompatible (SI), however the situation seems to be more complex in *Pyrus* (Sanzol and Herrero, 2007). European pear cultivars have been classified as either completely SI, weakly SC, or completely SC, however, with strong dependency on environmental conditions (e.g. temperature) and year-to-year variations. This has resulted in contradictory reports for the same cultivar (Griggs and Iwakiri, 1954; Callan and Lombard, 1978; Sanzol and Herrero, 2007; Moriya et al., 2009). Both in pear and apple, the strength of the SI response varies depending on different intrinsic and extrinsic parameters, including tree or flower age, flower quality, ambient temperature, and application of plant hormones (DeNettancourt, 1997). Self-incompatibility in pear and apple can also be overcome by specific pollination techniques, such as the use of mentor pollen (i.e., a mixture of compatible and incompatible pollen) or pioneer pollen (i.e., pollination with compatible, but sterilized pollen followed by self-pollination) (Visser, 1981). Finally, cultivar differences in self-(in)compatibility may also be caused

Induction of self-compatibility in SI species can also originate from alterations at the genetic level. Three distinct types of genetic changes are relevant, namely: induction of polyploidy, modification of S-locus incompatibility gene(s), and genetic modification of non-S-locus factors.

Polyploidy in *Pyrus* and *Malus* results in pollen-determined self-compatibility through competitive interaction in its S-heteroallelic diploid pollen, as described above (Crane and Lewis, 1942; Adachi et al., 2009). Breakdown of SI due to alterations in the DNA sequence of incompatibility genes can be classified into two types: modifications in the S-RNase and modifications in the pollen S-gene. Several *Pyrus* genotypes exhibit breakdown of SI due to a mutation in the pistil S-gene. The Japanese pear variety "Osa-Nijisseiki" is a naturally occurring self-compatible mutant. This mutant variety harbors two different S-haplotypes (*Pp*S2-*Pp*S4); however, it lacks the complete *PpS4-RNase* gene due to a deletion of more than 4 kb spanning the entire length of the *PpS4-RNase* gene (Sassa et al., 1997). In Chinese pear (*Pyrus bretschneideri*), the variety "Yan Zhuang," a self-compatible sport of the "Ya Li" variety, was identified as a pistil-part mutant that is caused by a point mutation in the 182nd nucleotide of the *PbrS21-RNase* sequence. The resulting Gly-to-Val substitution significantly affects the stability of the S-RNase leading to selfcompatibility (Li et al., 2009). In European pear (*Pyrus communis*), genotyping of the *S-RNase* alleles of the self-compatible varieties "Abugo" and "Ceremeno" led to the identification of a mutation in *PcS121*, which is referred to as *PcS121*\*. This new allele has a 561-nt retrotransposon insertion within the intron together with two indels of 2 and 30 bp at the 3'UTR region which could explain the absence of *PcS121*\* gene expression in styles of both "Abugo" and "Ceremeno" (Sanzol, 2009a). Mutations in the female S-locus determinant that do not lead to SC but give rise to distinct, although functionally identical, variants of the same *S-RNase* allele are also possible. Genomic analysis of *S-RNase* sequences of 28 European pear cultivars led to the identification of two distinct variants of the *PcS104*-allele. These variants differ at five nucleotide positions, but do not confer functional difference as in both cases self-incompatibility is maintained. Interestingly, two of these SNPs lead to an alteration of the predicted protein sequence, without affecting the corresponding pollen or pistil SI function (Sanzol, 2010). These results suggest that the different *S-RNase* sequences, i.e., referred to as *PcS104–1* and *PcS104–2*, represent transitional states in the process of generating new *S-RNase* alleles (Sanzol, 2010).

While many self-compatible pollen-part mutants are known in *Prunus* (Ushijima et al., 2004; Beppu et al., 2005; Sonneveld et al., 2005; Vilanova et al., 2006), breakdown of SI due to genetic defects in the pollen S-determinant is not expected to occur in *Pyrus* due to the non-self-recognition system of SI. Mutations leading to non-functional *SFBB* genes inherently imply a full absence of S-RNase degradation (both in self and non-self interactions), leading to the targeted degradation of otherwise compatible pollen tubes and cross-incompatibility with all other S-genotypes. For example, the previously mentioned pistil-compatible mutant "Osa-Nijisseiki" has a large deletion that not only spans the *PpS4-RNase* but also includes the *SFBB* gene immediately upstream of *PpS4-RNase*. Deletion of this *SFBB* gene not only renders the pollen self-incompatible (inhibited growth on PpS4-styles), but also cross-incompatible with PpS1 genotypes (inhibited growth on PpS1-styles) (Okada et al., 2008). For a long time, the only known true pollen-part self-compatible mutants (PPMs) in *Pyrus* were polyploid, because of the occurrence of competitive interaction. However, several years ago, the first pollen-part mutant (PPM) in a diploid *Pyrus* variety was identified, namely the variety 415–1 of Japanese pear (*Pyrus pyrifolia*). This line was produced by fertilizing "Kosui" with pollen from gamma-irradiated "Kosui" with *S-RNase* genotype *Pp*S4*Pp*S5 (Sawamura et al., 2013). Although being diploid, this variety exhibits a segmental duplication that encompasses the complete S5-haplotype block. As a result, the *Pyrus pyrifolia* variety 415–1 produces S-heteroallelic pollen containing both S4- and S5-haplotype blocks which can penetrate S4-S5 styles because of competitive interaction (Mase et al., 2014).

Breakdown of SI in *Pyrus* has also been reported to occur in the absence of mutations in the *S-RNase* or *SFBB* genes. This suggests that induction of SC can also be caused by the alteration of non-S-locus factors. In Chinese pear (*Pyrus bretschneideri*), the self-compatible cultivar "Zaoguan" (*Pbr*S4- *Pbr*S34) accepts self-pollen in its styles. However, the S-locus genes are free of genetic defects and the pollen is rejected in a normal manner on styles of other incompatible pear cultivars. Transcriptional analysis revealed a full absence of *PbrS34-RNase* expression in "Zaoguan" pistils, indicating that the selfcompatibility is caused by a yet unknown alteration in the transcriptional regulation of *Pbr*S34-RNase (Qi et al., 2011a). However, loss of *S-RNase* expression may also be caused by epigenetic alterations in the promoter or open reading frame (ORF) of the *S-RNase* sequence. Two other examples of selfcompatibility caused by unknown non-S-locus factors were found in the Chinese pear (*Pyrus bretschneideri*) variety "Jin Zhui," i.e., another self-compatible sport of "Ya-li," and in the Japanese pear (*Pyrus pyrifolia*) variety "XinXue" (Li et al., 2009; Wu et al., 2013; Shi et al., 2018). The unknown non-S-locus factor underpinning self-compatibility in "Jin Zhui" was recently suggested to be the *PLC* gene. As previously described, the S-RNase interacts in an allele-specific way with PLC to inhibit its activity. This action results in a decreased activity of Ca2+ channels at the pollen tube tip and thus disrupts the internal Ca2+ gradient in the pollen tube. The *PLC* gene of "Jin Zhui" shows a 26-amino acid insertion and no longer interacts with self S-RNase, suggesting that self-compatibility in the "Jin Zhui" variety is attributed to functional loss of PLC (Qu et al., 2017).

#### PRACTICAL ASPECTS OF GSI IN *PYRUS*: POLLINATION AND FRUIT SET

#### Implications of Self-Compatibility for Pear Fruit Development

It is considered most favorable for the production of highquality fruit that all ovules of the flower are fertilized. Developing seeds release plant hormones, such as auxins, that cause ovary expansion, so that the fruit mainly grows and expands at regions where fertilized seeds are located (Devoghalaere et al., 2012; Orcheski and Brown, 2012). In case only a subfraction of the ovules is fertilized, resulting fruits can be small (Westwood, 1993; Goldway et al., 2008). Pollination must therefore lead to fertilization. In theory, incompatible pollen tubes are inhibited while compatible pollen tubes are allowed to grow through the style. In practice, however, actual inhibition of pollen tubes is not only controlled by the genetic determination, but is also influenced by several external factors, such as the environment and number of pollination events. Studies in pear and apple have described the positive effect of multiple pollination events on selfed seed set in self-incompatible lines (Visser and Marcucci, 1984). The application of two consecutive pollination events generally leads to an increased seed set, and this increase seems to vary depending on the compatibility of each pollination. Selfpollination before cross-pollination (S/C) produces more seeds in incompatible pear varieties than cross-pollination before self-pollination (C/S), and both produce more seeds than a single cross-pollination event (Visser and Verhaegh, 1980; Visser, 1981; Visser et al., 1983; Zimmerman, 1988). This phenomenon is known as "pioneer pollen effect," in which a previous pollination event facilitates pollen tube growth during a second pollination event (Visser, 1981). A similar phenomenon is known as "mentor pollen" where an equal mixture of self- and cross-pollen produces a fair amount of selfed seed (Visser, 1981; Montalti and Filiti, 1984). In practice, open pollination in pear generally results in a very low number of pollen deposited on the style due to low abundance and activity of insect pollinators (Konarska et al., 2005; Jacquemart et al., 2006). Repeated pollination events are rare under natural conditions because insect pollinators are unlikely to revisit a flower (Giurfa, 1996; Witjes and Eltz, 2007; Wilms and Eltz, 2008). Moreover, in pear cultivation, the pollen mixture that reaches the stigma of *Pyrus* SI species mainly consists of self-pollen, particularly considering the fact that insect pollinators in pear orchards typically visit multiple flowers of the same tree (Visser and Marcucci, 1984; Pannell and Labouche, 2013). It is therefore expected that pollen interactions such as mentor or pioneer pollen do not frequently occur under natural conditions in *Pyrus*. Interestingly, when selfpollen does fertilize the ovule, ovule abortion is higher than in case of cross-pollination, suggesting the additional presence of one or more post-zygotic reproductive barriers that block formation of seed upon selfing (Martin and Lee, 1993). Seed abortion is influenced by the pollen source and in case of self-pollen, this abortion may be due to homozygous recessive lethal alleles resulting from selfing (Martin and Lee, 1993). It is therefore proposed that self-seeds with non-lethal, but inferior allele combinations are more prone to abortion (Martin and Lee, 1993; Pannell and Labouche, 2013). However, it is not excluded that selfed seeds have lower sink strength compared to those resulting from an outcrossing event, and hence show higher level of seed and/or fruit abortion due to competition for energy acquisition.

#### Self-Incompatibility in Pear Production and Breeding

The SI mechanism in pear dictates that fruit set in most commercial cultivars strongly depends on successful crosspollination and fertilization, hence posing major implications for both commercial pear production and breeding. In order to guarantee fruit set, commercial pear orchards need to contain at least two cross-compatible cultivars or combine the commercial cultivar with a pollen donor variety, such as a wild pear species. Moreover, both cultivars involved should exhibit overlapping flowering periods to enable effective cross-fertilization, i.e., to enable seed set which in its turn stimulates fruit development (Goldway et al., 2012). The identification and knowledge of the exact S-genotype of different pear varieties is hence crucial for many practical applications, including orchard design and the success of hybridization crosses in pear breeding programs (Sanzol and Robbins, 2008).

Overall, there are three scenarios for (in)compatibility of diploid cultivars of *Pyrus*: (1) when two different parent cultivars carry identical S-haplotypes, they are fully incompatible; (2) when they share only one of their S-haplotypes, they are semi-compatible; and (3) when they differ in both S-haplotypes, they are fully compatible. Although semicompatibility does not affect fruit set rate in hand-pollination experiments, it can cause significant reductions in fruit yield when environmental conditions are suboptimal for pollination (Schneider et al., 2005; Zisovich et al., 2005; Goldway et al., 2008; Sapir et al., 2008). From a practical point of view, however, a complete lack of cross-pollination is not entirely problematic in some pear varieties. Firstly, several pear varieties, such as "Conference," exhibit a natural potency of parthenocarpy and hence do not require pollination for induction of fruit set (Nishitani et al., 2012). Such varieties do not require fertilization and hence produce seedless fruits. Alternatively, parthenocarpy can also be induced by application of hormones, such as gibberellins. However, despite the promising role of pollination-independent fruit set, parthenocarpic pear varieties often produce fruit that is smaller compared to that resulting from cross-pollination, making them less suitable for commercial fruit production (Nishitani et al., 2012). Secondly, self-fertilization due to pseudo-compatibility has repeatedly been documented in several *Pyrus* varieties. However, successful self-fertilization in these cases is expected to vary considerably between seasons and cultivars (Williams et al., 1994). Moreover, similar as in parthenocarpic varieties, pear fruit resulting from self-fertilization is generally smaller and is more likely to abscise early compared to that resulting from cross-pollination events (Atwell et al., 1999).

In pear breeding applications, the intercrossing of two fully incompatible varieties is impossible as all pollen is rejected. Two incompatible pear varieties can only be intercrossed and hybridized *via* the use of specific techniques, such as mentor pollen, gamma irradiation of pollen, cut style techniques, or polyploidization (Atwell et al., 1999). In a similar way, crosses between two semi-incompatible varieties may also cause problems. In such crossing events, all pollen with a specific S-genotype is rejected, leading to a limited number of possible S-genotype combinations in the offspring. This "artificial selection" has a significant impact on the diversity of S-alleles in commonly grown cultivars, and leads to a reduced genetic and biological diversity in cultivated pears. Specific cultivars, like for example "Williams Bon Chrétien" (or "Bartlett"), are frequently used as a parent for the development of new cultivars. As a consequence, their corresponding S-alleles are overrepresented in newly developed commercial cultivars (Sanzol and Robbins, 2008; Orcheski and Brown, 2012). Interestingly, "Williams Bon Chrétien" carries the S-alleles *Pc*S101 and *Pc*S102, while most of its selected descendants carry the *Pc*S101-allele and not the *Pc*S102-allele. This shows that the *PCS101-*allele is favored during selection, suggesting that the S-locus is linked to one or more genes that underpin important traits for pear cultivation or fruit quality (Sanzol and Robbins, 2008). This intrinsically means that interesting traits may be lost in semi-incompatible crosses because pollen carrying the common S-allele will be rejected. In contrast, self-incompatibility can also have advantages. For example, SI can be handy when doing crosses, because female parents do not need to be emasculated before being pollinated by the desired male parent (Denna, 1971).

### CONCLUSION AND FUTURE PERSPECTIVE

Over the years, many studies have provided insights into the self-incompatibility mechanism of *Pyrus.* For example, the identification and molecular characterization of the pollenpart S-determinant has been a major focus during recent years. Furthermore, the identification of multiple *SFBB* genes, which are linked to the S-locus, has strengthend the hypothesis that pollen tubes are recognized according to "the non-self recognition" mechanism with multiple factors, similar to some Solanaceae species. However, it is still unclear how many *SFBB* genes are present and which of the identified SFBBs are actually involved in the non-self-recognition system. In addition, much is still unknown about the structure of the S-locus, more specifically, the positioning of the *SFBB* genes around the *S-RNase*, and its variability between varieties or different pear species. The characterization of natural, selfcompatible mutants contributes further to our knowledge of the GSI mechanism in *Pyrus*, especially those mutants that confer self-compatibility through still uncharacterized, but S-locus-linked factors.

The unraveling of the molecular mechanism(s) underlying pollen tube rejection in *Pyrus* GSI has gained increasing attention in fundamental research. The discovery that the S-RNase can interact with multiple targets besides the pollen S-determinant has revealed a multifactoral role for the S-RNase in the selfincompatibility reaction with several other functions besides RNA degradation. Based on growing evidence, it is likely that a much more complex mechanism underpins the rejection of self-pollen tubes in *Pyrus*. Finally, the characterization of natural, self-compatible mutants contributes to our knowledge of the genetic control and molecular regulation of GSI in *Pyrus*. Particularly, mutants conferring SC through uncharacterized, but S-locus-linked factors may provide new insights into the complex regulation of GSI in *Pyrus*. Such knowledge can be useful, for example, in the development of self-compatible varieties through convential breeding or by using gene editing techniques, like CRISPR-Cas9.

As self-incompatibility affects fertilization, seed set, and fruit quality in pear orchards and has important implications for pear production and breeding, it is essential that research keeps exploring its underlying mechanisms. Ultimately, new insights into pear self-incompatibility can result in new and targeted applications that may facilitate pear production and breeding.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

NDS, WK, BvDP and HC contributed to the main conceptual ideas and manuscript outline. HC wrote the manuscript with inputs, corrections and critical feedback from the other authors.

#### FUNDING

This work was supported by the Research Foundation Flanders (FWO) *via* the Doctoral (PhD) Strategic Basic Research Grant (Grant number: 1S02817N) and by the KU Leuven Special Research Fund Start-up Grants (STGBF/16/005) and (STG/18/043).


(*Pyrus communis* L.) and cloning the S109 RNase allele. *Sci. Hortic.* 119, 417–422. doi: 10.1016/j.scienta.2008.08.027


clustering in the BAC contig sequences around the S-RNase of Japanese pear. *J. Exp. Bot.* 62, 1887–1902. doi: 10.1093/jxb/erq381


Sanzol (2009b). Genomic characterization of self-incompatibility ribonucleases (S-RNases) in European pear cultivars and development of PCR detection for 20 alleles. *Tree Genet. Genomes* 5, 393–405. doi: 10.1007/s11295-008-0194-5


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Claessen, Keulemans, Van de Poel and De Storme. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Ovule Gene Expression Analysis in Sexual and Aposporous Apomictic Hypericum perforatum L. (Hypericaceae) Accessions

Giulio Galla<sup>1</sup> \*, Andrea Basso<sup>1</sup> , Simone Grisan<sup>2</sup> , Michele Bellucci<sup>2</sup> , Fulvio Pupilli<sup>2</sup> and Gianni Barcaccia<sup>1</sup>

<sup>1</sup> Laboratory of Genetics and Genomics, Dipartimento di Agronomia, Animali, Alimenti, Risorse Naturali e Ambiente, University of Padova, Padua, Italy, <sup>2</sup> Institute of Biosciences and Bioresources, Research Division of Perugia, National Research Council, Perugia, Italy

#### Edited by:

Petr Smýkal, Palacký University Olomouc, Czechia

#### Reviewed by:

Elvira Hörandl, University of Göttingen, Germany Martina Juranic, Commonwealth Scientific and Industrial Research Organisation (CSIRO), Australia

> \*Correspondence: Giulio Galla giulio.galla@unipd.it

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 15 December 2018 Accepted: 01 May 2019 Published: 24 May 2019

#### Citation:

Galla G, Basso A, Grisan S, Bellucci M, Pupilli F and Barcaccia G (2019) Ovule Gene Expression Analysis in Sexual and Aposporous Apomictic Hypericum perforatum L. (Hypericaceae) Accessions. Front. Plant Sci. 10:654. doi: 10.3389/fpls.2019.00654 Hypericum perforatum L. (2n = 4x = 32) is an attractive model system for the study of aposporous apomixis. The earliest phenotypic features of aposporous apomixis in this species are the mitotic formation of unreduced embryo sacs from a somatic cell of the ovule nucellus and the avoidance of meiosis. In this research we addressed gene expression variation in sexual and apomictic plants, by focusing on the ovule nucellus, which is the cellular domain primarily involved into the differentiation of meiocyte precursors and aposporous embryo sacs, at a pre-meiotic developmental stage. Gene expression analyses performed by RNAseq identified 396 differentially expressed genes and 1834 transcripts displaying phenotype-specific expression. Furthermore, the sequencing and assembly of the genome from a diploid sexual accession allowed the annotation of a 50 kb sequence portion located upstream the HAPPY locus and to address the extent to which single transcripts were assembled in multiple variants and their co-expression levels. About one third of identified DEGs and phenotypespecific transcripts were associated to transcript variants with alternative expression patterns. Additionally, considering DEGs and phenotype-specific transcript, the coexpression level was estimated in about two transcripts per locus. Our gene expression study shows massive differences in the expression of several genes encoding for transposable elements. Transcriptional differences in the ovule nucellus and pistil terminal developmental stages were also found for subset of genes encoding for potentially interacting proteins involved in pre-mRNA splicing. Furthermore, the sexual and aposporous ovule transcriptomes were characterized by differential expression in genes operating in RNA silencing, RNA-mediated DNA methylation (RdDM) and histone and chromatin modifications. These findings are consistent with a role of these processes in regulating cell fate determination in the ovule, as indicated by forward genetic studies in sexual model species. The association between aposporous apomixis, pre-mRNA splicing and DNA methylation mediated by sRNAs, which is supported by expression data and by the enrichment in GO terms related to these processes, is consistent with the massive differential expression of multiple transposon-related sequences observed in ovules collected from both sexual and aposporous apomictic accessions. Overall, our data suggest that phenotypic expression of aposporous apomixis is concomitant with the modulation of key genes involved in the two interconnected processes: RNA splicing and RNA-directed DNA methylation.

Keywords: Hypericum perforatum L., aposporous apomixis, LCM, RNAseq, ovule, RdDM

### INTRODUCTION

fpls-10-00654 May 22, 2019 Time: 19:19 # 2

Apomixis defines a subset of plant reproductive strategies which, unlike sexual reproduction, allow the inheritance of the maternal genome through seeds, without genetic recombination, and syngamy (Nogler, 1984). This asexual mode of seed formation has enormous economic and social potential in agriculture, as genetically fixing highly heterozygous genotypes through apomixis would have tremendous advantages for those crops which are commercialized as hybrid F1 seed (Vielle-Calzada et al., 1996a). Although apomixis is relatively common in nature, this trait has never been reported for the major crop species (Carman, 1997).

Known apomictic developmental pathways are traditionally classified into two main categories: sporophytic and gametophytic (Koltunow and Grossniklaus, 2003). In sporophytic apomicts, somatic cells from the ovule nucellus differentiate to generate multiple embryos, which coexist with the sexually-formed sibling and share its endosperm. Conversely, plants reproducing via gametophytic apomixis (Nogler, 1984) possess the ability to differentiate viable unreduced embryo sacs from unreduced cells of the ovule (i.e., unreduced meiocytes, somatic cells).

Both hybridization and polyploidy have wide-ranging effects on the chromosomal (Sanchez-Moran et al., 2001; Kantama et al., 2007), genomic (Soltis and Soltis, 1999; Pikaard, 2001), and transcriptomic (Adams et al., 2004; Adams, 2007) levels and as such have been suggested as possible mechanisms with the ability to elicit apomictic reproduction (Carman, 1997; Grossniklaus et al., 2001). Accordingly, heterochronic expression of genes involved in sexual reproduction has been associated to the apomictic developmental pathway by several authors (Grimanelli et al., 2003; Sharbel et al., 2010).

The key biological features of apomixis, with particular reference to aposporous type of gametophytic apomixis, are: (i) embryo sac development from a somatic cell of the ovule nucellus, (ii) parthenogenesis (i.e., fertilization-free embryo development); and (iii) the formation of viable endosperm either via fertilization-independent means or following fertilization with a sperm cell (Koltunow and Grossniklaus, 2003). These features deviate from sexuality as the capability to develop an embryo sac is strictly restricted to the reduced functional megaspore (FM) deriving from meiosis, and failure of the meiotic program in obligate sexual species is not accompanied by the initiation of embryo sac development from cell lineages of the ovule other than the FM. The ovule of higher plants is a multicellular structure composed by at least four cellular domains morphologically distinguishable from the proximal–distal axis: the funiculus, which attaches the ovule to the placenta; the chalaza, which forms the integuments; the nucellus, from which the progenitors of the MMC are specified; the integuments, which surround the nucellus. The cellular domain from which female gametes are generated is composed by a relatively small group of cells embedded within the ovule nucellus. While MMC founder cells are typically specified from the two innermost layers of nucellus, a single FM is non-randomly specified among the four meiocytes deriving from the MMC through meiosis. The progenitors of unreduced embryo-sacs (AI) found in ovules of aposporous apomictic plants also differentiate from the nucellus (Mártonfi et al., 1996; Galla et al., 2011).

Despite its prime importance for plant reproduction, the molecular mechanisms that control the somatic-to-reproductive transition are still poorly understood. Among biological macroprocesses potentially involved in this process, the epigenetic regulation of gene expression and hormonal homeostasis seems to play fundamental roles. Epigenetic regulation of gene expression either by DNA methylation and post-transcriptional gene silencing (PTGS) is required for proper ovule development (Garcia-Aguilar et al., 2010). Specification of gametophyte precursors in Arabidopsis thaliana is controlled by the expression of AGO9, which restricts the specification of gametophyte precursors in a dosage-dependent, non-cell-autonomous manner (Olmedo-Monfil et al., 2010; Armenta-Medina et al., 2011; Singh et al., 2011; Tucker et al., 2012). DNA methylation mediated by small RNAs (RdDM) is essential for gametophyte development and loss-of-function mutants for genes involved in this pathway produce aberrant cell fate establishment within the ovule, with phenotypes that are strikingly reminiscent of apomictic development (Hernandez-Lagana et al., 2016). These findings suggest that expression of genes involved in the establishment sexual-to-aposporous dichotomy are epigenetically regulated.

Hypericum perforatum L., an invasive perennial herb widely distributed in a variety of habitats, is regarded as a serious weed in many countries (Robson, 2002; Nürk et al., 2013). The mode of reproduction in H. perforatum is highly dynamic, and biotypes span from almost complete sexuality to nearly obligate apomixis (Noack, 1939; Davis, 1967; Mártonfi et al., 1996; Matzk et al., 2001; Barcaccia et al., 2007; Galla et al., 2011, 2015). From an evolutive point of view, H. perforatum most likely originated by autopolyploidization or secondary hybridization and introgression between distinct gene pools within H. perforatum and with the sister species H. maculatum (Scheriau et al., 2017). Apospory in H. perforatum is inherited as a dominant simplex locus (HAPPY), which segregates from the genetic factors controlling parthenogenesis (Schallau et al., 2010). This plant species is regarded as an interesting model for apomixis research due to a number of interesting traits, including: (i) the versatile and dynamic mode of reproduction;

(ii) the relatively small genome size (0.325 pg/1Cx); (iii) the short generation time (i.e., a flowering stage/year in wild type population and up to three flowering stage/year in greenhouse); (iv) the availability of a large number of morphologically distinct ecotypes; (v) the self-compatibility and high seed set (Matzk et al., 2001; Barcaccia et al., 2007).

Since the earliest study of Vielle-Chalzada in sexual and apomictic ovaries of Pennisetum (Vielle-Calzada et al., 1996b), comparative transcriptomics have been carried out in several aposporous apomictic systems including Poa pratensis (Albertini et al., 2004), Pennisetum ciliare (Singh et al., 2007), Panicum maximum (Yamada-Akiyama et al., 2009), Paspalum simplex (Polegri et al., 2010), Hieracium praealtum (Okada et al., 2013), Ranunculus auricomus (Pellino et al., 2013), and Boechera gunnisoniana (Schmidt et al., 2014). Although a few comparative transcriptomic studies have been already performed in H. perforatum (Galla et al., 2013, 2015, 2017), these investigations were carried out by focusing at the organ level (e.g., at pistil level), implying a relatively low resolution in detecting transcriptional differences occurring within the ovule nucellus. Also, as previous studies focused on flower developmental stages spanning female sporogenesis and gametogenesis (Galla et al., 2017), the transcriptional changes occurring in the ovule nucellus before the failure of meiosis and differentiation of aposporous initials in H. perforatum are largely unexplored. To gain additional insight on these transcriptomic changes and define the biological processes potentially leading to the aposporous developmental program, we performed an RNAseq gene expression analysis on the ovule nucellus collected from sexual and aposporous plant accessions by Laser Capture Microdissection (LCM).

#### MATERIALS AND METHODS

### Plant Materials

Whole genome sequencing efforts were performed by using an H. perforatum L. diploid (2n = 2x = 16) sexual accession (ACC ID:OBUPD-D1) kindly provided by the Padova Botanical Garden<sup>1</sup> (**Supplementary Table S1**). Naturally occurring H. perforatum tetraploids (2n = 4x = 32) were selected from two local populations (Acc ID: HP1 and HP4) collected in Northern Italy, province of Belluno (Barcaccia et al., 2006). H. perforatum L. induced tetraploids (2n = 4x = 32) were kindly donated by Dr. T. F. Sharbel (IPK-Gatersleben). Induced tetraploids were generated by colchicine application as described by Schallau et al. (2010). The reproductive mode of all H. perforatum accessions was estimated by flow cytometric screening of 48 single seeds as described by Matzk et al. (2001) and Schallau et al. (2010). The phenotype of investigated plant accessions was defined as sexual, when all investigated seeds displayed 4:6 embryo:endosperm C-values. For naturally occurring H. perforatum tetraploid accessions only plant individuals with >96% apomixis were considered. For the purposes of this research, BIII hybrids characterized by 6:8 or 6:10 embryo:endosperm C-values were classified as apomictic. The frequency of BIII hybrids ranged from 0 to 44%, with a median value of 14%. Sexual plants adopted for the RNAseq analysis were selected among induced tetraploids, whereas apomicts were selected from both naturally and induced tetraploid accessions (**Supplementary Table S1**). RNAseq analyses were performed by using nine sexual and nine apomictic accessions (**Supplementary Table S1**).

#### Genome Sequencing and Assembly

Genomic DNA was extracted using the CTAB protocol (Doyle and Doyle, 1990) and quantified with the Qubit dsDNA BR Assay kit (Life Technologies). DNA purity and integrity were assessed at the Nanodrop 1000 spectrophotometer (Thermo Scientific) and by capillary electrophoresis on a 2200 TapeStation (Agilent Technologies), respectively. For genome assembly, the presence of high molecular weight DNA was verified using Field-inversion gel electrophoresis (FIGE) and subsequently DNA fragments >40 kb were enriched using BluePippin (Sage Science).

The Chromium Gel Bead and Library Kit (10X Genomics) and the Chromium instrument (10X Genomics) were used to prepare linked-reads WGS libraries from 0.625 ng of high molecular weight DNA of the sexual/diploid sample. The generated barcoded library was sequenced on an Illumina HiSeqX Ten using 151 nt reads in paired-end for a total of 41.7 Gbp (138170412 fragments). Genomic linked reads were assembled using the Supernova assembler version 2.1.0 (Weisenfeld et al., 2017) using default parameter excepted for –max-reads that was set to 170000000. Final scaffolds were produced using Supernova mkoutput pseudohap option discarding scaffolds shorter than 300 bp. The annotation of genomic contigs matching the HAPPY locus has been performed by BLASTN, by using the gene sequences annotated in contig HM061166 to query the genome assembly. The newly identified genomic contigs were annotated with BLASTX and by using the nr database<sup>2</sup> .

#### Tissue Embedding, Laser Microdissection, RNA Extraction, and Amplification

Flower buds of approximately 3.0 mm, corresponding to Arabidopsis flower stage 11 (Galla et al., 2011) were collected, immediately placed on a glass Petri dish containing ice-cold acetone and opened to pick up the ovary under a dissecting microscope. Ovaries were stored over night at 4◦C in fresh acetone. After vacuum infiltration for 30–60 min, acetone was replaced with consecutive washings for 20 min in ice using 3:1, 1:1, and 1:3 mixtures of acetone:xylene, followed by two changes of pure xylene for 20 min each. Then, paraffin infiltration was performed by moving ovaries to 60◦C and washing them at 20 min intervals in 1:3, 1:1, and 3:1 mixtures of paraffin:xylene. Samples were washed other two times for 1 h each in pure paraffin at 60◦C, lodged in square mesh cassettes, solidified at room temperature and stored at 4◦C.

Ovaries were cut into 10-µm sections with a Leica Jung Autocut 2055 microtome, placed on Zeiss MembraneSlide 1.0

<sup>1</sup>www.ortobotanicopd.it/en

<sup>2</sup>http://www.ncbi.nlm.nih.gov/

PEN, and floated on diethylpyrocarbonate (DEPC) treated water at 40◦C on a slide warming tray until the sections stretched. Sections were deparaffinized in three changes of xylene, dried at RT for 10 min and immediately dissected with a Zeiss PALM Microbeam IV equipped with a Axio Observer inverted microscope. After that 250.000 µm<sup>2</sup> of tissue sections were collected in a single tube, 20 µl of lysis buffer from the Stratagene Absolutely RNA <sup>R</sup> Nanoprep Kit (Agilent) were added, and RNA was isolated according to the manufacturer's instructions. RNA integrity and concentration were assayed on a 2100 Bioanalyzer with the RNA 6000 Pico Kit. cDNA synthesis and amplification were performed by pooling the RNAs from nine sexual and nine apomictic accessions in three sexual and three apomictic samples. cDNA synthesis and amplification were performed by using the AMBION MessageAmpTM II aRNA Amplification Kit. Libraries. The TruSeq Stranded Total RNA Sample Prep Kit was used to prepare the libraries from 50 ng of aRNA. The generated barcoded library was sequenced on an Illumina NextSeq500 using 151 nt reads in paired-end.

#### Transcriptome Assembly, Annotation, and Gene Expression Analysis

De novo assembly of transcript sequences was done with the following pipeline. All reads with more than 10% of undetermined bases (Ns) or with more than 50 bases called with a phred-scored quality lower than 7 (probability of wrong call > 20%) were discarded. Following the reads quality filtering, sequence adapters were clipped by using scythe<sup>3</sup> . 3<sup>0</sup> ends of reads were quality trimmed with a quality threshold of 20 over a window of 10 bases with sickle<sup>4</sup> . Reads shorter than 20 bp were discarded. Read Coverage of transcripts in libraries were digitally normalized using a k-mer abundance approach with insilico\_read\_normalization.pl script of Trinity software setting a k-mer = 20 and a max depth of 60. Assembly of the transcripts was performed using Velvet/Oases pipeline in multi k-mer mode, using k-mers from 19 to 95 with in 4 bp steps and merging the assembled transcripts. Finally, merged transcriptome assemblies were clustered using EvidentialGene software to obtain a highquality reference transcriptome assembly, removing potential artifacts and clustering redundant transcripts.

The annotation of assembled sequences was performed as described in Galla et al. (2015, 2017). Briefly, to annotate all assembled unigenes, a BLASTX-based approach was used to compare the Hypericum sequences to the nr database downloaded from the NCBI<sup>2</sup> . The GI identifiers of the best BLASTX hits, with E-values ≤ 1 E-09 and similarities ≥ 70%, were mapped to the UniProtKB protein database<sup>5</sup> to extract Gene Ontology (GO<sup>6</sup> ) terms for further functional annotations. BLAST2GO software v1.3.3<sup>7</sup> (Conesa et al., 2005) was used to reduce the data to the GOslim level<sup>8</sup> and perform basic statistics on ontological annotations as previously reported (Galla et al., 2009). Annotations for genes involved in plant reproduction were retrieved from Galla et al. (2017). Annotations for genes involved in hormonal homeostasis were retrieved from AmiGO 2<sup>9</sup> .

For Gene Expression analysis, mapping and sequence counts were performed with the software CLC Genomics Workbench V 7 (Qiagen), with default parameters and by using the de novo transcriptome as reference. Differentially expressed genes (DEGs) were identified by using the software empirical analysis of DGE (Robinson and Oshlack, 2010) implemented in the CLC Genomics Workbench V 7 (Qiagen), and by adopting the FDR (p-value ≤ 0.05) and Bonferroni p-value correction (p-value ≤ 0.05). Sexual samples were adopted as reference. Principal component analysis and heat maps were generated with the software T-mev<sup>10</sup>. PCoA were generated by using all transcripts with expression ≥ 20th percentile and DEGs (Bonferroni p-value ≤ 0.05), respectively. Heat maps were generated with the HCL algorithm, by using Manhattan distances and average linkage clustering.

#### Expression Analysis by Real-Time qPCRs and in situ Hybridization Assays

Plant materials were selected according to the genetic and cyto-histological bases of apospory recently described for H. perforatum (Schallau et al., 2010; Galla et al., 2011). Pistils were collected separately from a minimum of five plant accessions (**Supplementary Table S1**). Total RNA was extracted from collected pistils using the SpectrumTM Plant Total RNA Kit (Sigma-Aldrich), by following the protocol provided by the manufacturer. The eventual contamination of genomic DNA was avoided by using optional DNase I (Sigma-Aldrich) treatment. The abundance and pureness of RNAs were assessed using a NanoDrop 2000c UV-Vis spectrophotometer (Thermo Scientific, Pittsburgh, PA, United States). cDNA synthesis was performed using the RevertAid First Strand cDNA Synthesis Kit (Thermo Scientific), following the instructions of the supplier. Primers used in the Real-Time RT-PCR experiments are reported in **Supplementary Table S12**. Expression analyses were performed using StepOne thermal cyclers (Applied Biosystems), equipped with 96-well plate systems, respectively, with SYBR green PCR Master Mix reagent (Applied Biosystems). The amplification efficiency was calculated from raw data using OneStep Analysis software (Life Technologies). Amplification performance expressed as fold change was calculated with the 11Ct method using HpTIP4 as a housekeeping gene (Pfaffl, 2001). Error bars indicate the standard error observed among the five biological replicates. In situ hybridization assays were performed as described by Brewer et al. (2006). For the synthesis of probes, approximately 50 ng of cDNA from apomictic pistils was amplified using specific primer pairs (**Supplementary Table S12**). Amplicons were purified with the QIAquick PCR Purification Kit (QIAGEN) and sequenced on an ABI3100 automated sequencer to confirm the target specificity. Purified amplicons were then diluted 1:20 and amplified with T7- and

<sup>3</sup>https://github.com/vsbuffalo/scythe

<sup>4</sup>https://github.com/najoshi/sickle

<sup>5</sup>http://www.uniprot.org/

<sup>6</sup>http://www.geneontology.org/

<sup>7</sup>http://www.BLAST2go.org

<sup>8</sup> goslim\_plant.obo

<sup>9</sup>http://amigo.geneontology.org/amigo

<sup>10</sup>http://mev.tm4.org/

T3-tailed primers (in separate reactions) to incorporate the T7 and T3 promoter (**Supplementary Table S12**). Both probes were labeled using a Roche DIG RNA labeling kit. Detection was performed following the Roche DIG detection kit instructions using anti- DIG AP and NBT/BCIP as substrates. Images were acquired with a Leica DM4000B digital microscope, equipped with a Leica DC300F camera and Leica Image Manager 50 software (Leica Microsystems).

### CpG Methylation Analysis

Methylation analysis of TEs-related sequences was performed by using the OneStep qMethyl Lite Kit (Zymo Research). Genomic DNA was extracted from single pistils collected from three sexual and three aposporous apomictic plant accessions. DNA extraction were performed by using the QIAamp DNA Investigator Kit (Qiagen), by following guidelines provided for the isolation of total DNA from tissues. The abundance and pureness of DNA samples were assessed using a NanoDrop 2000c UV-Vis spectrophotometer (Thermo Scientific, Pittsburgh, PA, United States).

## Data Availability

Raw sequences files were made available for download from SRA with the following Accession Nos. SAMN10880815, SAMN10880814, SAMN10880813, SAMN10880812, SAMN10880811, SAMN10880810. The Transcriptome Shotgun Assembly project has been deposited at DDBJ/EMBL/GenBank under the Accession No. GHFN00000000. The version described in this paper is the first version, GHFN01000000. The H. perforatum genome was submitted as WGS submission with the Accession No. SOPF00000000. The expression data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus (Edgar et al., 2002) and are accessible through GEO Series Accession No. GSE128923<sup>11</sup> .

## RESULTS

### Assembly and Annotation of the H. perforatum Ovule Transcriptome

Laser Capture Microdissections (LCM) were performed on ovules at a pre-meiotic developmental stage (**Figure 1**), collected from sexual and aposporous apomictic H. perforatum accessions (**Supplementary Table S1**). Microdissections were performed by focusing on the nucellus on ovules in which integuments were already differentiated, but not completely overlapping the nucellus (**Figure 1B**). A summary of RNAseq data is reported on **Supplementary Table S2**. The sequencing reaction originated, on average, 177.7 million reads per library (88.8 pair-ends). The assembly of all high-quality reads in a single reference transcriptome originated 67022 sequence contigs. The length of the assembled sequences ranged from 201 to 6743 bp. The average and median sequence length were 691 and 590 bp, respectively. The N50 of the transcriptome was estimated in approximately 800 bp (**Table 1**).

The polyploid nature (2n = 4x) of investigated H. perforatum accessions, together with the large size of our transcriptome assembly, prompted us to investigate the extent to which single transcripts were assembled in multiple variants. To address this issue, we sequenced the genome of a sexual unrelated H. perforatum (2n = 2x = 16; 0.325 pg/1Cx) accession and used the assembled draft for the alignment of all DEGs and phenotypespecific transcripts (**Table 2** and **Supplementary Table S3**). In its current version, the draft sequence of the H. perforatum genome is about 350 Mb in size, with a scaffold N50 of 63 kb. The number of assembled scaffolds is close to 37,000, with an average length of about 10 kb (**Table 2**). A preliminary investigation concerning the completeness of the assembly was performed by using the benchmarking universal single-copy orthologs (BUSCO) analysis. Using the embryophyta\_odb9 as reference database, we estimated an 85.7% completeness, 12.8% of duplicated (N: 184/1440) and about 5% (75/1440) of fragmented BUSCOs. 9.1% of BUSCOs were missing from the assembled genome (N:131 out 1440). By taking advantage of the H. perforatum genome draft, we could efficiently discriminate transcript variants (e.g., potential alleles) by co-aligning to the same genomic locus, from gene products related to orthologous or paralogous gene loci (see below).

The annotation of the ovule transcriptome was performed by using the Arabidopsis thaliana proteome (TAIR10) as reference. As shown in **Table 1**, we annotated 51666 transcripts (77% of the assembled transcripts), matching 15795 Arabidopsis gene models (e-value cut-off: 1.0E-9). 46846 transcripts (about 70% of the transcriptome) were assigned with one or more GO ontological term (**Table 1**). According to the GO-slim nomenclature (**Supplementary Figure S1**), a large portion of the H. perforatum ovule transcriptome is depicted to metabolic and cellular process (GO:0008152 and GO:0009987, respectively). Cellular component organization or biogenesis (GO:0008150), developmental process (GO:0032502), and response to stimulus (GO:0050896) were also highly represented in our datasets. At this ontological level (GO-slim, level2), the GO term reproduction (GO:0000003) was associated to about 5500 transcripts, accounting for approximately 12% of the ovule transcriptome (**Supplementary Figure S1**).

### Gene Expression Analysis Reveals an Enrichment of Processes Related to RNA-Dependent DNA Biosynthesis and RNA Processing in Ovules Collected From Aposporous Accessions

Gene expression (GE) analyses were performed by using the ovule transcriptome assembled de novo as reference. The mapping of high-quality reads to this reference aligned, on average, 50% of sequence reads (**Supplementary Table S2**). However, the average number of mapped reads was about 88 million single reads per sample. The percentage of mapped reads varied from 45 to 55%, with no obvious correlation with the reproductive behavior of considered accessions. A principal component analysis (PCA) computed by using all expression data points separated sexual samples

<sup>11</sup>https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE128923

FIGURE 1 | Hypericum perforatum main reproductive strategies and tissue type investigated in this study. (A) DNA histograms displaying estimates of DNA content by using H. perforatum single seeds, as assessed by FCSS. Blue: seed originated by sexual reproduction (4:6 embryo:endosperm C-values); red: single seed originated by autonomous apomixis (4:8 embryo:endosperm C-values); green: BIII hybrid (6:8 embryo:endosperm C-values). The overlay of these three profiles is also shown. (B) Ovule developmental stage adopted for Laser-Capture Microdissections (upper panel). The remaining tissue is displayed in the bottom panel. Dissections were performed by targeting the ovule nucellus (red line). nu: nucellus; ii: internal integuments; oi: outer integuments. Scale bar: 50 mm.

TABLE 1 | Summary statistics of the sequence assembly and functional annotation for Hypericum perforatum ovule transcriptome.


BLAST matches are referred to the Arabidopsis thaliana proteome TAIR10.

from the apomictic ones, with the two first principal components explaining approximately 43.7 of the overall variance (**Figure 2**).

As shown in **Table 3**, 396 transcripts (Bonferroni p-value ≤ 0.05) were found to be differentially expressed between the two phenotypes. More in detail, and by using TABLE 2 | Summary of sequencing and assembly data for the H. perforatum (2n = 2x) genome draft.


The summary of sequencing data refers to the R1 and R2 reads.

the sexual samples as reference, 312 (FDR: 1129) transcripts were found up-regulated whereas 85 (FDR: 437) were found down-regulated (**Table 3** and **Supplementary Table S3**). Remarkably, 119 transcripts were exclusively present in sexual

FIGURE 2 | Principal component analysis (PCA) of ovule-expressed transcripts. (A) Statistical analysis performed by using all expressed transcripts. The percentage variation explained by the two axes is about 44%. (B) Statistical analysis performed by using DEGs and phenotype-specific transcripts. The percentage variation explained by the two axes is about 96%. Sexual samples (BHS1-3) are shown as red squares, while aposporous apomictic samples (BHA1-3) are shown as blue squares.

#### TABLE 3 | Differentially expressed genes.

fpls-10-00654 May 22, 2019 Time: 19:19 # 8


For each comparison the table reports on the number of identified differentially expressed unigenes in the aposporous samples (test) with respect to the sexual samples (reference). Three biological replicates were adopted for both test (t) and reference (r) samples. For each comparison the number of up- and down- regulated unigenes is also indicated for each class of DEGs. Numbers in brackets indicate the number of DEGs resulting by the adoption of the FDR p-value correction.

(Bonferroni p-value ≤ 0.05; N: 25) or aposporous samples (Bonferroni p-value ≤ 0.05; N: 94). By considering the FDR p-value correction or no p-value correction, our estimates of phenotype-specific transcripts (e.g., the sum of APO-specific and SEX-specific transcripts) raised to 376 and 1456, respectively (**Table 3** and **Supplementary Table S4**).

The in silico mapping of genes located in the HAPPY (Hypericum APOSPORY) locus associated to apospory in H. perforatum (NCBI Accession No. HM061166) identified nine contigs in the draft sequence of the H. perforatum genome (**Supplementary Table S5**). While seven contigs were shorter than 10 kb in length, the two contigs: 62189 and 62786 accounted for 58,364 and 89,182 bp, respectively. Sequence identity between gene regions predicted from HM061166 and the corresponding genome sequences ranged from 93 to 100%, with an average value of 98%. The lowest sequence identity was scored by ARI-T gene locus (**Supplementary Table S5**). The largest contig (62786) corresponded to sequence portion of HM061166 bracketed by the two genes PAT1 and RINGH2, which includes the locus homologous to the A. thaliana gene ARI7. Both gene composition and gene order were perfectly conserved between contig 62786 and the corresponding sequence portion of HM061166. The only exception to this was the miss annotation of the transposon related gene RT3 in contig 62786. Noteworthy, the gene sequence corresponding to the first locus predicted from HM061166 (e.g., HK1) aligned with a terminal portion of contig 62189, ranging from positions 52192 to 58131. The annotation of contig 62189 predicted nine genes with no homology relationship with genes annotated in the BAC clone HM061166, except for HK1. Hence, we considered likely that the 50 kb sequence portion upstream the gene locus HK1 in contig 62189 represents the genomic sequence upstream of the HAPPY locus in the H. perforatum genome (**Supplementary Table S5**). Based on the alignment of transcript variants to the 9 genomic contigs, we could verify that most predicted genes in this region (N: 24/33) are expressed in pre-meiotic ovule nucellus. However, the transcript Locus\_2\_Transcript\_538734\_835163\_Confidence\_0.000\_ Length\_588, encoding for a light harvesting chlorophyll a/bbinding protein (Lhcb2), was the only DEG associated to HAPPY locus (FDR: 4.2E-3). The ARIADNE locus predicted from the contig 62786 (contig region 20985–22397 bp) matched four transcript variants (**Supplementary Table S5**) displaying no expression differences in our datasets. No transcripts were

associated to the annotated region corresponding to the ARI-T locus (contig region 23718–24280 bp).

In a different approach, the H. perforatum genome sequence was exploited to efficiently discriminate transcript variants (e.g., potential alleles) co-mapping into the same genomic locus, from gene products related to orthologous or paralogous gene loci. Detailed annotations for all identified DEGs (Bonferroni p-value ≤ 0.05) and phenotype-specific transcripts are reported on **Supplementary Tables S3**, **S4**. Data are referred to genomic loci containing at least one modulated transcript. We therefore focused our statistics on co-expression of multiple transcript variants to DEGs and transcripts exclusively found in aposporous or sexual samples. In the former case, the average number of multiple transcript variants was equal to 1.69 (range: 1–8), while we estimated an average number of 1.86 transcripts/locus (range: 1–13) for phenotypespecific transcripts. In both cases, about two thirds of transcriptionally modulated transcripts were present as single variants (**Figures 3A,B**). More in detail, the number of loci associated to transcripts with multiple expression patterns (N ≥ 2) was equal to 120 (38.9%) and 25 (29.5%) for upand down-regulated DEGs (**Figure 3C**). For APO- and SEXspecific transcripts, the number of loci associated to multiple expression patterns was equal to 516 (45.3%) and 135 (42.4), respectively (**Figure 3D**).

The analysis of GO enriched terms provided an overview of biological processes and molecular functions that characterize the transcriptional changes seen in ovules (**Table 4** and **Supplementary Table S6**). GO terms were filtered to the most specific ones to permit a clear, non-redundant, picture of over- and under-represented processes and functions. Among up-regulated transcripts, we found a significant enrichment for the GO term: RNA-dependent DNA biosynthetic process (FDR p-value: 8.4E+08). This ontological term was found to be enriched in both up-regulated and in down-regulated transcript sets (Bonferroni and FDR), as well as among APOspecific transcripts (FDR p-vale: 1,2E-02). Other enriched terms found in upregulated transcripts (FDR p-value ≤ 0.05) were RNA processing, glyceraldehyde-3-phosphate metabolic process, organelle organization and multicellular organism development (**Table 4** and **Supplementary Table S6**). Among down-regulated transcripts, enrichments were found for the GO terms: synaptonemal complex assembly, plant-type cell wall modification and gene expression. Noticeably, while no enrichment was found in SEX-specific transcripts, the APO-specific ones were depleted in multiple terms, including pollen tube growth, cell cycle process, vegetative to reproductive phase transition of meristem and regulation of gene expression, epigenetic, among others (**Supplementary Table S6**).

Interestingly, the only over-represented GO term in downregulated DEGs (FDR p-value: 1.8E-02) was synaptonemal complex assembly (GO:0007130). As apomixis process is expected to be associated with transcriptional changes on reproductive-related genes, we annotated our transcriptome according to the following major classes: cell specification, sporogenesis, gametogenesis, and embryogenesis and selected

all transcripts displaying transcriptional changes in the ovule nucellus datasets (**Supplementary Table S7**). More in details, we found five transcript variants potentially associated to cell fate specification, 11 transcript variants potentially involved in sporogenesis and 30 transcripts associated to later developmental processes, including gametogenesis (n:12) and embryo development (n:18). The transcript variants associated to the cell fate specification were up-regulated in the aposporous dataset or found exclusively expressed in nucellus of aposporous plant accessions. Among these, we found the orthologs of Arabidopsis BEL1-like homeodomain 1 and MYB domain protein 66 (**Supplementary Table S7**). Similarly, the datasets represented by transcripts potentially involved in

gametogenesis and embryo development displayed a clear disproportion between transcripts displaying higher expression in aposporous-derived ovules (including Up-regulated DEGs and APO-specific) and transcripts with the opposite expression pattern (e.g., Down-regulated DEGs and SEX-specific), in favor of the former class. Among genes included in this latter category, we annotated two AGAMOUS-like genes (namely AGL11, AGL62) and several embryo defective proteins (**Supplementary Table S7**). The dataset represented by transcripts potentially involved in meiotic entry and sporogenesis included six downregulated DEGs or SEX-specific variants and 5 genes displaying the opposite expression pattern (including Up-regulated DEGs and APO-specific).

associated to up-regulated and down-regulated DEGs are shown in (C), while APO-specific and SEX-specific transcripts are shown in (D).


fpls-10-00654 May 22, 2019 Time: 19:19 # 10

corrected p-value. For each GO term, the number of annotated (#Annot) and

non-annotated

 (#nonAnnot)

 transcripts in the test (Test) and reference datasets (Refs) is also indicated.

Taking into consideration the massive presence of transcripts annotated as RNA-dependent DNA biosynthetic process in our sequence datasets (**Table 4** and **Supplementary Table S6**), we wondered whether the differential expression of transposable elements (TEs)-like sequences was somehow associated to aposporous phenotype, in pre-meiotic ovules. To address this question, we aligned all DEGs (Bonferroni p-value ≤ 0.05) to the genome sequence, and isolated the genomic loci providing the best alignments (**Supplementary Table S8**). Although several (n: 5) transposon related sequences were predicted in the genomic loci matching the HAPPY locus, we did not find transcript variants matching these loci. The alignment of the entire ovule transcriptome to the genomic loci targeted by differentially expressed TEs provided a total 42 transcript variants, including: 13 Up-regulated, 3 APO-specific and 3 downregulated transcripts, together with 23 transcripts displaying no significant expression differences between sexual and aposporous samples (**Supplementary Table S8**). The annotation of these 12 genomic loci by BLASTX confirmed their TEs nature. Remarkably, for as many as 8 out 12 loci, the transcript with the lowest FDR p-value associated gene expression, displayed the highest nucleotide diversity with respect to the genome reference (**Figure 4** and **Supplementary Table S8**). Pearson correlation coefficients (PCC) ≥ 0.59 between FDR p-value and sequence identity were estimated for most of these genomic loci (N: 7 out of 8, **Supplementary Table S8**). Moving from these data, we selected four loci with significant PCC correlations and investigated their expression pattern and CpG DNA methylation at pistil level (**Figure 4**). The expression pattern observed by RNAseq assay in ovules was confirmed even at the level of pistils in three cases out of four. As for the DNA methylation pattern, although some methylation changes could be observed among the pistil developmental stages, we did not find any obvious correlation with the assessed expression data (**Figure 4**).

#### Apomictic Ovules Display a Differential and Heterochronic Expression of Transcripts Involved in Epigenetic Regulation of Gene Expression and RNA Splicing

Following the identification of multiple TEs differentially expressed in our datasets, we annotated the ovule transcriptome for the identification of gene products potentially involved in epigenetic regulation of gene expression (Pikaard and Mittelsten Scheid, 2014). The adopted terms were RNA silencing, chromatin modification, DNA modification and RNA directed DNA methylation. Multiple DEGs and phenotype-specific transcripts felt in these classifications (**Supplementary Table S9**). On average, 3 transcript variants per locus were identified (minimum: 1, maximum: 4).

The expression of six genes potentially involved in these epigenetic processes were tested by in situ hybridization assays (**Figure 5**). For all tested genes, hybridization signals were detected in the ovule nucellus, which is consistent with the sequencing of RNAs expressed in this cellular domain. However, hybridization signals were also detected in additional cellular domains of the ovule, including the integuments (**Figure 5**). In line with the RNAseq data, hybridization signals for HpIDN2, HpFDM2, HpFDM4, HpRDM16 were detected in ovules collected from both phenotypes. Hybridization signals for the two genes HpTIL1 and HpLSD were faint or undetectable in ovules collected from sexual plants. Furthermore, the hybridization patterns observed with the respective complementary probes suggest that transcription of selected genes occurs in both orientations (**Supplementary Figure S2**).

As shown in **Figure 6**, the hierarchical clustering of samples according to the expression of annotated transcripts was consistent with the two phenotypes (**Figure 6A**). At the same time, the heat map underlined the extent to which expression variation was captured in our RNA-seq assembly and GE pipelines. qPCR analyses were performed to assess the expression of selected genes on broader range of developmental stages, ranging from pre-meiosis (stage S1) to the end of gametogenesis (stage S4). At stage S1, the pattern recorded by qPCR in pistils for HpFDM4.1, HpNTF2, HpLSD1.1, HpTIL1.1, and HpNAP1;2.1 was in line with RNAseq data on LCM samples. Conversely, HpFDM2.1, HpIDN2.1, and HpDCL4.1 displayed similar expression levels in S1 pistils collected from sexual and aposporous plants. Nevertheless, multiple transcripts involved in RNA silencing and/or RNA directed DNA methylation displayed heterochronic expression, with higher or comparable transcript levels in sexual pistils at stage 3 (early gametogenesis), and higher or comparable transcript levels in apomictic pistils at stages S1 and/or S2 (**Figures 6B**, **7C**). This was true for HpFDM2.1, HpFDM4.1, HpIDN2.1, HpTIL1.1, and HpDCL4.1 (**Figure 6**). Interestingly, the expression of HpNAP1;2, encoding for a nucleosome assembly protein involved in chromatin formation or chromatin remodeling, was modulated before and after meiosis (stages S1 and S3). According to the qPCR assays, the expression of HpLSD1.1 and HpNTF2, encoding for the orthologs of a LYSINE-SPECIFIC HISTONE DEMETHYLASE and NUCLEAR TRANSPORT FACTOR 2, were nearly exclusively detected in aposporous pistils (**Figure 6B**). Our RNAseq investigations identified two differentially expressed transcripts displaying high sequence similarity with the Pre-mRNA-splicing factor 3-like RDM16 and mapping on different genomic contigs (contig 62591 and contig 62634, respectively). We named these transcripts HpRDM16.1 and HpRDM16.3. A third RDM16-like transcripts displaying no transcriptional changes in sexual and aposposporous ovules co-aligned with HpRDM16.1 and was named HpRDM16.2. Accordingly, HpRDM16.2 displayed no expression difference in S1 pistils of the two phenotypes whereas the average expression level of HpRDM16.3 in pistils collected at stage S1 was higher aposporous plants (**Figure 7C**). Regarding the expression of RDM16.1 in aposporous pistils (**Figure 7C**), this was characterized higher expression in terminal stages of aposporous pistils and by a pronounced expression difference between central developmental stages, those embracing meiosis and early gametogenesis, and the most marginal ones (e.g., S1 and S4). These findings, together with the differential expression of two RDM16-like genes and their possible role

gametogenesis). RUE: relative units of expression. RUM: relative units of methylation.

in DNA methylation (e.g., RDM16 stands for Reduced DNA Methylation 16) prompted us to ask whether transcriptional de-regulation of this gene might have a broader effect in the ovule transcriptome. In the heterologous system represented by the sexually reproducing A. thaliana, the protein RDM16 (AT1G28060) physically interacts with several proteins involved in RNA processing and splicing. We therefore searched our datasets for potential interactors of HpRDM16, involved in these processes (**Figure 7B**). The network represented in **Figure 7A** shows the potential interactors of HpRDM16 having at least one differentially expressed transcript variants in our RNAseq dataset (**Figure 7B**). The network has an average local clustering coefficient of 0.876 and a protein–protein interaction (PPI) enrichment p-value of 5.8e-5, indicating that proteins included in this network are at least partially biologically connected, as a group. Beside the known protein–protein interactions,

integuments; oi: outer integuments. Scale bar: 50 mm.

the network reports a significant enrichment (FDR p-value: 8.54e-06) in proteins involved in RNA splicing (KEGG map: ath03040). The clustering of corresponding H. perforatum transcripts based on ovule-RNAseq data is consistent with the conservation of their biological connection in this latter species (**Figure 7B**). In line with the RNAseq data, higher expression in aposporous pistils at stage S1 was recorded by qPCR for HpMAC5A, AT1G10320-like and AT3G47120 like. Intriguingly, genes included in this group seemed to be communized by marked differences in gene expression for terminal stages and very low expression differences (if some) between sexual and aposporous pistils at stage S3, corresponding to the onset of gametogenesis (**Figure 7C**). The identification of network of potentially interacting proteins involved in RNA binding and splicing and encoded by transcripts up-regulated in the aposporous ovule nucellus prompted us extend of qPCR assays on additional transcripts annotated with these terms (**Supplementary Figure S3**). qPCRs were performed to target the expression of a putative Serine/arginine-rich SC35-like splicing factor HpSCL28, together with HpSRP54, HpPUM3 and the F-box family protein HpSKIP22-like. In line with the RNAseq data, higher expression in aposporous S1 pistils was detected for HpSCL28 and HPSRP54 and HpSKIP22-like. Differential expression for the HpPUM3-like transcript, which encodes for a PUMILIO homolog 3 involved in the regulation of mRNA stability by binding the 3<sup>0</sup> -UTR of target mRNAs, was only detected in pistils at stage S4. Altogether, validation of RNAseq profiles on pistils collected at a comparative developmental stage provided consistent amplification profiles for 70% of investigated transcripts (N. 16 out of 22).

It is worth noting that expression of the A. thaliana orthologs of these genes is detectable in different plant structures. Within flowers it is generally higher at early developmental stages (Flower stage 9), it is enriched in pistils (referred as carpels, flower stages 12 and 15), and among the different pistil areas, it is more abundant in the ovary, which is pistil part that bears the ovules<sup>12</sup>).

<sup>12</sup>http://bar.utoronto.ca/ (**Supplementary Table S11**

### DISCUSSION

This study is aimed at investigating gene expression variation potentially associated to the aposporous developmental program by focusing on the pre-meiotic ovule nucellus, which is the cellular domain primarily involved into the differentiation of precursors of meiocytes and aposporous embryo sacs. Transcriptomic investigations were performed on a wide range of aposporous accessions, and by adopting a bulking strategy. This experimental design was adopted to focus on transcripts and meaningful biological processes conserved across different apomictic accessions. Hence, high coverage sequencing reactions (nearly 90M pair-end reads/library) and stringent criteria for the selection of phenotype-specific transcripts and DEGs were adopted to address the possible drawback represented by the adoption of multiple plant accessions deriving different geographical areas. Taken together, the size (N: 67,000) and N50 of the assembled transcriptome (N50: 814 bp) suggests that our de novo transcriptome assembly retains some redundancy. Nonetheless, our gene expression analysis identified nearly 400 DEGs (Bonferroni p-value ≤ 0.05) and up to 1832 transcripts characterized by phenotype-specific expression. The expression of six selected DEGs in the ovule nucellus was confirmed by in situ hybridization assays performed on both sexual and aposporous tissues. However, hybridization signals for the assayed genes were also detected in other ovule domains (e.g., internal integuments), indicating that expression of the selected DEGs is not restricted to the nucellus.

To better exploit our transcriptome data and investigate the extent to which single loci were represented by multiple variants, we sequenced the genome of an H. perforatum diploid sexual accession. In its current version, the draft sequence of the H. perforatum genome is about 350 Mb in size, with a scaffold N50 of about 63 kb, indicating that half of the genome sequence is in contigs larger than 63 kb. A preliminary investigation concerning the completeness of the assembly estimated an 85.7% completeness, 12.8% of duplicated and about 5% of fragmented benchmarking universal single-copy orthologs (BUSCOs). About 9% of investigated universal single-copy orthologs were missing from our assembly. The in silico mapping of genes located in the HAPPY locus associated to apospory in H. perforatum identified of nine contigs sharing high sequence similarity (98% within gene regions) with the corresponding BAC clone (HM061166). The largest assembled contig (62786) corresponded to a large sequence portion of HM061166 in between the two genes PAT1 and RINGH2 (**Supplementary Table S5**). As expected, synteny

and collinearity were perfectly conserved between gene loci annotated in contig 62786 and the corresponding sequence portion of HM061166. Noteworthy, the in silico mapping of corresponding gene regions permitted the annotation of a 50 kb sequence portion upstream the gene locus HK1 included in contig 62819, which likely represents a portion of the genome sequence upstream of the HAPPY locus in the sexual H. perforatum genome. The alignment of transcript variants to the gene loci annotated in the HAPPY locus identified 24 genes expressed in pre-meiotic ovule nucellus. ARIADNE7 was among the predicted genes with the highest expression in sexual and aposporous datasets. However, no transcriptional difference was detected for this gene and the only DEG annotated in this locus encoded for a light harvesting chlorophyll a/bbinding protein (Lhcb2), located about 40 kb upstream of HK1 (**Supplementary Table S5**).

The sequencing and assembly of a reference genome permitted the efficient annotation of identified DEGs and the identification of co-expressed transcript variants, defined as transcript sequences co-aligning to the same genomic sequence (**Figure 3**). Based on these investigations, we detected multiple variants with alternative expression patterns for about one third of identified DEGs and phenotype-specific transcripts, with an average number of about two transcripts per locus (DEGs: 1.8; phenotype-specific: 1.9). However, we are aware that computational assumptions adopted for their selection might have led to an underestimation on the number of transcript variants. Detailed investigations on a subset of gene loci encoding for TEs revealed that gene loci with the highest nucleotide diversity with respect to the genome reference were more likely to be differentially expressed (N: 8 loci out of 12), as indicated by the lower FDR p-value associated gene expression variation. Although no additional investigations were performed to address the extent of nucleotide diversity among transcript variants, the co-expression of multiple transcripts with alternative expression pattern might suggests high plasticity in mRNA transcription and processing.

Our comparative GE analysis underlined a clear disproportion between transcripts displaying higher expression in aposporousderived ovule nucellus (including Up-regulated DEGs and APO-specific) and transcripts with the opposite expression pattern (e.g., Down-regulated DEGs and SEX-specific). Taken together, and without a clear correlation with technical or methodological features, it is likely that a correlation exists between the aposporous developmental pathway in H. perforatum and the observed enhanced expression in ovules of aposporous accessions. The observed disproportion between transcripts displaying higher expression in aposporous-derived

ovules and transcripts with the opposite expression pattern, is consistent with the hypothesis that aposporous plants are subjected to alternative transcriptional (mediated by trans-acting factors) or post-transcriptional regulation of gene expression in the ovule nucellus.

By focusing on single biological pathways, the transcriptome of pre-meiotic aposporous nucellus was enriched in GO terms related to RNA-dependent DNA biosynthetic process (GO:0006278), while terms related to the RNA processing (GO:0006396) and epigenetic regulation of gene expression (GO:0040029) were significantly depleted. Accordingly, our GE study shows massive differences in the expression of several genes encoding for TEs. Enrichment of TEs -related GO-terms were identified in all comparisons, with the only exception of transcripts exclusively expressed in sexually-derived samples (**Supplementary Table S6**). The possible association between sequence conservation and differential expression observed for a number of TEs might lead to speculate that their correspondent loci are located within the Hypericum apomixis controlling region, which is currently represented by the HAPPY locus (Schallau et al., 2010). However, the differentially expressed TEs did not map into the HAPPY locus and no expression was recorded for the TEs annotated within the HAPPY locus. At the same time, the distribution of transcriptionally modulated TEs across the genome appears to be widespread. As an example, the genomic area represented by considering only TEs whose expression is enriched in aposporous samples (Bonferroni p-value ≤ 0.05) accounts for 12 contigs, with an overall size of about 3.5 Mb (nearly 1% of the assembled genome). Taken together, these data suggest that massive differential expression of TEs-related genes might be associated to the mis regulation of cellular processes involved in post-transcriptional regulation of TEs.

As apomixis could be related to heterochrony in the expression of reproductive processes (Grimanelli et al., 2003), we annotated all DEGs according to their possible involvement in reproductive processes. Only a few DEGs potentially associated with sporogenesis and gametogenesis were identified (**Supplementary Table S7**). The low number of DEGs potentially associated with reproduction is in line with the pre-meiotic developmental stage adopted for the laser capture microdissections. More in details, samples collected from aposporous accessions were characterized by higher transcript levels for the orthologs of the Arabidopsis gene BEL-like homeodomain 1 (BLH1), which is involved in cell specification. Transcriptional studies on the aposporous model plant Hieracium prealtum have shown that a putative homolog of the BEL-like EOSTRE is expressed in microdissected aposporous initials and early aposporous embryo sacs (Okada et al., 2013). Interestingly, embryo sac development in aposporous H. perforatum (Galla et al., 2011) and Hieracium spp. (Okada et al., 2013) accessions shows defects in nuclear migration and cellularization that might resemble those associated to the ectopic expression of BLH1 in A. thaliana embryo sac (Pagnussat et al., 2007). Furthermore, multiple transcript variants with antagonistic expression patterns were associated to the HpTIL1 gene, encoding for the catalytic subunit of DNA pol ε, whose mutation in Arabidopsis display pleiotropic phenotypes including reduced number of ovules, abnormally developing ovules, and reduced fertility (Jenik et al., 2005). ATMYB66, another transcription factor displaying higher expression in aposporous samples, is involved in epidermal cell fate specification in A. thaliana, although this phenotype has been documented in roots and hypocotyls (Cheng et al., 2014). Interestingly, results have shown that MYB66 physically interacts with several splicing factors including the protein encoded by AT2G32600, the pre-mRNA-splicing factor ISY1 (AT3G18790) and an ATPdependent RNA helicase encoded by AT3G26560. Among these interactors, only the orthologous ISY1 (AT3G18790) was upregulated in aposporous samples (**Supplementary Table S4**; FDR p-value: 3.73e-3). However, several DEGs with the potential to interact at protein level and likely involved in RNA splicing were detected in our datasets (**Figure 7** and **Supplementary Table S10**). These include the orthologs of AT1G10320, AT3G47120, BUD13 (AT1G31870) CDC5 (AT1G09770), EMB14 (AT1G80070), MAC5A (AT1G07360), and RDM16 (AT1G28060). The PPI network represented by the gene products of these DEGs has a protein–protein interaction enrichment p-value of 5.8e-5, indicating that proteins included in this network are at least partially biologically connected, as a group. Consistent with the gene annotations, the network has a significant enrichment (FDR p-value: 8.54e-06) in proteins involved in RNA splicing (KEGG map: ath03040). Pre-mRNA splicing is an essential process required for the expression of most eukaryotic genes. Alternative splicing produces multiple mRNAs from the same gene through variable selection of splice sites during pre-mRNA splicing. Noteworthy, more than 60% of Arabidopsis intron containing genes displays alternative slicing (Marquez et al., 2012) and the percentage of alternatively spliced genes in humans is about 95% (Pan et al., 2008). Splicing is carried out by a macromolecular machinery termed the spliceosome, which senses the splicing signals and catalyzes the removal of introns from pre-mRNAs. Results in Arabidopsis and corn have shown that alternative RNA splicing is needed for cell differentiation, development, and plant viability (Fouquet et al., 2011). Moreover, the knockout of MAC5A in Arabidopsis displays severe developmental defects including dwarfism, delayed growth, abnormal floral organs, and sterility (Monaghan et al., 2010). Regarding the expression of DEGs involved in RNA splicing before the onset of meiosis, the higher average gene expression detected in aposporous pistils by qPCR was in line with the higher expression detected with the RNAseq analysis on LCM samples. This could indicate that expression is enriched in the ovule nucellus or that regulation mechanisms for these genes are shared throughout the pistil. Furthermore, according to our time-course qPCR assays, the expression variation observed among aposporous and sexual pistils was remarkably high for terminal developmental stages and much lower in correspondence of the onset of gametogenesis (stage S3). Noteworthy, an enrichment in expression of cell cycle related genes was also documented in Hieracium prealtum AI cells and EAE sacs (Okada et al., 2013) and Boechera gunnisoniana apomictic germlines (Schmidt et al., 2014) and an ortholog of AT1G45231 (TGS1), encoding a trimethyl guanosine synthase

which has a dual role in splicing and transcription, is increasingly overexpressed in sexual plants from pre-meiosis to anthesis in Paspalum notatum (Siena et al., 2014).

In addition to pre-mRNA splicing, splicing factors might also play important roles in other biological processes, including sRNA production (Zhang et al., 2013). Hence, the Arabidopsis ortholog of HpCDC5, which was up-regulated in our aposporous samples (**Figure 7**), positively regulates post-transcriptional processing and/or transcription of primary microRNA transcripts (Lin et al., 2007; Zhang et al., 2013). Furthermore, screenings for the identification of genes involved in RNA-directed DNA methylation (Ausin et al., 2012; Dou et al., 2013; Huang et al., 2013; Du et al., 2015) demonstrated that several pre-mRNA splicing factors, including RDM16, act at different steps in the RdDM pathway. In our study, two orthologs of the Arabidopsis pre-mRNA splicing factor RDM16 were associated to transcript variants significantly up regulated in aposporous micro dissected samples and pistils at different developmental timepoints. Results have shown that RDM16 regulates the overall methylation of TEs and gene-surrounding regions, and preferentially targets Pol IV-dependent DNA methylation loci and the ROS1 target loci (Huang et al., 2013). Arabidopsis rdm16 mutants are affected in several aspects of plant development, including the viability of both female and male gametes (Huang et al., 2013). Interestingly, the small nuclear ribonucleoprotein Prp4p-related (LACHESIS), which is involved in a mechanism that prevents accessory cells from adopting gametic cell fate within the female embryo sac, is among the known interactors of the Arabidopsis RDM16<sup>13</sup> . HpRDM16.1 and HpRDM16.3 were not the only upregulated transcripts involved in epigenetic regulation of gene expressions by RdDM. Differential expression was also detected for genes operating in RNA silencing, RNA-directed DNA methylation (HpDCL4.1, HpIDN2.1, HpFDM2.1, HpFDM4.1, HpNTF2.1) and histone and chromatin modification processes (HpLSD.1 and HpNAP1;2.1). These transcriptional changes are consistent with a role of these processes in regulating cell fate determination in the ovule, as indicated by genetic studies in A. thaliana and maize (Garcia-Aguilar et al., 2010; Olmedo-Monfil et al., 2010; Armenta-Medina et al., 2011; Singh et al., 2011). qPCR reactions performed on genes potentially associated RNA silencing and RdDM only partially validated the RNAseq expression data. Nevertheless, as the down-regulation of HpIDN2 in aposporous H. perforatum pistils was also detected on a previous study (Galla et al., 2017), we hypothesize that validation of RNAseq data focused on ovule nucellus in pistils might be affected by the amplifications of multiple variants with contrasting expression pattern (**Supplementary Table S9**) and/or transcriptional noise associated to expression of these genes in other pistils cellular domains. Consistently, microarray data in Arabidopsis have shown that expression for these genes is higher in pistils (referred as carpels, flower stages 12 and 15) and enriched but restricted to the ovary, which is pistil part that bears the ovules. Noteworthy, the putative homologs of Arabidopsis genes involved in chromatin function and gene silencing via small RNA pathways, including a DICER-2 like gene, were enriched in Hieracium prealtum aposporous initials (Okada et al., 2013). Our time-course GE analysis on pistils collected at multiple developmental timepoint ranging from pre-meiosis to late gametogenesis revealed heterochronic expression for several genes involved in RNA splicing, RNA silencing and RdDM, includin HpFDM2, HpFDM4 and HpTIL1, HpMAC5A, RDM16.2 and the orthologs of AT1G10320 and AT3G47120, among others. For these genes, the higher or comparable expression in sexual pistils in correspondence to the onset of gametogenesis was accompanied by an inverted expression pattern in preceding developmental stages (**Figures 5**, **6** and **Supplementary Figure S3**). Even though the heterochronic expression pattern is currently documented for a very narrow set of genes in this species, the occurrence of heterochronic expression of genes involved in reproductive processes in apomictic species has been documented in several model species (Grimanelli et al., 2003; Sharbel et al., 2010). Remarkably, a an ortholog of the Arabidopsis NUCLEAR TRANSPORT FACTOR 2 (NTF2) family protein which is up-regulated in the aposporous nucellus (Bonferroni p-value: 9.5E-09) displayed constitutive expression in aposporous pistils and nearly no amplification signals in sexual pistils. NTF2 encodes for an RNA binding (RRM-RBD-RNP motifs) protein predicted to bind ssRNA (Allain et al., 2000) and involved in RdDM (Parida et al., 2017). Results have shown that the Arabidopsis NTF2 interacts with the methyl CpG binding domain MBD6, involved RNA-mediated gene silencing (Parida et al., 2017). Other known interactors of ATMBD6 are the two proteins: AGO4, which is a member of a class of PAZ/PIWI domain containing proteins involved in siRNA mediated gene silencing, and the histone deacetylase AtHDA6, which is another essential component of in RNA-directed DNA methylation involved in the silencing of TEs by modulating DNA methylation and histone acetylation (Liu et al., 2009). The expression of genes encoding for ARGONAUTE proteins did not vary significantly in our datasets. However, we did find significant differences in the expression of multiple genes encoding for AGO4 interacting proteins. Among these, significant transcriptional differences were detected for transcripts encoding for the orthologs of the XH/XS domain containing proteins: IDN2, FDM2, and FDM4. The interaction of IDN2 with FDM-like proteins is thought to be required for RdDM (Ausin et al., 2009; Zheng et al., 2010). In our current understanding on their molecular function, IND2, FDM2 and possibly FDM4, together with AGO4, are required for the association of DOMAIN REARRANGED METHYLTRASFERASE 2 (DRM2) with lncRNAs, at RdDM targeted loci (Bohmdorfer et al., 2014). Furthermore, at chromatin level, IDN2 physically interacts with a core subunit of the SWI/SNF complex (SWI3B) and mediate transcriptional silencing by guiding the SWI/SNF complex and establishing positioned nucleosomes on specific genomic loci (Zhu et al., 2013). Interestingly, the knockout of the two Arabidopsis LYSINE-SPECIFIC HISTONE DEMETHYLASE: LSD1 and LSD2 affects CHH methylation levels at a subset RdDM target loci (Greenberg et al., 2013). However, any functional correlation existing between IDN2 and the transcriptionally modulated

<sup>13</sup>https://string-db.org/cgi/network.pl?taskId=CSYEOSdK3rb7

LSD1 and NUCLEOSOME ASSEMBLY PROTEIN 1;2 (NAP1;2) at chromatin level is currently not clear and deserves additional investigations. Noteworthy, an enrichment of functions related to epigenetic regulatory pathways, including histone H3K4 demethylation and maintenance of DNA methylation were also reported aposporous initials of apomictic Boechera gunnisoniana (Schmidt et al., 2014), suggesting that epigenetic regulation of gene expression is a common feature of the aposporous reproductive strategy in different species.

#### CONCLUSION

Pre-meiotic ovule nucellus of aposporous plants are characterized by variations in the expression of a great number of genes, including TEs-related loci, which appears to be widespread across the genome. About one third of gene loci associated to DEGs and phenotype-specific transcripts were characterized by the co-expression of multiple transcript variants with antagonistic expression patterns. Our comparative gene expression analysis detected a clear disproportion between transcripts displaying higher expression in the nucellus of aposporous plants and transcripts with the opposite expression pattern. This observation is consistent with the hypothesis that aposporous nucellar cells are subjected to alternative transcriptional (mediated by trans-acting factors) or post-transcriptional regulation of gene expression in the ovule nucellus. Accordingly, DEGs were enriched or depleted in ontological terms related to RNA-dependent DNA biosynthetic process, RNA processing and epigenetic regulation of gene expression, including gene silencing by RNA and ncRNA metabolic process. The differential expression of multiple TEs-related sequences in ovules of aposporous accessions is in line with a functional association between apospory and the RNA directed DNA methylation pathways. Several genes encoding for potentially interacting proteins involved in pre-mRNA splicing were differentially expressed in the ovule nucellus, as well as terminal stages of pistil development. In addition to pre-mRNA splicing, DEGs involved in this process might also play important roles in sRNA production and RNA-directed DNA methylation. Noteworthy, several genes involved RNA-directed DNA methylation were also found differentially expressed in pre-meiotic aposporous nucellus and pistils collected at multiple developmental timepoints. These expression differences, together with GO enrichment analysis and the massive differential expression of TEs in pre-meiotic ovules are consistent with a deregulation of small RNA mediated DNA methylation in the ovule nucellus of aposporous H. perforatum.

### AUTHOR CONTRIBUTIONS

GB, FP, and GG conceived the study. GG carried out the computational analysis and drafted the manuscript. AB performed all validations. MB, SG, and FP performed the LCM and prepared the RNA for the sequencing reactions. All authors helped to draft the manuscript and read and approved the final manuscript.

### FUNDING

This research was supported by the Ministry of Foreign Affairs and International Cooperation, Direzione Generale per la Promozione del Sistema Paese (Italy), Ufficio Relazioni Internazionali del Consiglio Nazionale delle Ricerche, Italy (Laboratori Congiunti Bilaterali Internazionali CNR, Prot. 0005651) and by the following grants: Research Project for Young Researchers of the University of Padova (year 2010), "Comparative and functional genomics for cloning and characterizing genes for apomixis" (code: GRIC101130/10), Principal investigator: GG; Academic Research Project of the University of Padova (year 2012); "Transcriptomics of reproductive organs in model species for a comparative analysis of the genetic-molecular factors characterizing sexual and apomictic processes" (code: CPDA128282/12), Principal investigator: GB. Scientific Independence of Young Researchers (SIR): "Transcriptomic analysis of ovule-specific cell lineages to unveil the genetic and molecular bases of apomictic seed production in model species" (code: RBSl14K1ON), Principal Investigator: GG.

### ACKNOWLEDGMENTS

The authors want to thank COST Action FA0903 "Harnessing Plant Reproduction for Crop Improvement."

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00654/ full#supplementary-material

FIGURE S1 | Ontological annotation of the H. perforatum ovule transcriptome. Blue bars: biological processes (level 2); Green bars: molecular functions (level 3). For each annotation, the number of sequences is indicated in the horizontal ax and expressed as 1000 sequences. Annotations were clustered by using the plants-generic GO-slim function.

FIGURE S2 | In situ RNA hybridization experiments in pistils collected from apomictic and sexual genotypes. (A,C,E,G,I,K) Sexual genotypes; (B,D,F,G,J,L) aposporous apomictic genotyes. (A,B) HpIDN2; (C,D) HpFDM2; (E,F) HpFDM4; (G,H) HpRDR16; (I,J) HpLSD1; (K,L) HpTIL1. Experiments were performed with the sense probes, to hybridize mRNAs in antisense orientation. Hybridization signals are indicated with the blue color. nu: nucellus; ii: internal integuments; oi: outer integuments. Scale bar: 50 µm.

FIGURE S3 | Expression of H. perforatum transcripts involved in RNA binding and protein ubiquitination. Bars represent quantitative Real-Time PCR results expressed as relative units of expression. Light gray bars: relative expression level in sexual pistils; dark gray bars: relative expression level in aposporous pistils. Pistil developmental stages: S1: pre-meiosis; S2: meiosis; S3–S4 early and late gametogenesis, respectively. Relative expression values are plotted on the vertical ax. Error bars indicate the standard error observed among the five biological replicates.

TABLE S1 | Origins of H. perforatum samples. For each plant accession, the description, the genealogy, the origin, the ploidy and the degree of apomixis is indicated. The RNA sample ID is reported only for plant accessions adopted for the RNAseq analysis. Apospory expressed as a percentage was determined by flow cytometric screening of 48 single seeds. For details on the origin and composition of experimental populations, please refer to Schallau et al. (2010).

TABLE S2 | Summary of mRNA-seq data. For each library, the total number of reads as well as the number of mapped and unmapped reads are reported.

TABLE S3 | Transcripts displaying differential expression in H. perforatum ovules (Bonferroni p-value correction). For each locus ID, transcript variants displaying no significant expression differences between sexual and aposporous samples are also reported. For each transcript, the locus ID in our reference genome, the alignment start/end and the number of allele variants are also reported. repBD: annotations regarding reproductive processes (Galla et al., 2017). GOrep: ontological annotations related to reproductive processes. Gene annotations were downloaded from AmiGO2 (http://amigo.geneontology.org/amigo). Epigenetic mod: annotations related to genes involved in epidenetic regulation of gene expression as reported by Pikaard and Mittelsten Scheid (2014). The average expression of each transcript in sexual (sexual – means) and aposporous (aposporous – means), together with Bonferroni and FDM p-values corrections associated to GE analyses are also reported.

TABLE S4 | Transcripts annotated as phenotype-specific. Phenotype-specific transcripts were defined as transcripts expressed in all samples of a given phenotype and absent in all samples of the opposite phenotype. For each locus ID, transcript variants displaying no significant expression differences between sexual and aposporous sample as are also reported. For each transcript, the locus ID in our reference genome, the alignment start/end and the number of allele variants are also reported. repBD: annotations regarding reproductive processes (Galla et al., 2017). GOrep: ontological annotations related to reproductive processes. Gene annotations were downloaded from AmiGO2 (http://amigo.geneontology.org/amigo). Epigenetic mod: annotations related to genes involved in epigenetic regulation of gene expression as reported by Pikaard and Mittelsten Scheid (2014). The average expression of each transcript in sexual (sexual – means) and aposporous (aposporous – means), together with Bonferroni and FDM p-values corrections associated to GE analyses are also reported.

TABLE S5 | Annotation and expression of genes located into the HAPPY (Hypericum APOSPORY) locus. For each contig ID and annotated locus, the table reports on the gene name, gene coordinates (locus start|end) and the ID of associated transcript variants. Genes annotated in the BAC clone HM061166 are indicated by using the bac ID followed by the gene name and percentage sequence identity with the aligned portion of the corresponding contig. Percent sequence identity is reported in bracket. For each transcript ID, the length of the alignment, the identity in the aligned portion of the sequence are indicated. The average expression of each transcript in sexual (sexual – means) and aposporous (aposporous – means), together with Bonferroni and FDM p-values corrections associated to GE analyses are also reported.

TABLE S6 | GO Enrichment analysis. For each cluster of DEGs the over/under represented GO term (GO-ID), the ontological vocabulary (Ontology), the description of the GO term, the FDR corrected p-value and the p-value are reported. The number of annotated and non-annotated sequences in each test and reference datasets is also indicated.

TABLE S7 | Annotation and expression of reproductive-related DEGs in ovules collected from sexual and aposporous accessions. For each locus ID, transcript variants displaying no significant expression differences between sexual and aposporous samples are also reported. For each transcript, the locus ID in our reference genome, the alignment start/end and the number of allele variants are also reported. repBD: annotations regarding reproductive processes (Galla et al.,

#### REFERENCES


2017). GOrep: ontological annotations related to reproductive processes. Gene annotations were downloaded from AmiGO2 (http://amigo.geneontology. org/amigo). Epigenetic mod: annotations related to genes involved in epidenetic regulation of gene expression as reported by Pikaard and Mittelsten Scheid (2014). The average expression of each transcript in sexual (sexual – means) and aposporous (aposporous – means), together with Bonferroni and FDM p-values corrections associated to GE analyses are also reported.

TABLE S8 | Nucleotide diversity and expression pattern of TEs expressed in H. perforatum ovules. For each locus ID, transcript variants displaying no significant expression differences between sexual and aposporous samples are also reported. For each transcript, the locus ID in our reference genome, the alignment start/end and the number of allele variants are also reported. aln pident: percentage sequence identity in the aligned portion of the transcript. PCC: Pearson correlation coefficients between aln\_pident and FDR p-value.

TABLE S9 | Expression pattern of transcripts potentially involved in epigenetic regulation of gene expression. For each locus ID, transcript variants displaying no significant expression differences between sexual and aposporous samples are also reported. For each transcript, the locus ID in our reference genome, the alignment start/end and the number of allele variants are also reported. repBD: annotations regarding reproductive processes (Galla et al., 2017). GOrep: ontological annotations related to reproductive processes. Gene annotations were downloaded from AmiGO2 (http://amigo.geneontology.org/amigo). Epigenetic mod: annotations related to genes involved in epidenetic regulation of gene expression as reported by Pikaard and Mittelsten Scheid (2014). The average expression of each transcript in sexual (sexual – means) and aposporous (aposporous – means), together with Bonferroni and FDM p-values corrections associated to GE analyses are also reported.

TABLE S10 | Expression pattern of transcripts annotated as RNA binding. For each locus ID, transcript variants displaying no significant expression differences between sexual and aposporous samples are also reported. For each transcript, the locus ID in our reference genome, the alignment start/end and the number of allele variants are also reported. repBD: annotations regarding reproductive processes (Galla et al., 2017). GOrep: ontological annotations related to reproductive processes. Gene annotations were downloaded from AmiGO2 (http://amigo.geneontology.org/amigo). Epigenetic mod: annotations related to genes involved in epidenetic regulation of gene expression as reported by Pikaard and Mittelsten Scheid (2014). The average expression of each transcript in sexual (sexual – means) and aposporous (aposporous – means), together with Bonferroni and FDM p-values corrections associated to GE analyses are also reported.

TABLE S11 | Graphical Representation of A. thaliana log transformed clustered data. For each gene/probe ID, the table reports on the unique experiment ID (UNIQID), the research area, the plant growth stage, the tissue and Arabidopsis ecotype adopted for the expression study. For each gene, references to the manuscript table and/or figure are also reported. A. thaliana gene expression data were retrieved from the BAR Expression Browser

(http://bar.utoronto.ca/affydb/cgi-bin/affy\_db\_exprss\_browser\_in.cgi) (Toufighi et al., 2005). The output displays the average expression of replicate treatments relative to average of the appropriate control. Expression data were downloaded as graphical representation of log transformed clustered data.

TABLE S12 | Primers used in this study. For each target gene, the forward and reverse primers are reported. The purpose of each primer combination is also reported.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Galla, Basso, Grisan, Bellucci, Pupilli and Barcaccia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Apospory and Diplospory in Diploid Boechera (Brassicaceae) May Facilitate Speciation by Recombination-Driven Apomixis-to-Sex Reversals

John G. Carman<sup>1</sup> \*, Mayelyn Mateo de Arias1,2, Lei Gao<sup>1</sup> , Xinghua Zhao1,3 , Becky M. Kowallis<sup>4</sup> , David A. Sherwood<sup>1</sup> , Manoj K. Srivastava1,5, Krishna K. Dwivedi1,4,5 , Bo J. Price<sup>1</sup> , Landon Watts<sup>1</sup> and Michael D. Windham<sup>6</sup>

#### Edited by:

Marta Adelina Mendes, University of Milan, Italy

#### Reviewed by:

Amal Joseph Johnston, Universität Heidelberg, Germany Petr Koutecký, University of South Bohemia, Czechia Diego Hojsgaard, University of Göttingen, Germany

> \*Correspondence: John G. Carman john.carman@usu.edu

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 03 January 2019 Accepted: 16 May 2019 Published: 31 May 2019

#### Citation:

Carman JG, Mateo de Arias M, Gao L, Zhao X, Kowallis BM, Sherwood DA, Srivastava MK, Dwivedi KK, Price BJ, Watts L and Windham MD (2019) Apospory and Diplospory in Diploid Boechera (Brassicaceae) May Facilitate Speciation by Recombination-Driven Apomixis-to-Sex Reversals. Front. Plant Sci. 10:724. doi: 10.3389/fpls.2019.00724 <sup>1</sup> Plants, Soils and Climate Department, Utah State University, Logan, UT, United States, <sup>2</sup> Instituto Tecnológico de Santo Domingo, Santo Domingo, Dominican Republic, <sup>3</sup> College of Desert Control Science and Engineering, Inner Mongolia Agricultural University, Hohhot, China, <sup>4</sup> Caisson Laboratories, Inc., Smithfield, UT, United States, <sup>5</sup> Crop Improvement Division, Indian Grassland and Fodder Research Institute, Jhansi, India, <sup>6</sup> Department of Biology, Duke University, Durham, NC, United States

Apomixis (asexual seed formation) in angiosperms occurs either sporophytically, through adventitious embryony, or gametophytically, where an unreduced female gametophyte (embryo sac) forms and produces an unreduced egg that develops into an embryo parthenogenetically. Multiple types of gametophytic apomixis occur, and these are differentiated based on where and when the unreduced gametophyte forms, a process referred to as apomeiosis. Apomeiotic gametophytes form directly from ameiotic megasporocytes, as in Antennaria-type diplospory, from unreduced spores derived from 1st division meiotic restitutions, as in Taraxacum-type diplospory, or from cells of the ovule wall, as in Hieracium-type apospory. Multiple types of apomeiosis occasionally occur in the same plant, which suggests that the different types occur in response to temporal and/or spatial shifts in termination of sexual processes and onset timing of apomeiosis processes. To better understand the origins and evolutionary implications of apomixis in Boechera (Brassicaceae), we determined apomeiosis type for 64 accessions representing 44 taxonomic units. Plants expressing apospory and diplospory were equally common, and these generally produced reduced and unreduced pollen, respectively. Apospory and diplospory occurred simultaneously in individual plants of seven taxa. In Boechera, apomixis perpetuates otherwise sterile or semisterile interspecific hybrids (allodiploids) through multiple generations. Accordingly, ample time, in these multigenerational clones, is available for rare meioses to produce haploid, intergenomically recombined male and female gametes. The fusion of such gametes could then produce segmentally autoploidized progeny. If sex re-emerges among such

**249**

progeny, then new and genomically unique sexual species could evolve. Herein, we present evidence that such apomixis-facilitated speciation is occurring in Boechera, and we hypothesize that it might also be occurring in facultatively apomictic allodiploids of other angiospermous taxa.

Keywords: apomeiosis, apomixis, apomixis-to-sex reversion, apospory, recombination driven diploidization, Boechera (Brassicaceae), diplospory, reticulate evolution

### INTRODUCTION

The genus Boechera (Brassicaceae) evolved about 2.5 Myr ago (Mandakova et al., 2015) and is closely related to Arabidopsis (Bailey et al., 2006; Rushworth et al., 2011). It encompasses c. 83 primarily inbreeding sexual diploid taxa (Li et al., 2017), many of which have relatively narrow geographic ranges. Boechera also includes hundreds of genomically distinct diploid, triploid and tetraploid hybrids that are partially to fully sterile sexually. These hybrids produce most of their seeds through apomixis (without meiotic recombination, chromosome reduction or fertilization), but sexually derived seeds are also occasionally produced (Aliyu et al., 2010). This dual capacity, to produce seeds sexually and apomictically, is called facultative apomixis, and it is characteristic of most if not all angiospermous apomicts (Asker and Jerling, 1992).

Most Boechera taxa belong to a well-supported western North American clade (Alexander et al., 2013), the distribution of which extends from northern Mexico to the Arctic with outlying populations (mostly apomictic and polyploid) in Greenland and around the Great Lakes and the St. Lawrence River. Another clade of nine taxa, previously assigned to the genus Borodinia (Alexander et al., 2013), is here included in Boechera due to the recent discovery of inter-clade hybridization (Windham et al., field observations). The latter are distinctive in being sparsely pubescent and restricted to forested regions of eastern North America and the Russian Far East. Boechera s.l. is, by far, the largest genus of tribe Boechereae, a morphologically disparate group that includes seven other genera whose phylogenetic affinity only became apparent through recent chromosomal and molecular analyses. Indeed, the primary defining characteristic of Boechereae consists of a reduction in chromosome base number from n = 8 to n = 7 (Al-Shehbaz, 2003). Evidence suggests that the n = 8 Boechereae ancestor entered North America from Asia about 5 Mya via the Bering land bridge. The chromosome base number reduction likely occurred thereafter by multiple translocations (Mandakova et al., 2015).

Though predominantly autogamous (self-pollinating), interspecific hybrids (allodiploids) involving sexual Boechera diploids, as well as their introgression products, are frequently encountered in nature (Kantama et al., 2007; Beck et al., 2012; Aliyu et al., 2013; Alexander et al., 2015; Li et al., 2017). These are generally apomictic and display broad ecological competencies (Windham and Al-Shehbaz, 2006; Alexander et al., 2015; Windham et al., 2015; Shah et al., 2016). Because of introgression, apomictic Boechera are often confused with sexual diploids, the habitats of which are generally much more difficult to locate. As with many agamic complexes (Asker and Jerling, 1992; Bayer, 1997), this situation complicates Boechera taxonomy (Li et al., 2017).

While apparently common in Boechera, apomixis arising in allodiploid hybrids, which form between two sexual diploid species, is rare in other angiosperms (Carman, 1997). In this respect, many Boechera apomicts also produce unreduced (2n) pollen, which also is generally uncommon among other angiospermous apomicts (Asker and Jerling, 1992). In Boechera, 2n sperm of apomictic diploids can fertilize 1n eggs of cooccurring sexual taxa to produce new and genomically unique triploid apomicts (Bocher, 1951; Alexander et al., 2015; Li et al., 2017). Apomictic Boechera tetraploids also arise in this manner, but these are less common (Schranz et al., 2005; Aliyu et al., 2010).

Frequent hybridization with or without homoeologous recombination (Kantama et al., 2007) explains the proliferation of apomictic alloploid Boechera (Beck et al., 2012; Windham et al., 2015; Li et al., 2017), but how the sexual diploids originate is less obvious. The traditional view is that they arise by range expansion and speciation along ecological gradients (Alexander et al., 2015; Li et al., 2017). However, the slow pace of such speciation is inconsistent with the large numbers of rare sexual diploids described for this youthful genus. Here we provide a cytological and theoretical framework that addresses this question.

Apomixis is verifiable by single seed flow cytometry, which measures embryo-to-endosperm genome ratios, e.g., 2C:3C seeds (diploid embryo and triploid endosperm) are sexual, but 2C:5C or 2C:6C are apomictic (Matzk et al., 2000; Aliyu et al., 2010). However, "types" of apomixis must be determined cytologically (Asker and Jerling, 1992; Hand and Koltunow, 2014). Apomixis is gametophytic in Boechera, which means ovules produce 2n female gametophytes (embryo sacs), which in turn produce parthenogenetic eggs.

The pioneering study of female meiosis (megasporogenesis) and female gametophyte formation in Boechera (Bocher, 1951) was motivated by observations of 2n pollen formation. This, plus two subsequent studies of 2n pollen forming Boechera (Naumova et al., 2001; Taskin et al., 2004), revealed meiotic first division restitutions that produced dyads of 2n spores in ovules and anthers. On the male side, both spores formed 2n pollen. On the female side, one 2n spore degenerated and the other developed into a 2n gametophyte (Taraxacum-type diplospory). This limited embryological sampling led to an incorrect notion that 2n and 1n pollen in Boechera are diagnostic of apomixis and sex, respectively (Roy, 1995; Windham and Al-Shehbaz, 2006). More thorough sampling in recent years has provided a clearer picture of apomixis development in Boechera (Carman, 2007; Carman et al., 2015). Certain accessions of Boechera microphylla (B. imnahaensis × yellowstonensis) were found to be highly

apomictic despite apparently normal male and female meioses (Mateo de Arias, 2015). In these plants, functional pollen grains form from all four 1n microspores. However, on the female side, all meiotically produced spores generally degenerate, and a 2n gametophyte forms adventitiously from a nucellar cell of the ovule wall (Hieracium-type apospory).

To better understand the unusual pervasiveness and origins of multiple apomixis types in Boechera, we expanded our taxonomic sampling to 64 accessions representing 44 operational taxonomic units (OTUs). Our sampling includes sexual and apomictic taxa that span the Boechera phylogeny (Alexander et al., 2013), and it represents a mix of taxa traditionally treated as species as well as recently discovered but as yet unpublished entities. Hereafter, published names are used for the diploid sexual species. However, the apomictic hybrids are identified by genome composition as found in the Boechera Microsatellite Website (BMW) http://sites. biology.duke.edu/windhamlab/ (Li et al., 2017). We show that both apospory (normal male and female meioses with female sexual development failing thereafter) and diplospory (first division restitution male and female meioses) occur frequently in Boechera and are widely dispersed across the genus. Based on these findings, we provide a possible explanation for the origins of rare and allelically poor sexual endemics that are often encountered in habitats otherwise populated by allelically complex apomicts.

#### MATERIALS AND METHODS

#### Plant Materials

Cytological analyses were performed using floral buds taken from plants growing in native habitats, plants transplanted from native habitats, or plants grown from seeds (**Supplementary Table S1**). Seeds were placed on moist filter paper, stratified at 4 ◦C for 21 days, and planted. Potted seedlings or transplants were grown in 600 mL cone-shaped (68 mm diameter × 255 mm tall) pots or 350 mL square (85 mm wide × 95 mm tall) pots that contained Sunshine Mix #1 potting soil (Sun Gro Horticulture Canada Ltd., Vancouver, BC, Canada). Vernalization was accomplished by cold incubation (4◦C) for 10–12 weeks with minimal lighting (8/16 day/night photoperiod). Vernalized plants were transferred to controlledenvironment greenhouses or growth chambers that maintained a 16/8 h day/night photoperiod using supplemental light provided by a combination of cool white florescent bulbs, incandescent bulbs, and high-pressure sodium-vapor lamps. These provided a minimum photosynthetic photon flux of 400 µmol m−<sup>2</sup> s −1 at the tops of the canopies. Day/night temperatures were maintained at 22/16◦C, and plants were watered regularly with a dilute solution (250 mg L−<sup>1</sup> ) of Peters Professional 20:20:20 fertilizer (Scotts, Maryville, OH, United States).

#### Cytological Analyses

Clusters of floral buds at pre-anthesis stages were fixed in formalin acetic acid alcohol (FAA) or Farmer's 3:1 fixative (ethanol acetic acid) for 48 h. The buds were then cleared in 2:1 benzyl benzoate dibutyl phthalate (BBDP) (Crane and Carman, 1987) as follows: 70% EtOH, 30 min; 95% EtOH, 4 h (2×); 2:1 95% EtOH BBDP, 2 h; 1:2 95% EtOH BBDP, 4 h; 100% BBDP, 4 h; and 100% BBDP overnight (kept until analyzed). Pistils were then dissected from the floral buds. Pistil lengths, measured from the base of the pedicel to the top of the stigma (±0.05 mm), were then obtained using a dissection microscope, and the pistils were mounted on slides with a minimal amount of 2:1 BBDP clearing solution. Development was studied using an Olympus (Center Valley, PA, United States) BX53 microscope equipped with differential interference contrast (DIC) optics, an Olympus DP74 digital camera, and Olympus cellSens Dimension Version 1 software.

The following ovule stages were tabulated: (i) pre-meiotic megaspore mother cell (MMC), (ii) meiotic or diplosporous dyad, (iii) sexual tetrad of megaspores, (iv) enlarged functional megaspore with three degenerating megaspores (sexually derived), (v) early-stage 1 or 2-nucleate gametophyte with three degenerating megaspores, (vi) enlarged functional megaspore with one degenerating megaspore (Taraxacum-type diplospory), (vii) early-stage 1 or 2-nucleate gametophyte with one degenerating megaspore (Taraxacum-type diplospory), (viii) early-stage 1 or 2-nucleate gametophyte with no degenerating megaspores (Antennaria-type diplospory), (ix) presence of one or more enlarged non-vacuolate nucellar cells (aposporous initials, AI) that equaled or surpassed the size of the meiocyte (meiotically active MMC, dyad or early-staged tetrad), and (x) enlarged nucellar cell(s) with one or more distinct vacuoles and 1–2 nuclei (early-stage aposporous gametophytes, AES). Because of uncertainties in origin, gametophytes (sexual, diplosporous or aposporous) were not recorded beyond the 2-nucleate stage. Pistil lengths and developmental stages of the majority of scorable ovules in each pistil were recorded.

### Flow Cytometry

Relative levels of nuclear DNA in embryo and endosperm cells of single seeds of B. stricta (**Figure 1, 63**), B. exilis × thompsonii (**Figure 1, 8**), B. imnahaensis × yellowstonensis (**Figure 1, 27**) and B. cf. gunnisoniana 3× (**Figure 1, 6**) were determined. Nuclei of immature and mature seeds were isolated using a mortar, pestle and a few drops of DAPI (4,6-diamidino-2-phenylindole) containing Partec (Partec North America, Inc., Swedesboro, NJ, United States) buffer (CyStain UV Precise P). Pestles were used to lightly crush the seeds. Fragments were exposed to buffer for several minutes. The nuclei-containing solutions were then filtered through 30 µm nylon filters into 1.2 mL tubes. Nuclear fluorescence was determined using a Partec PA flow cytometer. Seeds with a 2C:3C embryo endosperm ratio were recorded as sexual, while seeds with 2C:5C, 2C:6C, 2C:7C, or 3C:9C ratios were recorded as apomictic (Aliyu et al., 2010).

### Allelic Diversity and Geographic Distributions of Sexual Diploids

For sexual diploids, taxonomic names, specimen numbers, locations of origin, and allele lengths for 13 single-locus microsatellite (simple sequence repeat, SSR) loci (A1, BF3, B6, B9,

FIGURE 1 | Meiosis and apomeiosis in 64 Boechera accessions (organized by reproductive mode). (A) Frequencies by taxon of ovules exhibiting sexual tetrads, sexual or Taraxacum-type diplosporous dyads, Hieracium-type aposporous gametophytes, or Antennaria-type diplosporous gametophytes (accession numbers correspond to those in Supplementary Table S1). Tetrad, dyad, and Antennaria-type diplosporous gametophyte frequencies per accession sum to 100%. Aposporous gametophyte frequencies are listed separately (red bars). These develop adventitiously while meiotic tetrads form and degenerate. Median (Med.) numbers of SSR alleles observed among homozygous samples of each sexual accession (Supplementary Table S2) are shown (Med.) as are numbers (No.) of correctly staged ovules analyzed per accession. (B) A hypothesis of evolutionary cycling between hybridization induced apomixis and apomixis facilitated reticulate evolution of new sexual species. Blue box: taxa with mostly 2n pollen with some reduced and shrunken pollen, which suggests meiotic anomalies due to recent interspecific hybridization but with a transition from diplospory to apospory occurring in some taxa possibly due to early genome diploidization events associated with infrequent selfing (15–21); red box: taxa with mostly fertile reduced pollen coupled with apospory and sexual tetrad formation and degeneration, which suggests more extensive genome diploidization with a gradual restoration of sexual fertility; green box: sexually fertile anthers and pistils with no cytological evidence of apomeiosis.

B11, BF15, B18, B20, B266, C8, E9, I3, and I14) were downloaded from the BMW. To minimize inclusion of apomicts (mistakenly collected as sexual diploids), specimens were excluded if they were heterozygous for any of the 13 loci, yielded data for less than six of the 13 loci, or represented taxa with less than six homozygous specimens. Taxa meeting these criteria were then ranked based on median and mean numbers of alleles per locus (population level allelic polymorphism). This variability was used to identify putative sexual or apomictic ancestors.

### RESULTS

Clearing and mounting of whole pistils using BBDP (Crane and Carman, 1987) was efficient for high throughput analysis of megasporogenesis and gametophyte formation. Each pistil contained, depending on species, from 40 to 200 ovules (Al-Shehbaz et al., 2010). When pistils were mounted horizontally (c. 16 per slide), 20–30% of their ovules were in sagittal orientation, which permitted efficient analyses of MMC origins as well as details of dyad, tetrad and early gametophyte formation.

### Diplospory and Apospory Are Common in Boechera

In most sexual angiosperms, the mature female gametophyte is a seven celled (eight nucleate) structure (Polygonum type) that forms from a genetically reduced megaspore of a meiotic tetrad (Johri et al., 1992). Accordingly, the consistent observation of the following four phenomena was taken as strong evidence for near-obligate to obligate sexual reproduction: (i) meiotic tetrad formation, (ii) absence of aposporous gametophytes at the meiotic tetrad stage, (iii) absence of vacuolate enlargement or endomitotic activity in a dyad obtained from a MMC, and (iv) vacuolate enlargement of the surviving megaspore of a meiotic tetrad coupled with endomitotic activity. These criteria were consistently observed in the ovules of 23 of the 64 accessions analyzed (**Figure 1A, 42–64**). In these accessions, the tetrad to functional megaspore stage lasted c. 2 d, which permitted tetrads to accumulate in rows along the placentae (**Figures 2A–C**). Additional photomicrographs diagnostic of sexual ovule development (meiotic tetrads and vacuolate 1- to 2-nucleate gametophytes forming from surviving megaspores of meiotic tetrads) are shown in **Supplementary Figures S1A–P** for eight of the sexual taxa identified herein.

Ovules from 21 of the 64 accessions analyzed (**Supplementary Table S1**, 16 OTUs) generally underwent Taraxacum-type diplospory (**Figure 1A, 1–21**). Here, a first division meiotic restitution occurred, which was followed by a mitotic-like second division to produce a dyad of 2n megaspores (**Figures 2D,E**). The 2n gametophyte then developed from the chalazal most spore (**Figure 2D**), and the micropylar-most spore degenerated. The dyad stage terminated Taraxacum type diplosporous megasporogenesis and, as with sexual megasporogenesis (resulting in tetrads), a pause in development preceded gametophyte formation. These pauses allowed diplosporous dyads to accumulate in rows, like sexual tetrads, along the ovary placentae (compare **Figure 2A** with **Supplementary Figure S2A**). Additional photomicrographs diagnostic of Taraxacum-type diplosporous ovule development (vacuolate 1 to 2-nucleate unreduced gametophytes forming from surviving megaspores of unreduced apomeiotic dyads) are shown in **Supplementary Figures S2A–F** for six of the Taraxacum-type diplosporous taxa identified herein. Also shown are unreduced microspore dyads (**Supplementary Figure S2G**), which are commonly produced in the anthers of diplosporous Boechera (Bocher, 1951; Naumova et al., 2001).

Aposporous gametophytes generally replaced all four megaspores of meiotic tetrads in 27 of 64 accessions, which represented 19 OTUs (**Figures 1A, 15–41**, **2H–K**). Photomicrographs diagnostic of apospory (degenerating meiotic tetrads being replaced by vacuolate 1- to 2-nucleate aposporous gametophytes) are shown in **Supplementary Figures S2H–O** for seven aposporous taxa. Because we scored reproduction from the dyad to 2-nucleate gametophyte stages, our apospory frequencies are probably underestimates. This is because some ovules scored as sexual in the dyad to early tetrad stages would have likely produced aposporous gametophytes had they been fixed at a later date. Apospory and diplospory occurred together in seven of the 27 aposporous accessions (**Figures 1A**, **2F**).

Antennaria-type diplospory (mitotic diplospory or gonial apospory) occurred rarely (**Figure 1A**). When it did occur, it began early in ovule development while inner and outer integuments were initiating (**Figures 2L,M**). To our knowledge, this is the first report of Antennaria-type diplospory in Boechera. Aposporous gametophytes also began to form during early integument development (**Figures 2F,J** and **Supplementary Figures S2M,N**). In contrast, gametophyte formation from functional megaspores of sexual tetrads (**Figure 2A** and **Supplementary Figures S1D,F**) and Taraxacum-type

FIGURE 2 | Representative images of sexual megasporogenesis and Taraxacum-type diplosporous, Hieracium-type aposporous, and Antennaria-type diplosporous gametophyte (ES) development in Boechera. (A) Row of three adjacent sexual tetrads in a B. stricta ovule (Figure 1, 63), (B) sexual tetrad in a B. yellowstonensis ovule (Figure 1, 47), (C) sexual tetrad in a B. exilis ovule (Figure 1, 43), (D) unreduced Taraxacum-type dyad in a B. exilis × thompsonii ovule (Figure 1, 8), (E) unreduced Taraxacum-type dyad in a B. crandallii × gracilipes ovule (Figure 1, 20), (F) two unreduced 1-nucleate Hieracium-type aposporous ESs with a degenerating tetrad in a B. retrofracta × stricta ovule (Figure 1, 21), (G) sexual megaspore mother cell (MMC) with a parietal cell in a B. yellowstonensis ovule (Figure 1, 47), (H) unreduced 2-nucleate Hieracium-type aposporous ES with a degenerating sexual tetrad in a B. cusickii × sparsiflora ovule (Figure 1, 24), (I) four unreduced 2-nucleate Hieracium-type aposporous ES in a B. imnahaensis × yellowstonensis ovule (Figure 1, 19), (J) unreduced 1-nucleate Hieracium-type aposporous ES with a degenerating tetrad in a B. retrofracta × retrofracta ovule (Figure 1, 25); (K) an unreduced 2-nucleate and an unreduced 4-nucleate aposporous gametophyte in a B. crandallii × thompsonii ovule (Figure 1, 31), (L,M) unreduced 1-nucleate Antennaria-type diplosporous ES forming directly from the MMC in a B. retrofracta × stricta ovule (Figure 1, 21). Black arrows, degenerating megaspores; white arrows, surviving megaspores; narrow white lines, central column of nucellar cells, which gave rise to the archesporial cell; ai, aposporous initial cell; ii, inner integument; oi, outer integument; p, parietal cell; v, vacuole; bars, 20 µm.

diplosporous dyads (**Figure 2D** and **Supplementary Figures S2C–E**) generally occurred as integuments were enclosing the nucellus.

In angiosperms, the nucellus develops by periclinal divisions of subepidermal cells of the funiculus, and the cells of the epidermis divide anticlinally to accommodate this nucellar

enlargement. The periclinal divisions produce columns of nucellar cells. The central column extends from the middle of the chalaza to the distil most position at the micropylar epidermis (see narrow white lines, **Figures 2D,E,G,H,L** and **Supplementary Figures S1I,J**, **S2B,D,K,L,N**). The distil cell of the central column enlarges to produce the archesporial cell. In some angiosperms, enlarging archesporial cells divide mitotically to produce a parietal cell that separates the archesporial cell from the nucellar epidermis (Johri et al., 1992). In our sexual and apomictic taxa, parietal cells formed in 10–20% of the ovules, and they were observed from the MMC stage until they degenerated during early gametophyte formation (**Figures 2B,E,G,H** and **Supplementary Figures S1C,D,G,L,O**, **S2C,I,K–M**). Histochemical evidence suggests that abnormal meioses can also produce parietal-like cells in Boechera (Rojek et al., 2018).

Four taxa were studied by flow cytometry. Seeds from diploid aposporous B. imnahaensis × yellowstonensis generally produced peaks consistent with unreduced gametophyte central cells (4C, from the fusion of two 2n polar nuclei) being fertilized by 1n sperm nuclei (1C). Of 47 seeds successfully tested, 44 exhibited the expected 2C:5C ratio, consistent with reduced pollen formation, two exhibited a sexual 2C:3C ratio, and one exhibited a 2C:7C ratio, which likely reflects 4C central cell fertilization by a 3C sperm from adjacently growing diplosporous triploid B. c.f. gunnisoniana 3×. This 96% apomictic seed set confirms our suspicion that aposporous gametophyte frequencies (80% for this taxon; **Figure 1A, 27**) underestimate apomixis penetrance. All seven peak-producing seeds of B. cf. gunnisoniana 3× exhibited the expected 3C:9C ratio for this diplosporous triploid. Of 48 peak-producing seeds of diplosporous B. exilis × thompsonii, 45 produced a 2C:6C ratio, consistent with unreduced pollen formation, two produced a sexual 2C:3C ratio, and one produced a 2C:7C ratio. The latter again suggests fertilization by the adjacently growing B. cf. gunnisoniana 3×. The 24 B. exilis × retrofracta and the two B. retrofracta × stricta peak-producing seeds produced the expected diplosporous 2C:6C ratio. Likewise, all 12 seeds from the sexual B. stricta produced the expected sexual 2C:3C ratio (**Figure 3**).

#### Evidence for Homoeologous-Recombination-Driven Reticulation

To evaluate possibilities of apomixis-to-sex reversions in allodiploid Boechera, we searched the BMW for sexual endemics with limited geographic distributions and limited allelic variable (listed at the bottom of **Supplementary Table S2**). One of these, B. mitchell-oldsiana, is endemic to a 4 km stretch along the rim of Hells Canyon in northeastern Oregon. This location is within the center of diversity of two prominent sexual taxa, B. retrofracta and B. sparsiflora. Mean and median numbers of alleles per locus in the homozygous BMW samples for B. mitchelloldsiana were 1.4 and 1, respectively (**Supplementary Table S2**). According to traditional views, B. mitchell-oldsiana, with its low allelic variability, could represent an ancient, nearly extinct sexual species that has experienced a genetic bottleneck followed by a weak comeback. Alternatively, it may have evolved from a single sexual species along an ecological gradient by directional selection. Then again, it may have evolved by reticulate evolution via a recombination-driven apomixis-to-sex reversion (**Figure 4**). If by directional selection, from a single species, most of its alleles should be found within single plants of its ancestral sexual species. However, if it evolved recently by apomixis-facilitated reticulation, its alleles should be found equally distributed between two ancestral sexual species, and local apomicts formed by hybridizations between these putative parental species should possess all or nearly all of the B. mitchell-oldsiana alleles.

As expected for an apomixis-facilitated reticulation, B. mitchell-oldsiana alleles were nearly evenly distributed between the two putative sexual parents, B. retrofracta and B. sparsiflora, and neither parent alone appeared to be capable of providing all of the needed alleles (**Table 1**, allele columns of putative sexual ancestors). In contrast, each of three local apomictic hybrids contained nearly all of the B. mitchell-oldsiana alleles (**Table 1**, allele columns for the three apomicts). The microsatellite-genotyped apomictic B. retrofracta × sparsiflora hybrids in the BMW represent only a small fraction of hundreds of such hybrids in this region from which B. mitchell-oldsiana may have evolved.

#### DISCUSSION

### Evolutionary Instability: A Hallmark of Youthful Boechera

Angiospermous apomicts are typically perennial outcrossing polyploids that produce 1n pollen and, by a single apomixis type (e.g., apospory or diplospory), produce 2n female gametophytes and parthenogenetically competent eggs (Asker and Jerling, 1992; Carman, 1997, 2007). Apospory and diplospory occurring in the same plant is unusual. Where this occurs, the less frequent type is generally rarely observed, e.g., in Tripsacum and Antennaria (Carman, 2007), Paspalum (Bonilla and Quarin, 1997), Rubus (Czapik, 1996), Poa (Tian et al., 2013), and a few others (Nogler, 1984; Asker and Jerling, 1992; Carman, 1997). However, in several Boechera hybrids, apospory and diplospory occur simultaneously, each at elevated frequencies (**Figure 1A**). Boechera apomicts are atypical in other respects as well. For example, they are usually autogamous, instead of outcrossing, many produce 2n pollen, and many are diploid.

**Figure 1A** places taxa of like reproductive mode together. However, a closer look suggests a possible evolutionary relevance to this clustering. Specifically, the diplosporous apomicts (**Figure 1A, 1–21**) produce dyads of genetically unreduced spores in both male and female organs. In contrast, the aposporous apomicts (**Figure 1A, 15–41**) produce tetrads of genetically reduced spores in both female and male organs. Some of the aposporous apomicts are interracial hybrids, where fertile reduced pollen is expected (**Figure 1A, 22–23, 33, 60**). However, the others are allodiploids (**Figure 1A, 24, 26–32, 35**), which like diplosporous hybrids, should be sexually sterile or semisterile. Herein we propose a mechanism whereby new

sexually fertile species may evolve from sexually semisterile but apomictically fertile allodiploids through facultative episodes of genome diploidization (**Figure 4**).

To reacquire meiotic stability after interspecific hybridization, apomictic Boechera must have undergone genome modifications that enhance chromosome pairing and recombination (diploidization). Since progeny of near-obligate apomicts are usually clonal and genetically identical to their mothers, well established allodiploid Boechera apomicts, which are also facultatively sexual, should have ample time (even hundreds of years) to sooner or later simultaneously produce 1n (genomically recombined) male and female gametes. In contrast, nonapomictic species hybrids are generally sterile, and these usually die without reproducing (Dobzhansky et al., 1977).

When allodiploid apomicts facultatively produce progeny by production and union of genomically recombined 1n gametes, a 50% reduction in homoeologous chromosome regions occurs. This is accompanied by a compensating increase in homologous (and homozygous) chromosome regions (**Figure 4**). With each additional autogamous generation, a 50% decrease in remaining homoeologous regions occurs. After several generations of selfing, each interspersed with perhaps multiple generations of apomixis, allodiploid apomicts should become sufficiently diploidized (**Figure 4**) for successful and efficient meioses to occur. Their chromosomes at this point represent chiasmagenerated composites of alternating homozygous sections of the homoeologous genomes of their parents (Sybenga, 1996; Carman, 2007). This process is analogous to recombinant inbred line (RIL) production where multiple generations of selfing produce new chromosomes consisting of alternating segments of the original parental chromosomes. Chromosome painting studies provide evidence that such inter-genomic recombination in apomictic Boechera is extensive (Kantama et al., 2007; Koch, 2015).

If recombinational loss of parental chromosome regions eliminates alleles responsible for one apomixis type over another, or for apomixis in general, then new apomictic or sexual plants with uniquely recombined genomes may evolve

(**Figure 4**). Such processes could explain the existence of a B. imnahaensis × yellowstonensis accession that is mostly diplosporous and another accession of the same combination that is mostly aposporous (**Figure 1A, 19, 27**). It is noteworthy that many aposporous hybrids contain a B. microphylla clade genome (B. thompsonii, B. imnahaensis, or B. yellowstonensis), which suggests that the B. microphylla clade may be predisposed to switch from diplospory to apospory. Our data also suggest that tendencies toward apospory may persist for many sexual generations (**Figure 1A**, note high frequencies of tetrad formation with widely varying frequencies of apospory). While unreduced pollen is occasionally observed among apomicts of other angiospermous families, as well as among sexual plants (Asker and Jerling, 1992; Carman, 1997), the correlations between diplospory or apospory and 2n or 1n pollen, respectively, are unique to Boechera, and these correlations add to the uniqueness of the Boechera agamic complex.

The cytogenetic data available for the plants investigated herein (e.g., production of fertile 1n pollen) support the hypothesis of gradual, reticulation-driven shifts from recently evolved (sexually sterile or semisterile) diplosporous apomicts (**Figure 1A, 1–14**), to plants that produce 1n and 2n pollen and exhibit diplospory, apospory and sex (**Figure 1A, 15–21**), to plants that produce mostly 1n pollen and exhibit mostly apospory and sex (**Figure 1A, 22–41**), and finally to completely sexual plants that produce 1n pollen (**Figures 1A, 42–64,B**). It should be noted that only a very small percentage of progeny, if any, in each hybrid combination might fortuitously undertake this evolutionary route. In this respect, the vast majority of seeds produced by apomictic hybrids are genetic clones of the mother plant. Hence, while apomixis to sex reversions may on occasion occur for a given hybrid combination, the parental apomictic hybrid remains happily apomictic. The definitive test for verifying this process would be to observe it firsthand. As noted above, both diplosporous B. exilis × thompsonii and aposporous B. imnahaensis × yellowstonensis produce about 4% of their seeds sexually. Accordingly, sexual gametophyte formation frequencies among sexually produced progeny (off types) could be determined. If segmental diploidization (**Figure 4**) and apomixis-to-sex reversions occur, they should be detectable within 2–4 generations. Another approach would be to genotype rare sexual endemics and their sexual and apomictic neighbors using phylogenetically stable markers. If a rare sexual endemic evolved recently from another sexual plant, most of its genetic markers should be similar to its progenitor. However, if it evolved by a recombination driven apomixis-to-sex reversal, then most or perhaps all of its molecular markers should be found in neighboring apomictic hybrids. In turn, these hybrids should contain near equal numbers of alleles from two distinct sexual parents, as was observed for B. mitchell-oldsiana herein (**Table 1**). Interestingly, B. mitchell-oldsiana exhibits a low frequency of apospory (**Figure 1A, 41**), which is consistent with a putative apomixis-to-sex origin.

The B. mitchell-oldsiana germplasm analyzed here is unique among SSR genotyped diploids. Specifically, it's geographic distribution is restricted to a few flourishing populations, within 4 km of each other, in a single Oregon county. Similarly restricted populations of diploids have been reported, but only a few samples of SSR genotypes are available for them. It will be interesting to conduct analyses similar to that shown in **Table 1** as additional rare sexual diploids are more thoroughly SSR genotyped.

Given the documented diversity of apomictic hybrids in Boechera [over 400 unique genomic combinations reported by Li et al. (2017)], it is evident that the association between apomixis and hybridization in this youthful genus is strong and that apomixis arises quickly following the amalgamation of divergent, mostly self-pollinating lineages. Likewise, if reversions from apomixis to sex require only a few successful sexual generations (**Figure 4**), then the entire process could reasonably occur in nature within a few decades. This would include (i) hybridization of sexual diploids, (ii) an homoeologous hybrid apomixis phase, (iii) a weakly apomictic segmental allodiploid phase, and (iv) a fully diploidized fledgling sexual endemic phase with early interspecific hybridizations of its own (**Figure 1B**). Variably repetitive patterns of microsatellite markers, as observed in the BMW (Li et al., 2017), could be explained by such a rapid recombination-driven speciation.

TABLE 1 | Allele frequencies at 13 microsatellite loci for the allelically scant and geographically restricted sexual endemic B. mitchell-oldsiana (mitc) as found in the Boechera Microsatellite Website (BMW).


Corresponding values are presented for its three most allelically similar sexual taxa (putative progenitors), B. retrofracta (retr), B. sparsiflora (spar), and B. puberula (pube), and for three allelically similar triploid apomictic hybrids, the parents of which include one or more of the putative ancestors (MS346, B. retrofracta × sparsiflora × sparsiflora; FW133, B. rectissima × retrofracta × sparsiflora; FW1042, B. cusickii × puberula × retrofracta). Numbers in the heading next to sexual taxa indicate number of homozygous lines in the BMW that were tallied to obtain frequency data. High frequency alleles from the two putative parents are shaded. For the three apomicts, allele presence values are from single plants. This highlights the possibility that sexual B. mitchell-oldsiana may have evolved from a single interspecific apomictic hybrid. Numbers of additional diploids that contain each allele are also listed (a measure of allele rarity). Locations are those from which BMW DNA samples were obtained. CAN, Canada; United States abbreviations as commonly used.

### Apomixis Types May Simply Reflect Temporal and Spatial Variations in Termination of Sexual Development and Onset of Gametophyte Formation

Apomixis in plants (Asker and Jerling, 1992), animals (Suomalainen et al., 1987), and protists (Bilinski et al., 1989) involves three single-cell processes: termination of sexual development, production of unreduced spores or eggs, and parthenogenesis where spores or eggs reinitiate the life cycle without syngamy. It has been hypothesized that these seminal events of apomixis are anciently polyphenic with the corresponding seminal events of sexual reproduction, and that eukaryotes in general have more or less retained, during evolution, abilities to switch from one reproductive mode (phenism) to the other (Carman et al., 2011; Hojsgaard et al., 2014; Albertini et al., 2019). Accordingly, onset timings and locations of unreduced gamete formation, which in angiosperms requires gametophyte formation, could be the event that defines apomixis types in angiosperms (Battaglia, 1989; Carman, 1997). If this hypothesis is correct, then apomixis types are not dependent on apomixis-type-specific mutations per se but on genetically controlled temporal and spatial variations in the induction of unreduced gametophyte formation.

Drought and heat stress can switch facultatively diplosporous Boechera from mostly apomeiotic dyad formation to mostly meiotic tetrad formation (Mateo de Arias, 2015). Hence, some of the variability in dyad to tetrad ratios observed among diplosporous accessions (**Figure 1A, 1–21**), especially those fixed in the field (**Supplementary Table S1**, Windham collections), may have been caused by variations in the weather prior to field collections.

In certain ovules of the present study, sexual development was terminated prior to meiosis and was immediately replaced by unreduced gametophyte formation. This Antennaria-type diplospory occurred while integuments were still budding (**Figures 2L,M**). Likewise, unreduced gametophyte formation also followed the termination of sexual development during early meiotic prophase, which defines Taraxacum-type diplospory (**Figures 2D,E**), and shortly after meiosis in aposporous Boechera

(**Figures 2H,J**). That multiple types of apomixis occur in the same plant is evidence that timing and location of unreduced gametophyte formation dictates apomixis type (**Figure 5**). Interestingly, high frequency shifts between types of apomixis, as well as between sexual and apomictic development, have been induced in sexual and apomictic Boechera through pharmacological treatments that affect stress response pathways and DNA methylation (Gao, 2018).

#### Apomixis and Speciation, a Reappraisal

Facilitating the origins of genomically unique sexual species and genera runs counter to long held opinions concerning the involvement of apomixis in evolution. Historically, biologists considered apomixis, as well as wide hybridization and polyploidy, as antitheses of speciation (Darlington, 1939; Stebbins, 1971; Van Dijk and Vijverberg, 2005). Clearly, these processes block the selection-based shifts in allele frequencies thought to be required for gradual speciation along ecological gradients (Mayrose et al., 2011). However, studies now implicate reticulation as a prominent player in speciation (Carman, 1997; Rieseberg, 1997; Martis et al., 2013; Sochor et al., 2015; Payseur and Rieseberg, 2016; Vargas et al., 2017; Hojsgaard, 2018). In this respect, the immortality conferred by apomixis to allodiploid Boechera should provide them with unlimited time for rare recombinations to occur and for sexually fertile species, which possess multi-species-recombinant genomes, to evolve (**Figure 4**). To date, only a few cases of apomixis to sex reversions have been reported (Chapman et al., 2003; Domes et al., 2007; Horandl and Hojsgaard, 2012; Ortiz et al., 2013; Hojsgaard et al., 2014). However, this could change if the speciation mechanism described herein is found to be of more general occurrence among agamic complexes.

Establishment of apomixis-to-sex founder plants, like the recombinational events required to generate them, are probably rare, and this rarity could explain the low levels of allelic variability observed among some of the sexual diploids of Boechera (**Supplementary Table S2**). Also, since geographic ranges of apomicts often exceed those of their sexual progenitors (Bayer, 1997; Hojsgaard et al., 2014), newly formed apomixis-tosex founder populations could reasonably be allopatric with their most recent sexual ancestors but sympatric with clones of their immediate apomictic parents.

Few diploid apomicts exist outside of Boechera. Generally, they are ephemeral, sexually sterile, and apomictically fertile allodiploids that occasionally form from allotetraploids through parthenogenesis of 1n (=2x) eggs (Asker and Jerling, 1992). Spontaneous haploid parthenogenesis is reasonably common in angiosperms (Dunwell, 2010). Hence, most allotetraploid apomicts probably on occasion produce allodiploids. Published examples have been reported in the following genera: Parthenium (Gerstel and Mishanec, 1950), Hierochloe (Weimarck, 1967), Ranunculus (Nogler, 1982), Allium (Kojima and Nagato, 1997), Hieracium (Bicknell, 1997), and Erigeron (Noyes and Wagner, 2014). If the genomes of such allodiploids are sufficiently divergent as to prevent the formation of fertile and reduced gametes, then apomixis, provided it is still occurring (Noyes and Wagner, 2014), could stabilize the cytotype. However, if the allodiploid apomict happens to produce progeny sexually, especially by selfing, then recombination driven diploidization with the formation of genomically unique sexual species may eventually occur (**Figure 4**).

#### CONCLUSION

The unique situation in Boechera of self-pollinating, aposporously fertile allodiploids that facultatively produce 1n eggs and sperm may facilitate reversions from apomixis to sex. In fact, multiple genomically unique sexual species could in theory evolve from the same allodiploid, the divergent genomes of which would contain different assortments of homozygous segments of the original parental genomes (**Figure 4**). In this manner, apomixis may serve as an effective springboard in stabilizing reticulate evolution processes (Carman, 1997). Since allodiploidy increases rates of homoeologous recombination (Dewey, 1984; Wang, 1989; Grandont et al., 2014; Poggio et al., 2016), diploidization possibilities should be enhanced. Accordingly, the aposporous allodiploid Boechera identified herein are well suited for studying this putative speciation mechanism. While occurring at a much slower pace, this process could also occur among polyploid apomicts. Here, the process would originate in allodiploids that form from allotetraploid apomicts by haploid parthenogenesis.

### AUTHOR CONTRIBUTIONS

JC designed the study and wrote the manuscript with important contributions from MW and DS. MMdA, MS, and KD conducted

the flow cytometry. MW provided the taxonomic guidance. MW, JC, DS, LG, and MS collected the specimens. BK and JC designed the Boechera embryology procedures. JC, MMdA, XZ, LG, BK, DS, MS, BP, and LW conducted the embryological analyses. All authors read and approved the final draft.

#### FUNDING

This work was supported by a United States Department of Commerce, National Institute of Standards and Technology, Advanced Technology Program Grant 70NANB7H7022 to JC; a United States National Science Foundation Grant DEB-0816560 to MW; a National Agricultural Innovation Project award, Indian Council of Agricultural Research, New Delhi to KD; a CREST fellowship, Department of Biotechnology, New Delhi to MS; and Utah Agricultural Experiment Station project awards, Utah State University, Logan to JC (approved as Utah Agricultural Experiment Station journal paper number 9089).

#### ACKNOWLEDGMENTS

The authors thank Michelle Jamison, Devin Wright, Bryan Cox, John Carman Jr., and George Hampton II for assistance in collecting, growing, and preparing Boechera samples for cytology.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00724/ full#supplementary-material

#### REFERENCES


FIGURE S1 | Additional examples of meiotic tetrads and immature 1- to 2-nucleate gametophytes (ESs) forming from functional (surviving) megaspores (FMs) in sexual Boechera. Tetrads and distinctly vacuolate ESs forming from FMs are diagnostic of sexual reproduction. Numbers after taxa correspond to accession numbers in Figure 1 and Supplementary Table S1. (A,B) B. formosa, 52; (C,D) B. schistacea, 56; (E,F) B. pendulina, 61; (G,H) B. stricta, 62; (I,J) B. lemmonii, 64; (K,L) B. oxylobula, 59; (M,N) B. juniperina, 44; (O,P) B. sparsiflora, 58. Black arrows, degenerating megaspores; white arrows, surviving megaspores or early developing gametophytes; narrow white lines, central column of nucellar cells, which gave rise to the archesporial cell; P, parietal cell; bars, 20 µm.

FIGURE S2 | Additional examples of diplosporous and aposporous reproduction in Boechera. Distinctly vacuolate gametophytes (ESs) that form from the chalazal member of a dyad are diagnostic of Taraxacum-type diplospory (A–F). Distinctly vacuolate ESs that form from nucellar cells and replace degenerating tetrads are diagnostic of apospory (H–O). Numbers after taxa correspond to accession numbers in Figure 1 and Supplementary Table S1. (A) partial row of unreduced Taraxacum-type diplosporous dyads in a B. exilis × thompsonii, 9, pistil; (B) B. exilis × retrofracta, 1; (C) B. pendulina × thompsonii, 11; (D) B. fendleri × stricta, 16; (E) B. cf. gunnisoniana 3×, 6; (F) B. exilis × retrofracta, 2; (G) unreduced microspore dyads from a 1.1 mm long B. retrofracta × stricta, 21, anther. (H) B. thompsonii × thompsonii, 33; (I) B. thompsonii × thompsonii, 22; (J) B. crandallii × thompsonii, 31 (additional focal plane of Figure 2I); (K) two unreduced 1 nucleate Hieracium-type aposporous ESs, an aposporous initial, and a degenerating tetrad in a B. retrofracta × stricta ovule, 21; (L) B. cusickii × sparsiflora, 24; (M) B. fendleri × stricta, 32; (N) unreduced 1 nucleate Hieracium-type aposporous ES with degenerating unreduced Taraxacum-type dyad in a B. retrofracta × stricta ovule, 21; (O) B. exilis × thompsonii, 30; black arrows, degenerating megaspores; white arrows, surviving megaspores; narrow white lines, central column of nucellar cells, which gave rise to the archesporial cell; AES1 and 2, 1- and 2-nucleate aposporous ESs, respectively; DES1 and 2, 1- and 2-nucleate diplosporous ESs, respectively; Dy, microspore dyads; P, parietal cell; bars, 20 µm.

TABLE S1 | Collection information for Boechera accessions evaluated cytologically for mode of reproduction. Numbers following taxa names correspond to numbered taxa in Figure 1; BMW, Boechera Microsatellite Website; collection numbers are those of Carman (JC) and Windham (MW).

TABLE S2 | Numbers of SSR alleles observed among the homozygous samples of 59 diploid sexual Boechera taxa as of July, 2017.




**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2019 Carman, Mateo de Arias, Gao, Zhao, Kowallis, Sherwood, Srivastava, Dwivedi, Price, Watts and Windham. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# The Paradox of Self-Fertile Varieties in the Context of Self-Incompatible Genotypes in Olive

*F. Alagna1† , M. E. Caceres 2† , S. Pandolfi2 , S. Collani <sup>3</sup> , S. Mousavi <sup>2</sup> , R. Mariotti <sup>2</sup> , N. G. M. Cultrera2 , L. Baldoni <sup>2</sup> \* and G. Barcaccia4*

*1 Dipartimento Tecnologie Energetiche (DTE), Centro Ricerche Trisaia, ENEA Agenzia nazionale per le nuove tecnologie, l'energia e lo sviluppo economico sostenibile, Rotondella, Italy, 2 Dipartimento di Scienze Bio Agroalimentari (DiSBA), Istituto di Bioscienze e Biorisorse (IBBR), Consiglio Nazionale Delle Ricerche (CNR), Perugia, Italy, 3 Department of Plant Physiology, Umeå Plant Science Centre, Umeå University, Umeå, Sweden, 4 Laboratorio di Genomica, Dipartimento di Agronomia, Animali, Alimenti, Risorse naturali e Ambiente (DAFNAE), Università di Padova, Legnaro, Italy*

#### *Edited by:*

*Giuseppe Ferrara, University of Bari Aldo Moro, Italy*

#### *Reviewed by:*

*Sara V. Good, University of Winnipeg, Canada Daniele Bassi, University of Milan, Italy*

> *\*Correspondence: L. Baldoni luciana.baldoni@ibbr.cnr.it*

*† These authors have contributed equally to this work*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 04 February 2019 Accepted: 16 May 2019 Published: 26 June 2019*

#### *Citation:*

*Alagna F, Caceres ME, Pandolfi S, Collani S, Mousavi S, Mariotti R, Cultrera NGM, Baldoni L and Barcaccia G (2019) The Paradox of Self-Fertile Varieties in the Context of Self-Incompatible Genotypes in Olive. Front. Plant Sci. 10:725. doi: 10.3389/fpls.2019.00725*

Olive, representing one of the most important fruit crops of the Mediterranean area, is characterized by a general low fruit yield, due to numerous constraints, including alternate bearing, low flower viability, male-sterility, inter-incompatibility, and self-incompatibility (SI). Early efforts to clarify the genetic control of SI in olive gave conflicting results, and only recently, the genetic control of SI has been disclosed, revealing that olive possesses an unconventional homomorphic sporophytic diallelic system of SI, dissimilar from other described plants. This system, characterized by the presence of two SI groups, prevents self-fertilization and regulates inter-compatibility between cultivars, such that cultivars bearing the same incompatibility group are incompatible. Despite the presence of a functional SI, some varieties, in particular conditions, are able to set seeds following self-fertilization, a mechanism known as pseudo-self-compatibility (PSC), as widely reported in previous literature. Here, we summarize the results of previous works on SI in olive, particularly focusing on the occurrence of self-fertility, and offer a new perspective in view of the recent elucidation of the genetic architecture of the SI system in olive. Recent advances in research aimed at unraveling the molecular bases of SI and its breakdown in olive are also presented. The clarification of these mechanisms may have a huge impact on orchard management and will provide fundamental information for the future of olive breeding programs.

Keywords: *Olea europaea* L., pseudo-self-compatibility, pollen-pistil interaction, sporophytic system, self-incompatibility

#### INTRODUCTION

Olive (*Olea europaea* L.) is a perennial diploid species mainly clonally propagated and diffused in the Mediterranean area as one of the oldest tree crops (Loumou and Giourga, 2003; McKey et al., 2010; Zohary et al., 2012; Mousavi et al., 2017). As in many other allochthonous, hermaphrodite, wind-pollinated species, olive is characterized by a plentiful flowering, followed by a poor fruit set, which results in low yields (Cuevas and Polito, 2004; Ben Dhiab et al., 2017; Kassa et al., 2019). Environmental conditions, such as temperature, rainfall, and wind, may strongly affect flowering time, flowering intensity, and fertilization (Fernandez-Escobar et al., 2008; Haberman et al., 2017; Benlloch-González et al., 2018). Efficient pollination depends on many factors, such as the presence of exogenous compatible pollen, the duration of stigma receptivity, the number of pollen grains, pollen-ovule ratio, and stigma morphology (Cruden, 2000; García-Mozo et al., 2004; Pinillos and Cuevas, 2009; Rojo et al., 2016). Despite the importance of these factors, the main constraints responsible for the low fruit setting of olive are undoubtedly self-incompatibility (SI) and a high percentage of ovary abortion of some cultivars (Reale et al., 2009; Seifi et al., 2015).

Self-incompatibility is one of the most effective systems adopted by flowering plants to prevent inbreeding and maintain a high diversity within species (Ferrer and Good, 2012). Within the sporophytic (SSI) and gametophytic (GSI) categories, several incompatibility systems have been reported, but only three of them have been characterized at molecular level, one for SSI (Brassicaceae) and two for GSI (one in Solanaceae, Plantaginaceae, and Rosaceae, and one for Papaveraceae) (Higashiyama, 2010). The recent discovery of the diallelic SSI system in olive and other related genera has provided evidence on the SI system operating in the Oleaceae family. In particular, it has been clarified that the inhibition of pollen tube growth takes place at the stigma, and it has been established that there are only two incompatibility groups, i.e., diallelic SI, such that olive cultivars are incompatible within groups and compatible between groups (Saumitou-Laprade et al., 2017a).

In light of this evidence, it has become even more difficult to explain the self-fertility of some varieties, which has been confirmed by numerous studies (Seifi et al., 2012; Selak et al., 2014a; Breton et al., 2016; Marchese et al., 2016a). It is also now clear that pseudo-self-compatibility (PSC)—the failure to reject self-pollen despite the presence of a functional SI system—may occur in olive, and all previous reports on olive self-fertility need to be re-interpreted or ignored when not confirmed by paternity tests of seeds deriving from selfing. In this review, we hope to shed some light on the complexity of SI system in olive, particularly focusing on the evidence for PSC, providing a synthesis of the fragmented work on this topic and a new perspective on SI in olive, supported by experimental work from our laboratory.

#### OLIVE SELF-INCOMPATIBILITY

For a long time, olive has been erroneously classified as a GSI species, based on morphological traits shared with taxa manifesting a GSI system, such as wet-type stigma and bi-nucleate pollen (Serrano et al., 2008); however, only a few cytological studies supported the occurrence of this kind of SI system (Serrano et al., 2010). Recently, it has been demonstrated that the GSI model fails to explain the presence of reciprocal differences in fruit set in one-third of mates (Breton et al., 2014; Farinelli et al., 2018). This evidence, together with others, such as the inhibition of germination at the stigma and the failure to identify genes controlling GSI in olive (Collani et al., 2012), led to the hypothesis that olive may have a SSI system. The wide methodical genetic study carried out by Saumitou-Laprade et al. (2017a,b) definitively confirmed that olive has a sporophytic homomorphic diallelic system, similar to that identified in close species, such as *Phillyrea angustifolia* and *Fraxinus ornus* (Saumitou-Laprade et al., 2010; Vernet et al., 2016) and different from any other described in other plant families. The SSI system has been classified as diallelic because controlled by a single locus with only two alleles (S1 and S2), according to the segregation analysis of SI trait in a cross population (Saumitou-Laprade et al., 2017a). Thus, only two SI genotypes have been found so far, and olive cultivars can be classified, based on their belonging to the SI group, as G1 = S1S2 or G2 = S1S1. First evidence on the olive diallelic SSI system indicated that it is not accompanied by heterostyly (Saumitou-Laprade et al., 2017a); however, dedicated studies to confirm this preliminary data were not conducted.

The general behavior of pollen within pistil tissues, under self- or incompatible cross-pollination, is represented in **Figure 1**. After self-pollination (**Figure 1**, panels A–G), most of the pollen grains landing on the stigmatic surface do not germinate, and others start germinating but do not penetrate nor grow into the transmitting tissue of the style, indicating that inhibition of pollen tube growth takes place at stigmatic level, according to the sporophytic nature of the olive SI. By contrast, during cross-pollination with compatible pollen (**Figure 1**, panels H–J), while a high number of pollen grains germinate on the stigma and grow toward the style, a few of them are also able to penetrate the transmitting tissue of the style and reach the ovule. Generally, only one pollen tube (and rarely two or three) grow toward the ovary and reach the carpels to penetrate an ovule (Ateyyeh et al., 2000; Seifi et al., 2011).

It has also been observed that the SI group or the genotype of the pollen recipient may have a significant effect on the length that pollen tubes may reach in the stigma prior to being arrested due to the SI reaction. In general, in G1 plants, pollen grains do not germinate at all or tube growth stops shortly after germination, whereas a higher variability can be observed in G2 plants, characterized by short or long pollen tube growth (Saumitou-Laprade et al., 2017a). This behavior, which probably contributed to the earlier difficulties in the classification of olive as GSI or SSI species, might be explained with a different timing of the SI response after the recognition of self-pollen.

### OLIVE PSEUDO-SELF-COMPATIBILITY

The current knowledge on olive SI system indicates that all olive cultivars are self-incompatible (Saumitou-Laprade et al., 2017a); however, some of them can behave as PSC in particular conditions (probably due to both genetic and environmental factors), being able to overcome SI and produce seeds after self-pollination (Breton et al., 2016; Marchese et al., 2016a; Saumitou-Laprade et al., 2017a). For a long time, olive cultivars were classified into self-compatible and self-incompatible cultivars, with most of them considered as self-incompatible (Wu et al., 2002; Mookerjee et al., 2005; Seifi et al., 2011;

Sánchez-Estrada and Cuevas, 2018). Self-compatibility tests carried out in several studies have shown a low but certain rate of self-fertilizing cultivars, that were, consequently, considered to be partially self-compatible (Androulakis and Loupassaki, 1990; Ateyyeh et al., 2000; Moutier, 2002; Kasasbeh et al., 2005; Selak et al., 2011; Taslimpour and Aslmoshtaghi, 2013; Breton et al., 2014; Koubouris et al., 2014; Farinelli et al., 2018). Unfortunately, due to the lack of molecular tests to assess the origin of putatively selfed seeds and the use of unreliable materials and protocols for cross- and selfpollination experiments, some varieties were erroneously considered as self-compatible and were later proved totally self-incompatible. As an example, cv. Arbequina, previously classified as self-compatible (Cuevas, 2005), was then shown as self-incompatible (Díaz et al., 2006; Marchese et al., 2016b). Similarly, as a probable consequence of the multiple factors affecting self-fertilization success, some varieties were considered self-incompatible, as cv. Koroneiki (Mookerjee et al., 2005; Seifi et al., 2011), that later on definitively showed a low but reliable self-fertilization rate (Marchese et al., 2016b). Differences were also found among clones of the same cultivar, as in the case of cv. Leccino, mostly resulting in self-incompatible (de la Rosa et al., 2003), but with some clones partially self-compatible (Solfanelli et al., 2006).

Based on the recent evidence about the SI system operating in olive, the cultivars considered representative of the Mediterranean cultivated olive diversity showed a clear selfincompatibility reaction and inhibition of self-pollen growth (Saumitou-Laprade et al., 2017a). These results definitively support that the genetic architecture of olive SI excludes, in theory, any form of self-fertilization. The contrasting evidence that some cultivars may produce selfed progeny hints that a mechanism of incompatibility breakdown exists and that this mechanism is presently still unknown. In view of these findings, we can now conclude that olive shows PSC. This behavior can explain the contradictory data on the self-compatibility tests, and it is in agreement with previous literature reporting that the percent of successful self-fertilization was significantly lower than the fertilization rate under open- or cross-pollination (Griggs et al., 1975; Ateyyeh et al., 2000; Kasasbeh et al., 2005; Mookerjee et al., 2005). The production of truly selfseedlings has been confirmed by numerous studies where paternity tests have been applied (de la Rosa et al., 2004; Mookerjee et al., 2005; Marchese et al., 2016b).

PSC pollen growth reaction seems completely different from the compatibility response observed under pollination with compatible pollen. In self-pollinated pistils of cv. Frantoio, in fact, in most cases, pollen grains do not germinate at all, even after 6 and 15 days after flower opening and pollen-stigma contact (**Figure 1**, panels K and L), but in few cases, one or few pollen tubes may grow into the style and likely reach the ovary (**Figure 1**, panels M–P).

#### FACTORS AFFECTING THE PSEUDO-SELF-COMPATIBILITY

Pseudo-self-compatibility has been observed in numerous selfincompatible species in Asteraceae, Brassicaceae, Fabaceae, Poaceae, Ranunculaceae, Solanaceae, and other families (Good-Avila et al., 2008; Crawford et al., 2015; Liao et al., 2016). In these species, numerous external and internal factors seem to affect the ability of plants to overcome SI barrier, including the pollen germination speed, the relative growth rate of selfpollen tubes compared to cross-pollen grains, and the flower aging (Levin, 1996; Stephenson et al., 2000; Good-Avila et al., 2008; Horisaki and Niikura, 2008). The factors affecting the overcome of olive SI system are not yet known, but the available data indicate that both genotypic and environmental factors can play a role in this process. By contrast, flower age does not seem to affect PSC, considering that it has been observed in stigmas of all developmental stages.

#### Environmental Factors

It is well established that environmental conditions may affect self-fertility of self-incompatible plants. In particular, SI can be overcome by high temperatures (Okazaki and Hinata, 1987; Wilkins and Thorogood, 1992; Horisaki and Niikura, 2008), high humidity (Ockendon, 1978), and chemical treatments (Lao et al., 2014; Yang et al., 2018). Also, different environments and artificial pollination techniques may favor self-fertility (Do Canto et al., 2016).

In accordance with these studies, it has been reported that environmental conditions may affect the SI reaction and fruit set in olive, and, among them, temperature appears to play a key role (Orlandi et al., 2010; Selak et al., 2013; Haberman et al., 2017). It is thought, in fact, that SI in olive is temperature-dependent (Suárez et al., 2012), and generally, high temperatures during flowering may reduce self-fertilization rate (Ayerza and Coates, 2004; Selak et al., 2013). However, the effect of temperature seems strongly genotype-dependent (Griggs et al., 1975; Koubouris et al., 2009; Selak et al., 2013). Furthermore, temperature variations strongly influence the ability of olive pollen to grow and germinate (Koubouris et al., 2009), as well as it influences stigma receptivity and ovule longevity (Selak et al., 2014b), thus affecting the fruit set (Cuevas et al., 1994).

According to the role of environmental factors in the PSC expression, results from studies on self-fertilization of olive cultivars varied particularly among different environmental conditions. For instance, variability in the self-fertilization rate has been observed among different experimental years (Griggs et al., 1975; Cuevas et al., 2001; Lavee et al., 2002; Solfanelli et al., 2006; Selak et al., 2011), orchard location (Selak et al., 2011), or different conditions, as those determined by the use of polyethylene cages (Selak et al., 2013). However, considering the high number of environmental factors changing among location and years, the available data do not allow to identify, with the exception of temperature, other key environmental factors affecting self-fertility.

#### Genotype-Dependent Factors

In addition to the environmental effects, PSC appears to be strongly influenced by olive genotype, considering that selffertilization has been exclusively observed in specific olive varieties and never reported in others. For example, successful self-fertilization in cv. Frantoio has been reported in numerous studies (Kasasbeh et al., 2005; Farinelli et al., 2008; Spinardi and Bassi, 2012; Breton et al., 2014), despite a functional SI system is also present in this variety.

Our histological and molecular study confirmed the ability of cv. Frantoio to overcome SI. By contrast, PSC was not observed for other varieties, such as cvs. Leccino and Dolce Agogia (**Figure 2A**), in agreement with other authors (Spinardi and Bassi, 2012; Farinelli et al., 2018). According to these data, paternity tests with microsatellite markers performed on seeds of cv. Koroneiki, Manzanilla, Cacereña, and Manzanilla de Sevilla obtained by self-pollination (**Figure 2B**), confirmed the origin of zygotic embryos by effective self-fertilization (Saumitou-Laprade et al., 2017a). These results validated the ability of some olive cultivars to overcome the SI barrier and confirm the occurrence of PSC.

Similar to olive, a significant difference in PSC among genotypes has been reported in other plants (Foster and Wright, 1970; Elgersma et al., 1989; Brennan et al., 2005; Baldwin and Schoen, 2017). According to the quantitative nature of PSC, this trait is typically polygenic (Do Canto et al., 2016). In some Brassicaceae species, variation in PSC has been shown to be caused by genetic variation in genes unlinked to the S-locus but involved in the SI signaling cascade that mediates the rejection of self-pollen (Liu et al., 2007; Baldwin and Schoen, 2017). This pattern has also been confirmed in either GSI or SSI species (Good-Avila and Stephenson, 2002; Mable et al., 2005; Mena-Ali and Stephenson, 2007; Crawford et al., 2015; Liao et al., 2016).

The hypothesized architecture of the S-locus in olive (G1 = S1S2 and G2 = S1S1), which guarantees a perfect 1:1 balance between the two groups (Saumitou-Laprade et al., 2017a), implies that only out-crossing between the two groups could preserve these two unique combinations, while the selfing of the heterozygous group would result in the appearance of the homozygous combination S2S2, which would lead to the imbalance of the populations in favor of one group with respect to the other. Furthermore, first evidence demonstrates that cross-pollination between self-fertile varieties and genotypes belonging to the same SI group never occurs (Saumitou-Laprade et al., 2017a). This complex picture of the paradoxical occurrence of PSC in olive is difficult to

explain, also assuming the presence of a "leaky S-allele," as observed in other plant species (Baldwin and Schoen, 2017).

### TOWARD THE IDENTIFICATION OF MOLECULAR DETERMINANTS OF SELF-INCOMPATIBILITY AND PSEUDO-SELF-COMPATIBILITY

Further studies should be carried out, either at microscopic, genomic, and genetic levels, in order to understand the mechanisms underpinning the PSC in olive, although, it is also crucial to increase the knowledge of the molecular events occurring during the SI reaction which remain still unexplored. Studies aimed at identifying female and male determinants of the olive SI system, based on gene similarity to other plant species in which SI was molecularly characterized, have been conducted, but results of these studies indicated that olive flowers do not possess or express genes similar to the GSI determinants identified in other plants (Collani, 2012; Collani et al., 2012). On the contrary, candidate genes for female (*OeSRK-like* and *OeSLG-like*) and male (*OeSCR-like*) determinants, as orthologous of the genes that control the SSI system in the Brassicaceae family, have been cloned and characterized in olive varieties (Collani, 2012; Collani et al., 2012), and gene expression studies showed that the *OeSRK-like* gene is preferentially expressed in pistils at early flowering stages of cv. Leccino and lowly expressed in pistils at later flowering stages of cv. Frantoio, while *OeSCR-like* was found specifically expressed in dehiscent anthers of both cultivars (Collani, 2012). Despite such positive initial evidences, further genetic and molecular findings clearly demonstrated that these genes do not encode for the genetic determinants of the olive S-locus, as independent segregation of *OeSRK-like* and *OeSCR-like* genes was documented by means of SNP-based markers and no interaction between *OeSRK-like* and *OeSCR-like* proteins was observed by Yeast-2-Hybrid screens (data not shown). Although the role of these genes remains to be disclosed, our negative results corroborate an olive SI system whose genetic determinism is different from that active in Brassicaceae and from the others molecularly characterized, according to the peculiar features of olive SSI. This result supports the hypothesis that independent evolution of multiple SI systems has occurred in flowering plants (Ferrer and Good, 2012).

Activity and localization of enzymes, such as RNases, putatively involved in pollen rejection mechanism under self-pollination, have been described in olive (Serrano and Olmedilla, 2012), as well as the occurrence of programmed cell death in olive pollen, as a consequence of the SI response and the differential level of reactive oxygen and nitrogen species between selfcompatible and self-incompatible pollen grains (Serrano et al., 2012). However, their role in the SI mechanism needs to be further investigated.

A high throughput transcriptomic study of olive self-pollinated flowers identified a wide set of transcripts showing extensive expression differences between pseudo-self-compatible (cv. Frantoio) and self-incompatible (cv. Leccino) cultivars, confirming that biochemical, physiological, and signaling changes occur when incompatibility is broken down (Alagna et al., 2016). These data represent a valuable resource for the identification of genes related to PSC.

In addition, several enzymes putatively involved in the regulation of pollen tube growth and in the modulation of temperature-dependent reproductive processes were identified by studying the proteomic profile of olive stigma exudate (Rejón et al., 2013). The ReprOlive database, built on the transcriptomic information of olive reproductive tissues (Carmona et al., 2015), as well as the availability of the sequence of cultivated and wild olive genomes (Cruz et al., 2016; Unver et al., 2017), provides further high valuable tools for the identification of candidate genes involved in SI signaling and for the discovery of the molecular mechanisms involved in PSC. These studies will be facilitated by the trans-generic functional homology of olive SI, which will allow for the application of discoveries from *P. angustifolia* and *F. ornus* species to olive.

#### CONCLUSIONS AND PERSPECTIVES

The important advances made in the study of the olive SI system do not explain the occurrence of self-fertility in some cultivars, confirmed by many studies and certainly regulated by both genetic and environmental factors. New observations should be carried out in order to clarify how germination of pollen grains and their growth within the transmitting tissue of the style may occur in a context of incompatibility. The availability of genetic materials and microscopic, genomic, and transcriptomic resources for the study of olive reproductive constrains should allow the elucidation of the selfing mechanism, as well as the identification of putative genes involved in the PSC. Understanding the mechanisms regulating SI and PSC will have a huge impact on olive orchard management and breeding programs, offering new tangible opportunities for improving olive and olive oil production.

#### REFERENCES


#### AUTHOR CONTRIBUTIONS

LB and GB conceived the study. FA wrote the first draft of the manuscript. MEC and SP performed the histological observations. All the authors contributed to the content of the manuscript and revised the manuscript. All the authors agreed on the final version of this review.

#### FUNDING

This work was partially supported by the project BeFOre—"Bioresources for Oliviculture," H2020-MSCA-RISE (G.A. 645595) and by the project INNO.V.O. – "Development of new varieties to face the challenges of olive growing," PSR Umbria 2014-2020.


genotyping by sequencing. *Aust. J. Crop. Sci.* 10, 857–863. doi: 10.21475/ ajcs.2016.10.06.p7520


**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Alagna, Caceres, Pandolfi, Collani, Mousavi, Mariotti, Cultrera, Baldoni and Barcaccia. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# A High-Density Linkage Map of the Forage Grass Eragrostis curvula and Localization of the Diplospory Locus

Diego Zappacosta<sup>1</sup> , Jimena Gallardo<sup>1</sup> , José Carballo<sup>1</sup> , Mauro Meier<sup>2</sup> , Juan Manuel Rodrigo<sup>1</sup> , Cristian A. Gallo<sup>1</sup> , Juan Pablo Selva<sup>1</sup> , Juliana Stein<sup>3</sup> , Juan Pablo A. Ortiz<sup>3</sup> , Emidio Albertini<sup>4</sup> \* and Viviana Echenique<sup>1</sup> \*

<sup>1</sup> Departamento de Agronomía, Centro de Recursos Naturales Renovables de la Zona Semiárida (CERZOS-CONICET, CCT Bahía Blanca), Universidad Nacional del Sur, Bahía Blanca, Argentina, <sup>2</sup> Laboratorio Biotecnológico, Asociación de Cooperativas Argentinas Coop. Ltd., Pergamino, Argentina, <sup>3</sup> Laboratorio de Biología Molecular, Facultad de Ciencias Agrarias, Universidad Nacional de Rosario, Instituto de Investigaciones en Ciencias Agrarias de Rosario (IICAR, CONICET-UNR), Zavalla, Argentina, <sup>4</sup> Dipartimento di Scienze Agrarie, Alimentari e Ambientali, Università degli Studi di Perugia, Perugia, Italy

#### Edited by:

Marta Adelina Mendes, University of Milan, Italy

#### Reviewed by:

Elvira Hörandl, University of Göttingen, Germany Robert VanBuren, Michigan State University, United States

> \*Correspondence: Viviana Echenique echeniq@criba.edu.ar Emidio Albertini emidio.albertini@unipg.it

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 21 December 2018 Accepted: 28 June 2019 Published: 12 July 2019

#### Citation:

Zappacosta D, Gallardo J, Carballo J, Meier M, Rodrigo JM, Gallo CA, Selva JP, Stein J, Ortiz JPA, Albertini E and Echenique V (2019) A High-Density Linkage Map of the Forage Grass Eragrostis curvula and Localization of the Diplospory Locus. Front. Plant Sci. 10:918. doi: 10.3389/fpls.2019.00918 Eragrostis curvula (Schrad.) Nees (weeping lovegrass) is an apomictic species native to Southern Africa that is used as forage grass in semiarid regions of Argentina. Apomixis is a mechanism for clonal propagation through seeds that involves the avoidance of meiosis to generate an unreduced embryo sac (apomeiosis), parthenogenesis, and viable endosperm formation in a fertilization-dependent or -independent manner. Here, we constructed the first saturated linkage map of tetraploid E. curvula using both traditional (AFLP and SSR) and high-throughput molecular markers (GBS-SNP) and identified the locus controlling diplospory. We also identified putative regulatory regions affecting the expressivity of this trait and syntenic relationships with genomes of other grass species. We obtained a tetraploid mapping population from a cross between a full sexual genotype (OTA-S) with a facultative apomictic individual of cv. Don Walter. Phenotypic characterization of F<sup>1</sup> hybrids by cytoembryological analysis yielded a 1:1 ratio of apomictic vs. sexual plants (34:27, X <sup>2</sup> = 0.37), which agrees with the model of inheritance of a single dominant genetic factor. The final number of markers was 1,114 for OTA-S and 2,019 for Don Walter. These markers were distributed into 40 linkage groups per parental genotype, which is consistent with the number of E. curvula chromosomes (containing 2 to 123 markers per linkage group). The total length of the OTA-S map was 1,335 cM, with an average marker density of 1.22 cM per marker. The Don Walter map was 1,976.2 cM, with an average marker density of 0.98 cM/marker. The locus responsible for diplospory was mapped on Don Walter linkage group 3, with other 65 markers. QTL analyses of the expressivity of diplospory in the F<sup>1</sup> hybrids revealed the presence of two main QTLs, located 3.27 and 15 cM from the diplospory locus. Both QTLs explained 28.6% of phenotypic variation. Syntenic analysis allowed us to establish the groups of homologs/homeologs for each linkage map. The genetic linkage map reported in this study, the first such map for E. curvula, is the most saturated map for the genus Eragrostis and one of the most saturated maps for a polyploid forage grass species.

Keywords: linkage map, weeping lovegrass, apomixis, polyploid, QTL, synteny

## INTRODUCTION

fpls-10-00918 July 12, 2019 Time: 16:36 # 2

Apomixis, a clonal mode of reproduction through seeds, occurs in numerous plant families and in organisms from other kingdoms (Asker and Jerling, 1992). Apomixis can be divided in two main types based on the origin of the clonal embryos. During adventitious embryony or sporophytic apomixis, the embryo develops directly from a somatic cell in the ovule (usually the nucellus or integument) outside of the sexual embryo sac. Survival of the apomictic embryo depends on the successful fertilization of the meiotically derived embryo sac and its ability to grow sufficiently to gain access to the endosperm (Koltunow and Grossniklaus, 2003). During gametophytic apomixis, unreduced embryo sacs form by mitosis of a megaspore mother cell that avoids meiosis (diplospory) or via a mitotic division of a nucellar cell (apospory) (Koltunow and Grossniklaus, 2003). The embryo then forms by fertilization-independent embryogenesis (parthenogenesis), and the endosperm develops autonomously or after fertilization of the polar nuclei (pseudogamy) (Koltunow et al., 2013). This mode of reproduction is present in more than 400 plant species, representing approximately 40 families. The occurrence of adventitious embryony has been reported in 148 genera, apospory has been reported in 110 genera, and diplospory has been reported in 68 genera (Hojsgaard et al., 2014). The apomictic trait has a polyphyletic origin, and the genes and mechanisms involved in its expression and regulation are diverse. Therefore, research on this reproductive mechanism should focus specifically on each apomictic species (Crane, 2001).

Apomixis has great potential for enhancing plant breeding and seed production, as it enables the fixation and unlimited propagation of complex, heterozygous genotypes (Spillane et al., 2001). Despite the efforts that have been made toward transferring this trait to crop species using various approaches, such attempts have thus far been unsuccessful (Kandemir and Saygili, 2014). Although transgenesis appears to be an optimum way to transfer this trait to crops, it is imperative to determine the molecular pathways and genes responsible for apomixis. Several strategies have been used to gain insight into the genetic basis of apomixis, including interspecific hybridizations between sexual crops and apomictic wild relatives (Savidan et al., 2001), unraveling its genetic control in natural apomicts (Albertini et al., 2005; Corral et al., 2013; Siena et al., 2014; Conner et al., 2015; Pellegrini, 2016; Garbus et al., 2017; Selva et al., 2017), identifying mutants of sexual species that mimic apomixis components (Garcia-Aguilar et al., 2010; Olmedo-Monfil et al., 2010), and recreating an apomictic phenotype in a sexual background (Ravi et al., 2008; d'Erfurth et al., 2009).

Analyses of segregating populations derived from crosses between sexual (as the female parent) and apomictic (as the pollen donor) genotypes have led to the development of several models to explain the inheritance of the trait and its components (apomeiosis and parthenogenesis) (Ozias-Akins and van Dijk, 2007). The proposed mechanisms differ in terms of the number and functions of genes and allelic relationships, as well as the effects of dominance over sexuality (Carman, 1997; Grimanelli et al., 2001; Koltunow and Grossniklaus, 2003). Nevertheless, most of these studies agree that one or a few Mendelian factors control the transference and expression of apomeiosis and its components in most species (Ozias-Akins and van Dijk, 2007; Pupilli and Barcaccia, 2012). By contrast, molecular and cytogenetic analyses suggest that this trait is controlled in a complex genetic manner involving restricted recombination around the apomixis locus, transelimination of gametes, supernumerary chromatic structures, DNA rearrangements, and the presence of transposons in several species (reviewed by Albertini et al., 2010 and Ortiz et al., 2013). These characteristics have been the main drawback preventing the isolation of apomixis determinants in natural species.

Eragrostis curvula (Schrad.) Nees (weeping lovegrass) is a pseudogamous apomictic grass species native to Southern Africa (Streetman, 1963) that is used as forage in the United States, Australia and Argentina. The E. curvula complex includes cytotypes with different ploidy levels (e.g., 2×–8×) displaying obligate apomixis, facultative apomixis, and sexual reproduction (Voigt and Bashaw, 1976; Voigt et al., 2004). This grass is considered to be an allopolyploid species, although multivalent formation has been recorded in some polyploid genotypes (Vorster and Liebenberg, 1977; Poverene, 1988). Moreover, Burson and Voigt (1996) have shown that this grass behaves as segmental allotetraploid.

The genus Eragrostis has a unique diplosporous apomictic type (Eragrostis type) characterized by the lack of meiotic stages in which the megaspore mother cell undergoes two rounds of mitotic division, leading to the formation of an unreduced 4-nucleate embryo sac containing an egg, two synergids, and one polar nucleus (Meier et al., 2011). Following parthenogenetic development of the unreduced egg cell to form a maternal embryo, the endosperm forms after the polar nucleus is fertilized (pseudogamy). Thus, the apomictic seed maintains the same embryo:endosperm genomic ratio (2:3) as the sexual seed (Crane, 2001). Genetic analysis using segregating populations has led to the proposal of a simple genetic model for the inheritance of apomixis in this species in which apomixis is dominant over sexuality and is controlled by a single locus (Voigt and Bashaw, 1972; Voigt and Burson, 1992). In previous reports (Voigt and Bashaw, 1972; Voigt and Burson, 1992) the apomixis inheritance was evaluated only by progeny tests, having information of the complete process. At the moment the knowledge about the number of regions affecting apomixis in Eragrostis is limited due to the difficulties to evaluate parthenogenesis. The apomeiosis and parthenogenesis components of apomixis in other grasses are usually inherited together as a single dominant locus [apospory-specific genomic region (ASGR)] (Ozias-Akins and van Dijk, 2007), corresponding to a physically large hemizygous region of reduced recombination (Ozias-Akins et al., 1998; Stein et al., 2007). More recently, Conner et al. (2015) have found as a candidate gene for parthenogenesis in Cenchrus/Pennisetum an ASGR-BABY BOOM-like (ASGR-BBML). Recently, BBM was combined with other edited genes to produce an astounding progress in the transference of clonal reproduction to rice (Khanday et al., 2019; Wang et al., 2019) demonstrating the possibility of engineering key components

of plant reproduction to allow the multiplication of heterotic combination through seeds.

For the past few years, we have been studying the molecular mechanisms involved in E. curvula reproduction, primarily using transcriptomic approaches. As a result, several differentially expressed genes between sexual and apomictic genotypes have been detected (Cervigni et al., 2008a,b; Selva et al., 2012, 2017; Garbus et al., 2017). Moreover, we demonstrated that the proportion of sexual embryo sacs in facultative apomicts increases under stress conditions, indicating that epigenetic and genetic mechanisms underlie the expression of apomixis in this species (Zappacosta et al., 2014; Rodrigo et al., 2017).

Genetic maps including molecular markers allow genetic regions linked to phenotypic characters to be identified and are therefore critical for genetic improvement (Collard et al., 2005). Single-nucleotide polymorphisms (SNPs) are currently the most widely used markers due to their abundance in the genome and the increasing ability to sequence large numbers of individuals in a cost-effective manner (Deschamps et al., 2012). Furthermore, in the past few years, genotyping-by-sequencing (GBS) has emerged as a new concept in marker development. The main advantages of GBS are its potential to detect SNP markers in numerous individuals and to combine genome reduction and barcode technology using a rapid, efficient, low-cost protocol for population mapping studies (Elshire et al., 2011). Worthington et al. (2016) developed the first saturated linkage maps for a polyploid apomict grass species (Brachiaria decumbens) using SNP markers generated by GBS. This saturated map has been used to assess synteny with foxtail millet and to identify flanking markers linked to the ASGR.

In this study, we constructed the first saturated linkage map at the tetraploid level for the diplosporous apomictic grass species E. curvula by combining traditional AFLP, SSR, and high-throughput molecular markers (GBS-SNP) and identified the localization of the diplospory controlling locus. We also used the saturated map to analyze the presence of regulatory regions affecting the expression of diplospory and the syntenic relationships with related grass species, shedding light on this agronomically important trait.

### MATERIALS AND METHODS

#### Plant Material

To obtain a segregating mapping population for the reproductive mode (F<sup>1</sup> type progeny), the tetraploid sexual E. curvula genotype OTA-S (USDA accession PI574506; 2n = 4x = 40) was crossed with the facultative apomictic tetraploid cv Don Walter INTA. Plants of both parental genotypes were placed together in isolation in a confined sector of the greenhouse. We used one maternal plant (OTA-S) and three paternal clones (Don Walter). To ensure the cross-pollination, the pollen donor panicles were moved over the OTA-S panicles twice a day. Because it was impossible to perform castration (emasculation) due to the size and morphology of the spikes, some of the resulting seeds were produced by self-fertilization. The resulting seeds were sown in MS medium, and the germinated plants were transplanted to soil in pots and grown in the greenhouse. To confirm the hybrid origin of F<sup>1</sup> individuals, fingerprinting analysis of RAPD malespecific amplicons was carried out as described by Rodrigo et al. (2017). Hybrid F<sup>1</sup> plants were selected based on the presence of at least three paternal amplification bands using the primers shown in **Supplementary Table S1**. Selected individuals were cultivated in 10-L pots under greenhouse conditions with a photoperiod of 15 h light/9 h dark during the spring flowering period (Bahía Blanca, Buenos Aires Province, Argentina; 38◦ 42◦ S, 62◦ 16◦ W).

#### Cytoembryological Analyses

To assess the reproductive modes of the F<sup>1</sup> plants, megasporogenesis and megagametogenesis were analyzed according to Meier et al. (2011). Inflorescences were collected at the beginning of anthesis (when all embryo sac developmental stages are observable) and fixed in FAA (50% ethanol, 5% acetic acid, and 10% formaldehyde in distilled water). Individual spikelets were dehydrated in a tertiary butyl alcohol series and embedded in Paraplast (Leyca Paraplast Plus, United States). The samples were cut into 10-µm sections, stained with safraninfast green, and observed under a Nikon Eclipse TE300 light transmission microscope (Tokyo, Japan). The reproductive mode was assessed by scoring the two main types of embryo sacs: octanucleated reduced Polygonum-type embryo sacs and tetranucleated nonreduced diplosporous Eragrostis-type embryo sacs. The latter contain an egg cell (2n), two synergids (2n), and one polar nucleus (2n) but lack antipodals (Meier et al., 2011). Plants were considered sexual when they showed only Polygonum-type embryo sacs and apomictic when at least one nonreduced diplosporous embryo sac was observed. At least 30 pistils with normally developed embryo sacs were analyzed per F<sup>1</sup> individual.

#### DNA Extraction

Genomic DNA was extracted from fresh leaf tissue according to Garbus et al. (2017). Briefly, fresh plant material was frozen and ground to a powder in liquid nitrogen using a TissueLyser II (Qiagen). For each sample, 100 mg of tissue was incubated at 65◦C in preheated extraction buffer containing 100 mM Tris HCl pH 8, 1.4 M NaCl, 20 mM EDTA pH 8, 2% CTAB (w/v), and 0.5% (v/v) β-mercaptoethanol. Chloroform was subsequently added to reach a 2:1 ratio (buffer: chloroform), and the aqueous phase was collected after centrifugation. DNA was precipitated with one volume of isopropanol and washed with 70% (v/v) ethanol. The pellet was air-dried and resuspended in 50 µL of TE buffer containing 20 µg/ml RNase. DNA concentration was determined by spectrophotometry, and DNA quality was determined based on its integrity in agarose gels. All samples were quantified again using a Qubit Fluorometer (Thermo-Fisher Scientific) prior to library construction.

#### SSR Markers

E. curvula SSR markers previously developed by Garbus et al. (2017) were used for genotyping of the mapping population; the primers are listed in **Supplementary Table S1**. PCR was performed in a final volume of 20 µl containing 1X Taq polymerase reaction buffer, 2.5 mM MgCl2, 0.125 mM of each

dNTP, 1 µM of each primer, 50 ng of genomic DNA, and 2 U of Taq polymerase (Invitrogen, Brazil). The PCR program consisted of an initial denaturation at 94◦C for 4 min, 35 cycles of 94◦C for 30 s, 58◦C for 1 min, and 72◦C for 5 min, and a final extension at 72◦C for 5 min. The PCR was performed in a thermocycler (MJ Research). Samples were mixed (2:1, v/v) with denaturing loading buffer (95% formamide and bromophenol blue), denatured at 95◦C for 5 min, chilled on ice, and resolved in 6% (w/v) silver-stained polyacrylamide gels.

#### AFLP Markers

AFLP markers were generated as described by Vos et al. (1995) with minor modifications. The sequences of the adapters and primers used for preamplification and selective amplification are shown in **Supplementary Table S1**. The amplification products were mixed with denaturing buffer, denatured at 95◦C for 5 min, chilled on ice, and resolved in 6% (w/v) silver-stained polyacrylamide gels. The AFLP markers were given a number (primer combination) and a letter (indicating the order of polymorphic bands).

#### GBS Library Preparation and Sequencing

A DNA GBS library was constructed for 86 F<sup>1</sup> individuals (1 sample each), the two parental genotypes (4 samples each) and two controls. Genomic DNA (50 ng per individual) was processed as described by Elshire et al. (2011) at the Biotechnology Center (UWBC DNA Sequencing Facility, University of Wisconsin, Madison, United States). Digestion was carried out using the methylation-sensitive restriction enzyme ApeKI, followed by the ligation of barcoded adapters. The samples were pooled into one library that was PCR amplified. The library was sequenced to 100 bp in two lines of the Illumina HiSeq 2500 platform. Details can be found at the Biotechnology Center website<sup>1</sup> .

#### GBS-SNP Discovery

The reads were trimmed using Cutadapt software version 1.14 (Martin, 2011) with the following parameters: (i) low-quality ends (−q = 20); (ii) maximum error rate (−e = 0.1); (iii) overlap length (−O = 1); and (iv) adapter (−a = AGATCGGAAGAGC). The trimmed reads were analyzed with FastQC software version 0.11.5 (Andrews, 2011). Using the barcodes provided by the sequencing service and the filtered reads, de novo SNP discovery and genotype calling were conducted using the UNEAK pipeline developed by Tassel software version 3.0 (Glaubitz et al., 2014) with the following parameters: (i) minimum number of reads (−s = 40,0000,000); (ii) enzyme used to create the GBS library (−e = ApeKI); (iii) minimum count of a tag to be output (−c = 10); (iv) error tolerance rate in the network filter (−e = 0.02); (v) minimum minor allele frequency (−mnMAF = 0.01); (vi) maximum minor allele frequency (−mxMAF = 0.5); (vii) minimum call rate (−mnC = 0.6); and (viii) maximum call rate (−mxC = 1). This method does not require a reference genome sequence because SNP discovery is performed directly within pairs of matched sequence tags and filtered through network analysis. To further avoid allele miscalling we considered the number of tags in each SNP for each individual, classifying an individual as homozygous when it has more than five tags and as heterozygous when it has at least one tag for each allele.

#### Data Analysis and Linkage Map Construction

Segregation data from each parental genotype was analyzed independently. The configuration (homozygous/heterozygous) of all polymorphic markers (SSR, AFLP, and GBS-SNP) was recorded for each progeny. A χ 2 test was used to determine the fit goodness (at p < 0.01) between the observed and expected number of genotypes. GBS-SNPs with unexpected alleles (e.g., one C/G SNP in one parent and C/C in the other showing G/G descendants) were excluded, even if they only had one offspring with the unexpected allele. Markers that were heterozygous in only one parent and had a segregation ratio of 1:1 (heterozygous:homozygous) in the progeny, were classified as single-dose allele (SDA) markers and used for map construction. Finally, GBS-SNP markers with more than 5% of missing data were removed.

There are several specific pipelines for polyploids such as polymapR (Bourke et al., 2018a) and TetraploidSNPMap (Hackett et al., 2017), but are designed to process markers with allelic dosage values, which are unavailable for GBS data. For this reason we decided to follow the traditional approach of a single dosage marker model (Li et al., 2014; Thaikua et al., 2016; Worthington et al., 2016). These markers possess a number of advantages over other marker segregation types, mainly in unexplored polyploid species for which the mode of inheritance is uncertain. Simplex markers allow an "assumption-free" linkage map to be created and the use of software designed for diploids (Bourke et al., 2018b). Thereby the genetic linkage maps were constructed for OTA-S and Don Walter using JoinMap 4.1 software (Van Ooijen, 2006) with the CP (cross-pollinator full-sib population) option. Markers with >98% identity were eliminated. Grouping analysis was carried out using a LOD (logarithm of odds) score threshold of 7.0 or higher. Maps were constructed within each linkage group using the regressionmapping algorithm, and map distance units were derived from the Kosambi mapping function with default options. Only linkages with a recombination frequency <0.40 were used for map construction.

### QTL Mapping

QTL mapping was performed using the Don Walter linkage map with MapQTL 6 software (Van Ooijen, 2009) using the Multiple QTL Mapping (MQM) method (Jansen, 1993, 1994; Jansen and Stam, 1994). Phenotypic data representing the proportion of diplosporous embryo sacs observed in each F<sup>1</sup> hybrid (ranging from 0 in sexual individuals to 100 in apomictic individuals) were used. The LOD threshold to consider a QTL as significant was determined using permutation tests with 10,000 iterations and a genome-wide significance level of 0.05. MQM analysis was performed by setting a mapping step size of 1 cM and a LOD score higher than the threshold. Each

<sup>1</sup>https://www.biotech.wisc.edu/services/dnaseq

significant QTL was characterized by its maximum LOD score, linkage group, position, percentage of explained phenotypic variation, and confidence interval extension (region at either side of the likelihood peak until the LOD score dropped to 2.0). The positions of QTLs on the genetic map were drawn using LinkageMapView software version 2.1.2 (Ouellette et al., 2017).

#### Synteny Analysis

GBS-SNP markers sequences were queried against the genomes of Oropetium thomaeum (VanBuren et al., 2018), Cenchrus americanus (Varshney et al., 2017<sup>2</sup> ), Setaria italica (Bennetzen et al., 2012<sup>3</sup> ), Zea mays (Jiao et al., 2017), Panicum hallii (Bioproject: PRJNA250527), and Oryza sativa (Kawahara et al., 2013). Markers that aligned to the genomes with an identity >80% and a query coverage >70 were found using BLAST 2.7.1 (Altschul et al., 1990) and used to assign each linkage group to a chromosome and to identify homologs/homeologs groups. Given that the reference genomes used to establish syntenic relationships are available as haplotypes, the homologs/homeologs linkage groups are impossible to differentiate. Thereby we will mention them like "homologs/homeologs" since now. Circos v0.69 was used to plot synteny between the linkage maps and reference genomes (Krzywinski et al., 2009).

#### Ploidy Level Analysis and Genome DNA Content Estimation

For ploidy level analysis, the parental plants, OTA-S and Don Walter, plus all the hybrid individuals were analyzed to corroborate its ploidy level. Cultivars Victoria and Don Eduardo, were used as diploid and hexaploid control, respectively. Approximately 0.5 cm<sup>2</sup> of fresh leaf tissue was chopped with a sharp razor blade in extraction buffer (100 mM citric acid monohydrate and 0.5% [v/v] Tween 20). The suspensions were then filtered through nylon tissue with 42-µm mesh width. After filtration, samples were pooled in groups of four samples each. One ml of staining buffer (100 mM Tris–HCl, 5.3 mM MgCl<sup>2</sup> , 86 mM NaCl, 0.03 mM sodium citrate, 7.3 mM Triton X-100, 0.003 mM 4<sup>0</sup> -6-diamidino-2-phenylindole, pH 7.0) was added, and the tubes were stored in the dark on ice for 1 to 4 h before measurements. Fluorescence intensity of 4<sup>0</sup> -6-diamidino-2-phenylindole-stained nuclei was determined using the flow cytometer Ploidy Analyser PA (Partec, Germany).

For genome DNA content estimation, approximately 0.5 cm<sup>2</sup> of fresh E. curvula leaf tissue, together with an equal amount of Secale cereale cv. Dankovske leaf tissue, was chopped with a sharp razor blade in extraction buffer (5 mM Tris, 2 mM Na2EDTA, 80 mM KCl, 20 mM NaCl, 15 mM β-mercaptoethanol, and 0.1% [v/v] Triton X-100, pH 7.5). The nucleus suspension was filtered and incubated in 100 µl of staining solution consisting of 100 mg/l propidium iodide (PI) stain and RNase A. The stained nucleus suspension was analyzed using a flow cytometer (Partec, Germany). Genome size was estimated based on the corresponding mean value for S. cereale cv. Dankovske (16.19 pg 2C DNA content, Doležel et al., 1998).

#### RESULTS

#### Mapping Population Development and F<sup>1</sup> Phenotyping

A total of 300 offspring derived from the cross between OTA-S and Don Walter-INTA were obtained. In the first hybrid selection using RAPD markers, 86 plants were selected (**Supplementary Figure S1**), but 19 were ultimately eliminated because they originated from self-pollination of the female plant by the mentor effect of the male pollinator. Five other individuals were eliminated due to a failure in GBS genotyping (low read counts, see below). Consequently, the mapping population consisted of 62 hybrids. The tetraploid level of the parental and hybrid plants was corroborated by flow cytometry.

Cytoembryological analysis (**Supplementary Figure S2**) of 61 individuals of the population (one hybrid did not flower) gave a ratio of apomictic versus sexual individuals of 1:1 (34:27, X <sup>2</sup> = 0.37), which agrees with the model of inheritance of a single dominant genetic factor. **Figure 1** shows the distribution of hybrid plants according to their reproductive mode (2,850 sexual and apomictic pistils observed). Interestingly, the proportion of sexual embryo sacs within the apomictic plants varied from 0 to 97%, indicating that apomixis in E. curvula is a characteristic with highly variable expressivity.

#### GBS-SNP Identification

The sequencing of the library produced 366,193,356 (100 bp) reads (Bioproject: PRJNA509552). After trimming, 33,350,780 low-quality reads were removed, and 332,842,576 reads were subsequently analyzed using the UNEAK pipeline. Five samples (Z116, Z154, Z217, Z223, and Z252) were eliminated from further analyses due to the low number of reads. The depth of coverage for each sample is listed in **Supplementary Table S2**.

A total of 332.8 million of reads were assigned to 178,559 tag pair sites. After removing the markers with missing data in the parental plants, 106,105 GBS-SNP markers were identified. Segregating GBS-SNP markers that were heterozygous in one parent and homozygous in the other one were selected, resulting in 28,074 and 33,765 GBS-SNPs for OTA-S and Don Walter, respectively. **Table 1** shows the results obtained after sequential filtering of these data. The final number of GBS-SNP markers was 1,447 for OTA-S and 2,192 for Don Walter (**Table 1**). After including 11 SSR and 93 AFLP markers (as shown in **Table 2**), the final number of markers was 1,489 for OTA-S and 2,255 for Don Walter (including the phenotype in the last case).

#### Genetic Linkage Map Construction

We constructed two linkage maps corresponding to the female (OTA-S) and male (Don Walter) parent using JoinMap 4.1. As a first step, identical markers were excluded (54 and 78 identical markers were eliminated from the OTA-S and Don Walter data,

<sup>2</sup>https://www.ncbi.nlm.nih.gov/bioproject/PRJNA294988

<sup>3</sup>https://www.ncbi.nlm.nih.gov/bioproject/PRJNA32913

TABLE 1 | Steps in the GBS-SNP marker filtering procedure and the number of markers selected in each step for each parental plant (OTA-S and Don Walter) of the E. curvula mapping population.


TABLE 2 | Final number of markers for each parental plant (OTA-S × Don Walter) of the E. curvula mapping population.


respectively). The high level of heterozygosity and the maximum number of markers per linkage group allowed by the regression method in the JoinMap software resulted in more groups than the expected ones (20). Thereby, we used the 2n chromosome number to define the linkage group number as other authors previously did (Li et al., 2014; Worthington et al., 2016).

The OTA-S map was defined by 1,114 SDA markers distributed in 40 linkage groups (LOD score threshold 7.0 or 9.0, **Supplementary Table S3**), which is consistent with the number of chromosomes, and contained a minimum of 2 and a maximum of 102 markers per linkage group (**Table 3**, **Figure 2**, and **Supplementary Figure S3**). The total length of the OTA-S map was 1,335 cM, with an average marker density of 1.22 cM per marker. The genetic linkage map of the apomictic parent Don Walter was built using 2,019 SDA markers distributed in 40 linkage groups (LOD score threshold 7.0 or 8.5, **Supplementary Table S3**), with 7–123 markers per linkage group (**Table 3**, **Figure 3**, and **Supplementary Figure S4**). The total length of the Don Walter map was 1,976.2 cM, with an average of 0.98 cM per marker. In total, more than 90% of the interlocus gaps in both genetic maps were <4 cM, and only seven and four gaps were >10 cM in the OTA-S and Don Walter linkage maps, respectively. The order and exact positions of markers on the maps are shown in **Supplementary Figures S3**, **S4**, and the GBS-SNPs sequences in **Supplementary Table S4**.

At the mapping threshold stated, the diplospory locus was mapped to Don Walter linkage group 3, along with other 65 markers (**Figure 4**). This locus was flanked by four GBS-SNPs having a recombination frequency of zero, being in agreement with previous reports of a low recombination region controlling the trait in other species (Ozias-Akins and van Dijk, 2007; Albertini et al., 2010; Ortiz et al., 2013).

### QTL Analysis to Identify Regions Affecting Diplospory Expressivity

To detect loci associated with the expressivity of diplospory in E. curvula, we performed interval-mapping analysis using the phenotypic information derived from cytoembryological analysis of F<sup>1</sup> hybrids (see above) and the genetic linkage map of Don Walter. This analysis detected two genomic regions highly associated with this trait (LOD score >3.9) in Don Walter linkage group 3 (**Figure 5** and **Supplementary Figure S5**). The maximum LOD scores for each potential QTL were 6.96 to 7.39, explaining an estimated phenotypic variation (R<sup>2</sup> ) of 13.7 and 14.9%, respectively (**Table 4**). One of these two regions is very close to the diplospory locus that was mapped using JoinMap (located at 3.27 cM); thus, it could be considered the major determinant of this trait. The second

TABLE 3 | Distribution of single-dose allele (SDA) markers across the 40 linkage groups on the E. curvula (OTA-S and Don Walter) genetic maps.


region is located 15 cM from the diplospory locus. Three additional QTLs were found with a LOD >3 but lower than the threshold value (LOD >3.9). Two of these QTLs were localized to linkage group 1 and the other to linkage group 20 (**Table 4** and **Supplementary Figure S5**). The positions and quantitative information about these QTLs are shown in **Table 4**.

### Syntenic Analysis to Identify Homolog/Homeolog Groups

To identify homologs/homeologs groups in the linkage maps, we mapped the sequences of the GBS-SNP markers against the genomes of other species. Analysis using Oropetium thomaeum as a reference (the closest species with a high-quality genome sequence; VanBuren et al., 2018) showed that 477 (40%) and 900 (45%) GBS-SNP markers from OTA-S and Don Walter, respectively, mapped to unique positions (best match) under the above-mentioned conditions (identity >80% and query coverage >70). Although the order of the markers and their positions in the O. thomaeum genome are not highly correlated with those of E. curvula, analysis of Circos graphs showed that the markers of each linkage group tended to cluster on the same chromosome (**Figure 6**). As an example we can mention OTA-S linkage group 4 that matches mainly with O. thomaeum chromosome 4 (dark red lines in **Figure 6A**). From the male side (cv. Don Walter), linkage group 5 of E. curvula matches with O. thomaeum chromosome 3 (red lines in **Figure 6B**). Nonetheless, it is possible to observe groups that match with more than one chromosome, like Don Walter linkage group 8 matches with O. thomaeum chromosomes 7 and 8. On the other hand, as is shown in **Figure 7**, markers of Don Walter linkage group 3 (containing the diplospory locus) are syntenic with those of O. thomaeum chromosome 5. Most of the linkage groups showed homology, primarily with a single chromosome of O. thomaeum (**Supplementary Tables S5**, **S6**). **Table 5** shows the groups of homologs/homeologs considered to be exclusive linkage groups (in which most common markers fell into a single chromosome) or shared linkage groups (in which most markers were divided into two or three chromosomes). This enabled us to identify homologs/homeologs groups for each linkage map, to establish the relationship between the two maps, and to validate the genetic E. curvula-saturated maps.

We also analyzed synteny with genomes of other related species (C. americanus, O. sativa, P. hallii, S. italica, and Z. mays), yielding similar results to those described above (**Supplementary Tables S7**, **S8**). When we compared the markers completely linked to the E. curvula apolocus (TP135456, TP107627, TP79423, and TP95591) with the genome of related species, we found that the GBS-SNP sequences gave homology with O. thomaeum (Chr5), C. americanus (Chr4), O. sativa (Chr5), P. hallii (Chr5), S. italica (Chr3), and Z. mays (Chr6 and Chr8) (**Supplementary Table S9**).

Finally, we evaluated the genomic DNA content of the parents of the mapping population. Flow cytometry analysis yielded an estimated haploid genome size of 1,312 Mbp for OTA-S and 1,195 Mbp for Don Walter.

#### DISCUSSION

Segregation analysis of the reproductive mode in our E. curvula tetraploid mapping population revealed a 1:1 ratio of apomictic versus sexual individuals. This type of inheritance supports the hypothesis that diplospory is controlled by a single dominant genetic factor in E. curvula, as described for other diplosporous apomictic species, such as Taraxacum officinale

FIGURE 2 | Linkage groups of the sexual plant OTA-S (E. curvula) obtained using GBS-SNPs, SSRs, and AFLPs. Marker positions are expressed in centimorgans. Different colors represent different marker densities.

(Vijverberg et al., 2004) and Tripsacum dactyloides (Grimanelli et al., 1998). Pioneering studies of the inheritance of apomixis in weeping lovegrass were carried out by Voigt and colleagues (Voigt and Bashaw, 1972; Voigt and Burson, 1992), who phenotyped plants by measuring various morphological traits, obtaining a ratio of apomictic versus sexual offspring of 1:1.4. The authors proposed a simple genetic model for the inheritance of apomixis in weeping lovegrass, i.e., apomixis is dominant over sexuality and is controlled by a single gene. Voigt et al. (Voigt and Bashaw, 1972; Voigt and Burson, 1992) categorized plants into apomictic, highly sexual, and sexual, but Savidan (2000) later proposed that plants are apomictic even if they only have the ability to produce apomictic offspring. When we reanalyzed Voigt et al.'s results taking into account the concept proposed by Savidan (2000), the proportion changed to 1.7:1 (96:56). The results obtained in this study using cytoembryological observations and molecular markers (both methods are more reliable than the analysis of morphological traits) showed that a single locus controls diplospory in weeping lovegrass and that this trait is dominant over sexuality. Whether the expression of this trait relies on a single gene or linked cosegregating genes is still unknown.

In other diplosporous species, although the regions controlling different components of apomixis (apomeiosis,

FIGURE 4 | Linkage group 3 from the facultative apomictic cv. Don Walter (E. curvula) containing the locus that controls diplospory (APO). Marker positions are expressed in centimorgans.

parthenogenesis, and autonomous or pseudogamous endosperm) are physically separated, these regions are inherited either as a single locus (T. dactyloides) or independently (E. annuus) (Grimanelli et al., 1998; Noyes et al., 2007). In the case of apospory, a more frequent apomixis mechanism than diplospory, a dominant locus of simple inheritance has been identified (Akiyama et al., 2004; Calderini et al., 2006; Okada et al., 2011; Ortiz et al., 2013), although in Poa pratensis, two genetic factors are thought to control apospory and parthenogenesis (Albertini et al., 2001). In Pennisetum squamulatum and in species from the genus Paspalum, the ASGR shows a lack of recombination, forming an extensive block (50 Mbp in P. squamulatum; Akiyama et al., 2004) that is fully inherited, thus ensuring the concurrent inheritance of all its components (Ozias-Akins et al., 1998; Labombarda et al., 2002; Stein et al., 2007). Several authors have reported the presence of repetitive elements, pseudogenes, and heterochromatic regions in the ASGR. Koltunow and Grossniklaus (2003) hypothesized that the repetitive sequences act as a sink to sequester factors involved in the sexual reproductive pathway, thereby altering the expression of sexual reproductive processes and possibly causing apomixis. More recently, Kotani et al. (2013) reported that extensive repetitive sequence structures associated with the apospory locus in Hieracium are not required for apomixis. Therefore, it is possible that these structural features and allele divergence occur as a consequence of asexual reproduction and suppressed recombination, which might have evolved to maintain the genetic elements required for apomixis.

Although several reports describe the presence of genes in diverse apomictic species, which are differentially expressed or play functional roles in apomictic development (Albertini et al., 2005; Corral et al., 2013; Siena et al., 2014; Conner et al., 2015; Pellegrini, 2016; Worthington et al., 2016; Garbus et al., 2017; Selva et al., 2017), little is known about the gene or genes that control regulatory programs or common pathways among different apomictic species or that trigger the mechanisms underlying apomixis.

In this study, we constructed genetic linkage maps for E. curvula, including one for the female sexual parent and one for the male apomictic parent. These maps are the most saturated maps for the genus Eragrostis and some of the most saturated maps for polyploid forage grass and apomictic species produced to date (Jessup et al., 2003; Stein et al., 2007; Thaikua et al., 2016; Worthington et al., 2016). Nonetheless, additional studies are needed to allow our linkage maps to reach the high resolution of genetic maps of model species, which include thousands of markers mapped with high accuracy and precision. One of the greatest limitations to the construction of the linkage genetic maps produced in this study was the small number of individuals in the mapping population, i.e., 62. A population size >75 should be used (Wu et al., 1992), but this was difficult to achieve for E. curvula due to a variety of factors, such as the complex reproductive mode of this species, the inability to perform castration (emasculation) due to spike size and morphology, and the high frequency of self-pollination in the single tetraploid sexual genotype available (OTA-S). In

TABLE 4 | QTLs mapping for diplospory on the facultative apomictic cv. Don Walter (E. curvula) linkage groups, showing only the QTL with LOD values higher than 3.


The LOD threshold to consider a QTL as significant was established at a LOD value of 3.9.

addition, many other genotypes used as pollen donors were incompatible with the maternal plant. Despite these limitations, this is a high density map which is consistent with data collected by other authors using similar models and techniques (Worthington et al., 2016; Huang et al., 2018).

Several linkage maps of polyploid species are based exclusively on markers that segregate at a 1:1 ratio (SDA), enabling the use of diploid mapping softwares like JoinMap. Allopolyploid species have disomic inheritance, and its genetics is therefore similar to that of diploids, except for the presence of multiple genomes. The assumption that all the markers have a 1:1 disomic inheritance might be an oversimplification because the markers with a different segregation pattern were not considered. However, for our dataset, this is a straightforward approach very well documented in the literature to deal with GBS-SNP markers in allotetraploid species (Li et al., 2014; Thaikua et al., 2016; Worthington et al., 2016).

E. curvula is considered to be an allopolyploid species, although multivalent formation has also been recorded in some polyploid genotypes, such as Tanganyika and Don Eduardo (Vorster and Liebenberg, 1977 Poverene, 1988). However, multivalents are not frequent in the parental genotypes of the mapping population, where preferential pairing among primary homologs has been demonstrated (Poverene, 1988).

The sizes of the linkage maps of OTA-S and Don Walter are quite different (1,335 cM versus 1,976.2 cM, respectively). This variation in genetic map size is not related to the difference in genome size between genotypes, as we estimated the haploid genome sizes to be 1,312 Mbp for OTA-S and 1,195 Mbp

for Don Walter. Thus, although the variation in the genetic sizes of the linkage group maps of both parents is not likely due to differences in genome size, this variation might reflect the differential recombination rates of the genotypes. Indeed, studies of model plants have demonstrated the impact of genome sequence divergence on recombination rates, with a lower recombination rate related to higher levels of genome divergence (Chetelat et al., 2000; Opperman et al., 2004; Li et al., 2006). In addition, recombination rates are known to differ between sexes in both plants and animals (Lorch, 2005). For example, Huang et al. (2018) found that the male genetic map of Clementine mandarin was notably larger than its female counterpart. Another possible reason for the difference in the sizes of the linkage maps is that OTA-S was obtained by bulk seed harvest produced in isolation from four tetraploid (2n = 40) clones derived from PI 299929 and from a cross between PI 299928 and PI 299929 (Voigt, 1976).

When we investigated synteny of the E. curvula genome with genomes of other related species, we identified homologs/homeologs linkage groups when the OTA-S and Don Walter linkage maps were compared with the physical map of O. thomaeum, the closest relative with a high-quality genome sequence. Using this information, it was not only possible to obtain homologs/homeologs groups for each map but also to establish which groups of each map would be equivalent. The synteny analysis of the apolocus linked markers with the genome of related species showed an interesting result. The maize relative diplosporous genus Tripsacum have two RFLP markers (csu68 and umc28) linked to diplospory that are located at a distal position on Z. mays Chr6L (Leblanc et al., 1995). This region is syntenic to Z. mays Chr8 and Chr3 (Savidan et al., 2004). Thereby, our results are promising since the E. curvula apolocus linked markers gave homology with regions located on maize Chr6L and Chr8. Regarding to other species, our markers gave homology with chromosomes or genomic regions that are different to the ones reported in the literature as linked to the apolocus (see **Table 6**). These findings supports the hypothesis that apomixis is polyphyletic and emerged several times during evolution (Carman, 1997).

The apomictic plants in our mapping population showed different levels of expression of diplospory, with 3–100% of the observed pistils having apomictic embryo sacs. We previously reported (Rodrigo et al., 2017) that OTA-S only shows Polygonum-type embryo sacs, whereas Don Walter is a facultative genotype, with 60–100% diplosporous apomictic embryo sacs. As occurs in most known apomictic plants, these plants are facultative and can switch their developmental program back and forth from the asexual to the sexual route (Brukhin, 2017). This trait appears to be useful for the evaluation of candidate genes, especially genes with quantitative effects. Other studies, such as the one of Noyes (2005)related with the inheritance of diplospory in Erigeron, have also found a complete gradient of apomixis expression. These different levels of expression of diplospory observed in E. curvula allowed us to evaluate diplospory as a quantitative trait and to look for other genomic regions that could regulate it. Our QTL analysis revealed two main regions very close to the diplospory locus in Don Walter linkage map (linkage group 3) and three other regions with a LOD value slightly below the LOD significance threshold, including two localized in linkage group 1 and the other in linkage group 20. Although the phenotypic analyses were performed in only one environment, the results are trustable because are in concordance with the linkage mapping analysis and the diplospory locus position. Additionally, to the best of our knowledge this study is the first

conducted to date that treats apomixis as a quantitative trait and provides evidence for an external region that regulates this trait. Another important finding in favor of the presence of regions that regulate this trait is that sexual/apomixis expressiveness is strongly dependent on environmental conditions (Zappacosta et al., 2014; Rodrigo et al., 2017), which, in turn, could be indicative of regulation at the epigenetic level.

Eragrostis-type apomixis has particular characteristics that make it an interesting model for the transfer of apomixis, especially for crops such as maize, which are highly sensitive to changes in the embryo:endosperm ploidy ratio, which should be equal to 2:3. In our model, this ratio is the same as that of sexual endosperm. This is an important difference from other apomictic models in which the situation is variable



Exclusive LG indicate the groups where most of the markers match to a single chromosome. Shared LG indicate the groups where most of the markers match with multiple chromosomes.

TABLE 6 | Synteny between the apolocus region reported in apomictic species and non-apomictic reference species.


For E. curvula the syntenic analysis was done with the markers cosegregating with the apolocus.

and relaxed (the endosperm can develop under a wide range of relationships) (Hojsgaard, 2018). The embryo:endosperm ploidy ratio is strictly 2:3 in several model species because at the early stages of embryo and endosperm development, many alleles are silenced (imprinted) depending on their parental origin. Any deviation in the dosage will result in the arrest of endosperm development and seed abortion (Brukhin, 2017). Other interesting aspect of this model is that the Eragrostis type embryo sac development lacks of meiotic stages (Crane, 2001), and as a diplosporous plant, compared to apospory, the chances of polyembryony are even lower (Asker and Jerling, 1992; Koltunow and Grossniklaus, 2003; Batygina and Vinogradova, 2007).

#### CONCLUSION

Phenotyping of an F<sup>1</sup> population showed that the segregation of diplospory follows a 1:1 (apomictic:sexual) ratio, indicating that a single gene or genomic region is involved in diplospory.

We constructed the first genetic map of E. curvula. This map is the most saturated map for the genus Eragrostis and one of the most saturated maps for a polyploid forage grass and apomictic species constructed to date, with 40 linkage groups per parent. These results are somewhat expected for an allotetraploid with a high grade of heterozygosis.

Our linkage analysis determined that the diplospory locus and other 65 markers in a single linkage group (Don Walter LG3). This locus is closely flanked by two QTLs that could be linked to the expressivity of this trait.

The use of the current mapping population gave us the opportunity to construct a genetic map and to locate molecular markers associated with apomixis. Furthermore, this population is composed of individuals that are genetically close but have different reproductive modes, which might allow us to conduct further expression studies that will help identify candidate genes that regulate apomixis. This also should allow us to map other

traits that are contrasting in the parental genotypes and are limiting factors for weeping lovegrass production, such as forage quality, a trait related to lignin content. Finally, it might also be possible to map genes involved in biotic and abiotic stress tolerance; these are critical traits in the breeding of this forage grass, which is cultivated in marginal crop regions.

Further studies using the auxin test proposed by Matzk (1991) to evaluate parthenogenesis will allow us to determine if both traits - diplospory and parthenogenesis - are controlled by genes located in one or more genomic region/s.

#### AUTHOR CONTRIBUTIONS

DZ, MM, and VE conceived and designed the study. DZ, JR, and JPS developed the mapping population. DZ, MM, JR, and JO phenotyped mapping population. JC trimmed GBS data and made GBS-SNP discovery. DZ, JG, MM, JS, and JO performed genetic mapping. DZ, JG, and CG performed synteny and QTLs analysis. All authors participated in manuscript elaboration. VE and EA conducted and supervissed the research, obtained funding and participated in manuscript writing.

#### FUNDING

This work was supported by the Agencia Nacional de Promoción Científica y Tecnológica (ANPCyT, PICT Raíces 2014–1243, 2017–0879 to VE), the Universidad Nacional del Sur (PGI 24/A199 to VE), and the European Union's Horizon 2020 Research and Innovation Programme under the Marie Skłodowska–Curie Grant Agreement No. 645674.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.00918/ full#supplementary-material

#### REFERENCES


FIGURE S1 | Selection of hybrid E. curvula plants with RAPD markers. Fragments were amplified with primer 248 and revealed in 6% acrylamide gels. Arrows indicate polymorphisms between the parental plants and the stars show offsprings harboring paternal markers.

FIGURE S2 | Development of the sexual (A–C) and diplosporic embryo sacs (D–F) in plants of weeping lovegrass. Bar: 50 µm. Sections dyed with safranina-fast green. (A) Megaspore mother cell and degenerated megaspores, (B) Binucleated embryo sac, (C) Tetranucleate embryo sac, (D) Elongated megaspore mother cell, (E) Binucleate embryo sac, (F) Tetranucleate embryo sac.

FIGURE S3 | Linkage groups of the sexual plant OTA-S (E. curvula) obtained using GBS-SNPs, SSRs, and AFLPs. Marker positions are expressed in centimorgans.

FIGURE S4 | Linkage groups of the facultative apomictic plant Don Walter (E. curvula) obtained using GBS-SNPs, SSRs, and AFLPs. Marker positions are expressed in centimorgans.

FIGURE S5 | Linkage groups from the facultative apomictic cv. Don Walter (E. curvula) showing the QTLs positions (in cM) for diplospory. The LOD threshold to consider a QTL as significant is indicated as a dashed line at a LOD value of 3.9.

TABLE S1 | RAPD and SSR primers and AFLP adaptors, pre-amplification and selective primer sequences used to construct the E. curvula genetic maps.

TABLE S2 | Number of reads, length and coverage (average genome size of 1,250 Mb) of the GBS library from E. curvula used for SNP calling. Barcodes for each sample are indicated.

TABLE S3 | LOD score where each linkage group was determined.

TABLE S4 | SNPs names and the alternative sequences of each alleles.

TABLE S5 | Synteny between the E. curvula sexual genotype OTA-S linkage groups and the physical map of O. thomaeum.

TABLE S6 | Synteny between the E. curvula facultative apomictic cultivar Don Walter linkage groups and the physical map of O. thomaeum.

TABLE S7 | Synteny between the E. curvula sexual genotype OTA-S linkage groups and the physical maps of Cenchrus americanus, Oryza sativa, Panicum hallii, Setaria italica, and Zea mays.

TABLE S8 | Synteny between the E. curvula facultative apomictic cultivar Don Walter linkage groups and the physical maps of C. americanus, O. sativa, P. hallii, S. italica, and Z. mays.

TABLE S9 | Homology of GBS-SNP markers of E. curvula facultative apomictic cultivar Don Walter with the chromosomes of Oropetium thomaeum, C. americanus, O. sativa, P. hallii, S. italica, and Z. mays.


Asker, S. E., and Jerling, L. (1992). Apomixis in Plants. Boca Raton, FL: CRC Press.



sequencing analysis pipeline. PLoS One 9:e90346. doi: 10.1371/journal.pone. 0090346


using maize RFLP markers. Theor. Appl. Genet. 90, 1198–1203. doi: 10.1007/ BF00222943


apomixis in sexual crops. J. Biotechnol. 159, 291–311. doi: 10.1016/j.jbiotec. 2011.08.028



**Conflict of Interest Statement:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling Editor is currently organizing a Research Topic with one of the authors EA and confirms the absence of any other collaboration.

Copyright © 2019 Zappacosta, Gallardo, Carballo, Meier, Rodrigo, Gallo, Selva, Stein, Ortiz, Albertini and Echenique. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Effects on Plant Growth and Reproduction of a Peach R2R3-MYB Transcription Factor Overexpressed in Tobacco

#### *Edited by:*

*Marta Adelina Mendes, University of Milan, Italy*

#### *Reviewed by:*

*Laura Bassolino, Research Centre for Industrial Crops, Council for Agricultural and Economics Research, Italy Teemu Heikki Teeri, University of Helsinki, Finland*

*\*Correspondence:*

*Livio Trainotti livio.trainotti@unipd.it*

#### *†Present address:*

*Md Abdur Rahim, Department of Genetics and Plant Breeding, Sher-e-Bangla Agricultural University, Dhaka, Bangladesh*

*Francesca Resentini, Instituto de Biología Molecular y Celular de Plantas, Consejo Superior de 20 Investigaciones Científicas (CSIC)—Universidad Politécnica de Valencia, Valencia, Spain*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 14 December 2018 Accepted: 21 August 2019 Published: 18 October 2019*

#### *Citation:*

*Rahim MA, Resentini F, Dalla Vecchia F and Trainotti L (2019) Effects on Plant Growth and Reproduction of a Peach R2R3-MYB Transcription Factor Overexpressed in Tobacco. Front. Plant Sci. 10:1143. doi: 10.3389/fpls.2019.01143*

*Md Abdur Rahim1†, Francesca Resentini1†, Francesca Dalla Vecchia1,2 and Livio Trainotti1,2\**

*1 Department of Biology, University of Padova, Padova, Italy, 2 Orto Botanico, University of Padova, Padova, Italy*

In plants, anthocyanin production is controlled by MYB and bHLH transcription factors. In peach, among the members of these families, *MYB10.1* and *bHLH3* have been shown to be the most important genes for production of these pigments during fruit ripening. Anthocyanins are valuable molecules, and the overexpression of regulatory genes in annual fast-growing plants has been explored for their biotechnological production. The overexpression of peach *MYB10.1* in tobacco plants induced anthocyanin pigmentation, which was particularly strong in the reproductive parts. Pigment production was the result of an up-regulation of the expression level of key genes of the flavonoid biosynthetic pathway, such as *NtCHS*, *NtCHI*, *NtF3H*, *NtDFR*, *NtANS*, and *NtUFGT*, as well as of the proanthocyanidin biosynthetic pathway such as *NtLAR*. Nevertheless, phenotypic alterations in transgenic tobacco lines were not only limited to anthocyanin production. Lines showing a strong phenotype (type I) exhibited irregular leaf shape and size and reduced plant height. Moreover, flowers had reduced length of anther's filament, nondehiscent anthers, reduced pistil length, aborted nectary glands, and impaired capsule development, but the reproductive parts including androecium, gynoecium, and petals were more pigmented that in wild type. Surprisingly, overexpression of peach *MYB10.1* led to suppression of *NtMYB305*, which is required for floral development and, of one of its target genes, *NECTARIN1* (*NtNCE1*), involved in the nectary gland formation. *MYB10.1* overexpression up-regulated JA biosynthetic (*NtAOS*) and signaling (*NtJAZd*) genes, as well as *1-aminocyclopropane-1-carboxylate oxidase* (*NtACO*) in flowers. The alteration of these hormonal pathways might be among the causes of the observed floral abnormalities with defects in both male and female gametophyte development. In particular, approximately only 30% of pollen grains of type I lines were viable, while during megaspore formation, there was a block during FG1 (St3-II). This block seemed to be associated to an excessive accumulation of callose. It can be concluded that the overexpression of peach *MYB10.1* in tobacco not only regulates flavonoid biosynthesis (anthocyanin and proanthocyanidin) in the reproductive parts but also plays a role in other processes such as vegetative and reproductive development.

Keywords: anthocyanin, epidermis, flower, gametophyte, *Nicotiana tabacum*, trichome, R2R3-MYB, transcription factor

### INTRODUCTION

Peach [*Prunus persica* (L.) Batsch] is one of the most economically important fruit crops that belongs to the Rosaceae family. The world peach and nectarine production is more than 24.97 million tons (http://www.fao.org/faostat/en/#data/QC, accessed on 29 July 2018). The largest world producer of peaches and nectarines is China followed by Italy and the United States. The fruit color is one of the major quality traits in peach. The visible red coloration of peaches is mainly due to the accumulation of anthocyanin pigments. During ripening, many fruits accumulate different types of bioactive chemicals, including anthocyanins that give protection to human health against cancer and cardiovascular, neurodegenerative, and other chronic diseases (Rao and Rao, 2007; Butelli et al., 2008; Singh et al., 2008). Besides, anthocyanins increase antioxidant levels in serum (Mazza et al., 2002), cholesterol distribution (Xia et al., 2007), and restoration of vision disorders (Matsumoto et al., 2003) and help in reducing obesity (Tsuda et al., 2003), as well as protect human red blood cells from oxidative damage (Tedesco et al., 2001). Therefore, red peach is one of the important objectives of the fruit tree breeders for the fresh market acceptability (Ravaglia et al., 2013). The accumulation of anthocyanin pigments is genetically determined by genes coding for enzymes of the anthocyanin biosynthetic pathway and by transcription factors (TFs) controlling their expression (Dixon and Steele, 1999; Petroni and Tonelli, 2011; Jaakola, 2013; Albert et al., 2014). R2R3-MYB TFs are the main regulators of the structural genes encoding enzymes for anthocyanin biosynthetic pathway (Ban et al., 2007; Deluc et al., 2008). In *Arabidopsis*, MYB TFs are classified into three subfamilies based on the presence of number of conserved DNA-binding domains called MYB domains (Stracke et al., 2001; Feller et al., 2011). The MYB TFs are called MYB1R factor (one MYB domain), R2R3-MYB factor (two MYB domains), and MYB3R factor (three MYB domains) based on the number of MYB domain repeats (Stracke et al., 2001). In *Arabidopsis*, there are 137 R2R3-MYB TFs, and some of them regulate flavonoid biosynthesis. In particular, *production of anthocyanin pigment1* (AtPAP1/AtMYB75), AtPAP2 (AtMYB90), AtPAP3 (AtMYB113), and AtPAP4 (AtMYB114) are involved in anthocyanin production (Borevitz et al., 2000; Nesi et al., 2001; Ramsay and Glover, 2005; Stracke et al., 2007; Gonzalez et al., 2008; Heppel et al., 2013), whereas AtMYB123 regulates proanthocyanidin (PA) biosynthesis (Lepiniec et al., 2006). Besides *Arabidopsis*, anthocyanin-promoting MYB TFs are studied in many species, for instance, in tomato (ANT1; (Mathews, 2003)), petunia (AN2; (Quattrocchio et al., 1999)), *Capsicum* (A; (Borovsky et al., 2004)), grape [MYB1a; (Kobayashi et al., 2002)], maize [P; (Grotewold et al., 1991)], sweet potato (MYB1; (Mano et al., 2007)), snapdragon (ROSEA1, ROSEA2 and VENOSA; (Schwinn, 2006)), apple (MYB10, MYB1/MYBA; Takos et al., 2006; (Takos et al., 2006; Ban et al., 2007; Espley et al., 2007; Lin-Wang et al., 2010), strawberry (MYB10, MYB1, and MYB1; (Lin-Wang et al., 2010; Salvatierra et al., 2013)), and also in peach (MYB10, MYB10.1/2/3; (Lin-Wang et al., 2010; Rahim et al., 2014; Ravaglia et al., 2013)).

In the transcriptional regulation of anthocyanin biosynthetic genes, R2R3-MYB TFs do not work alone but in a complex, called MBW, which includes basic helix-loop-helix (bHLH) TFs and WD40 proteins (Xu et al., 2015). Mutants, RNAi, and overexpressing transgenic lines have largely been used to study the function of these genes, also in heterologous systems, as in the case of the overexpression of MYBs from several species in tobacco (Yamagishi et al., 2014; Huang et al., 2016; Li et al., 2016; Liu et al., 2016; Naing et al., 2018). These overexpression studies frequently reported the accumulation of anthocyanins in the host and thus, given the beneficial effects on health and the possible use of those pigments as dyes, suggested their use as tools for biofortification (Butelli et al., 2008) or for wider biotechnological applications (Appelhagen et al., 2018).

In this study, the functional characterization of peach MYB10.1 encoding R2R3-MYB TF was carried out in a heterologous system by stable tobacco transformation. Besides the expected effects on the production of anthocyanins, other processes such as vegetative and reproductive development were changed.

#### MATERIALS AND METHODS

#### Plant Materials and Growth Conditions

The peach *MYB10.1* was isolated from cv. "Stark red gold" and tobacco (*Nicotiana tabacum*) cv. "Samsung NN" was used to generate transgenic plants. All the cultures were grown in a climate chamber at 22°C under 16-h light/8-h dark condition.

#### Gene Isolation and Plasmid Construction

The full-length coding sequence (CDS) of *MYB10.1* (ppa026640m) was amplified by polymerase chain reaction (PCR) from fruit cDNA of peach cv. "Stark red gold" with forward primer (5′-ATGGAGGGCTATAACTTGGGTGT-3′) and reverse primer (5′-TTAATGATTCCAAAAGTCCACGTT-3′) comparing with other known MYB10 TFs from different species. Polymerase chain reaction products were cloned into pCR®8/GW/TOPO® vector (Invitrogen), and CDS identity confirmed by sequencing. The cloned DNA was moved into a binary vector modified in house from the pH-TOP (Craft et al., 2005) in order to contain *attR* sites and a *GUS* reporter gene interrupted by a plant intron (Vancanneyt et al., 1990). The CDS in the final expression vector (named *pOp::MYB10.1*) was under the control of a pOp promoter, recognized by the synthetic TF LhG4, cloned on an independent plasmid (named *35S::LhG4*) under the control of the CaMV 35S promoter (construct LhG4 in Craft et al., 2005; see **Supplementary Figure S1** for a schematic map of the two constructs). Finally, binary vectors harboring the desired constructs were transferred into *Agrobacterium tumefaciens* strain LB 3101 as previously described (Rahim et al., 2014).

**Abbreviations:** 4-MU, 4-methylumbelliferone; *AN2*, *anthocyanin2*; *ANR1*, *anthocyanindin reductase 1*; *ANS*, *anthocyanidin synthase*; bHLH, basic helixloop-helix; *CHI*, *chalconeisomerase*; *CHS*, *chalcone synthase; DFR*, *dihydroflavonol 4-reductase*; *F3H*, *flavanone-3-hydroxylase*; JA, jasmonate; *JAZd*, *jasmonate ZIM-domain d*; LAR, leucoanthocyanidin reductase; MTT, 2,5-diphenyl monotetrazolium bromide; MUG, 4-methylumbelliferyl-β-d-glucuronide; *NEC1*, nectarin1; *PAL*, phenylalanine ammonia-lyase; *PAP1*, *production of anthocyanin pigment1*; qRT-PCR, quantitative real-time polymerase chain reaction; RT-PCR, reverse transcription–polymerase chain reaction; TF, transcription factor; *UFGT*, *UDP-glycose:flavonoid-3-o-glycosyltransferase*; WT, wild type.

### Plant Transformation

Tobacco transformation was carried out following a leaf disc cocultivation protocol (Fisher and Guiltinan, 1995). The selected transformants were identified by their ability to root on 200 mg L−1 kanamycin and 15 mg L−1 hygromycin before being transferred to soil in a greenhouse. Double selection was used to isolate transformants carrying both the *pOp::MYB10.1*  and *35S::LhG4* cassettes. Polymerase chain reaction on genomic DNA confirmed the presence of the transgenes in selected clones.

### Enzymatic **β**-Glucuronidase Assay

The activity of β-glucuronidase (GUS) enzyme was evaluated with the substrate 4-methylumbelliferyl-β-d-glucuronide (MUG). The soluble proteins were extracted from frozen tobacco leaf tissues and homogenized in protein extraction buffer (50 mM NaHPO4 pH 7.0, 10 mM EDTA, 0.1% Triton X-100, and 2 mM β-mercaptoethanol). The GUS enzymatic assay was carried out by incubating protein extract in reaction buffer containing the MUG substrate at 37°C. The reaction was stopped in the solution containing 0.2 M Na2CO3. The released 4-methylumbelliferone (4-MU) was quantified with a DTX880 Multimode Detector (Beckman Coulter) according to the manufacturer's instructions. The GUS activity was expressed as nM4-MU released min−1 μg−1 protein (Jefferson et al., 1987). The protein concentration was measured according to the Bradford method, and data points were normalized by protein quantification (Bradford, 1976).

#### RNA Extraction and cDNA Synthesis

Total RNA was extracted from tobacco and peach flowers according to Chang et al. (1993). The yield and purity of RNA were checked by means of UV absorption spectra, whereas RNA integrity was ascertained by agarose gel electrophoresis. cDNA was synthesized from 4 μg of total RNA pretreated with 1.0 unit of RQ1 RNAse-free DNAseI (Promega). Random primers were used in the reaction together with High Capacity cDNA Archive Kit (Life Technologies) following the manufacturer's instruction.

#### Gene Expression Analysis

The expression profiles of *MYB10.1*, *NtAN2*, *NtAN1b*, *NtMYB305*, *NtJAZd*, *NtNEC1*, and the anthocyanin biosynthetic pathway genes (*NtPAL*, *NtCHS*, *NtCHI*, *NtF3H*, *NtDFR*, *NtLAR*, *NtANR1*, *NtANS*, and *NtUFGT*) in flowers of tobacco transgenic and wildtype (WT) plants were compared by means of reverse transcription (RT)–PCR. Primers were designed using Lasergene software package (DNASTAR) (see **Supplementary Table S1**). Primer specificities and amplification efficiencies were checked by PCR using as template a pool of the synthetized cDNAs; thereafter, cycle numbers for each gene were optimized. The *ubiquitin conjugating enzyme E2* (*NtUBC2*) was used as control gene for equal loading. Gel images have been digitized using a Bio-Rad Gel Doc XR system avoiding saturating images.

The expression of peach *MYB10.1*, *MYB10.2*, *MYB10.3*, and *MYB24* was carried out by quantitative real-time PCR using an Applied Biosystems 7500 instrument in different floral parts of peach flower. The reactions were set up in a total volume of 10 μl consisting 5.0 μl of Syber Green PCR Master Mix (Applied Biosystems), 0.05 pmol of each forward and reverse primers and 4.5 μl (1.0 ng/μl dilution) of peach flower cDNA samples as starting template. Polymerase chain reaction conditions were 95°C for 10 min to activate the enzyme followed by 40 cycles of 95°C for 15 s, 60°C for 15 s, and 65°C for 34 s. The obtained *CT* values were analyzed using Q-gene software (Muller et al., 2002) considering the means of three independently calculated normalized expression values for each sample. *PpN1* (gene identifier: Prupe.8G137600, formerly ppa009483m, a peach type 2A phosphatase activator TIP41) was used as internal standard for peach.

#### Electron Microscopy

Tobacco leaves and flower parts at the same developmental stages were observed under low-pressure conditions by means of environmental scanning electron microscopy (ESEM). The experiment was performed using a FEI Quanta 200 instrument at the CUGAS facilities of University of Padova, Italy. Files of the acquired images were used to measure cell dimensions with the software ImageJ.

### Pollen Viability Assay

The pollen viability was assessed by staining grains with 1% 2,5-diphenyl monotetrazolium bromide (MTT) in 5% sucrose (Norton, 1966; Khatun and Flowers, 1995; Rodriguez-Riano and Dafni, 2000). Tobacco flowers were collected when anthers started to burst; pollen grains were dispersed in 800 µl of MTT solution in 1.5-ml tubes for 10 min followed by centrifugation at 10,000 × *g* for 1 min. Heat-killed (80°C for 2 h) WT tobacco pollen grains were used as negative controls. Ten microliters of grain suspension was placed in a Bürker chamber to count pollen grains under a microscope (Leica DM5000B), equipped with a digital image acquisition system. Pollen grains were considered viable only when they turned deep pink (Wang et al., 2004).

#### Pollen *in Situ* Germination and Pistil Observation

Aniline blue staining of pistils was done according to previously described methods (Kho and Baër, 1968; Dumas and Knox, 1983). Briefly, closed flowers (just before anthesis) were emasculated, covered with a paper bag, and left on the plant for additional 7 to 8 h before hand pollination to allow transmitting tract and ovule development. Pistils were pollinated with few pollen grains laid with a small brush on the stigmas and after 48 h were fixed with absolute ethanol/glacial acetic acid (3:1) for 3 h. The fixed pistils were washed three times with dH2O for 5 min each, softened in 7.5 N NaOH overnight, and washed in dH2O for 1 h each at least three times before staining. Pistils were stained in aniline blue solution (0.1% aniline blue in 0.1 M K2HPO4, pH 10.0) for 1 h and observed under a fluorescence microscope (Leica DM5000B, equipped with a digital image acquisition system) using UV (350–400 nm) light.

#### Ovule Development Analysis

To analyze the defects in ovule development, flowers at different developmental stages from WT and *MYB10.1* overexpressing lines were fixed overnight at 4°C in 3% glutaraldehyde in 0.1 M sodium cacodylate buffer (pH 6.9) and postfixed at 4°C for 2 h in 1% osmium tetroxide in the same buffer. The specimens were dehydrated in a graded series of ethyl alcohol and propylene oxide and embedded in araldite. Sections were cut using an ultramicrotome (Ultracut S, Reichert-Jung, Wien, Austria). For light microscopy, thin sections (1 µm) stained with toluidine blue (1% basic toluidine and 1% Na tetraborate, 1:1 v/v) were observed using a microscope (DMR 5000 Leica), equipped with a digital image acquisition system.

Ovule development, in flowers at different developmental stages, was followed also in cleared tissues, prepared as reported by Yadegari (1994). Inflorescences were fixed in ethanol:acetic acid 9:1 overnight followed by two washes with 90% and 70% ethanol. Samples were cleared with chloral hydrate/glycerol/water solution (8:1:2) and then dissected under a stereomicroscope and observed using a Zeiss Axiophot D1 microscope equipped with differential interface contrast optics. Images were recorded with an Axiocam MRc5 camera (Zeiss) using the Axiovision program (version 4.1).

In order to detect callose accumulation, whole flowers were fixed and stained as described (Martin, 1959). Handmade thin sections were squashed on a microscope glass and observed using a DMR 5000 Leica microscope, equipped with a digital image acquisition system.

### RESULTS

#### Identification and Cloning of the Peach MYB10.1 cDNA

The MYB10.1 cDNA was isolated from peach fruit cv. "Stark red gold." Analysis of its CDS (720 bp) showed that it encodes an R2R3-MYB TF of 239-amino-acid residues. It has nucleotide sequence homology with anthocyanin-promoting *Arabidopsis* MYB TFs like AtPAP1, AtPAP2, AtPAP3, and AtPAP4 (58.6% and 63.3%, 52.6%, and 57.7% similarity, respectively). Actually, a relatively high similarity (79.2%) was found with MdMYB10 TF (see **Supplementary Table S2**). Constructs used were the same as in Rahim et al. (2014).

#### Generation of Transgenic Plants Overexpressing the Peach MYB10.1 Gene

The function of the peach *MYB10.1* gene was analyzed by overexpressing it in the tobacco heterologous system. Thirteen independent lines overexpressing the peach *MYB10.1* gene were obtained from the transformation events, and the presence of the transgene was confirmed by PCR. For the phenotypic and molecular characterization, T0 plants were used (**Figures 1**–**6**; **Tables 1** and **2**), but in some cases, also T1 and F1 (crossing between transgenic plants for *35S::LhG4* and *pOp::MYB10.1* plants) tobacco plants were used for further analysis (**Figures 7** and **8**). The transgenic plants were compared with WT tobacco plants, also propagated by tissue culture, at the same developmental stages.

FIGURE 3 | Analysis of the leaf epidermal (adaxial surface) cells of transgenic tobacco lines. Length of the pavement cells (A), breadth of the pavement cells (B), and number of the pavement cells cm−2 (C) of the tobacco leaf adaxial epidermis. Length and breadth were measured using the ImageJ software program (Schneider et al., 2012) on photographs taken by ESEM. In the top row of panels A and B, data are presented for at least five cells (dots) from each plant (column, mean ± SE) of the three used for each independent line (a block for each line); two lines were used for each phenotype (a different color for each phenotype, red for type I and blue for type II). In the bottom row, values are averaged according to the lines (mean ± SD), with the same color code as in the top row. Panel C describes the density of pavement cells per surface unit. In this case, data are presented only per single line. Asterisks indicate statistically different values (P < 0.05) from WT with a nested one-way analysis of variance test.

phenotypes. N/A indicates the absence of capsules and ovules/seeds in type I transgenic plants.

tobacco lines at anthesis. The PCR cycle number was optimized for each gene, and the *Nicotiana tabacum* (Nt) ubiquitin conjugating enzyme E2 encoding gene (NtUBC2, Koyama et al., 2003) was used as a control for equal loading. The genes included in the expression analysis are as follows: MYB10.1, peach MYB10.1 TF gene; NtMYB305, tobacco MYB305 TF gene orthologous of Arabidopsis stamen filament growth related genes AtMYB21, AtMYB24 and AtMYB57 (Cheng et al., 2009); NtPAL, phenylalanine ammonia-lyase; NtCHS, chalcone synthase; NtCHI, chalcone isomerase; NtF3H, flavanone-3-hydroxylase; NtDFR, dihydroflavonol 4-reductase; NtLAR, leucoanthocyanidin reductase; NtANR1, anthocyanindin reductase 1; NtANS, anthocyanidin synthase; UFGT, udp-glycose:flavonoid-3-oglycosyltransferase; NtJAZd, encodes a jasmonate ZIM-domain protein; NtAOS, allene oxide synthase, NtNEC1, nectarin 1; and NtACO, 1-aminocyclopropane-1 carboxylate oxidase

#### Phenotype of Transgenic Tobacco Plants Overexpressing the Peach MYB10.1 Gene

Several *MYB10.1* overexpressing lines have been obtained, among which individuals had phenotypes ranging from WT-like to severe impairment in growth and development, but none showed evident signs of anthocyanin production. Transgenic plants showing strong defects in vegetative and reproductive characteristics are collectively described as type I, whereas type II includes independent clones showing only minor changes in vegetative and reproductive development when compared with WT tobacco plants (**Figure 1**). Six independent transgenic lines (named 1.2, 5.2, 9.2, 17.2, 19.2, and 20.2) could be assigned to type I and seven (2.2, 8.2, 10.2, 16.2 18.2, 22.2, and 24.2) to type II. Two independent lines (type I: 1.2 and 17.2; type II: 2.2 and 10.2) for each category of phenotype were analyzed further. Since the transgenic plants also contained a GUS reporter gene under the same transcriptional regulatory system (see *Materials and Methods*), GUS activity was assayed. Result showed at least fourfold higher activity in type I compared to type II transgenic plants, while no GUS activity was detected in the WT (see **Supplementary Figure S2**). Thus, GUS activity was, as expected, a good marker for MYB10.1 expression as it was higher in clones showing the strong phenotypes (type I).

At the vegetative level, type I transgenic plants showed reduced plant height (**Figure 1**) and leaf size; furthermore, also leaf shape was altered compared to type II and WT tobacco plants (**Figures 2A**–**E**). An ESEM analysis of leaf epidermis evidenced that the pavement cells of both types I and II transgenic lines had significant reduction in their lengths compared to WT ones (**Figures 2F**–**J** and **3A**). Moreover, the breadth of the pavement

FIGURE 6 | Detailed type I MYB10.1 phenotype during flower development. (A) Alteration of floral development in type I compared to (C) WT (stages were defined according to Koltunow et al. (1990). Bars = 200 pixel. (B) Cleared sections in wild-type developing ovules from stage st3-I to st3-V, the seven-celled embryo sac. (E) In ovules of overexpressing lines, the development is blocked to st3-I. (D) Thin sections of WT and type I ovules confirm the developmental arrest. (F) Aniline blue staining evidenced the persisting deposition of callose in pollinated type I plants, while the callose disappears in pollinated wild-type flowers. Ovules stages are according to Schneitz et al. (1995). ap, antipodal cells; cc, central cell; ec, egg cell; fg, FG; ii, inner integument; oi, outer integument; syn, synergid cells; v, vacuole. Bars are 1 cm in A and B; 50 µm in B, D, E, F St3-I; 200 µm in F St3-V.



TABLE 2 | Percent pollen viability of type I (strong) transgenic tobacco flowers compared to WT.


FIGURE 7 | Phenotypes of transgenic (A, B, and C, overexpressing MYB10.1) and WT (D) tobacco seedlings; (A) T1 seedlings from the type II 2.2 line (selfed); (B) T1 seedlings from the type II 10.2 line (selfed); (C) F1 seedlings from the cross: ♀ 35S::LhG4 x ♂ 35S::MYB10.1. Arrows indicate pigmented cotyledons of transgenic tobacco seedling.

cells was significantly increased only in type I transgenic lines, while there were no significant differences between type II and WT (**Figure 3B**). In addition, the number of pavement cells per unit area was also significantly reduced in type I transgenic lines, while in type II lines it was similar to WT (**Figure 3C**).

Type I transgenic plants showed abnormalities in floral development, whereas type II transgenic plants had flowers similar to WT (**Figure 4**). Type I transgenic flowers had a huge reduction in the length of the calyx, corolla, androecium, and gynoecium compared to type II and WT (**Figures 4A**, **B**;

**Table 1**). The lengths of the stamen filament were drastically or partially reduced in type I or type II flowers, respectively; as a result, anthers did not reach stigmas, thus preventing autopollination (see **Supplementary Figure S3**). In addition, there was no nectary formation (**Figure 4C**) in type I tobacco flowers. Usually tobacco flowers are pigmented only in petals, and in type II, flowers mimicked WT ones, while in type I transgenic flowers, besides the stronger pigmentation observed in petals, some purple pigmentation was observed also in anthers and ovaries (**Figures 4B**, **C**). Moreover, in type I transgenic plants, usually anthers did not dehisce; nonetheless, sometimes, at an extremely late stage when flowers were going to drop off, it could be observed that few anthers released some pollen at their tips (see **Supplementary Figure S4A**). At that stage, stigmas had completely lost their ability to allow germination of pollen grains, and thus, all the flowers were fated to drop so that only remnants of inflorescences without capsules remained on the plant (see **Supplementary Figure** 

**S4B**). As a consequence of this altered development, no seeds were produced by type I transgenic plants. On the contrary, an increase in petal pigmentation and purple coats in developing seeds were the most striking phenotypic differences observed in type II transgenic plants compared to WT (**Figures 4A**–**E**).

#### Alteration of Gene Expression in Transgenic Tobacco Flowers

The expression of some genes related to anthocyanin biosynthesis and floral development was analyzed by RT-PCR in both the transgenic and the WT tobacco flowers at anthesis. As shown in **Figure 5**, transcript amounts of the peach *MYB10.1* gene were higher in type I transgenic flowers than in type II ones. As expected, *MYB10.1* transcripts were not detected in WT flowers. The expression level of the endogenous *NtAN2* (an R2R3-MYB related to the regulation of anthocyanin biosynthesis in tobacco flower and the ortholog of *MYB10.1*) was low in type I, moderate in type II, and high in WT tobacco flowers. On the other hand, the expression pattern of endogenous *NtAN1b* (encoding bHLH TF) was similar to WT in both types of transgenic flowers.

The transcript abundance of anthocyanin biosynthetic genes was found to be from slightly [chalcone isomerase (NtCHI), flavanone-3-hydroxylase (NtF3H), dihydroflavonol 4-reductase (NtDFR), anthocyanidin synthase (NtANS), and UDP-glucose:flavonoid-3-O-glucosyltransferase (NtUFGT)] to strongly [phenylalanine ammonia-lyase (NtPAL), chalcone synthase (NtCHS)] increased in transgenic plants of both types and induction that seems even stronger in type II than in type I clones (e.g., for CHI, F3H, and DFR genes). The enhanced expression levels of the aforementioned biosynthetic genes correlate with the purple seed coat produced by type II transgenic tobacco plants. A similar expression pattern was also found for leucoanthocyanidin reductase (NtLAR), whereas anthocyanidin reductase 1 (NtANR1) expression was low and not so different from that in WT, meaning that MYB10.1 might also regulate PA biosynthetic genes in tobacco flowers.

As type I transgenic tobacco plants had shown defects in their reproductive organs, other genes encoding MYB TFs related to floral development were also analyzed in this study. In *Arabidopsis*, three MYBs (AtMYB21, AtMYB24, and AtMYB57) are predominantly expressed in flowers (Cheng et al., 2009; Li et al., 2006). As regards tobacco, NtMYB305, besides being the orthologous MYB of the aforementioned *Arabidopsis* genes, controls nectary and flavonoid biosynthetic gene expression in flowers (Liu et al., 2009). Considering these MYBs, a phylogenetic analysis was carried out, and NtMYB305 was confirmed to be orthologous to *Arabidopsis* AtMYB21, AtMYB24, and AtMYB57, whereas only a single peach gene (ppa011751, named MYB24 following the *Arabidopsis* nomenclature) was found in this clade (see **Supplementary Figure S5**). The expression of *NtMYB305*, known to be at maximum between stages 9 and 10 (Liu et al., 2009), was reduced in type I and similar to WT in type II transgenic flowers (**Figure 5**). As the expression of *NtNEC1*, a gene encoding the major nectarine protein, is controlled by *NtMYB305*, it was analyzed in *MYB10.1* overexpressing plants. Not surprisingly, *NtNEC1* was not expressed in type I transgenic flowers, which also had extremely low amounts of *NtMYB305* transcripts, whereas it appeared expressed at similar levels in both type II and WT flowers. These profiles correlate with the nectary glandless phenotype of type I transgenic tobacco flowers (**Figure 4C**).

Jasmonic acid (JA) is required for normal androecium development in *Arabidopsis* acting also through the induction of MYB genes (Mandaokar et al., 2006). As type I plants had phenotypes resembling *Arabidopsis* mutants defective in JA synthesis or signaling (Mandaokar and Browse, 2009), *NtAOS* (*allene oxide synthase*, for JA biosynthesis) and *NtJAZd*  (encodes a jasmonate ZIM-domain protein, for JA signaling) were tested for their expression. Both genes were up-regulated in type I transgenic flowers, but had levels similar to WT in type II clones.

The type I transgenic T0 plants were unable to set seed either by self-pollination, as occurs in WT flowers, or by manual pollination, even with WT pollen. Ethylene plays a role in megasporogenesis (De Martinis and Mariani, 1999), and thus to investigate whether the hormone had a role in the female reproductive organ development of type I plants, the expression of the *NtACO* gene encoding 1-aminocyclopropane-1-carboxylate oxidase was analyzed. The result showed that the transcript level of *NtACO* was up-regulated in type I transgenic flowers, whereas it was at WT levels in type II plants.

#### Functional Analysis of Floral Reproductive Parts of Transgenic Lines

To get insights on the inability to set seed in type I transgenic plants, their male and female reproductive parts were analyzed in more detail. For this purpose, the pollen viability from nondehiscent, dehiscent (only few anthers underwent dehiscence, but at a very late stage, when the flowers had already entered the senescence process), and partially dehiscent anthers was assessed. The pollen viability assay (Wang et al., 2004) showed that approximately 67.2% of pollen grains were viable in dehiscent, 34.24% in the nondehiscent, and 34.38% in the partially dehiscent anthers, while 90% was the viability for the WT pollen (**Table 2**). Therefore, it seems that transgenic pollen grains are less viable than WT but still able to fertilize the ovules, even if the anthers do not open to release grains. To discriminate whether the problem was due to the failure of anther opening or to the pollen maturation process, reciprocal crosses were performed. To check the fertility of the type I transgenic pistils, stigmas of T0 flowers were manually pollinated with WT pollen. The fertilization was a total failure, and all the flowers were shed from the plants. On the contrary, when emasculated WT stigmas were hand pollinated with type I pollen from lately dehiscent anthers, there was 85% of successful fertilization (see **Supplementary Table S3**). It has to be noted that most of the seedlings coming from these crosses were GUS positive, thus proving that they originated from transgenic pollen grains (GUS and MYB10.1 are on the same cassette). This indicates that *MYB10.1* somehow affects more the fertility of the female reproductive part than the male one.

To further investigate the effect of *MYB10.1* overexpression on fertility, the female gametophyte development has been monitored in type I and WT plants (**Figures 6A**, **C**). Three pistils of five independent lines have been analyzed, and a block during ovule formation was noticed. In an analysis of 150 ovules each, in cleared and thin sections, a block has been observed during FG1 corresponding to ovule developmental stage 3-I, confirmed by thin-section investigations (**Figures 6D**, **E**).

Aniline blue staining evidenced that the deposition of callose at the level of the megaspore persisted during development of ovules in type I plants up to the anthesis stage, while in WT the polymer was readsorbed soon after the tetrad stage (**Figure 6F**).

Type II plants were not considered in these analyses as their flowers were fertile and similar to WT one but for the color of their corollas and seed teguments.

#### Phenotype of T1 Generation of Transgenic Lines

The T1 seeds of type II and F1 seeds from crosses between transgenic plant for *35S::LhG4* and *pOp::MYB10.1* plants were sown in Petri dishes on filter paper soaked with water containing kanamycin and hygromycin to allow the growth of only the transgenic plants expressing *MYB10.1*. Some seedlings had a stunted growth, with cotyledons that looked bleached and accumulating pigments and died afterwards (**Figures 7A**, **B**); on the contrary, no pigments were found in WT cotyledons that grew normally (**Figure 7D**). Also the cotyledons of F1 seedlings (♀ *35S::LhG4* × ♂ *pOp::MYB10.*1) showed anthocyanin accumulation and chlorophyll bleaching (**Figure 7C**). After germination on Petri dishes, some of these T1 seedlings were transferred to soil and grown in the greenhouse. The presence of the transgene was confirmed by histochemical GUS staining and later by PCR. Several individuals of the T1 generation originated from the self-fertilization of type II transgenic plants showed pigment accumulation also in the calyx, capsule, and developing seed coats (**Figure 8**), while no pigment accumulation was observed in the capsule of their previous (T0) generation (**Figure 4D**).

## DISCUSSION

There are several reports on both homologous and heterologous overexpression of *MYB* genes like *MdMYBA* in tobacco (Ban et al., 2007), *MdMYB1* in *Arabidopsis* (Takos et al., 2006), *MdMYB10* in apple (Espley et al., 2007), *PyMYB10* in *Arabidopsis* (Feng et al., 2010), *VvMYB5a* (Deluc, 2006) and *VvMYBA1* (Li et al., 2011) in tobacco and grape, *LeANT1* in tobacco and tomato (Mathews, 2003), *IbMYB1a* in *Arabidopsis* (Chu et al., 2013), *IbMYB1* in *Arabidopsis* and sweet potato (Mano et al., 2007), *GMYB10* in tobacco (Elomaa et al., 2003), *AtPAP1* in tobacco, and *Arabidopsis*  (Borevitz et al., 2000). Ban et al. (2007) reported that the ectopic expression of *MdMYBA* in tobacco induces the accumulation of anthocyanins in the reproductive tissues. Feng et al. (2010) also demonstrated similar results when *PyMYB10* was overexpressed in *Arabidopsis*. The ectopic expression of *MYB10.1* showed similar anthocyanin accumulation patterns, although always limited to reproductive parts, particularly in petals and developing seeds. Tobacco plants are naturally able to make anthocyanins, but the pathway is only active in the flower, where both MYB (*NtAn2*) and bHLH (*NtAN1a* and *NtAn1b*) genes are expressed (Bai et al., 2011). On the contrary, the anthocyanin biosynthetic pathway is not active in tobacco vegetative parts, as leaves, where *NtAN1a* and *NtAn1b* are not expressed, and thus the overexpression of *MYB10.1* could enhance the anthocyanin pigmentation only in the reproductive parts (flowers and young seeds). The pattern of pigment accumulation is thus dependent on *NtAN1a* and *NtAn1b*, the two bHLH genes that have been shown to participate to the MBW complex in tobacco and whose transcription is presumably not induced by MYB10.1 in vegetative parts, as it was not in flowers (**Figure 5**). The inability of MYB10.1 to induce *NtAN1b* expression makes it different from other R2R3MYBs, as the tobacco *NtAn2*, whose ability to induce the expression of *NtAN1a* and *NtAn1b* provided the necessary elements of the MBW complex to induce anthocyanin synthesis also in leaves (Bai et al., 2011). This inability makes MYB10.1 quite peculiar if compared to other R2R3MYBs of its clade (see **Supplementary Figure S4**). Indeed, Chu et al. (2013) showed that overexpression of *IbMYB1a* gene (coding for sweet potato R2R3-MYB TF) up-regulates structural genes, like *CHI*, *F3H*, and *DFR*, in the anthocyanin biosynthetic pathway in transgenic *Arabidopsis*. Huang et al. (2013a) also demonstrated that overexpression of *EsMYB1* (encoding a R2R3-MYB TF from *Epimedium sagittatum*) up-regulates important flavonoid-related genes in both transgenic tobacco and *Arabidopsis*. Huang et al. (2013b) also found significant up-regulation of anthocyanin biosynthetic genes such as *NtCHS*, *NtCHI*, *NtDFR*, and *NtANS* in transgenic tobacco lines overexpressing *MrMYB1* gene (encoding R2R3-MYB) from Chinese bayberry. Also, the heterologous expression of *MYB10.1* modulates the transcription of most of the structural genes in the anthocyanin biosynthetic pathway, thus leading to purple anthocyanin pigmentation in the reproductive tissues of transgenic tobacco plants, but only in flower. These findings suggest that the overexpression of peach *MYB10.1* up-regulates anthocyanin production in tobacco flower by inducing the transcription of the biosynthetic genes through the interaction with the endogenous WD40 and bHLH coactivators.

In addition to regulation of anthocyanin biosynthesis, which was expected, overexpressing of *MYB10.1* caused other phenotypic variations in transgenic tobacco lines. After analyzing expression pattern of *MYB10.1* in transgenic tobacco plants, it was found that tobacco plants (type I) with higher expression of transgenes were defective with their vegetative and reproductive development. On the contrary, tobacco plants (type II) with moderate expression of transgenes were normal in their vegetative and floral development excluding seed coat pigmentation. The type I transgenic lines exhibited shorter plants and reduced leaf length, but similar leaf breadth as compared to type II and WT. This is probably due to the irregular cell size and shape, which here has been measured only on the most accessible ones, i.e., those of the epidermis. Epidermal cells, indeed, showed reduced length and increased breadth leading to a reduced number of epidermal cells per unit area. This suggests the involvement of *MYB10.1* on the plant cell growth and developments, interfering with a similar regulatory system composed of TTG1(WD40)-EG3/ EGL3(bHLH)-GL1(R2R3-MYB) that has been widely studied in *Arabidopsis* epidermal cell differentiation (Grebe, 2012). Although there are reports that sustain that "the trichomes of *Arabidopsis* and *Nicotiana* are merely analogous structures and that the *MYB* genes regulating their differentiation are specific and separate" (Payne et al., 1999) and that the *bHLH* genes have no effect on epidermal cell development in plants belonging to the Asterid division (Serna and Martin, 2006), we cannot exclude that the high expression levels achieved by means of the pOp/ LhG4 expression system (Rutherford et al., 2005) of *MYB10.1* had an impact on cell fate determination, reducing their sizes and thus decreasing plant and leaf growth. Altered expression of MYB/bHLH TFs not only induced anthocyanin biosynthesis in tomato leaves, but also led to the up-regulation of *MIXTA-like* and *GLABRA2* (*GL2*) TFs, regulators of epidermal cell patterning and trichome differentiation (Outchkourov et al., 2018). Future molecular investigations on vegetative parts of type I plants and the development of lines with epidermis-specific expression will help to elucidate the genes and pathways affected by *MYB10.1* overexpression in vegetative parts.

The role of MYBs of the PAP/MYB10 clade on flower development was up to now limited to pigment and thus color development, whereas MYBs of other clades have been described to participate to cell patterning and differentiation. It is known that *AtMYB21*, *AtMYB24*, and *AtMYB57* are mainly expressed in the flowers and play a critical role for the reproductive organ development (Shin et al., 2002; Li et al., 2006; Cheng et al., 2009; Dubos et al., 2010; Reeves et al., 2012). Yang et al. (2007) showed that *AtMYB24* was expressed in flowers, specifically in microspores and ovules, and plays an important role in anther development. Overexpression of *AtMYB24* caused pleiotropic phenotypes such as reduced plant height as well as defective anther development. After double mutant (*myb21, myb24*) analysis in *Arabidopsis* (Mandaokar et al., 2006; Reeves et al., 2012), it was shown that *AtMYB21* and *AtMYB24* are involved in floral organ development, such as flower opening, petal expansion, anther filament elongation, anther dehiscence, inhibition of lateral vascular development in unfertilized carpels, and abscission of sepals, petals, and stamens. In tobacco and petunia, the R2R3-MYBs most similar to *AtMYB21* and *AtMYB24* are *NtMYB305* and *EOBII* (see **Supplementary Figure S5**) and are involved in nectary gland formation (Liu et al., 2009) and biosynthesis of phenylpropanoid volatiles (Spitzer-Rimon et al., 2010; Colquhoun et al., 2011), respectively. Similarly, in cotton, *GhMYB24* (R2R3-MYB) is preferentially expressed in anthers/pollen (Li et al., 2013). Its overexpression in *Arabidopsis* leads to sterile plants, while lower to moderate expression levels of transgene produce fertile plants. A similar phenotype was observed when peach *MYB10.1* was overexpressed in tobacco plants, even though the two MYBs belong to different clades. It has to be noted that also peach has a gene of the MYB21/24/57 subclade (see **Supplementary Figure S5**) whose expression in the first three whorls of the flower (**Supplementary Figure S6**) is much higher than those of the previously characterized *MYB10* genes (Rahim et al., 2014), and thus, we could speculate that also in peach flower development is controlled by MYB24, whereas MYB10.1 is controlling anthocyanin synthesis. Nonetheless, the out-titration of MYB proteins achieved by heterologous expression with strong expression systems, such as the use of the Ca MV 35S promoter or a dual system as here, might affect the action of endogenous tobacco MYBs, interfering with functions that they do not (or do marginally) control in the original species.

In transgenic type I tobacco plants overexpressing peach *MYB10.1*, *NtMYB305* showed extremely low levels of expression, similar to those described in NtMYB305 RNAi lines (Liu et al., 2009). Thus, the nectary phenotype of *MYB10.1* overexpressing plant might be explained by the suppression of *NtMYB305*. More complicated is the explanation of the anther phenotype, as *NtMYB305* has been reported to be not expressed in this flower part (Liu et al., 2009), as, on the contrary, is for its strict homologues from *Arabidopsis* (MYB21/MYB24) cotton (Li et al., 2013), but also peach (**Supplementary Figure S6**). Jasmonate (JA) controls several plant processes, including plant growth, fertility, development, anthocyanin accumulation, and defense, and within the signaling cascades activated by JA, the jasmonate-ZIM domain (JAZ) repressor proteins are the major components (reviewed from (Browse, 2009; Pauwels and Goossens, 2011). Thines et al. (2007) demonstrated that AtJAZ1 protein repressed the transcription of JA responsive genes, and application of exogenous JA initiates JAZ1 degradation. Qi et al. (2011) reported that JAZ proteins interact with bHLH TFs like transparent testa8 (TT8), GLABRA3 (GL3), and enhancer of GLABRA3 (EGL3) and R2R3-MYB TFs such as MYB75/PAP1 and GLABRA1 (GL1) to repress JA-mediated anthocyanin production and trichome formation. In the absence of JA, JAZ proteins bind to the downstream TFs and limit their transcriptional activity, while the availability of JA leads to degradation of the JAZ proteins to free the downstream TFs for the transcriptional regulation of target genes (Chini et al., 2007; Qi et al., 2011). Jasmonic acid is involved in stamen development and pollen maturation process in plants. It has been demonstrated (Mandaokar et al., 2006) that *AtMYB21* and *AtMYB24* are induced by JA, while JAZ proteins interact with AtMYB21 and AtMYB24 to decrease their transcriptional function (Song et al., 2011); upon perception of JA signal, COI1 recruits JAZs to the SCFCOI1 complex for ubiquitination and degradation through the 26S proteasome to release AtMYB21 and AtMYB24, thus triggering transcription of various important genes for JA-mediated stamen development. Moreover, it has been shown that AtMYB21 acts within a negative feedback loop that regulates expression of multiple JA biosynthetic genes and, together with AtMYB24, also affects AtARF6 and AtRF8 activity and that a portion of the *myb21 myb24* flower phenotypes may be caused by decreased ARF activity (Reeves et al., 2012). The overexpression of *MYB10.1* also up-regulates the JA biosynthetic and signaling pathway genes *NtAOS* and *NtJAZd* (orthologous to *AtJAZ1* gene), in type I transgenic tobacco flowers. This suggests an imbalance in JA action in these transgenic plants, which show a phenotype similar to JA-deficient mutants but increased levels in transcripts coding for JA synthesis and action. This imbalance occurs only when high levels of *MYB10.1* accumulate in transgenic plants, thus leading to accumulation of flavonoids. It is well known that flavonoids inhibit auxin transport (Peer et al., 2011), and thus the increase in flavonoid concentration might impair auxin actions leading also to reduced JA levels, which could explain the anther phenotype. On the contrary, type II and WT showed a normal expression pattern of JA-related genes, and flowers had normal development, probably because these plants could react to the higher *MYB10.1* levels by decreasing *NtAN2* and *NtMYB305* levels, thus keeping the MYB-bHLH-WD40 (MBW) complex below a threshold level that does not stimulate excessive flavonoid production. Li et al. (2004) characterized a tomato mutant, *jasmonic acid–insensitive1* (*jai1*) with defects in JA signaling resulting in female sterility. They also reported that the sterility was due to the defect in the maternal control of seed maturation linked with the loss of accumulation of JA-regulated proteinase inhibitor proteins in reproductive tissues. Thus, besides auxin, also an impaired JA metabolism might be (among) the cause(s) of altered female fertility also in type I tobacco plants overexpressing *MYB10.1*. Indeed, type I plants exhibit a block during female gametophyte development and in particular during FG1 stage (St3-1), with respect to the WT. The *Arabidopsis* AtARF6 and AtARF8 are expressed also in the ovule and the embryo sac and, with auxin, are crucial for proper gynoecium maturation (Reeves et al., 2012).

The plant hormone auxin is important for gametophytic developmental processes, and it was observed that manipulation of its levels results in changes in cell fate (Panoli et al., 2015). Disruption of such gradients due to flavonoids inhibition of auxin transport might be among the causes of the arrest of the megagametophyte development.

At the same time, ethylene plays an important role in the early stages of female sporogenesis and ovule fertilization in tobacco (De Martinis and Mariani, 1999). Suppression of a pistil-specific *NtACO* gene caused female sterility due to an arrest in ovule development in transgenic tobacco plants. Type I transgenic tobacco flowers showed the opposite result with higher expression level of *NtACO*, possibly to balance the altered JA and IAA levels or because of a feedback mechanism trying to overcome the block of ovule development.

The nectary is rich in carbohydrates and secretes nectar to attract pollinators, as well as defends floral reproductive tissues against microorganisms (Carter et al., 2007). The tobacco nectar contains five different types of nectarin proteins (NEC1 to NEC5), but NEC1 is most abundant, and its expression is restricted to nectary (reviewed from Carter and Thornburg, 2004). The expression of *NtNEC1* is regulated by tobacco *NtMYB305* (Liu et al., 2009). The expression levels of *NtMYB305*, *NtNEC1*, *NtNEC5*, and anthocyanin biosynthetic genes such as *NtPAL* and *NtCHI* were reduced in tobacco by *NtMYB305*  knockdown experiment. The type I transgenic flowers also showed similar effects. These transgenic flowers were defective in floral development and had no nectary gland; this could be due to either the extremely low expression of *NtMYB305*, which could not activate transcription of *NtNEC1* in type I transgenic tobacco flowers, or a direct interference of MYB10.1 on *NtNEC1* transcription. Considering these findings, it is hypothesized that

#### REFERENCES


the overexpression of *MYB10.1*, by altering IAA/JA/ethylene levels, suppresses *NtMYB305* expression, thus causing defective flowers/missing nectary.

Based on the present results obtained through transgenic plant analysis, it can be concluded that overexpression of *MYB10.1* in tobacco regulates anthocyanin biosynthesis in the reproductive parts. Furthermore, its misregulation interferes, directly or indirectly, in other process such as vegetative and reproductive development, suggesting caution when choosing MYBs and promoters to induce the synthesis of anthocyanins to improve plant nutritional quality (Appelhagen et al., 2018).

#### AUTHOR CONTRIBUTIONS

LT and MR designed the research. MR, FV, and FR conducted the experiments. LT and MR analyzed data. FR, MR, and LT wrote the manuscript. All authors read and approved the manuscript.

#### FUNDING

This research work was supported by Ministero delle Politiche Agricole Alimentari e Forestali-Italy through the project "DRUPOMICS" (grant DM14999/7303/08) and University of Padova (grant CPDA072133/07). MAR was supported by a "Fondazione CARIPARO" fellowship.

#### SUPPLEMENTARY MATERIALS

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01143/ full#supplementary-material


by functioning as a general transcriptomic switch. *Plant Physiol.* 156, 974–984. doi: 10.1104/pp.111.176248


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Rahim, Resentini, Dalla Vecchia and Trainotti. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# *REM34* and *REM35* Control Female and Male Gametophyte Development in *Arabidopsis thaliana*

*Francesca Caselli1, Veronica Maria Beretta1, Otho Mantegazza1, Rosanna Petrella1, Giulia Leo1, Andrea Guazzotti1, Humberto Herrera-Ubaldo2, Stefan de Folter2, Marta Adelina Mendes1, Martin M. Kater1 and Veronica Gregis1\**

*1 Dipartimento di Bioscienze, Università degli Studi di Milano, Milan, Italy, 2 Laboratorio Nacional de Genómica para la Biodiversidad, Unidad de Genómica Avanzada, Centro de Investigación y de Estudios Avanzados del Instituto Politécnico Nacional, Irapuato, Mexico*

#### *Edited by:*

*Sergio Lanteri, University of Turin, Italy*

### *Reviewed by:*

*Gabriela Carolina Pagnussat, National University of Mar del Plata, Argentina Maria Beatrice Bitonti, University of Calabria, Italy*

> *\*Correspondence: Veronica Gregis veronica.gregis@unimi.it*

#### *Specialty section:*

*This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science*

*Received: 18 March 2019 Accepted: 01 October 2019 Published: 24 October 2019*

#### *Citation:*

*Caselli F, Beretta VM, Mantegazza O, Petrella R, Leo G, Guazzotti A, Herrera-Ubaldo H, de Folter S, Mendes MA, Kater MM and Gregis V (2019) REM34 and REM35 Control Female and Male Gametophyte Development in Arabidopsis thaliana. Front. Plant Sci. 10:1351. doi: 10.3389/fpls.2019.01351*

The REproductive Meristem (REM) gene family encodes for transcription factors belonging to the B3 DNA binding domain superfamily. In *Arabidopsis thaliana*, the *REM* gene family is composed of 45 members, preferentially expressed during flower, ovule, and seed developments. Only a few members of this family have been functionally characterized: *VERNALIZATION1* (*VRN1*) and, most recently, *TARGET OF FLC AND SVP1* (*TFS1*) regulate flowering time and *VERDANDI* (*VDD*), together with *VALKYRIE* (*VAL*) that control the death of the receptive synergid cell in the female gametophyte. We investigated the role of *REM34, REM35*, and *REM36,* three closely related and linked genes similarly expressed in both female and male gametophytes. Simultaneous silencing by RNA interference (RNAi) caused about 50% of the ovules to remain unfertilized. Careful evaluation of both ovule and pollen developments showed that this partial sterility of the transgenic RNAi lines was due to a postmeiotic block in both female and male gametophytes. Furthermore, protein interaction assays revealed that REM34 and REM35 interact, which suggests that they work together during the first stages of gametogenesis.

Keywords: gametophyte development, REM, transcriptional regulation, ovule, pollen, post-meiotic division, *Arabidopsis thaliana*

#### INTRODUCTION

In higher plants, the alternation between the diploid sporophytic generation and the haploid gametophytic generation is a fundamental characteristic of their life cycle. The formation of the gametophyte from the sporophyte is the result of two sequential processes, sporogenesis, and gametogenesis. Angiosperms are heterosporous plants, characterized by the production of two types of unisexual gametophytes, the megagametophyte (embryo sac), and microgametophyte (pollen). Developments of both female and male gametophytes can be divided into two main steps: sporogenesis, during which meiosis occurs giving rise to haploid spores, and gametogenesis, which leads to the formation of the gametes (Berger and Twell, 2011).

In Arabidopsis, the female gametophyte develops in the gynoecium. The first step of megasporogenesis consists in the formation of the ovule primordia, in which one cell differentiates into the megaspore mother cell (MMC) or megasporocyte; the MMC sustains one meiotic division, giving rise to four haploid megaspores. Only one of them, the functional megaspore, continues its development and goes through three mitotic divisions forming a mature embryo sac composed of eight nuclei and seven cells: three antipodal cells, two medial polar nuclei, and one egg cell surrounded by two synergids (Mansfield and Briarty, 1991).

In the anthers, the microspore mother cell gives rise, through meiosis, to four microspores, which develop into mature pollen grains, containing two sperm cells surrounded by the vegetative cell (Hafidh et al., 2016).

The transition from sporogenesis to gametogenesis is directly correlated with the cell cycle transition from meiosis to mitosis. During gametogenesis, the number of mitotic divisions (two for the male and three for the female gametophyte) has to be tightly regulated and coordinated with cytokinesis. This cell division process is complex and requires the integration of different pathways such as those involved in cell cycle progression, chromatin modifications, and hormonal signaling. Moreover, mitotic progression during gametogenesis is also affected when interfering with basic biological processes like organelle and ribosome biogenesis (Shi et al., 2005; Li et al., 2009; Wang et al., 2012).

In both gametophytes, the retinoblastoma-related protein (RBR) plays a key role in the regulation of the cell cycle by inhibiting cell cycle entry through repressing E2F transcription factors. The *rbr* mutation results in an uncontrolled nuclear proliferation in both gametophytes (Ebel et al., 2004; Ingouff et al., 2006; Johnston et al., 2008). More recently, *RBR* was also associated with the meiosis activation, when the MMC is getting reduced by meiosis and forming subsequently the functional megaspore (Zhao et al., 2017).

In all eukaryotic organisms, cell cycle progression is tightly linked to the activation and degradation of different cyclindependent kinases (CDKs). During both female and male gametophyte developments, the activity of two homologous RING finger E3 ubiquitin ligases, RHF1 and RHF2, are required for the degradation of the CDK inhibitor ICK4/KRP6, which allows the correct progression of the cell cycle. In the *rhf1 rhf2* double mutant, both female and male gametophytes fail to complete their development and are arrested in FG1 and microspore stage respectively (Liu et al., 2008).

The transcriptional activity in different cell types during plant development is dependent on epigenetic modifications, such as chromatin remodeling and histone modifications. Failure in the establishment of such modifications can cause different defects throughout the plant's life cycle. During gametogenesis, silencing of the *CHROMATIN-REMODELLING PROTEIN 11* (*CHR11*) within the embryo sac causes an arrest of nuclear proliferation from stage FG1 to FG5 (Huanca-Mamani et al., 2005). Furthermore, mutations in the histone acetyl transferase genes *HAM1* and *HAM2* cause an arrest in the early stages of both megagametogenesis and microgametogenesis (Latrasse et al., 2008).

Genetic studies have identified a large number of loci that control gametophyte development. Molecular cloning and characterization of some of them have revealed insights in sporocyte formation, meiosis/mitosis, and gametophyte development. Detailed phenotypic and molecular characterization of mutants remains a big challenge also because of the complication to work with such mutants, which often are partially sterile or even lethal (Muralla et al., 2011).

In the context of finding new players involved in the control of this process, the *REM* gene transcription factor family promises to be a good candidate since two of the four REMs that were functionally chatacterized, *VERDANDI* (*VDD or REM20*) and *VALKYRIE* (*VAL or REM11*), have a function in gametophyte development (Matias-Hernandez et al., 2010; Mendes et al., 2016). The other two members, *VERNALIZATION1* (*VRN1* or *REM5*) and *TARGET OF FLC AND SVP1* (*TFS1* or *REM17*), were shown to be involved in the control of flowering time (Levy, 2002; Sung and Amasino, 2004; Richter et al., 2019).

The expression patterns of *REM* genes were analyzed by Mantegazza et al. (2014) showing that the majority of the members of this family are preferentially expressed during flower and seed developments. Through this analysis, we identified *REM34*, *REM35*, and *REM36*, which are mainly expressed in the reproductive meristems but also throughout different stages of flower development. *REM34*, *REM35*, and *REM36* are located in a cluster, containing in total nine *REM* genes on the fourth chromosome of *Arabidopsis*. *REM34*, *REM35*, and *REM36* are very similar, which might indicate a possible functional redundancy.

Insertional mutants already analyzed for *REM34* and *REM36* are not complete knock-outs and showed no visible phenotype whereas no insertional mutants are available for *REM35* (Mantegazza et al., 2014). Since these genes are located in linkage on the *Arabidopsis* genome, it is also practically impossible to obtain multiple mutant combinations by crossing the available mutant lines.

Therefore, in this study, we investigated the role of *REM34*, *REM35*, and *REM36* through their simultaneous downregulation by RNA interference. Plants in which at least *REM34* and *REM35* were down-regulated showed an early arrest in the development of both female and male gametophytes. The process of mega/ micro sporogenesis was not affected, and meiosis was taking place. However, subsequent mitosis was not occurring after spore formation, suggesting that these genes play a role in gametogenesis progression.

#### MATERIALS AND METHODS

#### Plant Material and Growth Conditions

All experiments were performed in *Arabidopsis thaliana* ecotype Columbia-0 (Col-0). Plants were grown in a controlled environment at 20–22°C either under long day conditions (16 h light/8 h dark) or under short day (8 h light/16 h dark) conditions for 4 weeks after germination and then transferred to long day conditions. The *suf4-1 pSUF4*:*SUF4*-*GUS* seeds were donated by S.D. Michaels. Tobacco plants were germinated and grown at 20–22°C under long day conditions.

#### RNA Interference and *35S:EAR\_REM34* Constructs

To obtain the *REM\_RNAi* construct 252, 232 and 254 base pairs long DNA fragments specific for the coding sequence of each of the genes *REM34, REM35,* and *REM36* were selected (the primers used to amplify the fragments are listed in the **Supplementary Table 1**). The fragments specificity was checked by BLAST against the Arabidopsis genome.

The three selected regions were PCR amplified, adding the *BsaI* sites to the primers, and cloned in a pENTR™ vector previously modified to function as a Golden Gate acceptor, with a single Golden Gate reaction, producing the pENTR-*RNAi\_ REM* vector. The Gateway LR reaction (Invitrogen™ Gateway™ recombination cloning system) was then performed to sub-clone the *RNAi\_REMs* fragments into the pFGC5941 vector and used to transform Arabidopsis. Primers that were used are listed in **Supplementary Table 1.**

The EAR motif was added to the C terminus of the *REM34* coding sequence (see primer sequences in **Supplementary Table 1**). The fragment was cloned into the pB2GW7 plasmid (35S) passing through the pENTRY-D-TOPO vector (Invitrogen™ Gateway™ recombination cloning system). Arabidopsis plants were transformed using the floral-dip method (Clough and Bent, 1998).

#### Quantitative RT-PCR

Total RNA was extracted from whole inflorescences. RNA samples were treated with DNase (TURBO DNA-free®; Ambion, http://www.ambion.com/) and retrotranscribed employing the ImProm-IITM Reverse Transcription System (Promega). Diluted aliquots of the cDNAs or genomic DNA were used as templates in qRT-PCRs, using the iQ SYBR Green Supermix (Bio-Rad) to detect target synthesis. All the experiments were performed with three technical replicates for each of the three biological replicates, with the exception of the expression analysis of *REM34, REM35,*  and *REM36* in the T1 *REM\_RNAi*, in the T1 35S:*REM34\_EAR* plants and for T-DNA abundancy evaluation. Primers employed for these analyses are listed in **Supplementary Table 1**.

#### Silique Length, Seed Number Evaluation, and Reciprocal Crosses

For each line, 10 siliques (dissected from three different plants) were measured, and seed, aborted seed, and non-fertilized ovule numbers were counted. For this purpose, a Leica® MZ 6 microscope was used.

For the reciprocal crosses between wild-type and *REM\_RNAi*  #1 plants, mature siliques as well as open flowers and buds in an advanced stage of development were removed from the inflorescence of the mother plant, along with the meristem and smallest buds. Remaining buds were emasculated by removal of all floral organs except for the ovary. Then, anthers in the correct stage of development were taken from other flowers and used to pollinate the stigma. The numbers of seeds and unfertilized ovules were assessed for at least five pistils for each cross, and three biological replicas of the experiment were performed.

#### *In Situ* Hybridization Analysis

*In situ* hybridization analysis for *REM34, REM35,* and *REM36* were performed following the same protocol and employing the same probes described by Mantegazza et al. (2014). Evaluation of the expression profile in the inflorescence and flower meristems was used as a positive control.

#### Protein–Protein Interaction Analysis

Yeast two-hybrid assays were performed in the yeast strains PJ69-4A and PJ69-4α (de Folter and Immink, 2011). The coding sequences of *REM34, REM35,* and *REM36* were cloned in the pDEST32 (bait vector, BD; Invitrogen) and pDEST22 (prey vector, AD; Invitrogen) Gateway vector. The bait constructs were tested for autoactivation on selective yeast synthetic dropout medium lacking Leu, Trp, and His supplemented with 1, 3, 5, 10, or 15 mM of 3-aminotriazole, in order to set the screening conditions. After mating, colonies were plated on the proper selective media and grown for 5 days at 20°C.

The same coding sequences were also cloned in the pYFPN43 and pYFPC43 vectors, to perform the BiFC assay. Agrobacterium, transformed with the vectors and the viral suppressor p19 construct, was used to infiltrate tobacco leaves. The abaxial surfaces of infiltrated leaves were imaged 3 days after inoculation. As positive control for the infiltration, the already published VAL-VDD interaction was tested (Mendes et al., 2016). As negative controls, the constructs containing the proteins of interest were co-transformed with the empty pYFN43 and pYFC43 vectors. Furthermore, REM34 homodimerization, which was not observed in the Y2H assays, was also employed as a negative control (**Supplementary Figure 5**).

#### Female Gametophyte Characterization

Female gametophytes were cleared and analyzed as previously described by Brambilla et al. (2007). Inflorescences were prepared for observation using the following protocol: flowers were emasculated and the next day harvested. The emasculated pistils were left O/N at 4°C in a 1:9 acetic acid:ethanol solution. Samples were rehydrated by subsequent washes with ethanol 90 and 70% and then incubated O/N at 4°C in clearing solution (160 g chloral hydrate, 50 g glycerol, and H2O to a final volume of 250 ml). Pistils at different developing states were separated from the other floral organs and opened to evaluate the female gametophyte morphology. For these experiments, a Zeiss Axiophot® microscope equipped with differential interference contrast (DIC) optics was used.

#### *In Vitro* Pollen Germination

For this experiment, the protocol published by Bou Daher et al. (2009) was followed applying minor modifications.

Pollen grains were plated on small glass plates, containing 2.5 ml of pollen germination medium [PGM:18% sucrose, 0.01% boric acid, 1 mM CaCl2, 1 mM Ca(NO3)2, 1 mM MgSO4, 0.5% agarose pH = 7]. The plates were incubated overnight at 22°C, with wet paper to maintain humidity. The next day, pollen germination and growth were evaluated with a Zeiss Axiophot® microscope.

#### Aniline Blue Staining

Flowers were emasculated and, after 24 h, pollinated. The pollinated ovaries were collected at two different time points: 5 and 24 h after pollination. Samples were overnight fixed and stained in absolute ethanol/glacial acetic acid 9:1, as previously described by Mori et al. (2006). Subsequently, they were transferred into a 8M NaOH solution for 1 h at 50°C. Finally, the carpels were washed twice with ddH2O for 10 min. The staining was performed with a modified aniline blue solution (aniline blue 2%, glycerol 1 M ddH2O) (Takeuchi and Higashiyama, 2016). Samples were stored at 4°C for 3 h or overnight. The observation was done under UV light (350–400 nm) with a Zeiss Axiophot® microscope.

#### Pollen DAPI Staining

Pollen was stained according to Park et al. (1998). Mature pollen was obtained by placing 3–4 open flowers in a microcentrifuge tube containing 300 µl of 4′,6-diamidino-2-phenylindole (DAPI) staining solution (0.1 M sodium phosphate (pH 7), 1 mM EDTA, 0.1% Triton X-100, 0.4 µg/ml DAPI high grade, Sigma). After brief vortexing and centrifugation, the pollen pellet was transferred to a microscope slide and observed with a Zeiss Axiophot® microscope. Pollen at earlier stages of maturation was also analyzed by dissecting single anthers. Anthers were disrupted on microscope slides and squashed in DAPI staining solution (1 µg/ml) under a coverslip.

#### GUS Staining

β-Glucuronidase (GUS) assays were performed as described by Resentini et al. (2017). Pistils at different developmental stages were dissected and fixed in acetone 90% and incubated O/N at 37°C. After staining, they were cleared using the protocol described above.

#### Alexander Staining for Pollen Grains

Staining of pollen grains was performed as described by Peterson et al. (2010). After fixation (performed with 6 alcohol:3 chloroform:1 acetic acid), the anthers were placed on a microscope slide with a few drops of staining solution (10 ml 95% alcohol, 1 ml malachite green (1% solution in 95% alcohol), 50 ml distilled water, 25 ml glycerol, 5 ml acid fuchsin (1% solution in water), 0.5 ml orange G (1% solution in water), 4 ml glacial acetic acid, and distilled water (4.5 ml) to a total of 100 ml). Samples were analyzed with a Zeiss Axiophot® microscope.

#### CLSM Analysis

For confocal imaging, the Laser Scanning Confocal Microscope Nikon A1 was used. Inflorescences were fixed as described by Braselton et al. (1996). Samples were then excited using a laser (532 nm), and emission was detected between 570 and 740 nm.

#### RESULTS

#### RNAi Mediated Silencing of *REM34, REM35, and REM36*

Since *REM34, REM35,* and *REM36* are very similar and in linkage, an RNA interference approach was adopted to investigate their role during reproductive development in Arabidopsis*.*

Due to sequence divergency, even in the B3 DNA binding domain (Romanel et al., 2009), it was impossible to design a single artificial small interfering RNA fragment that was able to silence the three *REM* genes simultaneously. Therefore, a multiple RNA interference (RNAi) technology was used to express a single chimeric double stranded RNA that targeted the three REM genes under the control of CaMV35S (Miki et al., 2005; Bucher et al., 2006) (**Figure 1A**).

We selected three regions specific for the coding sequence of *REM34*, *REM35*, and *REM36*. The regions selected for *REM34* and *REM36* are highly specific for the genes of interest and were expected not to have any off target in the Arabidopsis genome. The RNAi fragment that targets *REM35* has a partial complementarity with *REM36*, and, at a lower level, with *REM37*, whose expression is almost undetectable in most Arabidopsis tissues (Mantegazza et al., 2014; Klepikova et al., 2016).

Forty *REM\_RNAi* T1 transgenic *Arabidopsis* lines were obtained. We evaluated the down-regulation of the *REM* genes in nine different T1 lines (**Figure 1B**), which all showed defects in silique and gametophyte development.

Silencing of the three target genes was confirmed in the T2 generation by qRT-PCRs (**Supplementary Figure 1**). Furthermore, we showed that the RNAi construct was specific for their targets by testing the expression of *REM37* and *REM39*. The latter was chosen due to the fact that *REM39* is highly expressed in the tissues where *REM34, REM35*, and *REM36* are also active (Mantegazza et al., 2014; **Supplementary Figure 1**).

#### *REM\_RNAi* Lines Have a Reduced Ovule Number and Seed Set Compared to Wild-Type Plants

We selected three *REM\_RNAi* lines (#1, #4, and #5), with different levels of silencing of *REM36,* for further investigations in the

FIGURE 1 | Multiple RNA interference lines. (A) Schematic representation of the RNAi construct. The REM34-REM35-REM36 sense and antisense fragments are separated by the chsA intron, to allow the hairpin structure formation. (B) qRT-PCR on nine different *REM\_RNAi* T1 inflorescences, showing a strong downregulation of *REM34* and *REM35* and different levels of *REM36* expression*.*

T2 generation. In line #1, *REM36* showed a downregulation of around 50%, while in lines #4 and #5, *REM36* was found to be slightly upregulated compared to the wild-type (**Figure 1B**).

In the T2 generation, silique length and seed number were evaluated for the three selected lines. The *REM\_RNAi #T2.1* line showed a decrease of 35.3% in the silique length and a 19.4% reduction in total ovule number (**Figures 2A**–**C**). Furthermore, on average 66% of the ovules failed to be fertilized (**Figures 2C**, **D**). The other two *REM\_RNAi* lines, #*T2*.4 and #*T2*.5, showed a similar phenotype even if the percentage of unfertilized ovules was lower, 35.3 and 45.4%, respectively (**Figure 2C**).

The *REM\_RNAi #1* line was selected to further investigate the sterility phenotype caused by the downregulation of *REM34, REM35,* and *REM36.* This line was propagated to the T3 generation, where plants homozygous for the *REM\_RNAi*  construct were selected. Even if the RNAi construct has a dominant effect, we evaluated whether the sterility observed in the *REM\_RNAi T2* segregating lines was exacerbated in plants homozygous for the construct. For this purpose, the seed set of the *REM\_RNAi T3.1* homozygous line was evaluated.

Interestingly, comparing both the *REM\_RNAi #T2.1* and the *REM\_RNAi #T3.1*, we noticed that the percentage of ovule abortion was the same, suggesting that the silencing of REMs is probably acting both at the sporophytic and gametophytic levels.

Since the two lines in which *REM36* was not downregulated displayed a milder phenotype compared to the *REM\_RNAi #T2.1 and REM\_RNAi #T3.1* lines, in which all three genes were downregulated, it is possible that *REM36* is partially redundant to *REM34* and *REM35* during gametophyte development. On the contrary, the ovule number was the same in all three *REM\_RNAi*  lines (**Figures 2A**, **C**), indicating that *REM36* is not involved in the determination of the ovule primordia number.

To further confirm that no phenotypical differences were detectable between plants homozygous and heterozygous for the T-DNA insertion, we analyzed the silique content of 10 *REM\_RNAi #T2.4* and 10 *REM\_RNAi* #*T2.5* T2 plants in which the construct was still segregating, and we found no significant differences between all the herbicide resistant plants (**Supplementary Figures 2A, C**). For both *REM\_RNAi #T2.4* and *#T2.5* lines, a relative evaluation of T-DNA copies in each of the nine plants considered was performed. The RT-PCR analyses showed a various amount of T-DNA amplicons which is clearly unrelated to the ovule abortions and the overall seed set observed in all the *REM\_RNAi #T2.4* and *#T2.5* analyzed individuals (**Supplementary Figure 2**). The *ACTIN7* amplicon was used as normalizer and the herbicide resistance BAR gene used to estimate the abundancy of T-DNA copies.

These analyses allowed excluding the possibility that the reduced seed set was linked to the presence of a heterozygous T-DNA insertion (Curtis et al., 2009; Clark and Krysan, 2010) and suggests that either the sporphytic silencing of *REM34*, *REM35,* and *REM36* affects the gametophyte or that the mobile siRNA diffuses from the sporophyte to the gametes (Mlotshwa et al., 2002; Melnyk et al., 2011; Skopelitis et al., 2018).

To understand if the reduced seed set was due to problems in the female or the male gametophyte, we performed reciprocal crosses between *REM\_RNAi #T3.1* and wild-type plants. As a control, both *REM\_RNAi #T3.1* (homozygous for the T-DNA triggering the RNAi silencing) and wild-type plants were manually selfed, in order to evaluate if the manipulation of the flower was affecting the fertility of the analyzed plants (**Figure 2E**).

When *REM\_RNAi #T3.1* pistils were pollinated with *REM\_ RNAi #T3.1* pollen, 73.3% of the ovules failed to be fertilized while wild-type lines manually pollinated with wild-type pollen resulted in 19.5% unfertilized ovules. When the *REM\_RNAi #T3.1* line pistils were pollinated with wild-type pollen, the percentage of unfertilized ovules was 78.6%, indicating a strong contribution of the female reproductive organ defects to this phenotype. Interestingly, when wild-type pistils were pollinated with *REM\_RNAi #T3.1* pollen, still 61.0% of the ovules were not fertilized (**Figure 2E**). Moreover, we observed a high variability in the number of unfertilized ovules using *REM\_RNAi* pollen as shown in **Figure 2E**. Macroscopical inspection revealed a decrease in pollen grain number compared to wild-type anthers and a lack of adherence of the pollen to the wild-type stigma, both observations were further investigated (see below). All these considerations strongly suggest that both female and male reproductive organs are affected in the *REM\_RNAi* lines.

#### *REM34, REM35,* and *REM36* Are Expressed in Both Female and Male Reproductive Organs in Adjacent Sporophytic and Gametophytic Cells

Previously, the expression pattern of the *REM* genes was characterized in the shoot apex by *in situ* hybridization analysis, showing that *REM34, REM35,* and *REM36* are expressed from the earliest stages of reproductive development of *Arabidopsis* in the inflorescence meristem and flower meristem and during the first stages of flower development with the exception of sepals (Franco-Zorrilla et al., 2002; Mantegazza et al., 2014).

In order to analyze the expression profiles in more detail during male and female sporophytic/gametophytic developments, we performed *in situ* hybridization analysis for *REM34, REM35,* and *REM36* in both female and male reproductive organs. The flower stages are described accordingly to Smyth et al. (1990) and Schneitz et al. (1995).

In Arabidopsis, pollen mother cells differentiate inside the young anther, ovule primordia arise from the placenta in stage 8 of flower development, and differentiation is completed at stage 13.

At stage 8/9 of flower development (Smyth et al., 1990), hybridization signals were detected for all three genes, in the anthers where the pollen mother cells differentiate and within the carpels, although in this case, the signal was stronger in the placenta and ovule primordia (**Figure 3A**).

At stage 10, a strong signal was always detected in developing ovules and pollen (**Figure 3B**). Our analysis revealed that the timing of expression of the three REM genes coincided with male and female sporogenesis.

During subsequent stages of flower development, stages 11–12, both female and male initiate gametophyte development. During these stages, a decrease in the signal was clearly observed

FIGURE 2 | *REM\_RNAi* lines have shorter siliques and a reduced seed set compared to the wild-type. (A) Graph showing the mean length of 10 wild-type and 10 *REM\_RNAi #T2.1, #T3.1, #T2.4,* and #*T2.5* siliques. A wild-type silique measures on average 13.4 mm, the siliques from the different *REM\_RNAi* lines were found to measure on average between 7.8 and 10.7 mm. (p < 0.01 for all comparison with the wild-type, ANOVA and *post hoc* Tukey HSD test were used). (B) Example of wild-type and *REM\_RNAi #T2.1* siliques (bar = 5 mm). (C) Graph showing the mean number of ovules/silique in the wild-type and *REM\_RNAi #T2.1, #T3.1, #T2.4,* and *#T2.5* plants, divided in seeds and not fertilized ovules. Compared to the wild-type situation, in which each silique contains on average 47.4 ovules, the *REM\_RNAi* siliques have on average 29.8 to 38.5 ovules (p < 0.01 for all comparison with the wild-type, ANOVA and *post hoc* Tukey HSD test were used). On average between 35.3 and 66.0% of ovules, depending from the analyzed line, failed to be fertilized, while no aborted ovules were detected in the wild type situation (p < 0.01 for all comparison with the wild type, ANOVA and *post hoc* Tukey HSD test were used). (D) Example of wild-type and *REM\_RNAi #T2.1* seed sets (bar = 5 mm). (E) Reciprocal crosses analysis between wild-type and *REM\_RNAi #T3.1* plants. As a control, both wild-type x wild-type and *REM\_RNAi #T3.1* x *REM\_RNAi #T3.1* crosses were performed. Crosses are indicated female x male. (p < 0.01 for all comparison with the wild type of the non-fertilized ovules number, ANOVA and post hoc Tukey HSD test were used).

FIGURE 3 | *REM34*, *REM35,* and *REM36* expression patterns. (A) In flowers at ST8-9, the signal in the carpel is restricted in the tissue of the placenta and ovules primordia. At the same stage, a clear signal is also visible in the anthers. (B) At ST10-11, the signal is present in the ovules, which are completing megagametogenesis, and in the anthers, where the pollen grains are undergoing the first mitotic division. (C) At ST12, when pollen reaches maturity, the signal is no longer visible in the anthers. (D) In flowers at anthesis, the target genes are expressed in the mature female gametophytes, in particular in the funiculus, inner integuments, and central cell. Flower stages are described accordingly to Smyth et al., 1990 (bar = 20 µm).

in anthers (**Figure 3C**); during these stages, pollen reaches maturity, and the vegetative and generative cells are differentiated after mitosis (Park et al., 1998). In contrast, a strong signal was detected during ovule development when the surviving megaspore undergoes three rounds of mitosis and passes from stages 3I to 3V (Schneitz et al., 1995). Interestingly, when the ovule is at its very last stage of development 3-VI (Schneitz et al., 1995), a strong signal was detected in the funiculus, in the innermost integument and, inside the mature female gametophyte, in the central cell region (**Figure 3D**).

The expression analysis of *REM34, REM35,* and *REM36* highlighted the fact that, also during anther/pollen and carpel/ovule development, these three *REMs* have a similar pattern of expression.

The analysis of the expression patterns of *REM34, REM35*, and *REM36* combined with the phenotypes observed in the *REM\_RNAi* lines denote an important role for these genes during the development and production of viable male and female structures and gametes.

#### In *REM-RNAi* Lines the Female Gametophyte Is Unable to Complete Its Development

The expression profile of *REM34, REM35,* and *REM36* suggests that these genes play a role during ovule development. Furthermore, the reciprocal crosses showed that between 73.3 and 78.6% of the ovules in the *REM\_RNAi #T3.1*, which is homozygous for the RNAi cassette, were not fertilized (**Figure 2E**).

Based on this evidence, we hypothesized that the ovule defects in the *REM\_RNAi* lines might be due to an arrest in their development. Therefore, a detailed evaluation of female gametophyte development was carried out in the *REM\_RNAi*  *#T3.1* homozygous line. In this line, 42.9% (227/529) of the ovules failed to complete their development and showed an arrest in the FG1 stage (**Figure 4A**). These ovules were characterized by an embryo sac containing one large cell, the functional megaspore, with a single nucleus; the rest of the ovules completed their development reaching the FG7 stage (**Figures 4B**, **C**). The same phenotype was observed in the *RNAi #T2.4* and *#T2.5* lines which both derived from hemizygous mothers (**Supplementary Figure 3**).

To confirm that, in the *REM\_RNAi* lines, the defective female gametophytes were arrested in the FG1 stage, after meiosis, we crossed the *pSUF4:SUF4-GUS* marker line with the *REM\_RNAi #T3.1* line. In the *pSUF4:SUF4-GUS* marker line, GUS expression is not detectable during megasporogenesis, but it becomes visible after meiosis, once the functional megaspore is formed, and marks all the nuclei of the embryo sac (Resentini et al., 2017). Observing *REM\_RNAi #T3.1*  pistils, both wild-type like ovules, with more than one nucleus and ovules arrested in the FG1 stage, with the nucleus of the functional megaspore, expressed the *GUS* reporter (**Figure 4D**) suggesting that the defect in female gametophyte development was post-meiotic.

To investigate in detail the arrest at the FG1 stage, we carried out confocal laser scanning microscopy (CLSM) on *REM\_RNAi #T3.1* developing ovules. The feulgen staining perfectly marked the cell wall of the ovule integuments and the embryo sac dividing nuclei, allowing the recognition of the gametophytic developmental stages. In the same *REM\_RNAi #T3.1* pistil, we observed ovules that normally developed until stage FG4 (**Figure 4E**) and those that were arrested in FG1 in which the embryo sac contains the functional megaspore and the three degenerating spores on top of it (**Figure 4F**).

FIGURE 4 | *REM\_RNAi #T3.1* female gametophyte characterization. (A) Analysis of cleared mature carpels of both wild-type (n = 11) and *REM\_RNAi #T3.1* (n = 13)*.* In wild-type mature carpels, all the ovules reach the FG7 stage (542/542 ovules), while in the *REM\_RNAi* line, 227/529 ovules are arrested at the FG1 stage. (B), (C) Cleared ovules collected from both wild-type (B) and *REM\_RNAi #T3.1* (C) mature carpels. In the wild-type situation, 100% of the embryo sac reaches the FG7 stage, while in the RNAi line, almost 60% of embryo sacs show an arrest in the FG1 stage (bar = 20 µm). (D) *pSUF4:SUF4-GUS* in the *REM\_RNAi* line. In the uppermost ovule, two nuclei are stained, indicating the progression of gametogenesis till FG4 stage. In the lowest ovule 1, nucleus is stained indicating an arrest in FG1 stage. The arrowheads marked nuclei (bar = 50 µm). (E–F) CLSM analysis of *REM\_RNAi #T3.1* ovules. In the same carpel, it was possible to observe ovules progressing in their development (E) and ovules arrested at the FG1 stage (F); asterisks indicate three out of the four nuclei. v, vacuole; fm, functional megaspore (bar = 10 µm).

### The *REM-RNAi* Lines Showed a Post-Meiotic Defect of the Male Gametophyte

From the analysis of wild-type carpels pollinated with *REM\_ RNAi* pollen, we observed that 61.1% of the ovules were not fertilized, suggesting that the male gametophyte in these lines is also defective (**Figure 2E**). To understand the cause of this defect, we first carried out an *in vitro* pollen germination assay which showed a 30% decrease in the germination rate of the *REM\_ RNAi #T3.1* pollen compared to the wild type (**Figures 5A**, **B** and **Supplementary Figure 4**).

The growth rate of *REM\_RNAi* pollen tubes and their ability to correctly target ovules were also evaluated *in vivo* by means of aniline blue staining (**Supplementary Figure 4**). The *REM\_RNAi* pollen tubes did not show any growth defect, they reached the end of the pistil in the same time as the wild-type pollen tubes, and the mature ovules were correctly targeted (**Supplementary Figure 4**). We noticed that, as mentioned before, the *REM\_RNAi* pollen number appeared to be lower, and it did not adhere well to the stigma papillae, which could explain the high variability observed in the backcrosses between wild-type pistils and *REM\_ RNAi* pollen (**Figure 2E**).

To try to understand the cause of the male sterility phenotype, pollen grains were collected from mature anthers and treated with Alexander's stain, which colors viable pollen red. While in the wild type, all the collected pollen was viable; in the *REM\_ RNAi #T3.1* anthers, 33.9% of the grains were not stained, indicating that those pollen grains were non-viable and did not appear to contain any cytoplasm (**Figures 5C**–**E**). Interestingly, the percentage of non-viable pollen grains in the *REM\_RNAi* line corresponds to the decreased germination capability observed *in vitro,* suggesting that the grains which are unable to produce the pollen tube are the degenerated ones.

To investigate the pollen defect in more detail, confocal laser scanning microscopy (CLSM) was used. In **Figure 5F**, wild-type pollen from a mature anther is shown; intine and exine layers were very well distinguishable and inside the pollen grain, and the sperm cells and the vegetative cell nuclei were stained. On the contrary, in the *REM\_RNAi #T3.1* mature anthers, a high percentage of pollen grains appeared shrunken and empty; neither sperm nor vegetative cells were identified, although the intine and exine layers looked intact (**Figure 5G**).

To understand when the pollen grains degenerated, we visualized their nuclei with DAPI staining at different developmental stages (**Figures 6A**–**F** and **Supplementary Figure 3**). At the microspore stage, all *REM\_RNAi #T3.1* grains were characterized by the presence of a single bright nucleus localized at the center of the cell, indicating that the pollen, like wild-type, passed through meiosis correctly (**Figures 6A**, **D**). After meiosis, in wild-type, the microspores underwent a first mitotic division that produced one vegetative and one sperm nuclei (**Figure 6B**). Subsequently, the second round of mitosis led to the formation of the mature pollen grain, which contained two small sperm cells each with a bright and elongated nucleus and the vegetative cell (**Figure 6C**). Interestingly, in *REM\_RNAi #T3.1* anthers, some grains were characterized by the lack of nuclei; this phenotype was detectable also at the tricellular stage (**Figures 6E**, **F** and **Supplementary Figure 3**).

Thus, after meiosis, *REM\_RNAi* anthers displayed both viable pollen grains, with two sperm cell nuclei and a distinct vegetative nucleus, and not viable pollen grains, in which no DNA is detectable (**Figures 6E**, **F**). This is similar to what was observed with the CLSM analysis.

All this evidence suggests that the degeneration of pollen grains observed in the *REM\_RNAi* lines could be due to a postmeiotic block in their development, a similar defect as the one observed in the female gametophytes.

### REM35 Formed Homodimers and Heterodimers With REM34

REM transcription factors can form functional heterodimers (Mendes et al., 2016). To understand if also REM34, REM35, and REM36 could function *via* dimer formation, yeast two-hybrid assays were performed. This approach revealed that REM35 is able to interact strongly with itself and also with REM34, while no interactions were detected with REM36 (**Figure 7A** and **Supplementary Figure 5**).

All the interactions observed in the yeast two-hybrid assays were confirmed *in vivo* with a Bimolecular Fluorescence Complementation (BiFC) assay in *Nicotiana benthamiana* leaves (**Figures 7B**, **C** and **Supplementary Figure 5**). This finding suggests that REM34 and REM35 could act as heterodimers.

#### Downregulation of *REM34, REM35,* and *REM36* Altered Expression of Genes Involved in Post-Meiotic Divisions

As described above, the downregulation of *REM34, REM35,* and partially *REM36* resulted in a post-meiotic arrest in both female and male gametophytes, suggesting that these transcription factors could be involved in regulating mitosis progression during gametogenesis.

To elucidate the molecular mechanism causing this block, we measured the expression levels of genes that control gametogenesis by q-RTPCR. We focused on genes that, when mutated or overexpressed, cause similar defects to those observed in the *REM\_RNAi* gametophytes. Those genes were divided into three categories based on the biological process in which they are involved in: ribosome biogenesis (*MDS, NLE*), cell cycle control (*RBR, KRP6*), and chromatin regulation (*HAM1, HAM2*).

*MDS*, which, together with *NLE,* is involved in the biogenesis of the 60S ribosomal subunit and is essential during megagametogenesis (Chantha et al., 2010), was downregulated in the *REM\_RNAi #T3.1* lines. *KRP6*, a CDK inhibitor whose overexpression causes a block in mitosis progression during female and male gametophytic development, was also downregulated in the *REM\_RNAi* lines.

Among the genes involved in chromatin modifications, two histone acetyltransferases (HATs), *HAM1* and *HAM2,* were selected. Only *HAM2* was downregulated in the *REM\_RNAi #T3.1* line (**Figure 8**).

These results suggest an intricate interconnection among regulators and effectors, which end up in a correct gametogenesis program.

and unable to be stained in red (bar = 20 µm). (E) Mature anthers from wild-type and *REM\_RNAi#T3.1* flowers were dissected, the released pollen was collected and treated with Alexander's staining to discriminate between viable and non-viable pollen grains. In the wild type, 100% of the pollen grains resulted vital while 33.9% of *REM\_RNAi#T3.*1 pollen was found to be non-vital. (wt n = 1,337, *REM\_RNAi#T3.*1 *= 874;* p < 0.01 for all comparison with the wild-type, ANOVA and *post hoc* Tukey HSD test were used). (F–G) CLSM analysis of wild-type (F) and *REM\_RNAi#T3.*1 (G) mature anthers. All the wild-type grains are round and contain the vegetative nucleus and the two sperm cell nuclei; in the *REM\_RNAi#T3.1* anthers, it is possible to visualize both pollen grains at two nuclei stage, as well as degenerate pollen grains, without any visible nucleus (bar = 10 μm).

microspore stage (A and D), all the grains contain a well-defined central nucleus. At the bicellular stage in all wild-type grains (B), the spermatic and vegetative nuclei are distinguishable, while in the *REM\_RNAi#T3.*1 lines (E), some grains, marked with an asterisk, do not display any nucleus. Wild-type mature pollen grains (C), characterized by the presence of two sperm cells and one vegetative nucleus. *REM\_RNAi#T3.*1 mature pollen (F), the asterisk marks a mutant pollen grain without nucleus (bar = 10 µm).

#### Overexpression of the REM34\_EAR Chimeric Protein

The genes that were downregulated in the *REM\_RNAi* lines might be targets of the REM transcription factors. This suggests that REM34 and REM35 might be transcriptional activators. To investigate whether REM transcription factors work as activators of transcription, we fused REM34 with the dominant EAR repressor domain (known as chimeric repressor silencing technology CRES-T) and transformed wild-type *Arabidopsis* plants with this construct. Five transgenic lines that overexpressed the *REM34\_EAR* chimeric gene at different levels in the T1 generation were obtained (**Figure 9A**).

In the T2 generation, silique length and ovule number were measured in two independent lines (*REM34\_EAR*#T2.1 and *REM34\_ EAR*#T2.7). In both the selected T2 *REM34\_EAR* lines, we observed a decrease in the silique length of 23.1 and 25.0%, respectively, and the presence of 55.0 and 42.6% aborted ovules, similar to what was observed in the *REM\_RNAi* lines (**Figures 9B**–**E**).

The phenotype of the aborted ovules was further evaluated in cleared mature carpels of the *REM34\_EAR #T2.1* and *#T2.7*  lines. We detected both ovules at FG7 stage, with the seven cells clearly distinguishable, and ovules at FG1 stage, characterized by a single cell embryo sac (**Figure 9F**).

To confirm also the post meiotic block in the male gametophyte, mature pollen of both *REM34\_EAR #T2.1* and #*T2*.*7* lines were stained with DAPI. Similarly to what was observed in the *REM\_RNAi* lines, some pollen grains were able to reach the tricellular stage while others appeared shrunken and degenerated, with no visible nuclei (**Figure 9G**).

The strong similarity between the *REM34\_EAR* and the *REM\_RNAi* phenotypes might suggest that the overexpression of the chimeric REM34\_EAR protein was causing co-suppression of other *REM* genes. To exclude this possibility, we investigated the expression level of *REM35, REM36, REM37,* and *REM39* in the *REM34\_EAR #T2.1* and #*T2.7* lines. The level of expression of the endogenous *REM34* was not taken into account, as the perturbation of *REM34* expression alone did not cause any evident phenotypical defects (**Supplementary Figure 6**) (Franco-Zorrilla et al., 2002; Mantegazza et al., 2014). The obtained results showed that the closely related *REMs* were not affected suggesting that the expression of the REM34\_EAR chimeric protein caused the observed phenotypes.

### DISCUSSION

#### Functional Analysis of *REM* Genes

The plant-specific REM family in Arabidopsis is composed of 45 genes, generated through multiple duplication events, which are mostly expressed during flower and ovule development (Romanel et al., 2009). Even if the expression pattern of these genes suggests that they could play an important role in regulating

showing the interactions between REM34 and REM35 and REM35 and REM35, on –L-W-H + 2.5 3-AT selective media. Empty pDEST32 vector was employed as a negative control. (B–C) BiFC experiments in tobacco leaf cells showing the reconstitute YFP fluorescence (green) between (B) REM34 and REM35 fusions to the C- and N-terminal fragments of YFP, respectively. (C) REM35 fusions to the C- and N-terminal fragments of YFP (bar = 50 µm).

developmental processes such as shoot architecture and flower development, until now, only a few of them have been associated to a function (Levy, 2002; Matias-Hernandez et al., 2010; Mendes

wild-type and *REM\_RNAi #T3.1*. The expression of selected genes was normalized to that of UBI, and the expression level in Col was set to 1.

et al., 2016). This might be due to their functional redundancy but also because they are often in linkage on the genome.

Here, we investigated the function of the linked duplicated *REM34, REM35,* and *REM36* genes by a multiple RNAi approach and showed that *REM34*, *REM35,* and partially *REM36,* are involved in male and female gametophytic developments during post-meiotic divisions. A similar multiple RNAi approach was previously employed to silence simultaneously up to six target genes in *Arabidopsis thaliana* (Czarnecki et al., 2016)*.*

The *REM\_RNAi* construct was found to be a very efficient tool: by selecting specific gene sequences, we were able to silence the three target genes with a single construct and a single transformation event. Importantly, the construct showed to be highly specific for the three genes of interest without any obvious off-target activity. Transgenic lines showing silencing of the *REM* genes under study were all characterized by a reduced seed set and an arrest in female gametophyte development at the earliest stages of gametogenesis. Since *REM34, REM35,*  and *REM36* appeared to be mainly expressed in sporophytic tissues throughout Arabidopsis reproductive development, the CaMV35S promoter was chosen to drive the expression of the RNAi fragments. The activity of the CaMV35S promoter seems to be low during female and male gametophyte developments, but it has been shown that such promoter can be successfully employed to silence genes during gametophytic development (Acosta-García and Vielle-Calzada, 2004; Mendes et al., 2016). A valid hypothesis for the observed gametophyte phenotypes might be that it is caused indirectly by the silencing of *REM34, REM35,* and *REM36* in the female and male sporophytic cells. However, it is also important to consider that the RNAi construct is dominant and that it can trigger a non-cell autonomous and systemic silencing signal which might be maintained throughout the different phases of plant development (Mlotshwa et al., 2002; Melnyk et al., 2011; Skopelitis et al., 2018).

Since functional redundancy is a common phenomenon in plants (Briggs et al., 2006), this kind of RNAi approach will be helpful for the functional characterization of members of highly redundant families and especially that are in linkage. Furthermore, since silencing of genes by *RNAi* is often not complete, this approach could favor the analysis of genes for whom knock-out leads to lethality or complete sterility.

#### REM Protein Interactions

Protein interaction studies revealed that REM34 and REM35 were able to interact with each other; while no interaction was found with REM36, this supports the hypothesis that *REM36* might not be able to substitute completely *REM34* and *REM35* functions. Interactions between REM factors were found before. VDD and VAL, two functionally characterized REM factors involved in synergid degeneration upon fertilization (Mendes et al., 2016), also interact with each other. Furthermore, both *VAL* and *REM35* were also able to make homodimers. These characteristics might well be a common feature for the REM family and, in perspective of the guilt-by-association principle, it would be informative to analyze all possible REM protein interactions. The same approach was shown to be extremely useful for the characterization of MADS

the mean length of 10 *REM34\_EAR #T2.1 #T2.7* siliques, compared to the wild-type, the two lines have a reduction in the silique length (p < 0.01 for all comparison with the wild type, ANOVA and *post hoc* Tukey HSD test were used). (C) Example of wild-type and *REM34\_EAR #T2.1* siliques (bar = 5 mm). (D) Graph showing the mean number of ovules/silique in the wild-type and *REM34\_EAR #T2.1 #T2.7* plants. Both lines were characterized by a reduction in the total seed set of around 10% compared to the wild-type (p < 0.01 for all comparison with the wild type, ANOVA and *post hoc* Tukey HSD test were used). On average between 55.0 and 46.2% of ovules, depending from the analyzed line, failed to be fertilized, while no aborted ovules were detected in the wild type situation (p < 0.01 for all comparison with the wild type, ANOVA and *post hoc* Tukey HSD test were used). (E) Example of wild-type and *REM34\_EAR #T2.1* seed sets (bar = 5 mm). (F) Cleared ovules sampled from mature *REM34\_EAR #T2.1* carpels, the asterisk marks the one blocked at the FG1 stage (bar = 20 µm). (G) DAPI stained pollen grains, sampled from mature *REM34\_EAR #T2.1* anthers; some grains were able to reach the tricellular stage and showed fluorescent nuclei while others appeared degenerated and with no visible nucleus (bar = 20 µm).

domain transcription factor family, for which extensive protein– protein interaction studies effectively guided genetic studies and functional characterization of many of them (de Folter et al., 2005; Gregis et al., 2006; Fornara et al., 2008; Immink et al., 2009).

#### *REM34* and *REM35* Control Female and Male Gametogenesis

We discovered that, in the *REM\_RNAi* lines, both the male and female germ lines were able to go through meiosis correctly, but they were not able to pass the FG1 stage, suggesting a role for *REM34* and *REM35* in the control of gametogenesis in Arabidopsis.

Although the *REM* gene family was named after the specific meristematic expression of its first member *AtREM1,* which was named *REM34* (Franco-Zorrilla et al., 2002), our data showed that *REM34, REM35,* and *REM36* are also expressed during gametophytic development, and we discovered that they were expressed starting from both carpels and anther primordia specification throughout all the stages of anther and carpel developments. In the carpel, the signal is strongly localized in the placenta and ovule primordia and in the developing ovules.

Indeed, our deep morphological analysis of both female and male gametophytes of the *REM\_RNAi* lines showed that from 35 to 65% of the female gametophytes were unable to undergo mitosis and were arrested at the FG1 stage when the MMC acquires functional megaspore identity.

*REM36* seemed to be partially redundant with *REM34* and *REM35.* Indeed, in the two lines in which the level of *REM36* expression was higher compared to the wild-type, the penetrance of the embryo sac defect was less*.* However, in all *REM\_RNAi* lines, we also observed a decrease of around 20% in the total ovules number irrespectively of the expression levels of *REM36.*  Thus, *REM36* might be involved in embryo sac development together with *REM34* and *REM35* but is not controlling ovule primordia specification*.*

In these lines, also pollen development was affected showing the same post meiotic arrest of the embryo sac. Thus, in Arabidopsis, REM34, REM35, and partially REM36 transcription factors seem to be required post-meiotically for gametophytic development.

Further confirmation for their role during both female and male gametogenesis came from the analysis of different *35S:REM34\_EAR* lines. These plants, in which *REM34* fused to the EAR repressor domain was overexpressed, showed the same postmeiotic arrest both in embryo sac and pollen development, suggesting that a complex formed by REM34 and REM35 could act as a positive transcriptional regulator of gametogenesis.

Because of the redundancy and position in linkage of the three genes of interest in the genome, most of this functional study was conducted using RNAi. This approach was found to be very effective in the silencing of *REM34, REM35,* and *REM36,* but the transgenic lines cannot be easily employed for genetic studies, due to the fact that it acts dominantly and because the level of silencing of the target genes can vary between different lines and throughout subsequent generations. Despite these difficulties, the analyses performed on both segregating and homozygous lines suggest that these three genes can influence gametogenesis acting mainly at the sporophytic level. This hypothesis is also supported by the expression pattern of these genes which, as shown by the *in situ* hybridization analysis, are present in the sporophytic tissues both in pistils and anthers when gametogenesis is taking place. The observation that *REM34, REM35,* and *REM36* appeared to be expressed throughout all stages of gametogenesis in the embryo sac leaves of course the possibility open that they directly play a function in the female gametophyte. The employment of an embryo sac specific promoter could be useful in order to validate this hypothesis and to be able to better distinguish between the sporophytic and gametophytic roles of *REM34, REM35,* and *REM36.*

To understand how the *REM* genes under study act, we tested whether the down-regulations of *REM34, REM35,* and *REM36* perturbed the expression of genes known to be involved in gametogenesis progression. These genes were classified accordingly to their biological function in three categories: cell cycle control, chromatin remodeling, and ribosome biogenesis. Interestingly, we observed that several genes involved in different biological pathways were downregulated in the *REM\_RNAi* lines. This observation suggests that *REM34, REM35,* and, in some measures, *REM36* are involved in the control of a very early steps of gametogenesis. In particular, they regulate the expression of different targets both directly and indirectly along the genetic network that controls gametophytic development in *Arabidopsis*.

Among the downregulated genes, the one that stands out most is *HAM2*, a HAT that, together with its homolog *HAM1*, belongs to the MYST clade of the HAT family and was shown to be involved in post-meiotic control of female and male gametophytic development (Latrasse et al., 2008). In mammals, the MYST protein family was found to be involved in many fundamental cell functions such as cell cycle progression and DNA repair (Pillus, 2008; Sapountzi and Côté, 2011). Furthermore, the human MYST4 acetylase was found to be expressed and involved in the control of gametogenesis as well (McGraw et al., 2007). In *Arabidopsis*, the *ham1 ham2* double mutant is lethal, while keeping one of the two genes heterozygous for the mutant allele resulted in a post-meiotic arrest of both female and male gametophyte developments (Latrasse et al., 2008). This phenotype is similar to the one observed in the *REM\_RNAi* as well as in the *35S:REM34\_EAR* lines. Interestingly, *HAM1* and *HAM2* were also found to be involved in the control of flowering time *via* the epigenetic regulation of *FLOWERING LOCUS C* (*FLC*), which is also a target of VRN1, one of the four *REM* genes for which the function is known so far, suggesting a common mechanism throughout plant development. The artificial silencing of these two acetyltransferases causes an early flowering phenotype (Xiao et al., 2013) which was also noticed in the *REM\_ RNAi* lines (data not shown). The downregulation of *HAM2* and the phenotypical similarity between the *REM\_RNAi* lines and the *HAM* downregulation suggest that these genes might be involved in the control of the same biological processes throughout *Arabidopsis* development. Further analysis will be needed to confirm the possible interaction between *REM34, REM35,* and the HATs HAM1 and HAM2. The observed downregulation of the other analyzed target genes could be due to the general deregulation of transcription caused by the reduced expression of the chromatin remodeling factor *HAM2.*

While not much is known about the *REM* gene family, substantial information is available for other transcription factor families that are characterized by the presence of the B3 DNA binding domain. In particular, the well-characterized *auxin response factor* (*ARF*) family, known to play a crucial role in regulating auxin responses, and the *related to ABI3/VP1* (*RAV*) family, which was found to be involved in hormonal regulation during different stages of *Arabidopsis* development (Swaminathan et al., 2008). The plant hormone auxin was found to be involved in gametogenesis (Pagnussat et al., 2009; Panoli et al., 2015). Indeed, perturbation of auxin transport in the embryo sac causes an arrest in the earliest stages of megagametogenesis (Ceccato et al., 2013). Auxin biosynthesis in the male gametophyte was also recently shown to be essential for the transition from microsporogenesis to microgametogenesis (Yao et al., 2018). Because of the phenotypic similarities between the auxin defective mutant (Pagnussat et al., 2009; Panoli et al., 2015; Ceccato et al., 2013) and the *REM\_RNAi*  lines and because of the linkage between transcription factors containing the B3 DNA-binding domain and the regulation of hormonal responses, it is tempting to speculate that the role of *REM34, REM35,* and *REM36* play in gametogenesis is also based on the regulation of a hormonal related processes.

In summary, we gained new information about the expression pattern and function of *REM34, REM35,* and *REM36* during gametophyte development in *Arabidopsis*; those genes might control post-meiotic divisions in both embryo sac and pollen grains. These findings underline further the importance of *REM* genes during reproductive development in plants. Although these genes are often highly redundant and physically linked in the genome, slowly on, we start to get a better understanding about their functions in plant development. Of course, we just see the tip of the iceberg and still a huge amount of work has to be done to fully understand in detail the molecular and genetic mechanisms by which *REM* genes function.

#### DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the manuscript/**Supplementary Files**.

#### AUTHOR CONTRIBUTIONS

FC performed most of the experiments and wrote the manuscript. VB performed morphological analyses and contributed to writing the manuscript. OM and GL designed and employed the RNA interference and EAR constructs. MM designed and

#### REFERENCES


performed the CLS experiment and performed part of the backcrosses. RP made BiFC experiments. HH-U and SF designed the Y2H screening and helped FC with the experiment. AG designed the in-vitro pollen germination experiment and the Aniline blue analyses. MK contributed to the design of the experiments and helped writing the manuscript. VG designed the research, helped with the experiments and wrote the manuscript.

#### FUNDING

VG was supported by Ministero dell'Istruzione, dell'Università e della Ricerca MIUR, SIR2014 MADSMEC, Proposal number RBSI14BTZR. The post-doctoral fellowship of AG was supported by MIUR, SIR2014 MADSMEC, Proposal number RBSI14BTZR.

The PhD fellowship of FC and RP were supported by the Doctorate School in Molecular and Cellular Biology, Università degli Studi di Milano. FC was supported by PROCROP-H20MC\_ RISE15LCOLO\_M. RP was supported by H2020-MSCA-RISE-2015 ExpoSEED Proposal Number: 691109.

Work in the SF laboratory was financed by the Mexican National Council of Science and Technology (CONACyT) grants CB-2012- 177739 and FC-2015-2/1061, and SF acknowledges support of the Marcos Moshinsky Foundation and the European Union H2020- MSCA-RISE-2015 project ExpoSEED (grant no. 691109).

#### ACKNOWLEDGMENTS

We thank Simona Masiero, Francesca Resentini, Lucia Colombo for helpful suggestions and valuable discussions. We also thank Annamaria Piva, Radha Cighetti and Francesco Gozzo from University of Milan, Toshiaki MITSUI and Marouane BASLAM Department of Applied Biological Chemistry Graduate School of Science & Technology - Niigata University Ikarashi, Nishi-ku, Niigata, Japan and Ravishankar Palanivelu School of Plant Science-University of Arizona Tucson for their technical support.

Part of this work was carried out at NOLIMITS, an advanced imaging facility established by the Università degli Studi di Milano.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01351/ full#supplementary-material


transcriptome based on RNA-seq profiling. *Plant J.* 88, 1058–1070. doi: 10.1111/tpj.13312


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Caselli, Beretta, Mantegazza, Petrella, Leo, Guazzotti, Herrera-Ubaldo, de Folter, Mendes, Kater and Gregis. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Natural Variation in Ovule Morphology Is Influenced by Multiple Tissues and Impacts Downstream Grain Development in Barley (Hordeum vulgare L.)

*Laura G. Wilkinson1, Xiujuan Yang1, Rachel A. Burton1, Tobias Würschum2 and Matthew R. Tucker1*

1 School of Agriculture, Food and Wine, University of Adelaide, Urrbrae, SA, Australia, 2 State Plant Breeding Institute, University of Hohenheim, Stuttgart, Germany

#### Edited by:

Marta Adelina Mendes, University of Milan, Italy

#### Reviewed by:

Raffaella Battaglia, Council for Agricultural and Economics Research, Italy Sureshkumar Balasubramanian, Monash University, Australia

\*Correspondence: Matthew Tucker matthew.tucker@adelaide.edu.au

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 03 July 2019 Accepted: 04 October 2019 Published: 31 October 2019

#### Citation:

Wilkinson LG, Yang X, Burton RA, Würschum T and Tucker MR (2019) Natural Variation in Ovule Morphology Is Influenced by Multiple Tissues and Impacts Downstream Grain Development in Barley (Hordeum vulgare L.). Front. Plant Sci. 10:1374. doi: 10.3389/fpls.2019.01374

The ovule plays a critical role in cereal yield as it is the site of fertilization and the progenitor of the grain. The ovule primordium is generally comprised of three domains, the funiculus, chalaza, and nucellus, which give rise to distinct tissues including the integuments, nucellar projection, and embryo sac. The size and arrangement of these domains varies significantly between model eudicots, such as Arabidopsis thaliana, and agriculturally important monocotyledonous cereal species, such as Hordeum vulgare (barley). However, the amount of variation in ovule development among genotypes of a single species, and its functional significance, remains unclear. To address this, wholemount clearing was used to examine the details of ovule development in barley. Nine sporophytic and gametophytic features were examined at ovule maturity in a panel of 150 European two-row spring barley genotypes, and compared with grain traits from the preceding and same generation. Correlations were identified between ovule traits and features of grain they produced, which in general highlighted a negative correlation between nucellus area, ovule area, and grain weight. We speculate that the amount of ovule tissue, particularly the size of the nucellus, may affect the timing of maternal resource allocation to the fertilized embryo sac, thereby influencing subsequent grain development.

Keywords: barley, ovule, nucellus, grain, pistil, yield

## INTRODUCTION

Barley is a cereal that has sustained humans for thousands of years and remains a crop of key agricultural and economic importance (Samuel 1996; Eitam et al., 2015). A large portion of the economic value of barley comes from the endosperm of the grain, which is a key source of calories for direct consumption by livestock and humans (Sands et al., 2009), as well as a source of protein and fermentable sugars for the malting and brewing industries (Gupta et al., 2010). The effects of climate change are predicted to negatively impact global barley yield over the next 50 years, and as such, efforts have been directed towards breeding elite barley genotypes with higher yield and robust tolerance to environmental stress (Tester and Langridge 2010).

Before grain can be produced, barley plants must generate floral heads, also referred to as inflorescences or spikes. In two-rowed barley, the central rachis of each spike is flanked by individual spikelets (flowers), which contain multiple florets of which only the central floret is fertile. All of the organs and tissues required for self-fertilization and seed production are located within this floret. This includes a single ovule within a single ovary (pistil), which together comprise the female reproductive organs that give rise to and protect the bulk of the tissues within the grain. The number of mature florets has been linked to barley yield (Alqudah and Schnurbusch, 2014), while the size and number of cells within the pistil has been linked to spike dry weight in wheat (Guo et al., 2015; Guo et al., 2016). Heat and drought stress have been shown to compromise aspects of pistil and ovule maturation, leading to defects in fertilization and grain development in wheat (*Triticum aestivum*) and maize (*Zea mays;* (Saini et al., 1983; Jäger et al., 2008; Oury et al., 2016; Onyemaobi et al., 2018). Thus, correct development of the female reproductive organs is an important determinant of floral fertility and yield, especially under conditions of environmental stress. A greater understanding of ovule and pistil development in cereal crops may provide breeding targets for improved yield and yield stability.

The ovule establishes the basic framework for seed production and its development has been heavily studied in a range of species including *Arabidopsis* (Schneitz et al., 1995; Pinto et al., 2019) and rice (Itoh et al., 2005; Colombo et al., 2008), highlighting similarities in tissue development and function. In general, ovule primordia consist of three domains including a proximal connective tissue (the funiculus), central chalaza, and distal nucellus (Schneitz et al., 1995). The funiculus acts as a stalk to connect the ovule to the maternal plant in *Arabidopsis*, but is absent in cereal species (Engell 1994; Maheswari 1950). Instead, this role is fulfilled by the chalaza, which connects the ovule to the ovary and also gives rise to the integuments that surround the nucellus. The nucellus is located at the distal tip of the ovule and gives rise to the megasporocyte and embryo sac (germline; Shirley et al., 2018; Pinto et al., 2019). Similar to *Arabidopsis*, embryo sac development in monocotyledonous cereals such as barley follows the *Polygonum*-type (Willemse and van Went, 1984; Schneitz et al., 1995). However, barley ovules are much larger than those of *Arabidopsis* due to a multilayered nucellus that contributes up to 65% of the ovule area at maturity (Wilkinson and Tucker 2017). In general, cereal ovules are described as crassinucellar, meaning the megasporocyte is separated from the ovule epidermis by at least two cell layers (Wilkinson et al., 2018). This differs from *Arabidopsis* that produces a small tenuinucellate nucellus dominated by a megasporocyte that directly adjoins the nucellar epidermis. In barley, the nucellus appears intermediate between the two forms in that it is large and multilayered but tenuinucellate during early development (Bennett et al., 1973; Engell, 1989). Despite these differences, in both *Arabidopsis* and barley the nucellus gives rise to a single megasporocyte (megaspore mother cell) that undergoes meiosis to produce a tetrad of four reduced megaspores (megasporogenesis), one of which is selected to become the functional megaspore and initiate embryo sac development (megagametogenesis). In both barley and *Arabidopsis* the embryo sac consists of seven different cell types with discrete functions. The position of the egg apparatus, central cell, and antipodal cells are conserved, but the number of antipodal cells varies considerably from three in *Arabidopsis* to at least 30 in barley and other cereals (Brink and Cooper 1944; Diboll 1968; Chaban et al., 2011).

Different ovule tissues play distinct roles in downstream seed development. For example, the integuments differentiate into a seed coat that provides physical protection for the seed (Debeaujon et al., 2000), while the embryo sac gives rise to the endosperm and embryo after fertilization of the central cell and egg cell respectively. Genes that influence integument growth or endosperm divisions in *Arabidopsis* have been implicated in the control of seed size (Garcia et al., 2005; Ingouff et al., 2006; Adamski et al., 2009; Batista et al, 2019), suggesting a role for both maternal and filial factors in the control of seed development. The role of the nucellus in downstream seed development is less clear, but is likely to fulfill distinct functions depending on the species. In *Arabidopsis*, the nucellus gives rise to the female germline before diminishing during subsequent pre-fertilization stages (Xu et al., 2016). In barley and wheat, similar to *Arabidopsis*, the nucellus gives rise to the germline. However, it subsequently increases in size and only diminishes after fertilization, when constituent cells undergo programmed cell death (PCD) and differentiate to form the nucellar projection (Dominguez et al., 2001; Thiel et al., 2008). This tissue fulfills a key role in funneling maternal nutrients into the endosperm through the endosperm transfer cells. Defects in differentiation of the nucellar projection, through down-regulation of the *Jekyll* gene for example, result in severe defects in grain fill (Radchuk et al., 2006). Previous studies in a small panel of barley genotypes suggested that nucellus size varies at ovule maturity, although the effect on grain development was not determined (Wilkinson and Tucker 2017). One possibility is that remobilization of reserves *via* nucellar PCD provides a local nutrient source for early endosperm development, and hence variation in nucellus size might impact key features of seed size and morphology.

This study aimed to define the bounds of normal ovule morphology in a population of 150 two-row spring barley genotypes, and investigate possible correlations between ovule morphology and mature grain traits. Correlations were identified between different ovule tissues, revealing that the nucellus and embryo sac both contribute to overall ovule size but in a genotype-dependent manner. Small but significant correlations were identified between mature ovule features and the grain they produced, and suggest that increased ovule and nucellus size may have a negative impact on grain size and weight.

#### METHODS AND MATERIALS

#### Plant Growth

A panel of 150 European two-row spring barley genotypes, representing a sub-panel of the genotypes described by Comadran et al. (2012) that show limited population structure and similar flowering time, were sourced from the James Hutton Institute, Scotland. In 2014, plants were grown in one large glasshouse at The Plant Accelerator, Adelaide, Australia, in a 50:50 cocopeat:clayloam soil mixture (v/v), under a 22/15°C day/night temperature regime with natural light conditions. Seeds were sown in May and harvested in September, at a density of one plant per pot. Harvested grain were hand-threshed, analyzed, and re-sown in 2015. The resulting plants were grown in two small glasshouses, in the same soil mixture, and under the same temperature regime, but with different day length due to later sowing (July). Plants were grown in triplicate with a randomized pot sequence. Pistils were harvested from all plants as described below and phenotypes were compared between the replicate samples in the two glasshouses (GH14 and GH17). Statistical analysis suggested there was a small but significant difference in mature ovule measurements; for example, the average ovule area in GH14 was 170,323.42 µm2 (n=573) compared to 176,697.89 µm2 (n = 457) in GH17 (*t*-test, p = 0.001; **Figure S1A**). Ratios of the GH14 to GH17 measurements varied from 0.96 to 1.0 and suggested that most features were slightly smaller in GH14 (**Figure S1B**). Because ovules from all genotypes were sampled from both glasshouses, average values were used for all subsequent analysis. In September 2016, a selection of 10 genotypes (Cecilia, Forum, Gant, Akita, Optic, Host, Foxtrot, Wren, Salka, and Lina) were re-grown to assess whether differences in ovule measurements were reproducible.

### Sample Collection and Microscopy

Pistils were dissected from anthesis stage florets based on physical appearance similar to stage 9.5 of the Waddington Scale (Waddington et al., 1983), and by the presence of bright yellow anthers that only released pollen when gently crushed. At least three pistils (five at most) were hand dissected from the middle of one inflorescence of all three replicates of each genotype, where possible. After clearing (see below), damaged or insufficiently cleared samples were discarded, eventually leaving between 4 and 15 mature ovules (stage Ov10; see **Figure S2**) for analysis per genotype.

In order to assess earlier stages of ovule development, the Waddington Scale (Waddington et al., 1983) was again used to stage florets during spike development. Five florets were collected from the middle of at least three inflorescences from Waddington stage 6 until fertilization. At stages prior to Waddington stage 8.5 whole florets were collected; for samples after this stage only the pistil was collected. Pre-anthesis samples were analyzed in 10 genotypes showing diverse morphology at anthesis, including Cecilia, Forum, Gant, Akita, Optic, Host, Foxtrot, Wren, Salka, and Lina. These stages are represented in **Figure 1** and **Figure S2** by the Salka genotype.

### Clearing and Microscopy

Pistils were fixed in FAA (10% formalin, 5% glacial acetic acid, 50% ethanol, 35% millipore H2O, plus a drop of Triton X100) overnight, then dehydrated through an ethanol series (3 × 30 min at each step of 70, 80, 90, 95, 100%) and placed into Hoyer's solution as described in Wilkinson and Tucker (2017). Ovules within the cleared pistil tissue were observed using differential interference contrast (DIC) microscopy and Nomarski prisms on a Zeiss Axio Imager M2, and captured as z-stack images encompassing the entire ovule from dorsal to ventral aspect in 40 optical sections. Composite images for figures were assembled in Adobe Photoshop and Illustrator (both version CC 2018; Adobe Inc., USA).

#### Quantitative Analysis of Mature Ovule Morphology

Nine morphological traits were measured from the z-stack images, using Zeiss Zen Blue (2012) software as described in Wilkinson and Tucker (2017). Each trait represents a one- or two-dimensional measurement, and data reflects the widest point of the region of interest visible within the z-stack. The nine measurements collected were: ovule area (OV\_A), ovule transverse width (OV\_T), ovule longitudinal height (OV\_L), embryo sac area (ES\_A), embryo sac transverse width (ES\_T), embryo sac longitudinal height (ES\_L), nucellus area (NUC\_A), nucellus proportion (NUC\_P), and integument width (INT\_W). Measurements were averaged from between 4 and 15 anthesis stage ovules (stage Ov10), representing at least two of the three replicate plants from each genotype. Low sample numbers (< 4 ovules) due to poor plant health, insufficient clearing, tissue damage, or staging errors resulted in elimination of 23 genotypes from the analysis, reducing the initial population of 150 genotypes to a functional population of 127. Integument "area" was not measured due to difficulties in accurately scoring the boundaries. Thus, what is presented as ovule area is essentially nucellus plus embryo sac area.

## Data Analysis

Genotypic variance components were obtained from linear mixed models with a random genotypic effect and their significance was tested by model comparison with likelihood ratio tests where the halved P values were used as an approximation (Stram and Lee 1994). Repeatability (R) was estimated from these models following the approach suggested by Piepho and Möhring (2007). Trait correlations, dendrograms, and principal component analyses were performed using default parameters in the "corrplot" package (https://cran.r-project.org/web/ packages/corrplot/corrplot.pdf) in R with RStudio (R version 3.5.0; RStudio®, USA). Unless indicated otherwise, Pearson's correlation coefficients and associated significance values are shown. Figures were assembled in Adobe Illustrator CC 2018 (version 22.0.0, Adobe Inc., USA).

### Grain Trait Measurements

Grain traits were analyzed using a SeedCount™ SC4 (Seed Count Australasia, Condell Park, Australia) at the University of Adelaide, following manufacturer's instructions. Data were obtained from two generations of the same genotypes i.e. grain that was sown and collected in 2014 ("2014" grain), and grain that sown and collected in 2015 ("2015" grain). Ovule phenotypes were collected from plants that gave rise to the 2015 grain. In total, sufficient numbers and replicates of grain and ovule data were obtained for 73 genotypes.

FIGURE 1 | Clearing of whole barley florets and pistils reveals different stages of ovule development. (A) At stage Ov2 the ovule has initiated in the center of the carpel and is characterized by the selection of an archesporial cell and initiation of the inner integument. (B) At stage Ov3 the megaspore mother cell (MMC) has differentiated from the archesporial cell and both integuments are observed. (C) At stage Ov4 the MMC has undergone meiosis, giving rise to four haploid daughter cells. The nucellar dome is obvious between the two growing integuments. (D) At stage Ov5/6 the functional megaspore has been selected and possibly initiated mitosis to form the embryo sac (ES), while the nucellar dome is not fully enclosed. (E) At stage Ov7/8 the integument has closed over the nucellus thus forming the micropyle. During this stage, the embryo sac will complete mitotic divisions and cellularize, producing two synergid cells, an egg cell, a central cell with two polar nuclei, and at least three antipodal cells. (F) At stage Ov9a growth of the ovule has increased rapidly, and the antipodal cells proliferate to become a group of 15 to 45 small and tightly clustered cells. (G) At stage Ov9b growth of the ovule has begun to slow, and the antipodal cells are distinctly less tightly clustered. (H) At stage Ov10 the ovule reaches anthesis, or reproductive maturity, discernible from stage 9b by slightly larger nucellus and embryo sac areas, and greater spacing of the antipodal cell nuclei. (I) At stage Ov11 the ovule is fertilized, as determined by a combination of a large increase in ovule size, visibility of sperm nuclei, lack of visibility of the polar nuclei, irregular shapes of antipodal cells, and clusters of small nuclei at the periphery of the embryo sac. ac, archesporial cell; es, embryo sac; fg1, onecell female gametophyte; fm, functional megaspore; ii, inner integument; mmc, megaspore mother cell; oi, outer integument; nuc, nucellus. A solid line indicates the bounds of the embryo sac, a dashed line indicates the bounds of the antipodal cell cluster, arrowheads indicate additional clusters of nuclei after fertilization. Images from the genotype Salka. Scale bars = 100 µm.

## RESULTS

#### Progression of Ovule Development in Barley Can Be Tracked by Whole Pistil Clearing

Nine distinct stages of barley ovule development were discernible in cleared floral tissue using DIC microscopy (**Figure 1**). These stages were aligned with ovule staging systems previously reported for rice to aid analysis and cross-species comparisons, and are referred to as stages Ov2 to Ov11 (Lopez-Dee et al., 1999; Itoh et al., 2005). In barley, stages Ov2 to Ov4 encapsulate outgrowth of the initial ovule primordium which includes: integument initiation and archesporial cell differentiation (Ov2), integument outgrowth and megaspore mother cell differentiation (Ov3) and integument "over-growth" and meiosis/megaspore selection (Ov 4) (**Figures 1A**–**C**; **Figure S2**). Stages Ov5 to Ov8 incorporate the events of embryo sac mitosis. These were difficult to precisely phenotype based only on embryo sac features (**Figures 1D**, **E**), but other morphological differences were evident. At stage Ov5/6, the nucellar dome was still obvious and ovules showed evidence of functional megaspore expansion (**Figure 1D**), while at stage Ov7/8, the integuments had fully encapsulated the nucellus and the embryo sac contained two to four free nuclei (**Figure 1E**). At stage Ov9a, a fully cellularized embryo sac was present in ovules and antipodal cells had started proliferating, concurrent with massive proliferation/expansion of the nucellus and embryo sac (**Figure 1F**). Antipodal divisions appeared to be complete at stage Ov9b, and the antipodals themselves were less tightly-clustered compared to Ov9a (**Figure 1G**). Ovules at stage Ov10 were fully mature and discernible from Ov9a/b by clear definition of individual antipodal cells that contained enlarged nuclei, a central cell containing two unfused polar nuclei and an egg cell with a prominent nucleus (**Figure 1H**). Stage Ov11 was used to broadly group ovules that had recently been fertilized (**Figure 1I**). This series of ovule stages from Ov2 to Ov10 is equivalent to the phase of barley pistil development represented by stages W6 to W10 on the Waddington scale (**Figure S2**). Ovule maturity (Ov10) appeared to have been reached by W9.5 to W10, just prior to anthesis and "blooming" as described by Brenchley (1920). In the absence of fertilization, only minor changes in ovule morphology could be observed in late stage pistils despite some changes in anther growth and stigma opening, while lodicules were not examined.

#### Mature Ovule Morphology Varies Among Two-Row Spring Barley Genotypes

Mature ovule morphology (Ov10) was measured in all 150 genotypes in terms of 2-dimensional area and 1-dimensional distances (Wilkinson and Tucker 2017). Regional areas were measured by following the widest boundary of the tissue of interest at any point within the z-stack. In the majority of genotypes, ovules at maturity exhibited an overall similar appearance including a prominent embryo sac, large antipodal nuclei, and an enlarged central vacuole (**Figures 1H**, **S3B**). Immature ovules were occasionally identified that showed an unusually small antipodal cluster, central cell, and short distance between the micropyle and top of the embryo sac (**Figure S3A**). At the other extreme, fertilized ovules could easily be distinguished by the presence of irregularly shaped antipodal cell nuclei, clusters of small nuclei at the periphery of the embryo sac and a much larger ovule area (**Figures 1I**, **S3C**). The incidence of these "extremes" that were immature or fertilized may reflect sampling error, natural mutants, and/or indicate that reproductive maturity is not perfectly synchronous between the anther and ovule in all barley genotypes. For the purposes of this study, the most dramatic extremes were considered to be incorrectly staged ovules and were not examined further, leaving sufficient data for analysis and comparison of 127 genotypes (**Table S1**).

Quantification of ovule morphology revealed natural variation in all traits (**Figure 2**, **Table 1** and **S1**). With the possible exception of integument width, most traits followed a normal distribution. The most variable trait was embryo sac area, with an average size of 48,876.2 ± 10,844.2 µm2 and a standard deviation (SD) equating to approximately 22% variation in size. Ovule area and nucellus area were comparatively less variable, observed to be 174,421.2 ± 19,857.8 µm2 (11.4% SD) and 125,560.8 ± 13,408.7 µm2 (10.7% SD), respectively. Consistent with this, the transverse and longitudinal measurements of the embryo sac varied more than the transverse and longitudinal measurements of the ovule (**Table 1**). Of all traits measured, ovule transverse width was the least variable, followed by integument width and the proportion of nucellus within the ovule (calculated as nucellus area/ovule area). Statistical analysis suggested that the difference between genotypes (i.e. the genotypic variance) was significant for all traits (**Table 1**). Moreover, repeatability (R) estimates were moderate to moderately high, ranging from 0.27 for ovule area to 0.59 for integument width, and sit in the range expected for complex traits based on a single location trial.

For all traits, at least 30 genotypes were found to have phenotypic variation that fell outside one SD (**Table 1**, **S2**). Examples of genotypes showing distinct differences are shown in **Figure 3**. For example, Lina, Foxtrot, Wren, and Salka all produced large ovules (**Figures 3A**–**D**, respectively), with a particularly large nucellus in Lina and Salka. In contrast, Gant, Cecilia, Akita, and Forum produced relatively small ovules (**Figures 3E**–**H**, respectively), with Akita and Forum producing a relatively small nucellus. These data indicate that the ovule traits under examination in this panel show significant variation between genotypes, and might therefore be used to examine their relationship to each other and the downstream events of seed development.

#### Ovule Component Tissues Show Similar Relationships Despite Genotype-Specific Differences

Pearson correlation analysis was used to assess if overall variation in ovule morphology is a result of coordinated development of all ovule tissues, or whether growth of one tissue is more important (**Figures 4A**, **S4**). Nucellus area and ovule area showed a slightly higher correlation (*r* = 0.86, p < 0.001) than embryo sac area and ovule area (*r* = 0.77, p < 0.001; **Figure 4B**). In contrast, embryo sac area and nucellus area showed a significant but low correlation (*r* = 0.33, p < 0.001). Meanwhile, both embryo sac area and ovule area were negatively correlated with nucellus proportion (ovule

TABLE 1 | Summary of natural variation, repeatability, and genotypic variance in nine mature ovule traits in 127 genotypes of two-row spring barley. OV\_A, ovule area (µm2); OV\_T, ovule transverse width (µm); OV\_L, ovule longitudinal height (µm); ES\_A, embryo sac area (µm2); ES\_T, embryo sac transverse width (µm); ES\_L, embryo sac longitudinal height (µm); INT\_W, integument width (µm); NUC\_A, nucellus area (µm2); NUC\_P, nucellus proportion (%).


area to nucellus proportion: *r* = -0.38, p < 0.001; embryo sac area to nucellus proportion: *r* = -0.87, p < 0.001). This suggests that while an increase in nucellus area reliably leads to larger ovule area, excessive embryo sac growth may achieve a similar outcome but at the expense of the nucellus. Bigger ovules were also more likely to have thinner integuments, as all ovule traits were negatively associated with integument width, particularly ovule area and nucellus area, although the correlation was low

FIGURE 3 | Examples of barley genotypes showing differences in ovule morphology at maturity. (A) A representative large ovule in Lina. (B) A representative large ovule in Foxtrot. (C) A representative large ovule in Wren. (D) A representative large ovule in Salka. (E) A representative small ovule in Gant. (F) A representative small ovule in Cecilia. (G) A representative small ovule in Akita. (H) A representative small ovule in Forum. Composite images were created by overlaying sequential optical sections from the z-stack. Scale bar = 100 µm.

(ovule area to integument width: *r* = -0.25, p < 0.01; nucellus area to integument width: *r* = -0.24, p < 0.005).

The contribution of both embryo sac and nucellus traits to ovule size was reflected in the principal component analysis (PCA) plot (**Figure 4C**). The component indicators for embryo sac transverse width and longitudinal height (*r* = 0.82, p < 0.001) were closely positioned on the PCA plot in contrast to those for ovule transverse width and longitudinal height (*r* = 0.63, p < 0.001; **Figure 4B**). This suggests that variation in embryo sac area is more likely to be due to proportional variation in both transverse and longitudinal dimensions, whereas variation in ovule area may be due to independent changes in either direction. The PCA plot revealed an even spread of genotypes without obvious clustering, in addition to several clear outliers for each trait. Genotypes previously identified to have "extreme" phenotypes (**Table S2**) were located at the periphery of the PCA plot, which provides some insight into how variation in either nucellus area or embryo sac-related traits influence ovule area. For example, the large-ovule phenotypes of Golden Promise, Salka, and Wren appear to be driven by a combination of large nucellus and embryo sac traits. This differed from genotypes such as Host, which produced an "average" sized ovule with a relatively large nucellus area, and the above-average ovule area of Foxtrot, which was predominantly due to enlarged embryo sac traits. Other genotypes, such as Forum and Gant, produced an overall small-ovule phenotype due to smaller nucellus area. This indicates that although variability in embryo sac traits impacts

ovule morphology, the overall "size" of the barley ovule is heavily dependent upon nucellus area.

To assess whether ovule features, in particular ovule area and nucellus area, were reproducible over subsequent generations, 10 genotypes were re-sown for analysis (**Table S3**, **Figure S5**). Genotypes were chosen that incorporated a range of ovule sizes including (from small to large); Cecilia, Forum, Gant, Akita, Optic, Host, Foxtrot, Wren, Salka, and Lina. Overall ovule area at maturity showed a Pearson correlation of 0.93 (p < 0.001) between subsequent generations, while nucellus area was slightly more variable but still significant at 0.79 (p < 0.01). Integument width also showed a significant correlation of 0.81 (p < 0.01). Embryo sac area showed a correlation of 0.56 but did not meet significance criteria. This confirms that in a subset of the population, ovule area, nucellus area and integument width show reproducible features across generations. This is an important finding for future studies since it suggests a genetic component influences variation in ovule development, and this might be investigated by quantitative genetic screens.

#### Grain Traits Vary Between Genotypes and Share Relationships With Same-Generation Ovule Morphology

The variation in ovule measurements identified in the barley panel provided an opportunity to assess quantitative relationships between sub-ovule tissues and grain traits. Grain from plants

FIGURE 4 | Relationships between different barley ovule traits at anthesis. (A) Anthesis ovule from Golden Promise showing the different regions used for measurements. Scale bar = 100 µm. (B) Heat map representing correlations between nine mature ovule traits measured in 127 genotypes of European two-row spring barley. Positive correlations are shaded blue, negative correlations are shaded red. Numbers within boxes represent the correlation coefficient (r) value. Both box color and r value are only shown for those with a p-value of < 0.05. OV\_A, ovule area (µm2); OV\_T, ovule transverse width (µm); OV\_L, ovule longitudinal height (µm); ES\_A, embryo sac area (µm2); ES\_T, embryo sac transverse width (µm); ES\_L, embryo sac longitudinal height (µm); INT\_W, integument width (µm); NUC\_A, nucellus area (µm2); NUC\_P, nucellus proportion (%). (C) Principal component analysis of 127 genotypes of European two-row spring barley based on nine mature ovule traits. Key cultivars of interest to this study are highlighted in red (larger ovule features) or green (smaller ovule features).

grown in 2014 and 2015 were analyzed using a SeedCount™ SC4 instrument (Seed Count Australasia, Condell Park, Australia). Grain samples were sorted to focus only on genotypes showing evidence of good fill and at least 50 grain in at least 2 replicates.

After filtering, 73 genotypes remained for comparison between years and with ovule phenotypes (**Table S4**).

Within each generation of grain, similar trends were observed. Grain width and thickness appeared to be the major indicators of grain weight compared to grain length (**Figure S6**). Similarly strong positive correlations were identified between grain weight and two-dimensional grain area (*r* = 0.71, p < 0.001 and 0.69, p < 0.001 within 2014 and 2015 grain, respectively). Despite being grown in glasshouses with similar environmental regimes, differences in sowing dates in 2014 and 2015 appeared to significantly impact fill; grain weight was 42.8 ± 0.99 in 2014 *vs*. 35.2 ± 0.56 in 2015 (t-test, p = 0.001). This is perhaps unsurprising given the later sowing date. However, several correlations were identified across the 2 years. For example, grain weight showed a small but significant correlation (*r* = 0.39, p < 0.001; **Figures S6**, **S7A**). A similar correlation was identified for grain width (*r* = 0.30, p < 0.01; **Figure S5**), in addition to slightly weaker correlations for grain thickness and area (**Figure S6**).

Grain traits were compared to ovule measurements from the same 73 genotypes (**Figures S6, S7**). No significant correlations were observed between grain morphology in 2014 and ovule morphology in progeny plants. In contrast, several significant correlations were identified between ovule morphology and features of the grain they produced (**Figures S6, S7**). Ovule transverse width, ovule area and nucellus area showed small but consistent negative correlations with all of the grain traits. For example, nucellus area showed a negative correlation with grain area (*r* = -0.37, p < 0.01; **Figure S7B**) and grain weight (*r* = -0.37, p < 0.01; **Figure S6**). The only positive correlation was observed between integument width and grain thickness (*r* = 0.35 p < 0.01; **Figure S7C**). Despite embryo sac area showing a strong correlation with ovule area in the same panel (*r* = 0.83 p < 0.001), no correlation was observed between any embryo sac measurement and grain measurement.

These effects appeared to be even more prominent when considering phenotypic extremes (**Figure S8**). Genotypes showing the largest nucellus area (n = 20) were compared to those showing the smallest nucellus area (n = 20; **Figure S8A**). As expected, these genotypes also showed a corresponding difference in overall ovule size (**Figure S8B**). In addition, grain weight (*t*-test, p = 0.01; **Figure S8C**) and grain area (*t*-test, p = 0.008; **Figure S8D**) were significantly reduced in those genotypes showing a larger nucellus and larger ovule. We considered that these differences in grain weight might be due to differences in fertility and the number of grain per spike. Mature spikes from at least 8 tillers of the 10 variable genotypes described above (Cecilia, Forum, Gant, Akita, Optic, Host, Foxtrot, Wren, Salka, and Lina) were scored for grain number and spike length. Analysis revealed differences in the number of grain per spike (from 18 to 24, average 21.6 ± 1.6) but these did not show any significant correlation with grain or ovule size. Hence, although the physiological basis for this variation in grain weight and size remains unknown, the results presented here are consistent with a pre-fertilization sporophytic ovule component that influences downstream features of grain development.

### DISCUSSION

The plant ovule is a key reproductive organ that supports growth of the female gametophyte and establishes an environment for seed development. Previous studies in barley have focused on

the role of the ovule nucellus around the time of fertilization and beyond (Radchuk et al., 2006; Thiel et al., 2008; Tran et al., 2014), revealing its role as a nutrient transfer tissue and identifying key genes that control its maturation and function (e.g. *Jekyll*; Radchuk et al., 2006). Despite this, little information is available regarding early stages of ovule and nucellus development in cereal monocots, or whether variation in nucellus growth affects downstream seed development. Tissue-specific components of fertility and seed development are relatively unexplored in cereal species, but may hold promise for future attempts to increase yield through modified breeding strategies, increased yield potential, or protection against stress (Whitford et al., 2013; Alqudah and Schnurbusch 2014; Brinton and Uauy 2018; Wilkinson et al., 2018).

This study examined the range of natural variation present in mature ovule phenotypes among a population of two-row spring barleys. Nine distinct stages of ovule development were identified by tissue clearing and morphological analysis, and these were aligned with previous staging studies in rice (Lopez-Dee et al., 1999; Itoh et al., 2005). Stages Ov2 to Ov4 describe the initiation of the germline lineage in the ovule, stages Ov5 to Ov8 incorporate mitotic divisions of the embryo sac, stages Ov9a and 9b show expansion/proliferation of ovule tissues, and stage Ov10 reflects reproductive maturity at anthesis. These ovule stages were aligned with the Waddington scale (Waddington et al., 1983), to simplify staging of ovule development in barley. A diagram showing the alignment of these scales is shown in **Figure S2**.

Data was collected from 150 genotypes at stage Ov10 to establish an "average" phenotype for the ovule at maturity and to identify genotypes showing variation in ovule morphology. It is generally reported that male and female reproduction in cereals is synchronized (Bennett et al., 1973; Kubo et al., 2013), thus the developmental stage of the anthers should reflect that of the less-accessible ovule. Here, developmental stage was predicted in two ways: (1) by assessing similarity of the anthesis pistil to that described in the Waddington scale, and (2) by determining whether the anthers of each floret were yellow in color and ready to release pollen. In some cases, pistil clearing revealed unexpectedly small ovule features, consistent with ovule immaturity. At the other extreme, some pistils contained an overly large ovule that appeared to have been fertilized (Diboll 1968; Engell 1994; Maeda and Miyake 1996, Maeda and Miyake, 1997; **Figure S3**). Collection of ovules at points before and after maturity, despite attempts to stage for reproductive maturity, may reflect sampling error. However, as anther and pistil phenotypes were used as a staging reference, we speculate that late male and female reproductive development are not perfectly synchronized in some two-row spring barley genotypes.

#### Large Ovules in Barley Typically Contain an Enlarged Nucellus

As the megasporangium, or the tissue that ultimately gives rise to the female germline, the nucellus is a key component of ovule fertility. Nucellus area varied up to ± \29% in the barley panel, and was tightly coupled to overall ovule size. Despite this, as ovule size increased, the proportion of nucellus tended to decrease. The reason for this was embryo sac expansion, since embryo sac area showed a clear negative correlation with nucellus proportion. Hence, increased embryo sac and nucellus area may both drive increases in ovule area, but the expanding embryo sac increases in size at the expense of the nucellus. This might be facilitated by pre-fertilization degradation of nucellus cells adjoining the embryo sac, genotype-specific proliferation of antipodal cells and/or through mechanical compression of nucellus cells over time (Johri et al., 2013).

Despite the fact that nucellus development varies between species, the functional significance of intra- and inter-species variation in nucellus size has remained unclear (Rudall et al., 2005; Rudall et al., 2008; Endress 2011; Lu and Magnani 2018). In cereals, hypotheses suggest that a bigger nucellus might provide a larger repository of amino acids, carbohydrates, or hexose sugars that are required for the early stages of grain development (Wilkinson et al., 2018), but this has yet to be conclusively shown. The multilayered tissue may also act as a buffer that sustains female fertility during periods of abiotic stress (Saini and Aspinall, 1982; Saini et al., 1983). Alternatively, a larger nucellus may facilitate formation of an optimal environment for signaling during gametogenesis. Developmental signals such as phytohormones are transmitted through the nucellus (Cheng et al., 2006; Pagnussat et al., 2009) and contribute to gametogenesis prior to fertilization (Lora et al., 2017; Juranić et al., 2018). Moreover, studies of mutants that produce extra female germline-like cells in rice and *Arabidopsis* reveal a key difference in the competency of these ovule cells to enter meiosis. "Extra" germline cells in rice typically enter meiosis, but this is not the case in *Arabidopsis*. This may reflect a specific feature of the larger nucellus in rice and its ability to provide more stimulatory signals for germline development (Nonomura et al., 2003; Lora et al., 2019).

#### A Link Between Nucellus Growth and Grain Development?

Correlation analysis in this study revealed small but significant negative relationships between ovule and grain phenotypes in the same generation, such that genotypes with a larger nucellus and ovule were more likely to produce smaller and lighter grain. In barley, the nucellus undergoes PCD and forms the nucellar projection, which functions as a transfer tissue facilitating movement of maternal nutrients to the developing embryo and endosperm (Thiel et al., 2008). Delayed PCD of nucellar cells dramatically reduces barley grain fill (Radchuk et al., 2006). Hence, one possible reason for the inverse relationship between nucellus size and seed weight is that a large nucellus may take longer to fully differentiate into a transfer tissue (i.e. the nucellar projection), thereby slowing down subsequent influx of nutrient into the fertilized embryo sac to support early endosperm divisions. Whether this relates exclusively to the syncytial phase or cellularization phase of grain development is unclear. In both rice and *Arabidopsis*, early syncytial stages of seed development play a critical role in seed size (Sundaresan 2005; Folsom et al., 2014). However, in barley at least, syncytial divisions appear to take place before maternal nutrient transfer pathways are fully established. Studies investigating the timing of nucellar PCD and differentiation suggest that flow of maternal nutrients into the fertilized embryo sac coincides with cellularization at around 5–6 days after pollination (Radchuk et al., 2006; Radchuk et al., 2010; Radchuk et al., 2012; also reviewed in Lu and Magnani 2018). Hence, the stage at which the embryo sac gains access to maternal nutrients is likely to be a factor influencing endosperm cellularization and subsequent seed size and weight (Weschke et al., 2003). A recent study in *Arabidopsis* provides some support for an antagonistic relationship between the nucellus and endosperm (Xu et al., 2016). Fertilization of the central cell in *Arabidopsis* cues degeneration of the nucellus *via* a series of MADS-box factors including *AGL62* and the B-sister gene *TT16*; these factors also act to repress nucellus growth and facilitate development of the chalazal endosperm. In the absence of *AGL62* function, the nucellus fails to degenerate and seed development aborts (Xu et al., 2016). Further examination of barley grain development will be required to distinguish when differences in weight and area appear, and to assess how this relates to differentiation of the nucellus.

The inverse relationship between nucellus size and seed weight appears somewhat distinct from other female traits examined in cereal species. For example, several studies identified a positive correlation between pistil size and yield traits, including grain size, in wheat and sorghum (Yang et al., 2009; Guo et al., 2015). This led to the hypothesis that floral nutrient allocation is a determinant of not only floral organ size but floral survival, and thus total yield (Guo et al., 2016). Although the relationship between pistil size and ovule size was not investigated here, grain number was examined in a subset of the population. No obvious correlation was identified to suggest that the variation in ovule development might be attributed to differences in floret fertility or grain number.

Other unanswered questions from this study relate to the reproducibility of the ovule-grain relationship, whether it can be uncoupled by different environmental conditions and whether the underlying genetic basis for this variation can be identified. Importantly the variation between genotypes was significant for all traits. In addition, analysis of ten genotypes in 2015 and 2016 indicated that sporophytic features of ovule development are relatively stable across consecutive generations, showing correlations of 0.79 to 0.93 (p < 0.01; **Figure S5**). This suggests there is likely to be a genetic component underlying ovule variation, which might be the focus of future genome wide association studies (GWAS). However, while the 2015 ovule features showed a relationship with features of 2015 grain, no significant relationship was identified with features of 2014 grain. This might be explained by different sowing dates, glasshouse conditions and/or day length. Genotype-specific differences in female fertility and ovule morphology have been identified after heat and water stress in wheat (Saini et al., 1983; Onyemaobi et al., 2018). In future studies it will be intriguing to investigate the response of this barley panel to periods of controlled stress, particularly in relation to the stability of ovule features and their impact on grain development.

#### Integument Development Varies in Barley Ovules At Maturity

In general, growth of the integuments and seed coat has been proposed to set an upper limit to final seed size (Windsor et al., 2000; Nakaune et al., 2005; Adamski et al., 2009; Fang et al., 2012; Du et al., 2014; Li and Li 2015). In *Arabidopsis* for example, the cytochrome P450 genes KLUH/CYP78A5 (KLU) and CYP78A9 influence proliferation of integument cells and subsequently the cells of the seed coat, ultimately regulating ovule fertility and overall seed size (Ito and Meyerowitz 2000; Adamski et al., 2009; Sotelo-Silveira et al., 2013; Zhao et al., 2018). In wheat, silencing of *TaCYP78A5* (the wheat orthologue of *KLU*) was found to restrict seed coat cell proliferation and cause a 10% reduction in grain size (Ma et al., 2015). During barley ovule development, the two integuments grow to completely encapsulate the nucellus before stage Ov7/8. In this barley panel, integument width was found to be the least variable of the nine ovule traits measured. Genotypes with larger ovules tended to have thinner integuments, but no clear correlation was detected between integument width and seed area or weight. One limitation of the imaging method used in this study is the lack of information regarding the number of cells in each tissue type. These measurements would be required to determine whether features of the integuments, other than width alone, contribute to quantitative variation in ovule and seed development.

#### Variable Ovule Phenotypes Provide Tools for Future Studies

Results from this study indicate that the size of mature ovules in barley is likely to be determined by a combination of factors such as nucellus proliferation, embryo sac expansion, and mechanical restriction *via* the integuments. How these features interact at the molecular and cellular level remains unclear, as does the link between sub-ovule features and grain size and weight. In future studies it may be possible to analyze these relationships at even greater resolution. For example, Mendocilla-Sato et al. (2017) recently reported a method for morphometric analysis of rice ovules that holds promise for analysis of cell shape, volume, and number. In addition, whole-mount clearing techniques such as ClearSee and PeaClarity (Kurihara et al., 2015; Palmer

### REFERENCES


et al., 2015) are compatible with fluorescent markers and stains, providing avenues to track, visualize, and quantify individual cells and tissue types. Thus, in the future, similar data might be utilized to investigate the genetic architecture of barley ovule development, potentially highlighting candidate genes that contribute to female fertility, reproductive stress tolerance and grain morphology.

## DATA AVAILABILITY STATEMENT

All datasets generated for this study are included in the article/ **Supplementary Materials**.

## AUTHOR CONTRIBUTIONS

MT, RB, and LW conceived the study. LW undertook the majority of experiments. TW contributed to data analysis and presentation. LW and XY contributed to development of staging schemes. LW and MT wrote the manuscript. All authors read and edited the manuscript.

## FUNDING

We acknowledge funding from the Australian Research Council (DP180104092) to MT.

### ACKNOWLEDGMENTS

We thank Thomas Laux, Chao Ma, Andrea Matros, Denghao Wu and members of the Tucker laboratory for technical support and stimulating discussions.

## SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01374/ full#supplementary-material


contributed to spring growth habit and environmental adaptation in cultivated barley. *Nat. Genet.* 44, 1388. doi: 10.1038/ng.2447


microdissection. *Plant Cell Physiol.* 54 (5), 750–765. doi: 10.1093/pcp/ pct029


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

The handling editor declared a past co-authorship with one of the authors [MT].

*Copyright © 2019 Wilkinson, Yang, Burton, Würschum and Tucker. This is an openaccess article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Reproductive Systems in Paspalum: Relevance for Germplasm Collection and Conservation, Breeding Techniques, and Adoption of Released Cultivars

*Carlos A. Acuña\*, Eric J. Martínez, Alex L. Zilli, Elsa A. Brugnoli, Francisco Espinoza, Florencia Marcón, Mario H. Urbani and Camilo L. Quarin*

Instituto de Botánica del Nordeste, Consejo Nacional de Investigaciones Científicas y Técnicas, Facultad de Ciencias Agrarias, Universidad Nacional del Nordeste, Corrientes, Argentina

#### Edited by:

Gianni Barcaccia, University of Padova, Italy

#### Reviewed by:

Michele Bellucci, Italian National Research Council (CNR), Italy Suresh Kumar, Indian Agricultural Research Institute (ICAR), India

\*Correspondence:

Carlos A. Acuña cacuna@agr.unne.edu.ar

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 02 January 2019 Accepted: 07 October 2019 Published: 21 November 2019

#### Citation:

Acuña CA, Martínez EJ, Zilli AL, Brugnoli EA, Espinoza F, Marcón F, Urbani MH and Quarin CL (2019) Reproductive Systems in Paspalum: Relevance for Germplasm Collection and Conservation, Breeding Techniques, and Adoption of Released Cultivars. Front. Plant Sci. 10:1377. doi: 10.3389/fpls.2019.01377

The objective of this review is to analyze and describe the impact that mode of reproduction in Paspalum has on germplasm conservation, genetic improvement, and commercialization of cultivars. Germplasm collection and conservation can now be rethought considering the newly available information related to how diversity is allocated in nature and how it can be transferred between the sexual and apomictic germplasm using novel breeding approaches. An inventory of species and accessions conserved around the world is analyzed in relation to the main germplasm banks. Because of the importance of apomixis in Paspalum species different breeding approaches have been used and tested. Knowledge related to the inheritance of apomixis, variable expressivity of the trait and techniques for early identification of apomicts has helped to improve the efficiency of the breeding methods. Novel breeding techniques are also being developed and are described regarding its advantages and limitations. Finally, the impact of reproductive mode on the adoption of the released cultivars is discussed.

#### Keywords: apomixis, polyploidy, plant collection, germplasm conservation, breeding methods

## INTRODUCTION

*Paspalum* L. is a large genus of the Poaceae with nearly 310 species that are mainly distributed in the Americas (Morrone et al., 2012). *Paspalum* species are well represented in rangelands used for cattle production systems, but some species are also cultivated for forage, turf, and cereal. Although modes of reproduction are highly variable within *Paspalum*, polyploidy and apomixis are common, affecting allocation of plant diversity among species and populations. The richness of information available for the genus allowed us to review the impact of mode of reproduction on germplasm collection and conservation, genetic improvement, and adoption of the released cultivars.

## POPULATION DIVERSITY AND GERMPLASM COLLECTION

Ploidy level and mode of reproduction have a great impact on species genetic diversity. Germplasm collection and conservation of the genetic variation contained within species are the basis for selection as well as for plant improvement. *Paspalum* is one of the largest genera of the Panicoideae subfamily with the great majority of its species being native to the Americas, distributed throughout the tropics, subtropics, and temperate regions (Zuloaga and Morrone, 2005; Rua et al., 2010; Morrone et al., 2012). Ploidy levels in *Paspalum* are variable, ranging from 2*x* to 12*x*. The tetraploid and diploid cytotypes are most common in nature (**Figure 1**). Mode of reproduction of 72 *Paspalum* species has been determined; 22 reproduce exclusively by apomixis, 27 by sexuality and 23 reproduce by both reproductive paths, but at different ploidy levels (Ortiz et al., 2013).

For many *Paspalum* species, there are several individual collections and chromosomal records, while there are still a few species whose genetic system and population diversity have been characterized. In the past two decades, extensive collections of germplasm and population diversity analysis have been made for some *Paspalum* species (Urbani et al., 2002; Daurelio et al., 2004; Cidade et al., 2008; Sartor et al., 2011; Brugnoli et al., 2013; Upadhyaya et al., 2014; Eudy et al., 2017). The patterns of genetic diversity in these natural populations have been determined using morphological, ecological, cytological, and molecular traits. Most of these species form agamic complexes with sexual diploid and apomictic polyploid cytotypes, e.g., *P. notatum* Flüggé (Gates et al., 2004), *P. simplex* Morong ex Britton (Caponio and Quarin, 1987; Espinoza and Quarin, 1997), *Paspalum alcalinum* Mez, syn. *P. buckleyanum* Vasey (Burson, 1997; Sartor et al., 2011), *P. denticulatum* Trin. (Quarin and Burson, 1991; Sartor et al., 2011) and *P. rufum* Nees (Burson, 1975; Norrmann et al., 1989; Sartor et al., 2011; Sartor et al., 2013). *Paspalum dilatatum* Poir and *P. scrobiculatum* L. are two examples of agamic complexes, but at the polyploid level (Bashaw et al., 1970; Pritchard, 1970; Chao, 1974; Quarin and Hanna, 1980; Evers and Burson, 2004). The available information related to population diversity, ploidy levels and mode of reproduction for some of the *Paspalum* species can be used to devise strategies for germplasm collection and conservation.

### Paspalum notatum and P. simplex

*Paspalum notatum* is an important component of rangelands in South America, and is used as cultivated pasture and turf in warm, humid areas (Blount and Acuña, 2009). It is a multiploid species with sexual and cross-pollinated diploid and apomictic tetraploid cytotypes (Burton, 1948; Burton, 1955). Some triploids and pentaploids have been reported (Quarin, 1992; Tischler and Burson, 1995; Daurelio et al., 2004). The tetraploid cytotype is naturally distributed throughout the range of the species, from northern Mexico to central Argentina, while the diploid is restricted to Northeastern Argentina (Gates et al., 2004). Daurelio et al. (2004) analyzed the genetic structure of three natural populations of *P. notatum* using molecular markers; one was diploid, another was tetraploid, sympatric to the diploid, and the third was tetraploid and allopatric. Greater variability was observed within the tetraploid population sympatric to the diploid, indicating that sympatry of diploid and tetraploid populations seemed to promote the generation of variability in apomictic systems by interploidy gene flow, as it was previously stated by Quarin (1992).

*Paspalum simplex* is a species native to South America (Urbani et al., 2002) with forage potential for semi-arid regions (Brugnoli et al., 2014). Diploid, triploid, tetraploid, and hexaploid genotypes were reported, the tetraploid being the most frequent (Urbani et al., 2002). A diversity analysis of diploid, tetraploid, and mixed diploid-tetraploid populations of *P. simplex* showed a greater genetic variation in the tetraploid populations sympatric

to a diploid population than tetraploid allopatric populations (Brugnoli et al., 2013). Tetraploid allopatric populations are more uniform and generally a single genotype predominates. A Principal Coordinates Analysis (PCoA) carried out on 20 populations of *P. simplex* showed most dispersion within the 2*x* population and the mixed 2*x*–4*x* (**Figure 2**, adapted from Brugnoli et al., 2013 and Brugnoli et al., 2014). On the other hand, the tetraploid and mixed 4*x*–6*x* populations showed very limited dispersion, as is the case of the tetraploid populations from Corrientes and Villa Ana, where practically all individuals were located at the same point, indicating reduced intrapopulation variation (**Figure 2**). Most diversity was present among 4*x* populations. No correlation was observed between genetic and geographical distances.

Both species have similar mechanisms of generating variability in apomicts, so the search and conservation of such variability should be carried out in mixed populations (2*x*– 4*x*) or in zones of contact between diploids and tetraploids. Additionally, individual plant collections from diverse locations with contrasting environments should be used as more diversity is expected among the apomictic genotypes from different locations.

#### Paspalum vaginatum and P. distichum

*Paspalum vaginatum* Sw. is a high-quality turf mainly used in the coastal regions across the tropics and subtropics, and *P. distichum* L. is a wild relative native to South America. Both species belong to the Disticha group of *Paspalum*. They are differentiated because *P. distichum* has spikelets with pubescent glumes, whereas *P. vaginatum* has glabrous glumes (Chase, 1929). However, the identification of both species usually generates confusion because the type specimen for *P. distichum* contained pieces of both *P. distichum* and *P. vaginatum* (Eudy et al., 2017). *P. vaginatum* has typically been described as an allogamous sexual diploid (Bashaw et al., 1970) but later on a tetraploid cytotype was also reported (Hojsgaard et al., 2009). In contrast, *P. distichum* has been mentioned as a polyploid species with tetraploid and hexaploid cytotypes, including pentaploids and hyperpentaploids (Quarin and Burson, 1991; Echarte et al., 1992; Echarte and Clausen, 1993; Hojsgaard et al., 2009; Rua et al., 2010).

Eudy et al. (2017) demonstrated the characterization of 90 diploid and seven polyploid accessions belonging to the Disticha group of *Paspalum* by selecting seven SSR markers that differentiated closely related lines. They found many lines with identical SSR profiles, which were part of a group of individual selections made at the same time from golf courses or in natural areas. On the other hand, five genotypes from two germplasm banks (USDA and UGA) that were thought to be clonally identical had different SSR profiles. These results support the idea that genotyping material immediately after collection, storing this DNA fingerprint, and periodically reconfirming identity may be an appropriate approach to maintain plant diversity and avoid duplications in germplasm banks (Eudy et al., 2017). The authors recommended further plant collections in native South America habitats to increase diversity in the USA germplasm conservation system. Although they suggested vegetative plant collections, seed collections should be more appropriate since *P. vaginatum* is a cross-pollinated species.

#### Paspalum dilatatum

*Paspalum dilatatum* Poir (Dallisgrass) is a forage species widely distributed in the subtropical and temperate regions around the world. It belongs to the informal subgeneric Dilatata group and is a multiploid species of hybrid origin with sexual tetraploid, apomictic pentaploid, and hexaploid cytotypes (Bashaw and Holt, 1958; Burson et al., 1991; Evers and Burson, 2004). Seven biotypes have been described for *P. dilatatum*: three are sexual tetraploid (*P. dilatatum* ssp. *flavescens*, *P. dilatatum* "Virasoro", and *P. dilatatum* "Vacaria"), one is apomictic pentaploid (*P. dilatatum* ssp. *dilatatum*), and three are apomictic hexaploid (*P. dilatatum* "Chirú", "Uruguaiana", and "Torres") (Burson et al., 1991; Evers and Burson, 2004; Williams et al., 2011). The predominant biotype is the apomictic pentaploid, which is widespread in South America and naturalized in other parts of the world (Evers and Burson, 2004).

An evolutionary analysis in the Dilatata informal group of *Paspalum* was carried out with microsatellite markers to clarify the relationships among biotypes and evolutionary pathways (Speranza, 2009). A clear genetic differentiation in nuclear microsatellite loci was observed between sexual tetraploid biotypes of the Dilatata group, which supports the hypothesis proposed by Burson (1983) that there is an independent origin of at least some of these biotypes. On the other hand, the low gene flow observed between the tetraploid biotypes maintains their genetic identity (Speranza, 2009). Two sexual tetraploid biotypes, *Paspalum dilatatum* ssp. *flavescens* and "Virasoro", showed a high level of homozygosity for all microsatellites loci evaluated. Otherwise, the sexual tetraploid biotype "Vacaria" showed a greater number of polymorphic loci according to its lower degree of autogamy. Most apomictic biotypes seemed to have a uniclonal origin with very rare sexual recombination. Speranza (2009) suggested that the mechanisms for the formation of apomictic genotypes involve either unreduced female gametes or euploid pollen grains from the pentaploid biotype.

These results suggest that sexual tetraploid germplasm should be collected to include the greatest diversity. Single plant collections at each location may be appropriate since the sexual tetraploid germplasm behaves as autogamous.

#### Paspalum scrobiculatum

*Paspalum scrobiculatum* L. (kodo millet) is one of the few *Paspalum* species native to the Old World tropics and used as cereal in India (Clayton, 1975; de Wet et al., 1983). This species has a particular genetic system within the *Paspalum* genus. It is a multiploid species with 4*x*, 6*x*, 8*x*, 10*x*, and 12*x* cytotypes. While the tetraploid is sexual the hexaploid and decaploid are diplosporous apomictic, with some potential for apospory in the hexaploid cytotypes (Bashaw et al., 1970; Pritchard, 1970; Chao, 1974; Ma et al., 2009). Chao, (1974) observed regular meiosis and sexual reproduction, with potential for apomixis due to the presence of aposporous embryo sacs, for the 12*x* cytotype.

A diversity analysis on morphological traits was performed for some germplasm collections of *P. scrobiculatum*. Dhagat (1978) studied the genetic diversity among 6 exotic and 90 indigenous germplasm accessions of *P. scrobiculatum*. Morphological and physiological traits, such as days to 50% flowering and maturity, plant height, and straw yield were the most important traits for the differentiation among the germplasm accessions from different geographical regions. Parihar (1985) found that quantitative traits such as plant height, tillers/plant, flag leaf length, peduncle length, and ear length showed the maximum percentage contribution to genetic divergence of 100 genotypes of *P. scrobiculatum*.

Formation of a core collection is an important strategy to enhance use of diverse germplasm with agronomically beneficial traits in applied breeding (Upadhyaya et al., 2014). A germplasm collection of kodo millet from India composed of 656 accessions was evaluated for 20 morphoagronomic traits. A core subset of 75 accessions (~11%) was selected from the germplasm collection maintained in the ICRISAT genebank, Patancheru, India. These core collections are ideal genetic resources to identify new sources of variation to be used in crop improvement (Upadhyaya et al., 2014).

#### Other Paspalum Species

Plicatula is an informal taxonomic subgeneric group within *Paspalum* (Chase, 1929) with a great diversity of species, biotypes, and forage attributes (Novo et al., 2017). This group includes approximately 30 species, most of which are tetraploid and apomictic (Ortiz et al., 2013). *Paspalum plicatulum* has sexual, self-sterile diploid, and apomictic tetraploid cytotypes, while *P. atratum* Swallen, *P. guenoarum* Arechav., and *P. nicorae* Parodi have only apomictic tetraploid cytotypes (Ortiz et al., 2013).

Microsatellite markers were evaluated in different species of *Paspalum* conserved in Brazil. Some species of the Plicatula group, such as *P. atratum* and *P. plicatulum*, showed a high intraspecific genetic variability, but there was no clear distinction among different species (Cidade et al., 2013). This lack of differentiation among the different taxa of the Plicatula group was attributed to the genetic similarity of these species as well as interspecific hybridization throughout their evolution (Cidade et al., 2013). Many species of the Plicatula group are morphologically variable (Oliveira, 2004) and this variation could be obtained and conserved from accessions collected at different environments.

*Paspalum buckleyanum* Vasey (syn. *P alcalinum* Mez) is a wild forage species adapted to rangelands with alkaline and saline soils. It is a multiploid species with sexual self-fertile diploid, facultative apomictic tetraploid, obligate apomictic pentaploid, hexaploid and heptaploid cytotypes (Burson, 1997; Hojsgaard et al., 2009; Sartor et al., 2011). A genetic variability analysis with AFLP markers in three hexaploid populations of *P. buckleyanum* showed all individuals of the same population had identical genetic profiles, indicating that a single genotype predominated in each population (Sartor et al., 2013). Single-plant collections at each site may be the best approach for increasing the conserved diversity for *P. buckleyanum* since the hexaploid forms monoclonal populations.

*Paspalum denticulatum* and *P. rufum* are also multiploid species with sexual diploids and apomictic polyploids (Siena et al., 2008; Sartor et al., 2011). Pure tetraploid populations of *P. denticulatum* behaved as facultative apomicts with a low level of variation in comparison to the mixed (2*x*–4*x*) population (Sartor et al., 2013). Diploid and diploid-tetraploid mixed populations of *P. rufum* exhibited greater genetic diversity than those observed in uniformly tetraploid population. However, the level of variability in pure tetraploid populations of *P. rufum* was observed to be greater than that in the tetraploid populations of *P. denticulatum*. The levels of genetic and genotypic variability in diploid and mixed populations of *P. denticulatum* and *P. rufum* may be mainly due to the reproductive system of diploid members and the gene flow from diploids to polyploids (Sartor et al., 2013). In both species, collection of diversity would be most efficient by focusing on obtaining many individuals or seed from mixed 2*x*–4*x* populations.

#### GERMPLASM CONSERVATION

Plant genetic resources are essential for sustainable agricultural production. Germplasm banks are centers for conservation of genetic resources under appropriate conditions that permit to prolong their viability and availability. There are 1,750 germplasm banks in the world which include nearly 7.5 million accessions belonging to cultivated species, wild and local varieties of importance to humans (food crops) and livestock (forages) (FAO, 2014).

There are two main and complementary methods to conserve genetic resources, i.e., *ex situ* and *in situ*. *Ex situ* conservation is characterized by preserving the genetic resources off-site (Plucknett et al., 1987). According to Gepts (2006), 90% of these genetic resources are conserved as seeds in cold storage at low relative humidity (3–7%). Germplasm can also be conserved as living plants in field gene banks, particularly when the species are sterile or produce recalcitrant seeds, which are those that can not survive desiccation without loss of viability. Germplasm can also be maintained *in vitro* (in test tubes on plant nutrient medium). More specialized and technically intensive methods are being used or investigated such as cryopreservation (liquid nitrogen: -196°C), artificial seeds, pollen, and DNA (FAO, 2014). The second method of conservation is *in situ* (on-site). This type of conservation can take place in farmers' fields (for cultivated crop) or in natural environments (for wild relatives of crop plants or wild species). Comparing both conservation methods, the advantages of *ex situ* conservation are the capacity of storing a large number of accessions, the facility of access to the germplasm for characterization, evaluation and distribution, and secure conservation conditions. However, *in situ* conservation is used because landraces are an important component of indigenous cultures, it allows evolution to proceed, its cost is low, and it is the primary form of conservation for wild crop relatives (Gepts, 2006).

Most *Paspalum* germplasm collections have been conserved *ex situ* in 11 gene banks around the world (**Table 1**). This germplasm is mainly conserved as seeds at low temperature and low humidity. However, living plants are also preserved in a few gene bank collections. The National Bureau of Plant Genetic Resources (NBPGR), India, contains the largest number of accessions, but all belonging to *P. scrobiculatum* (kodo millet). U.S. National Plant Germplasm System (USDA), EMBRAPA (Brazil) and IBONE (Argentina) germplasm banks preserved the greatest number of species with 48, 51, and 72, respectively.

An important germplasm bank of *Paspalum* species exists in Argentina. This gene bank includes 434 accessions from 72 species. These were collected in different regions of the natural distribution of the species. These materials are preserved mainly as seed, but are also conserved as plants in greenhouses or grown in the field (**Table 2**). The species represented by the greatest


TABLE 2 | Germplasm conserved at Facultad de Ciencias Agrarias, Universidad Nacional del Nordeste, and Instituto de Botánica del Nordeste, Corrientes, Argentina.


(Continued)

#### TABLE 2 | Continued


number of accessions are: *P. notatum* (77), *P. dilatatum* (41), and *P. simplex* (34), which represent 35% of the total number of accessions preserved in this gene bank.

Apomixis, present in a large number of *Paspalum* species and accessions, allows for conserving polyploid and highly heterozygous genotypes through seed instead of live plants. This advantage increases the conservation efficiency taking into account that a larger number of accessions can be conserved at lower cost. Additionally, this germplasm is conserved with lower risk of disease transmission as compared with asexual generations by vegetative propagation. The main disadvantage of apomixis for germplasm conservation relates to the fact that only for some species it is possible to access the diversity contained in apomictic genotypes through hybridization (Kumar et al., 2017). The availability of compatible sexual germplasm is the key for hybridizing apomictic accessions (Miles, 2007; Kumar et al., 2013).

Since polyploidy and apomixis predominate in *Paspalum*, the possibility of transferring the natural diversity present in apomictic genotypes to sexual synthetic tetraploid populations may be crucial for utilizing the germplasm. Zilli et al. (2018) have described a novel breeding approach developed for *P. notatum* that allow to transfer the diversity present in a group of apomictic ecotypes distributed throughout the Americas to a single synthetic tetraploid population. A similar approach is being used in the Plicatula group of *Paspalum*, which includes near 30 species (Novo et al., 2017). These populations are expected to contain a high level of diversity and can be interesting for preserving the genetic diversity in species and groups of related species within the genus.

### GENETIC IMPROVEMENT

#### Impact of Apomixis: Inheritance, Expressivity, and Early Identification of Apomicts

Polyploidy is present in around 75% of *Paspalum* species, ranging from triploid to 12-ploid, tetraploidy being the most common polyploid type (Ortiz et al., 2013). Polyploidy and apomixis are strongly related in *Paspalum*, and most polyploid cytotypes reproduce by apomixis. The studies on apomixis and its genetic control have been of great interest in recent decades, mainly because of the interest in transferring apomictic reproduction to the major economic crops; however, transfer of apomixis to crop species has not been successful mainly because the gene(s) involved have not yet been identified (Kumar et al., 2019). One of the objectives of manipulating apomixis for breeding purposes relates with the possibility of fixing heterosis in F1 for the traits of interests (Hanna, 1995; Miles, 2007; Kumar et al., 2017). The superiority of apomictic hybrids is expected to be retained across the reproductive cycles.

The generation of sexual tetraploid individuals by chromosome doubling of sexual diploids in *P. notatum* (Burton and Forbes, 1960; Quarin et al., 2001; Quesenberry et al., 2010), *P. simplex* (Cáceres et al., 1999), and *P. plicatulum* (Sartor et al., 2009) allowed the generation of segregating populations by means of hybridization between sexual and apomictic tetraploids. Martínez et al. (2001) used F1, F2 and back-crosses to study the inheritance of apospory in *P. notatum*. The authors reported that apomixis was inherited as a major dominant factor with distorted segregation in favor of sexual individuals, probably due to some pleiotropic lethal effect of the major gene(s) or partial lethality factors linked to the apospory locus. Recent cytogenetic studies demonstrated meiotic abnormalities associated with apospory in *P. notatum* (Dahmer et al., 2008; Podio et al., 2012) supporting this hypothesis. Later, Martínez et al. (2007) reported that apospory could not be transferred by monoploid male gametes (*n* = *x*), supporting the hypothesis postulated by Nogler (1984) that apomixis can only be transferred under heterozygous conditions. In addition, Aguilera et al. (2015) reported distorted segregation in favor of sexual inter-specific hybrids of crosses between an induced sexual tetraploid *P. plicatulum* and an apomictic *P. guenoarum*. Interestingly, when back-crossing the induced sexual tetraploid plant of *P. plicatulum* with apomictic F1 hybrids, no distortion was observed in the segregation patterns.

Apomictic reproduction generally does not imply the exclusive generation of clonal progenies; facultative apomictic reproduction is observed in almost all species of *Paspalum* studied to date. In the case of apomictic hybrids of *P. notatum*, the expressivity of apospory is more variable than in natural apomictic ecotypes, ranging from 1 to 100% (Martínez et al., 2001; Acuña et al., 2009; Acuña et al., 2011; Zilli et al., 2015; Marcón et al., 2019). The causes of the variation in apospory expressivity in the progeny of sexual × apomictic crosses remain uncertain, but some hypothesis have been proposed. For instance, Nogler, (1984) postulated the timing of apospory induction as the cause of variable expressivity of apospory in apomictic hybrids of *Ranunculus sp.*; if the apospory induction occurs prior to meiosis the result would be an aposporic embryo sac, otherwise, it would be sexual. The author observed that apospory expressivity in inter-specific hybrids of *Ranunculus* decreases when back-crossing with the female parent. This hypothesis would explain the results reported by Acuña et al. (2009, 2011) in apomictic hybrids of *P. notatum*, where the proportion of highly apomictic hybrids decreases in the progeny when crossing first and second generation apomictic hybrids × sexual genotypes. Zilli et al. (2015) reported that hybrids of *P. notatum* exhibited low or high apospory expressivity but a reduced proportion of hybrids exhibited intermediate levels of expressivity. Therefore, the variable expressivity may be related to a single major gene. In addition, temporal variation in the level of expressivity of apomixis has been reported in several *Paspalum* species. For instance, Quarin (1986) reported variable apospory expressivity in *P. cromyorrhizon*, attributing the phenomenon to variation in the photoperiod; however, the authors indicated that other environmental factors such as temperature and water stress could have a significant effect on apospory expressivity. Burton (1982a) reported that photoperiod and water and nitrogen deficit did not produce a shift from apomictic to sexual reproduction in natural apomictic *P. notatum* plants. However, the methodology used to determine apomixis expressivity, by evaluation of the homogeneity of the progeny, would not be enough sensitive to detect small changes in expressivity. Rios et al. (2013a) reported that higher apospory expressivity in *P. notatum* is observed during summer, whereas the expressivity is lower during spring and fall, attributing this variation to photoperiod. The mechanism(s) involved in the variation for apospory expressivity remains unclear, and due to its substantial importance for breeding not only apomictic grasses, but also main apomictic crops in the future, further research should be addressed to this topic.

Determining the mode of reproduction and expressivity of apomixis have been important for breeding programs. Progeny test was the first technique adopted in the genus *Paspalum*. This technique is based on evaluation of the uniformity of the offspring and its identity to the maternal plant. It was used for determining mode of reproduction and expressivity of apomixis in *P. notatum* (Burton, 1948; Burton, 1982a; Ortiz et al., 1997; Rebozzio et al., 2011). It is a reliable technique because of the direct observation on the progeny, but is a timeconsuming and demanding method as it requires to grow a large number of progenies in the field, and having the limitation of a fewer morphological markers available. The use of DNAbased markers may overcome the limitations (Chandra et al., 2010; Yadav et al., 2019). The generation of linkage maps and identification of genomic regions in the genus *Paspalum* (Pupilli et al., 2001; Labombarda et al., 2002; Martínez et al., 2003; Pupilli et al., 2004; Stein et al., 2004; Stein et al., 2007), allowed the identification of the apomixis-controlling genomic region. These kinds of studies allowed the development of molecular markers linked to apospory in *P. notatum* (Pupilli et al., 2001; Martínez et al., 2003; Pupilli et al., 2004; Stein et al., 2004; Stein et al., 2007, Rebozzio et al., 2012) and *P. simplex* (Labombarda et al., 2002). The availability of molecular markers allows breeders to achieve early classification of reproductive mode in segregating progenies, saving time and resources (Zilli et al., 2015; Brugnoli et al., 2019). Due to variable apospory expressivity reported in *Paspalum* species, this technique does not allow identification of highly apomictic hybrids, therefore auxiliary techniques such as mature embryo sac observation, flow cytometry, or progeny tests, using morphological or molecular markers, are needed. Mature embryo sac observation has been extensively used for determining mode of reproduction and apospory expressivity in *P. notatum* (Ortiz et al., 1997; Martínez et al., 2001; Quarin et al., 2001; Acuña et al., 2007; Martínez et al., 2007; Acuña et al., 2009; Acuña et al., 2011; Rebozzio et al., 2011; Zilli et al., 2015; Zilli et al., 2018), *P. cromyorrhizon* (Quarin, 1986), *P. malacophyllum* (Hojsgaard et al., 2016), *P. rufum* (Siena et al., 2008; Delgado et al., 2014; Soliman et al., 2019), among others. This technique was developed by Young et al. (1979) and recently modified by Zilli et al. (2015); it requires plants at flowering stage and provides reliable information about expressivity of apomixis (Ortiz et al., 1997) and also is inexpensive, rapid, and straightforward (Zilli et al., 2018). Flow cytometry on seeds was used in *P. notatum* (Urbani et al., 2017), *P. simplex* (Brugnoli et al., 2014), *P. malacophyllum* (Hojsgaard et al., 2016), *P. rufum* (Delgado et al., 2014), and in inter-specific hybrids of the Plicatula group of *Paspalum* (Aguilera et al., 2015). This technique provides similar information to embryo sac observation, but determining expressivity of apomixis at seed stage, though giving a more accurate approximation to what is expected in the progeny, comes at a higher cost in comparison to embryo sac observations. Therefore, molecular markers linked to apospory could be used for determining mode of reproduction at seedling stage, planting in the field only the apomictic ones, and mature embryo sac observation or flow cytometry on seeds could be used to identify and select highly apomictic hybrids for further evaluations, saving time and resources. Identification of obligate apomictic hybrids is a key factor for breeding programs to ensure genetic stability through the successive reproductive cycles, which is the most important factor for cultivar development.

#### Breeding Methods For Diploid and Polyploid Germplasm Ecotype Selection

The most common breeding approach adopted for warm-season grasses in the last century was the selection within the natural variability present in the species (Gates et al., 2004). This method was adopted from vegetatively propagated species and consists of selection of ecotypes from the natural distribution area of the species, selecting among germplasm bank introductions or landraces. Ecotypes are evaluated in multiple locations and years for forage and/or grain production and quality, or turf performance, in addition to tolerance for biotic and abiotic

stress. This breeding method is particularly useful for apomictic species due to the possibility of clonal reproduction by seeds (Gates et al., 2004). However, this methodology may be suitable to accomplish the short and midterm objectives as no novel variability is created (Zilli et al., 2018). Despite this, the majority of the released cultivars of *Paspalum* were selected using this breeding approach.

#### Hybridization

Two hybrid cultivars were released for diploid *P. notatum*: Tifhi 1 (Hein, 1958) and Tifhi 2. These two diploid hybrids were selected by considering the specific combining ability of pairs of self-incompatible genotypes (Burton, 1984). These hybrids exhibited greater forage yield and liveweight gain than the cultivar Pensacola. However, their adoption by farmers was not significant because of the high cost of seed production (Burton, 1984).

The general idea of using hybridization in apomictic plants relates to the possibility of releasing the natural diversity present in apomictic ecotypes and fixing superior F1 hybrids. These novel apomictic hybrids are expected to be genetically stable through indefinite asexual generations (Miles, 2007). Apomixis involves the formation of a chromosomally unreduced embryo from a megaspore mother cell or a somatic cell; however, these plants produce genetically recombined and chromosomally reduced male gametes (Hanna, 1995). Therefore, an apomictic plant could be used as male parent when a sexual compatible plant of same ploidy level is available (Kumar et al., 2017). The efficiency of crosses is expected to be greater when parents have the same ploidy level and number of chromosomes. In a few *Paspalum* species, sexual tetraploid genotypes were generated by chromosome doubling of sexual diploids. Induced sexual 4*x* plants of *P. notatum*, like their diploid progenitors, were allogamous due to self-incompatibility. Burton et al. (1970) indicated that self-incompatibility may be the result of the S-Z mechanism, commonly observed in grasses.

Several attempts had been made to obtain highly heterotic apomictic hybrids by crossing sexual × apomictic genotypes. Promising hybrids were generated in *P. notatum* (Burton, 1992; Acuña et al., 2009; Acuña et al., 2011; Zilli et al., 2015). Burton (1992) described the generation of four segregating families obtained by crossing three induced tetraploid sexual plants × two apomicts. One of these families exhibited greater forage and seed yield and was informally named Tifton 54. Acuña et al. (2009) obtained 591 hybrids by crossing 11 experimental sexual tetraploid plants × 5 apomicts. Several hybrids exhibited heterosis for agronomic and morphological traits; however, the proportion of highly apomictic hybrids in the progenies was low (11%). In addition, the autors reported that sexual hybrids behaved as self-compatible like their apomictic parents, and unlike the selfincompatible diploid and induced sexual tetraploids. Acuña et al. (2011) generated 2,700 hybrids by crossing 11 sexual F1 hybrids × 5 apomictic F1 hybrids and the apomictic cultivar Argentine. The autors reported the occurrence of heterosis for all agronomic and morphological traits evaluated. The proportion of highly apomictic hybrids was lower (3%) than observed by Acuña et al. (2009). Zilli et al. (2015) generated 524 hybrids crossing three experimental sexual plants × nine apomicts. Variable levels of heterosis were observed depending on the parental combination and the evaluated trait. The authors observed that segregation for mode of reproduction depended on the male parent used. However, the proportion of highly apomictic hybrids was low (8%). The low proportion of highly apomictic hybrids is one of the most important obstacles to the generation of superior apomictic hybrids. Finally, the first apomictic hybrid cultivar of *P. notatum* was recently released commercially (Urbani et al., 2017).

Recently, Brugnoli et al. (2019) generated 232 hybrids of *P. simplex* by crossing two induced sexual plants × seven apomicts. The average proportion of apomictic hybrids was 1:2.4 (apomictic:sexual hybrids), but was variable and depended on the parental combination. In addition, the authors reported no difference in agronomic and morphologic traits between sexual and apomictic hybrids. Several apomictic hybrids combined agronomic traits of interest and will be further evaluated.

The use of hybridization between related species is a suitable approach for breeding in the genus *Paspalum*. Inter-specific hybrids were generated in the Plicatula group of *Paspalum* using a chromosome-doubled sexual plant of *P. plicatulum* and apomictic ecotypes of different species of the Plicatula group (Novo et al., 2013; Pereira et al., 2015; Novo et al., 2016; Novo et al., 2017). New highly apomictic hybrids were generated, and several were characterized as superior for forage yield, cold tolerance, and cattle preference with respect to the apomictic male parent (Da Costa Huber et al., 2016; Motta et al., 2017; Novo et al., 2017). This approach may also be used to hybridize *P. vaginatum* and *P. distichum*, since *P. vaginatum* is sexual and cross-pollinated and *P. distichum* is mainly tetraploid and apomictic (Bashaw et al., 1970; Ortiz et al., 2013). The generation of an induced sexual tetraploid plant will be needed, but this method may facilitate the creation of genetically uniform seeded turf cultivars.

Another hybridization approach was used attempting to increase ergot (*Claviceps paspalli*) resistance in *P. dilatatum*. The widely distributed forage *P. dilatatum* is apomictic and pentaploid, and is highly susceptible to ergot. *P. urvillei* is a closely related species but is tetraploid and sexual. Caponio and Quarin (1990) generated inter-specific hybrids crossing a sexual tetraploid ecotype of *P. dilatatum* and a sexual tetraploid genotype of *P. urvillei*. These hybrids were back-crossed to *P. dilatatum* and evaluated for tolerance to ergot and seed yield (Schrauf et al., 2003); one of these hybrids was selected for superior performance and released as a cultivar namely "Primo" (INASE, 2013).

Molecular markers have been useful for hybridization in apomictic *Paspalum* species. Molecular markers linked to apospory were used for early identification of aposporic hybrids in *P. notatum* (Zilli et al., 2015; Zilli et al., 2018; Marcón et al., 2019) and *P. simplex* (Brugnoli et al., 2019). In addition, molecular markers were used to assess the efficiency of crossing methods (Aguilera et al., 2015; Zilli et al., 2015; Novo et al., 2017; Brugnoli et al., 2019). Random molecular markers (ISSR and SSR) were sucessfully used to predict the segregation for mode of reproduction and heterosis for forage yield in *P. notatum* (Marcón et al., 2019). The greater the genetic distance between parents, the greater the fraction of apomictic hybrids within the progeny and the level of heterosis for forage yield, indicating the advantage of crossing unrelated parents.

#### Recurrent Restricted Phenotypic Selection (RRPS)

Mass selection is a useful method for self- and open-pollinated sexual species, suitable for highly heritable traits. After the development of an efficient technique for hybridizing *P. vaginatum*, vegetatively propagated F1 hybrids are selected from segregating families (Hanna et al., 2013). Selected F1 hybrids may be part of the next cycle of crosses and selection. Recently, new techniques for hybridizing *P. scrobiculatum* were developed leading to the generation of large segregating families (Hariprasanna, 2017); therefore, mass selection may be suitable for improving this species. Moreover, the adoption of the breeding schemes will also depend on the availability of resources in breeding programs.

Restricted recurrent phenotypic selection (RRPS) is mass selection on which restrictions are imposed in order to increase its efficiency; this method was used by Burton (1974) for improving diploid sexual germplasm of *P. notatum*. The author applied restrictions such as the use of grids including 25 plants each, inter-crossing the selected plants instead of using open pollination (doubling genetic gain because selection is imposed on both maternal and paternal progenitors), controlling the inter-crossing by bagging inflorescences, and shaking the bags at flowering. Selected plants were represented by two inflorescences, and highly self-sterile plants reduced likelihood of selfing. Burton (1982b) proposed some modifications to this method, discarding progenies based on the performance of maternal progenitors, and improving cultural practice to achieve flowering during the first growing period. These modifications allowed using a 1-year selection cycle instead of two, and achieving a genetic gain four times greater than conventional mass selection, saving time and resources. Diploid cultivars Tifton 9, TifQuik and UF-Riata of *P. notatum* were obtained using this selection methodology (Blount and Acuña, 2009).

Due to the increasing interest in breeding *P. vaginatum* as an eco-friendly turf, RRPS could be suitable for improving this species. In the case of breeding programs focused on turf cultivars, as in the case of *P. vaginatum* and some *P. notatum*, the breeding scheme and selection methodology should be adjusted according to the propagation method (seed or vegetative) of the new cultivar (Hanna et al., 2013). In addition, RRPS would also be appropriate for improving the sexual diploid and polyploid germplasm of other *Paspalum* species.

#### Use Of Synthetic Sexual Tetraploid Populations

The use of synthetic sexual tetraploid populations as base population in breeding programs focused on apomictic species and adopting the RRPS for breeding these populations was first proposed for *P. notatum* (Burton, 1992). A mostly sexual tetraploid population was established in southern USA in 1974. Five cycles of RRPS were conducted, achieving improvement in individual plant biomass yield; but then, the program was discontinued. Another attempt was made in 1983, where two populations, obtained using different crosses, were established. After three cycles of selection 28 plants were identified as superior, and one of them, hybrid #7 produced more forage than cultivar Argentine and showed greater ergot resistance and seed yield. However, no cultivars were released from these populations and the program was discontinued. The lack of success was probably due to the limited understanding of the inheritance of apomixis at the time (Miles, 2007).

Recently, Zilli et al. (2018) generated a synthetic sexual tetraploid population of *P. notatum* composed of 306 plants by crossing three experimental sexual tetraploid genotypes with 10 natural apomictic genotypes and intercrossing 29 sexual F1 hybrids (**Figure 3A**). The 10 apomictic male parents were selected to represent the natural distribution area of the species, and to transfer the genetic variability from the apomictic germplasm to the sexual. In addition, this population was characterized for mode of reproduction, fertility, and ploidy level. It constitutes a base population for breeding. Furthermore, Zilli et al. (2019) estimated the genetic variability of this population and its ancestors by molecular markers, agronomic and morphologic traits, and seed fertility, and determined that the genetic variability contained in the apomictic germplasm was effectively transferred to the sexual synthetic germplasm. Additionally, the authors reported individuals from the sexual synthetic tetraploid population combining traits of interest for breeding programs. The availability of a sexual synthetic population allows breeders access to genetic variability in the sexual counterpart of apomictic × sexual crosses in breeding programs aimed to obtain highly productive apomictic hybrids. RRPS would be a suitable breeding method; however, due to the fact that improving apomictic species is focused on obtaining superior apomictic hybrids, the use of recurrent selection based on combining ability (Comstock et al., 1949) is expected to be more appropriate for accumulating additive as well as non-additive genetic effects (**Figure 3B**). In addition, this method allows the evaluation of the apomictic hybrids obtained from each test cross as potential new cultivars.

The generation of inter-specific hybrids in the Plicatula group of *Paspalum* (Novo et al., 2017) would allow the creation of a synthetic sexual tetraploid population for breeding and conservation purposes by inter-crossing selected F1 sexual hybrids. This breeding method could be also appropriate for improving other apomictic species, as was demonstrated in *Brachiaria* spp. (Miles and Escandón, 1997; Miles et al., 2006).

#### Genetic Transformation

Earlier plant breeding was restricted to the use of genes of the same or related species with different degrees of difficulty. The transfer of genes between unrelated species is not possible using conventional breeding approaches. Genetic engineering allows breeders to clone a gene from any organism and insert it into another organism, allowing the introduction of novel genetic variation for breeding (Vogel and Burson, 2004). Since the generation of the first genetically modified tall fescue (Wang et al., 1992), great progress was observed in many temperate forage species such as annual and perennial rye grass, red and tall fescue, and white clover, but only a few in warm-season species, were alfalfa received most of the attention (Wang and Brummer, 2012). Genetic transformation allows also the downor up-regulation of specific genes (Wang and Brummer, 2012).

In the genus *Paspalum*, just two species have been used for genetic transformation, *P. notatum* and *P. dilatatum*. However, no transgenic cultivar has been released. Transgenic *P. notatum* plants have been generated using biolistic apparatus (Smith et al., 2002; Gondo et al., 2005; Agharkar et al., 2007; James et al., 2007; Luciani et al., 2007; Sandhu et al., 2007; Zhang et al., 2007; Giordano et al., 2014; Mancini et al., 2014; Muguerza et al., 2014). In these studies, phosphinothricin acetyltransferace (*bar* and *pat* genes) and neomycin phosphotransferace II (*npt2* gene) were used as selectable markers. Sandhu et al. (2007) reported that

the transgenic *P. notatum* plants were resistant to gluphosinate ammonium under field and greenhouse trials. James et al. (2007) reported the transfer of a transcription factor from xeric *Hordeum spontaneum*, generating transgenic *P. notatum* plants tolerant to severe salt stress and dehydration under controlled environment conditions. Muguerza et al. (2014) were successful in obtaining low lignin content *P. notatum* plants by downregulation of cinnamyl alcohol dehydrogenase gene expression. The authors reported that four out of nine transgenic plants exhibited a significant increment by 5.6 to 10.4% in the *in vitro* dry matter digestibility. A similar approach was used by Giordano et al. (2014) on *P. dilatatum* achieving a reduction of up to 20% on lignin content and an increment of up to 4% for *in vitro* dry matter digestibility. Mancini et al. (2014) developed a modified transformation method aiming to future evaluation of candidate genes for apomictic reproduction, which will be an important advance in the identification of the gene(s) responsible for apomictic reproduction and its potential transference to major crops.

Genetic transformation is a potentially useful tool for genetic improvement of forage and turf crops. However, its potential has been inhibited due to stringent regulation of transgenic plants, delaying the release and adoption of new cultivar because of concerns regarding pollen flow (Rios et al., 2017). However, (Sandhu et al., 2009; Sandhu et al., 2010) evaluated gene flow using glufosinate-resistant apomictic *P. notatum* plants as pollen donors and sexual diploids and apomictic tetraploids as pollen receptors placed between 0.5 and 3.5 meters apart. The authors reported a frequency of 0.03% of transgenic triploids or neartriploids in the progeny of sexual diploids, and less than 0.16% in the progeny of apomictic tetraploids. In addition, most of the triploids and near-triploids exhibited very low vigor and fertility. Therefore, ploidy level and apomictic reproduction act as barriers in pollen mediated gene transfer from transgenic to nontransgenic plants. The drawnbacks of this approach relates to the fact that any gene transfer to wild apomictic relatives would be very efficiently propagated over generations.

Mutagenesis and somaclonal variations are potentially useful tools for genetic improvement, overcoming the regulations imposed on transgenics. One of the first attempts in *Paspalum* was reported by Burton and Jackson (1962) using radiation breeding in apomictic prostrate *P. dilatatum var. pauciciliatum*. The authors reported the generation of plants mutanted for vegetative and floral trait, but no improvements for ergot resistance or seed quality and yield were found. In addition, the radiation treatment did not produce a shift from apomictic to sexual reproduction. Recently, Heckart et al. (2010) were successful in using in vitro selection to develop herbicide resistant plants of *P. vaginatum*. The authors used tissue culture looking for somaclonal variation, using sethoxydim (2-cyclohexen-1-one, 2-[1-(ethoxyimino) butyl]-5-[2-(ethylthio) propyl]-3 hydroxy-) as the selection medium. Whole-plant resistance was confirmed in greenhouse studies. This methodology allows the generation of non-transgenic plants resistant to herbicide. The plants obtained have not been released as cultivars. Mutation was employed on cultivars Argentine and Wilmington of *P. notatum* for improving turf quality, by using either seedlings or rhizomes with different mutagenic treatments: X-rays, gamma rays, ethyl methane sulphonate (Rios et al., 2013b) and by exposure of cells to sodium azide in tissue culture (Kannan et al., 2015). A total of 40 mutant plants were evaluated for utility turf, and some of them exhibited superior turf performance in comparison to the original cultivars (Rios et al., 2017).

#### CULTIVAR ADOPTION

Approximately 94 cultivars have been released for the genus *Paspalum* L. (**Table S1**). Cultivars have been developed for different uses, e.g., cereal, turf or forage. In some cases, cultivars developed for forage are also used as utility turf, such as the cultivar Argentine of *P. notatum* Flüggé. Although the genus has nearly 310 species, only eight of these have produced cultivars. The species with most cultivars are *P. scrobiculatum* L., *P. vaginatum* Sw. and *P. notatum* (**Table S1**).

With 34 cultivars *P. scrobiculatum*, commonly known as 'kodo millet', is the species with the greatest number of cultivars. This is the only species of the genus that can be considered domesticated since it has been cultivated as an annual cereal in India for at least 3,000 years (de Wet et al., 1983). Currently, it is still grown as a major food source in India, particularly in the Deccan Plateau. It is also harvested as a secondary or wild cereal in India, Indonesia, Philippines, Thailand, Vietnam, and West Africa (Hariprasanna, 2017). It is also grown as a pasture crop in arid regions. Most cultivars were released as cereals in India during the past 30 years, and one, named Paltridge, was released for forage in Australia in 1966 (**Table S1**). *P. scrobiculatum* occurs in moist regions across the tropics and subtropics of the Old World (de Wet et al., 1983). It is a vigorous annual herb, 60 to 90 cm tall, which roots at lower nodes. The cultivated cytotype is tetraploid 2*n*=4*x*=40, and autogamous due to cleistogamy (Hariprasanna, 2017). Although several breeding techniques are being used in kodo millet, all released varieties are single plant selections from landraces or introduced germplasm from pure-line selection.

*Paspalum vaginatum,* known as 'seashore paspalum', is the *Paspalum* species with the second largest number of cultivars (**Table S1**). This species is cultivated as perennial turf in brackish and coastal environments across the tropics and subtropics around the world (Duncan and Carrow, 2000). The species has morphological characteristics that make it desirable as a turf, such as a spreading growth habit, tolerance to low mowing, deep green color, good density, and overall turf quality (Hanna et al., 2013). Its popularity as a warm-season turf mainly results from its salt tolerance, and ease of propagation (Duncan and Carrow, 2000; Eudy et al., 2017). Diploid (2*n* = 2*x* = 20), tetraploid (2*n* = 4*x* = 40) and hexaploid (2*n* = 6*x* = 60) cytotypes have been reported for *P. vaginatum*. However, the most widespread cytotype is the diploid, which is allogamous due to self-incompatibility (Hanna et al., 2013). Most cultivars released for *P. vaginatum* have been developed in USA, and the University of Georgia's program has made an outstanding contribution. Germplasm collections from native environments or golf courses, evaluation in the target environment, selection and released of best clones were part of the breeding method used during the 21st century. Nowadays, hybridization is commonly used in the most important breeding programs around the world (Raymer et al., 2007). Although, most cultivars are vegetatively propagated a few seed propagated cultivars are commercially available. Clonal cultivars are commercialized by sod farms around the world. In contrast, seed of seeded cultivars is produced by interplanting compatible parental clones and harvesting the F1 seed, as described by Hanna et al. (2013).

*Paspalum notatum* is the third most important species of the genus in terms of the number of released cultivars (**Figure 4**). This species commonly known as bahiagrass is mainly cultivated as forage in the subtropical belt around the world, especially throughout Florida and the Coastal Plain and Gulf Coast Region of Southeastern USA (Blount and Acuña, 2009). Persistence under intense and frequent grazing, and adaptation to poor sandy soils are probably the reasons for its adoption as cultivated pasture (Gates et al., 2004). It is also sown or sodded extensively as utility turf, particularly in roadways, including interstate highways (Busey, 1989). Only cvv. Pensacola and Argentine are commercialized for utility turf, although they are mainly sold for forage. The species has mainly diploid (2*n* = 2*x* = 20) and tetraploid (2*n* = 4*x* = 40) types. The diploid is sexual and crosspollinated; its natural distribution is restricted to northeastern Argentina (Burton, 1955; Burton, 1967). The tetraploid is an aposporous apomictic, and is naturally distributed from central Argentina to Northern Mexico (Gates et al., 2004). There are diploid and tetraploid cultivars, and breeding methods used to improve the two types differ. The widely known RRPS breeding method was created for improving diploid bahiagrass, and most diploid cultivars were developed using this method (Burton, 1989; Anderson et al., 2011). With the exception of cv. Boyero UNNE, all tetraploid cultivars are the result of direct selection from introduced germplasm in USA, Australia, and Japan (Blount and Acuña, 2009). Boyero UNNE is the first cultivar developed using hybridization to produce an apomictic F1 hybrid (Urbani et al., 2017).

*Paspalum dilatatum*, known as Dallisgrass, is the fourth species in terms of the number of cultivars (**Figure 4**). This species is used as forage with the particular aspect that it can be grown at higher latitudes than any other *Paspalum* species (Evers and Burson, 2004). It is currently grown in Australia, USA, and Uruguay. The species is native to the Americas and includes sexual tetraploid types, and apomictic pentaploid, and hexaploid types. Most cultivars released in USA, Argentina, Japan, and Australia are pentaploid (**Table S1**). There are also two hexaploid cultivars released in Uruguay and USA. All of these apomictic cultivars resulted from ecotype selection. There is one interesting case, which is the only tetraploid cultivar, cv. Primo, which resulted from interspecific hybridization between 4*x* sexual *P. dilatatum* and *P. urvillei* followed by several cycles of backcrossing to *P. dilatatum*. The objective was to transfer ergot tolerance from *P. urvillei* to *P. dilatatum* using this breeding technique (Schrauf et al., 2003).

The rest of the *Paspalum* spp. cultivars belong to the informal taxonomic group Plicatula. The group has 30 species; most of them are tropical or subtropical grasses with interesting qualities as forage (Zuloaga and Morrone, 2005). Rapid growth, high seed yield and aggressiveness for colonizing poor soils are characteristics that stand out for these species. Although there are conflicting taxonomic issues within the group, cultivars have been registered for four species, i.e., *P. atratum*, *P. guenoarum*, *P. plicatulum* and *P. nicorae* (**Table S1**). All cultivars within Plicatula are apomictic tetraploids and they were all released as forages.

*Paspalum atratum* is an upright grass with the ability to produce large forage yields concentrated during the warm season. It is adapted to a variety of soils from well drained sandy soils to poorly drained that stay saturated for several months (Kalmbacher et al., 1997; Hare et al., 1999a). The outstanding trait of *P. atratum*

is its late flowering, which allows for an extended vegetative phase and a concentrated and uniform reproductive phase (Hare et al., 2001; Marcón et al., 2018). This results in pastures having high nutritive value during the growing season and high seed yield. It is possible that all released cultivars of *P. atratum* belong to the same accession even to the same genotype since they were all selected out of a germplasm collection from Campo Grande, Brazil. The species is cold sensitive, which limits its adoption in latitudes greater than 30° (Kalmbacher et al., 1997; Marcón et al., 2018). Currently it is cultivated in Northern Argentina and Indonesia (Hare et al., 1999a; Hare et al., 1999b; Altuve and Bendersky, 2003).

*Paspalum guenoarum* is also an erect species, which is also robust, like *P. atratum,* but is more cold-tolerant. It is adapted to well drained, acid soils (Ramírez, 1954). Currently, no seed is commercially available for any of the cultivars of *P. guenoarum*.

Seed of the three cultivars released in Australia for *P. plicatulum* was recently added to the germplasm bank of IBONE. Plants obtained from these seeds were grown in Corrientes (Argentina), and morphologically characterized. They were all classified as *P. lenticulare* Kunth instead of *P. plicatulum*. The main difference between these two species is the presence of ramifications in the inflorescence of *P. lenticulare* (Oliveira and Valls, 2008). *P. nicorae* is a rhizomatous grass, which exhibits an aggressive colonizing behavior when grown on sandy soils. No commercial seed is available for cultivars of *P. plicatulum* and *P. nicorae*.

Mode of reproduction has been a key trait for developing the 94 cultivars listed in **Table S1**. If the whole genus is considered, 68% of the released cultivars are sexual and 32% are apomictic. The lack of domestication stands out among *Paspalum* species with the exception of *P. scrobiculatum*. It is believed that apomixis is a barrier to domestication resulting in the absence of apomixis in most economically important crops (Darlington, 1939). Thus, it is not unexpected that *P. scrobiculatum* reproduces sexually since it has been cultivated as cereal for several thousand years. Sexuality in *P. scrobiculatum* has allowed the formation of a large number of self-pollinated and genetically uniform landraces in its area of cultivation, and that variation was used for selecting the released cultivars. In contrast, sexual reproduction in cross-pollinated *P. vaginatum* has been used for hybridization and generation of a large number of hybrids, which are evaluated as clonally propagated or, less commonly, seeded turf. Since *P. scrobiculatum* and *P. vaginatum* have the largest number of cultivars in the genus sexuality is common among *Paspalum* cultivars. Apomixis predominates in the natural distribution area of the rest of the species, i.e., *P. notatum*, *P. dilatatum*, *P. atratum*, *P. plicatulum*, *P. guenoarum*, and *P. nicorae* (Ortiz et al., 2013). The case of *P. notatum* is particular since although the diploid cytotype is restricted to a small area located in northern Argentina, it is well represented among cultivars (**Figure 4**). Adaptation of the diploid cytotype to the environment of Southeastern USA may be the key for its predominance and popularity. Tetraploid cultivars are also well represented for *P. notatum*, mainly through ecotypes and also with a new hybrid. The novel sexual tetraploid population recently created may allow for more efficient generation of new hybrids with a combination of desirable forage or turf characteristics (Zilli et al., 2018).

Apomixis has been the rule among cultivars of *P. dilatatum*, *P. atratum*, *P. plicatulum*, *P. guenoarum*, and *P. nicorae*. The recent generation of sexual tetraploid germplasm for the Plicatula group is allowing hybridization and gene flow among species, and may allow for the released of novel apomictic hybrid cultivars.

#### CONCLUDING REMARKS

Polyploidy and apomixis are defining variables for allocation of plant diversity in *Paspalum*. Germplasm collection is expected to be more efficient if polyploid populations from contrasting environments are explored since the level of diversity depends on adaptation rather than geographical distances. Individual plant collections are suggested for most areas since monoclonal populations are common. Although of limited geographical distribution, rich levels of diversity are also present in sexual diploid populations and mixed diploid-tetraploid populations. In contrast to monoclonal populations, these mixed populations contained rich diversity and intensive plant collection is recommended for these particular areas.

There are several germplasm banks conserving *Paspalum* germplasm, but most species are conserved in Argentina, Brazil, and USA. There are also banks conserving a rich diversity for *P. scrobiculatum* in Asia due to its importance as a human food crop. Most of this germplasm is conserved as seed at low temperature and low humidity. The information for most banks is available online; however, this is not the case for a few of them. This review reports the complete list of accessions and species conserved in one of the most important banks for the genus, which is located in the region with the highest levels of diversity. Additionally, the recent generation of synthetic sexual tetraploid populations is expected to facilitate the conservation of entire species or taxonomic groups within single populations.

A variety of breeding methods are currently available for *Paspalum*. All of them are strongly dependent on mode of reproduction. Recent advances in breeding approaches developed for the apomictic species allow breeders to use the diversity locked in ecotypes for the generation of hybrids exhibiting heterosis for a variety of agronomically important traits. However, most current cultivars for apomictic species of *Paspalum* are the results of direct selection from accessions conserved and distributed among different regions. Moreover, novel breeding techniques developed for apomicts in *Paspalum* are expected to serve as a model for apomictic species of other genera and also for future apomictic crops if the trait is finally transferred to major crop species, such as maize and rice. The variable expressivity of apomixis is a key aspect to investigate in order to generate more efficient breeding methods.

Among the eight *Paspalum* species that are cultivated, *P. scrobiculatum* contains the greatest number of cultivars resulting from its relevance as a cereal in Asia. A species cultivated for turf, *P. vaginatum*, follows in importance since its turf quality and adaptation to the high salinity of coastal environments. *P. notatum* is the third in importance because of its multiple uses as forage, turf, and soil stabilization in acid and poor soils. The other five species have each been selected for forage production differing in their area of adaptation, e.g., *P. dilatatum* is adapted to mild temperate and sub-tropical environments while cultivars of the Plicatula group are largely adapted to tropical areas although some are adapted to sub-tropical areas. Sexual reproduction is the rule among the most economically important *Paspalum* cultivars belonging to *P. scrobiculatum*, *P. vaginatum*, and *P. notatum*. However, a large number of apomictic cultivars have been released mainly for forage. The adoption of an apomictic species or cultivar seems to be directly related to the stability over generations of agronomically important traits. The availability of novel breeding techniques for apomictic species is expected to have an impact mainly for future cultivars of *P. notatum* and species of the Plicatula group.

### AUTHOR CONTRIBUTIONS

CA contributed designing the manuscript structure, writing, and compiling the manuscript. EM, AZ, EB, and FE contributed

#### REFERENCES


writing different sections of the manuscript. FM, MU, and CQ contributed organizing and summarizing the information on tables and figures.

#### ACKNOWLEDGMENTS

The authors would like to thank Drs. John W. Miles and Alan V. Stewart for reviewing this manuscript, including English grammatical issues.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01377/ full#supplementary-material


utilization," in *Circular S-397* (Gainesville: Florida Agriculture Experimental Station).


**Conflict of Interest:** The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

*Copyright © 2019 Acuña, Martínez, Zilli, Brugnoli, Espinoza, Marcón, Urbani and Quarin. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.*

# Genetic Mapping of the Incompatibility Locus in Olive and Development of a Linked Sequence-Tagged Site Marker

#### Edited by:

Inaki Hormaza, Institute of Subtropical and Mediterranean Hortofruticultura La Mayora (IHSM), Spain

#### Reviewed by:

Juan De Dios Alché, Experimental Station of Zaidín (EEZ), Spain Carlos Romero, Polytechnic University of Valencia, Spain

#### \*Correspondence:

Luciana Baldoni luciana.baldoni@ibbr.cnr.it

† These authors have contributed equally to this work

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 27 August 2019 Accepted: 16 December 2019 Published: 28 January 2020

#### Citation:

Mariotti R, Fornasiero A, Mousavi S, Cultrera NGM, Brizioli F, Pandolfi S, Passeri V, Rossi M, Magris G, Scalabrin S, Scaglione D, Di Gaspero G, Saumitou-Laprade P, Vernet P, Alagna F, Morgante M and Baldoni L (2020) Genetic Mapping of the Incompatibility Locus in Olive and Development of a Linked Sequence-Tagged Site Marker. Front. Plant Sci. 10:1760. doi: 10.3389/fpls.2019.01760 Roberto Mariotti <sup>1</sup>† , Alice Fornasiero2,3† , Soraya Mousavi <sup>1</sup> , Nicolò G.M. Cultrera<sup>1</sup> , Federico Brizioli<sup>1</sup> , Saverio Pandolfi<sup>1</sup> , Valentina Passeri <sup>1</sup> , Martina Rossi <sup>1</sup> , Gabriele Magris2,3, Simone Scalabrin<sup>4</sup> , Davide Scaglione<sup>4</sup> , Gabriele Di Gaspero<sup>2</sup> , Pierre Saumitou-Laprade<sup>5</sup> , Philippe Vernet<sup>5</sup> , Fiammetta Alagna<sup>6</sup> , Michele Morgante2,3 and Luciana Baldoni1\*

<sup>1</sup> CNR - Institute of Biosciences and Bioresources (IBBR), Perugia, Italy, <sup>2</sup> Institute of Applied Genomics, Udine, Italy, <sup>3</sup> Department of Agricultural, Food, Environmental and Animal Sciences, University of Udine, Udine, Italy, <sup>4</sup> IGA Technology Services, Udine, Italy, <sup>5</sup> University of Lille, CNRS, UMR 8198 - Evo-Eco-Paleo, F-59000, Lille, France, <sup>6</sup> ENEA - Trisaia Research Centre, Rotondella, Italy

The genetic control of self-incompatibility (SI) has been recently disclosed in olive. Intervarietal crossing confirmed the presence of only two incompatibility groups (G1 and G2), suggesting a simple Mendelian inheritance of the trait. A double digest restriction associated DNA (ddRAD) sequencing of a biparental population segregating for incompatibility groups has been performed and high-density linkage maps were constructed in order to map the SI locus and identify gene candidates and linked markers. The progeny consisted of a full-sib family of 229 individuals derived from the cross 'Leccino' (G1) × 'Dolce Agogia' (G2) varieties, segregating 1:1 (G1:G2), in accordance with a diallelic self-incompatibility (DSI) model. A total of 16,743 single nucleotide polymorphisms was identified, 7,006 in the female parent 'Leccino' and 9,737 in the male parent 'Dolce Agogia.' Each parental map consisted of 23 linkage groups and showed an unusual large size (5,680 cM in 'Leccino' and 3,538 cM in 'Dolce Agogia'). Recombination was decreased across all linkage groups in pollen mother cells of 'Dolce Agogia,' the parent with higher heterozygosity, compared to megaspore mother cells of 'Leccino,' in a context of a species that showed exceptionally high recombination rates. A subset of 109 adult plants was assigned to either incompatibility group by a stigma test and the diallelic self-incompatibility (DSI) locus was mapped to an interval of 5.4 cM on linkage group 18. This region spanned a size of approximately 300 Kb in the olive genome assembly. We developed a sequence-tagged site marker in the DSI locus and identified five haplotypes in 57 cultivars with known incompatibility group assignment. A combination of two single-nucleotide polymorphisms (SNPs) was sufficient to predict G1 or G2 phenotypes in olive cultivars, enabling early marker-assisted selection of compatible genotypes and allowing for a rapid screening of inter-compatibility among cultivars in order to guarantee effective fertilization and increase olive production. The construction of high-density linkage maps has led to the development of the first functional marker in olive and provided positional candidate genes in the SI locus.

Keywords: genetic map, Olea europaea, double digest restriction associated deoxyribonucleic acid sequencing, self-incompatibility, functional markers

### INTRODUCTION

In cultivated olive (Olea europaea subsp. europaea var. europaea), the cross breeding activities have been delayed by the particularly long generation time (Santos-Antunes et al., 2005), the extended juvenile phase, the high demanding nursery practices, such as the forcing of seedling growth (Rugini et al., 2016) and the time and space needed for plant growing (Picheny et al., 2017). In olive, breeding programs last about 30 years on average (Lavee et al., 2014; Rallo et al., 2016) and have been limited to the empirical selection of a few sporadic intraspecific crosses (Rallo et al., 2008), or to clonal selection (Manaï et al., 2007; Gomes et al., 2008; Trapero et al., 2013; Mousavi et al., 2019), while the timing for the selection of new cultivars in other fruit crops has been greatly reduced, also by the application of new efficient genomic tools (Biscarini et al., 2017; Laurens et al., 2018; Cai et al., 2019). However, the importance of olive cultivation at worldwide level and the new challenges posed by the ongoing climate change, are leading to an ever increasing demand for new cultivars (Gutierrez et al., 2009; Urban, 2015; Bosso et al., 2016).

One of the current limitations to olive productivity is represented by its complex self- and inter-incompatibility system (Saumitou-Laprade et al., 2017a; Alagna et al., 2019), a barrier that may seriously curb yield and restrict the varietal choice for planting to only a few inter-compatible or self-fertile varieties. Fruit set deficiencies due to ineffective pollination are generally underestimated by the farmers, however, it has been demonstrated that supplemental pollination may significantly increase olive production (Ayerza and Coates, 2004), indicating the importance of an effective pollination design of olive orchards.

In olive as well as in other species of the Oleaceae family, such as Phillyrea (Phillyrea angustifolia) and ash (Fraxinus excelsior), an incompatibility system known as diallelic self‐incompatibility (DSI) has been described (Saumitou-Laprade et al., 2010; Vernet et al., 2016; Saumitou-Laprade et al., 2018). It has been hypothesized that two alleles at the DSI locus exist in cultivated olives, S and s, with S dominant over s, with only two possible genotypic combinations (Ss and ss), corresponding to G1 and G2 incompatibility groups, respectively, and by stigma test analysis it was never found G1xG1 or G2xG2 compatibility (Saumitou‐Laprade et al., 2017a), where G1xG2 crosses always generate G1:G2 = 1:1 balanced progenies. All olive cultivars seem to be self-incompatible, even if pseudo-self-fertility might occur for some cultivars in particular conditions (Alagna et al., 2019).

Although the DSI mechanism in olive is known, many important aspects remain to be clarified, such as the location of the incompatibility locus on the genome, the identification of candidate genes controlling this trait, and markers closely linked to incompatibility. The availability of such information will allow for a systematic screening of olive cultivars to identify their group of incompatibility through genotyping with linked markers.

Olive is a diploid species (2n = 2x = 46) with a genome size of approximately 1.4 Gb (Cruz et al., 2016), with a mean C-value of 1.59 pg (1.56 Gb), where more than 30% sequences are represented by tandem repeats (Barghini et al., 2014). Up to now, only intraspecific crosses have been used for olive mapping, and very early studies were performed with dominant markers (de la Rosa et al., 2003; Wu et al., 2004; Aabidine et al., 2010; Khadari et al., 2010). Recently, more dense maps have been produced by the use of codominant markers, such as diversity arrays technology (DArT) (Domínguez-García et al., 2012; Atienza et al., 2014), simple-sequence repeat (SSR) (Sadok et al., 2013), and single-nucleotide polymorphism (SNP) markers (Marchese et al., 2016; İpek et al., 2017; Unver et al., 2017).

SNPs are sequence-tagged markers widely used for association and genetic mapping due to their wide distribution along the genome, high-throughput genotyping, and ease to score (Vezzulli et al., 2008; Deulvot et al., 2010; Lou et al., 2017). Molecular markers linked to the traits of interest can be identified through different strategies, such as genetic linkage mapping based on biparental populations (Curtolo et al., 2017; Ji et al., 2018; Zheng et al., 2018; Sapkota et al., 2019), or through genome-wide association studies (GWAS), based on unrelated individuals (Khan et al., 2013; Nicolas et al., 2016; Elsadr et al., 2019). The generation of high-resolution linkage maps, a prerequisite for gene positional cloning, allows the genetic dissection of quantitative trait loci, assists in comparisons of synteny, and provides marker order for anchoring sequence scaffolds or physical contigs to linkage groups. In perennial fruit crops, the availability of markers tightly linked to traits under selection may strongly facilitate the progress of breeding programs (Minamikawa et al., 2018).

Linkage analysis can be performed on multi-generation families derived by the cross, back-cross, or selfing of homozygous or heterozygous lines. In fruit trees, the use of these progenies is hindered by the lack, in most cases, of homozygous genotypes and by the long generation time (Bai et al., 2007; Jaillon et al., 2007). For this reason, full-sib F1 families deriving from inter- or intra-specific varietal crosses of highly heterozygous parents are generally used in tree species (Bartholomé et al., 2015) and linkage analysis is conducted separately for each parent using a two-way pseudo-testcross mapping strategy (Grattapaglia and Sederoff, 1994). Pseudotestcross mapping has been carried out in many fruit crops, such as apricot (Irisarri et al., 2019), peach (Zeballos et al., 2016), oil palm (Bai et al., 2018), clementine (Ollitrault et al., 2012), grapevine (Zhu et al., 2018), apple (Di Pierro et al., 2016), and forest trees, like poplar (Zhigunov et al., 2017) and oak (Bodénès et al., 2016). A number of markers closely linked to important simple or complex traits has been identified with this strategy (Zhu et al., 2015; Zhai et al., 2016). Marker-trait associations, either for qualitative or quantitative trait loci (QTL), should allow to predict in advance the final breeding value of genotypes, permitting to discard unwanted genotypes immediately after seed germination and to only grow individuals that will later display the traits of interest (Edge-Garza et al., 2015; Migicovsky and Myles, 2017).

In the present work, a F1 progeny derived from the cross of two highly heterozygous and completely self-incompatible cultivars, 'Leccino' × 'Dolce Agogia,' respectively belonging to G1 and G2 groups of incompatibility (Saumitou‐Laprade et al., 2017b), was genotyped by ddRAD sequencing technology in order to construct linkage maps. This crossing was performed because parents show different phenotypes for numerous important agronomical traits. In particular, 'Dolce Agogia' is a vigorous cultivar, with an alternate bearing, resistant to some of the most dangerous olive diseases, such as Verticillium dahliae and Spilocaea oleagina (Iannotta and Scalercio, 2012; Arias-Calderón et al., 2015), whereas 'Leccino' has medium vigor and constant bearing, is susceptible or only partially resistant to the pathogens indicated above (Salman, 2017), but it was recently reported as tolerant to Xylella fastidiosa (Giampetruzzi et al., 2016), the most devastating emerging plant pathogen for the Mediterranean agriculture (Suffert et al., 2009). The maps generated in the present work, represent a significant improvement over the previous ones (Marchese et al., 2016; İpek et al., 2017; Unver et al., 2017), because they include a higher number of SNP markers with a uniform genome distribution and can, therefore, serve as saturated maps for trait mapping. We used the 'Leccino' map to identify the location of the DSI locus, to select sequence scaffolds of 'Leccino' that were anchored to the DSI by linked markers, and to develop and validate SNP markers for the incompatibility trait.

#### MATERIALS AND METHODS

#### Plant Material and Deoxyribonucleic Acid Extraction

A full-sib F1 progeny of 229 individuals was generated from the cross between 'Leccino,' as female parent, and 'Dolce Agogia,' as male parent. The progeny, thereafter referred to as Le×DA, is represented by a first set of 16-year-old seedlings (amounting to 155 individuals) and by a second set of 5 years old seedlings (74 individuals) grown in an experimental field, for a total of 229 F1 genotypes. The progeny showed high phenotypic variability, in particular for the length of the juvenile phase, plant vigor, tree habit, fruit bearing, and cutting's rooting ability (Hedayati et al., 2015). DNA was extracted from leaves using the DNeasy Plant Mini Kit (Qiagen).

#### Parentage Analysis of the Progeny

'Leccino,' 'Dolce Agogia,' and the Le×DA progeny were genotyped with a set of SSR markers for confirming the parentage (Baldoni et al., 2009). PCRs were performed as previously reported (Mousavi et al., 2017) and the amplified fragments were separated on an ABI 3130 Genetic Analyzer capillary sequencer (Applied Biosystems, Foster City, CA). Alleles were called using the GeneMapper 3.7 software (Applied Biosystems, Foster City, CA). Parentage analysis was performed using CERVUS version 3.0.7 program (Kalinowski et al., 2007) to sort out seedlings derived from open pollination.

#### Phenotyping for Incompatibility Groups of the Le×DA Progeny

Those offspring that have reached the mature phase and started blooming (109 out of 229, all belonging to the first set of 16-yearold seedlings), were phenotyped for the incompatibility group. Data on 91 individuals were previously reported (Saumitou-Laprade et al., 2017a), all the others were de novo phenotyped following the same protocol previously applied. In order to minimize errors, phenotyping of all individuals was repeated on previously clonally propagated plants grown in a separate field. Before blooming, flowering twigs of each tree were protected from open pollination with blossom bags. At full blooming, bagged twigs were collected and 20 open flowers per tree were used for the analysis. Sepals and petals were removed from 10 flowers and pistils were placed in two separate plates (one for each of the two pollen donors) onto Brewbacker and Kwack medium. The stigmas of five pistils in one plate were pollinated with pollen of 'Leccino' (G1) and the other five with pollen of 'Dolce Agogia' (G2). From the remaining 10 flowers, pollen was collected and used for pollinating 'Leccino' and 'Dolce Agogia' stigmas. The same protocol applied by Saumitou-Laprade et al. (2017a) was used to observe the growth of pollen tubes on the stigmata.

#### Double Digest Restriction Associated Deoxyribonucleic Acid Sequencing of Parents and Progeny

Genomic DNA was digested with SphI and MboI restriction enzymes, following a double digest restriction associated DNA sequencing (ddRADseq), according to the method proposed by Peterson et al. (2012) and modified by Scaglione et al. (2015). Fragments were added to a ligation reaction containing barcoded adapters, pooled, and then fractioned by agarose gelelectrophoresis. DNA in the size range between 350 and 600 bp was purified using a QIAquick Gel Extraction kit (Qiagen, Venlo, Netherlands). Enrichment PCR was performed with PCR primers that incorporate Illumina hybridization/sequencing sites and index sequences for combinatorial multiplexing. Quality, quantity, and reproducibility of libraries were assessed using a Caliper instrument (DNA High Sensitivity chip). Sequencing was carried out on an Illumina HiSeq 2500 instrument, generating 125-bp paired-end reads.

#### Single-Nucleotide Polymorphism Calling

Reads were aligned using Bowtie 2 software (Langmead and Salzberg, 2012) with default parameters against a whole-genome assembly of 'Leccino' (Muleo et al., 2016). The reference consisted of 509,032 scaffolds, with N50 length of 10,037 bp, amounting to a total of 1.429 giga base pairs (Gbp). Alignments with mapping quality < 4 were removed. Segregating sites were identified using the software Stacks (Catchen et al., 2011) and a bounded SNP model with alpha = 0.05 and upper error (epsilon) of 0.1. Genotypes were called with a minimum coverage of eight reads. Heterozygous genotypes were called within a read coverage ratio of 0.15-0.85 for reference and alternate alleles. Segregating loci were retained if genotypes were called in > 150 progeny.

#### Genetic Mapping

Genetic maps were generated with the double pseudo-test-cross approach for each parent using SNP genotypic data and phenotypic binary data for the incompatibility group. Linkage groups were obtained using the R/qtl module of the R statistical package with a logarithm of odds (LOD) threshold > 10 and a recombination rate of 0.20. Markers were first ordered using MSTMAP (Wu et al., 2008) with default parameters. A Perl implementation of the SMOOTH program (van Os et al., 2005) was used to remove errors within haplotypes. Markers were re-ordered using MSTMAP. Marker distances were calculated using the Kosambi function {map distance in centi-Morgan (cM) equals ¼ ln [(1 + 2r)/(1 − 2r)], where r is the observed recombination frequency}. Markers with distorted segregation were identified using a c<sup>2</sup> test (a = 0.05).

#### Comparative Genomic Analyses

Scaffolds of 'Leccino' that were anchored to the DSI locus by flanking markers were aligned with the genome assembly of the wild olive (Olea europaea var. sylvestris, GCA\_002742605.1) using (B)LASTZ. Gene annotation was performed using the highest hit in blastp alignments with the NCBI protein database.

#### Development of a Sequence-Tagged Site Marker for Diallelic Self-Incompatibility

PCR primers were designed on the sequence of scaffold\_6030 using the program Primer3 version 4.0 (forward 5'-3': TTTTGGGTGCGAATTGTCCA, reverse 5 ' - 3 ' : AGGCCACTGTATTTCTAACTCG) for the amplification of a 476-bp fragment spanning two adjacent SNPs that co-segregate with DSI. PCRs were performed using 25 ng of template DNA and Q5 High-Fidelity DNA polymerase (New England Biolabs). The thermal profile consisted of 98°C for 30 s, followed by 35 cycles at 98°C for 10 s, 60°C for 20 s, and 72°C for 30 s, and a final extension at 72°C for 2 min. Amplicon size was checked by 1% agarose gel electrophoresis. Amplicons of the expected size showing unique bands were sequenced using the BigDye Terminator v1.1 Cycle Sequencing Kit (Thermo Fisher Scientific) and an ABI PRISM 3130 XL Genetic Analyzer (Applied Biosystems, Foster City, CA). The obtained sequences were aligned using BioEdit 7.1.7 (www.mbio.ncsu.edu/BioEdit/ bioedit.html), to identify polymorphisms. In order to phase single-nucleotide substitutions in heterozygous individuals and to obtain full sequences from individuals carrying heterozygous small indels, amplicons were cloned by using pGEM-T Easy Vector (Promega) and Escherichia coli XL1 blue strain. DNA from different colonies for each genotype was amplified and sequenced as described above.

### RESULTS

#### Genetic Maps

The parentage analysis allowed to validate the origin of 95% of the 241 seedlings for a total of 229 true-to-type F1 genotypes. The remaining 12 individuals were excluded from further analyses. A total of 16,743 segregating loci were identified in the Le×DA progeny by ddRAD sequencing. The parental maps consisted of 23 linkage groups, including 9,737 RAD loci in 'Dolce Agogia' and 7,006 RAD loci in 'Leccino' (Table 1). Linkage groups were numbered consistently with the chromosome numbering of the wild olive genome assembly (Unver et al., 2017). The male parental map ('Dolce Agogia') included 1,829 genetic bins (i.e., positions on the genetic map with a unique segregation pattern) for a total length of 3,538 cM. The female parental map ('Leccino') included 2,311 genetic bins for a total length of 5,680 cM. The average distance between adjacent genetic bins was 2.46 cM in 'Leccino' and 1.93 cM in 'Dolce Agogia'.

TABLE 1 | Statistics of the parental linkage maps. Chromosome number (Chr), restriction associated DNA (RAD) markers, length (cM), and bins are reported for 'Leccino' and 'Dolce Agogia'.


All linkage groups of the 'Leccino' map were consistently longer that those of the 'Dolce Agogia' map, presumably as a result of a consistently higher recombination rate between homologous chromosomes in megaspore mother cells of 'Leccino' than in pollen mother cells of 'Dolce Agogia.' The shorter map obtained in 'Dolce Agogia' was not explained by runs of homozygosity at the chromosome termini compared to 'Leccino,' except for the upper 34 cM of Linkage Group (LG) 16 in 'Leccino,' which lacked segregating sites in 'Dolce Agogia.' The longer map obtained in 'Leccino' was, on the other hand, not explained by the presence of isolated markers with high likelihood of genotypic errors, which usually locally inflate genetic distances due to inconsistent genotype calls compared to neighboring markers. It was noteworthy that the suppression of recombination was observed in the parent with the higher level of heterozygosity, as revealed by the larger number of segregating sites.

#### Structural Variation Between Parental Genomes and Segregation Distortion

We used 2,305 RAD loci segregating from both parents with an allelic status "ABxCD" for pairing and aligning linkage groups between parental maps. The alignment of the two parental maps did not show inter-chromosomal translocations. Two large inversions were detected in LG 1, involving a segment of about 30 cM, and in LG 8, involving a segment of about 12 cM. We did not detect suppression of recombination around each inversion in any of the parental maps and we mapped segregating sites within the inverted regions in each parental map. In both cases, the parental genomes were therefore homozygous for alternative structural variants, but the haplotypes carrying the inversion were ancient enough to have accumulated segregating sites. Minor rearrangements in marker order were identified on LGs 2, 4, 5, 7, and 17. Extended regions with distorted segregation (a = 0.05) were detected on several linkage groups in both maps, with a higher frequency being observed in 'Dolce Agogia' map (Figure 1). Localized distortion was also detected at markers flanking the lower side of the incompatibility locus (see below) with an excess of marker alleles in phase with the incompatibility recessive allele.

### Identification of the Incompatibility Locus

A stigma test was performed on 106 adult Le×DA individuals. The incompatibility group phenotyping assigned 56 individuals to the G1 group, 47 to the G2 group, and discarded 3 individuals due to uncertain phenotype. Segregation of the phenotype followed the expected 1:1 ratio (c<sup>2</sup> = 0.79). A self-pollination test of 'Leccino' and 'Dolce Agogia' showed no growth of pollen tubes on the stigmas, confirming their complete self-incompatibility.

The incompatibility locus was mapped as a Mendelian trait in the 'Leccino' map. The locus was located on LG 18 within an interval of 5.4 cM. Within this interval, the RAD markers on scaffold\_6030 (sizing 42,057 bp) co-segregated with the trait in all 103 individuals of the progeny with a reliable incompatibility

group assignment (Figure 2). The upper border of the locus was supported by a recombination event observed in one individual between the incompatibility group and two RAD markers located in scaffold\_63515 (sizing 3,877 bp) and in scaffold\_26872 (sizing 32,989 bp). The lower border of the genetic interval was supported by five crossing-over events observed in the progeny between the incompatibility group trait and a RAD marker in scaffold\_7600 (sizing 34,918 bp) and by even more crossing-overs with a RAD marker in scaffold\_13712 (sizing 37,403 bp). All these scaffolds of 'Leccino' aligned to a region between coordinates 8,500,000 and 9,100,000 of chromosome 18 in the genome assembly of a wild olive (Unver et al., 2017) (Figure 3).

#### Alignment of Diallelic Self-Incompatibility Markers and Scaffolds With the Wild Olive Genome Assembly and Candidate Genes in the Region

The projection of 'Leccino' scaffolds containing the RAD flanking markers from either side of the DSI locus onto the assembly of chromosome 18 identified a seemingly short physical interval of a few dozen Kbs, comprised between scaffolds 13712 and 7600, on one side, and scaffold 26872 on the other side (Figure 3). The region around scaffolds 13712, 7600, 26872 in the wild olive assembly, corresponding to the chromosomal interval between 8.5 and 8.7 Mb, encodes five predicted proteins (Supplementary Table S1, IDs 1 to 5). However, the inconsistent physical position of scaffold\_7600, which contains a RAD marker co-segregating with DSI and aligns outside of the interval defined by the flanking markers, and the split alignment of the initial 14 Kb of scaffold\_7600 on one side of the locus and the remaining 12 Kb of scaffold\_7600 on the opposite strand at the other side of the locus suggest that the physical interval might be substantially longer. A 450-kb inversion, indicated by a double-head arrow in Figure 3, or an assembly error in the wild olive genome between two flanking sequence gaps would reconcile the genetic marker order in the 'Leccino' map and the assembly of scaffold\_7600 and would define a physical interval for the DSI locus in the wild olive assembly from coordinates 8,720,000 to 9,080,000, with inverted orientation (220 Kb-580 Kb relative coordinates in Figure 3). Under this hypothesis, the DSI region in the assembly of wild olive would encompass 12 additional predicted proteins, bringing the total number of candidate genes to 17 (Supplementary Table S1, IDs 6 to 17). RNA-Seq data also showed that other regions outside of the predicted gene models are transcribed in the physical interval of wild olive.

#### Validation of Sequence-Tagged Site Markers Linked to the Incompatibility Group

We amplified and sequenced a PCR fragment (hereafter referred to as Oe-DSI-locus-fragment-A), spanning two SNPs on scaffold\_6030 that showed co-segregation with RAD loci in 'Leccino' map. The sequencing results of 165 genotypes, which included cultivars and progenies, allowed to identify only two

arrow below indicates the region of order inconsistency between 'Leccino' map and the wild olive genome assembly.

polymorphic sites at positions 63 and 283 bp of the Oe-DSIlocus-fragment-A, which define unambiguously the two groups of incompatibility (Table 2 and Supplementary Table S2). In the parental cultivars Leccino and Dolce Agogia, the amplified fragment showed four alleles differentiated by eleven SNPs (Table 3). The inheritance of four haplotypes segregating from 'Leccino' (S-A/s-a) and 'Dolce Agogia' (s-b/s-c) and the linkage of the S-A haplotype with the genetic determinant of the G1 phenotype was confirmed by analyzing the Oe-DSI-locusfragment-A in the Le×DA progeny (Figure 4). Combining the stigma test results with the genotypic data, the dominant allele S-A and three recessive ones, s-a, s-b, s-c, were identified.

When the same fragment was amplified in 57 olive cultivars with known incompatibility group assignment, an additional haplotype was found, the S-B allele, which represents a new dominant allele distinguishing G1 cultivars (Supplementary Table S2). The five haplotypes carried a total of 12 SNPs and 2 small indels (Table 3). In the 20 G1 cultivars, the dominant S-A allele was detected 13 times and S-B was present in apparent homozygous state in six out of seven cases and in one case with the haplotype s-b, detected by cloning. Both dominant alleles were found in combination with s-a, s-b, and s-c recessive alleles. The presence of only one dominant allele in some cultivars could be due to homozygosity or to the inability to amplify the second allele (null allele). Among the 37 G2 cultivars, the most frequent recessive haplotype was s-b, also showing a high percentage of homozygosity (Supplementary Table S2). A skewed geographical distribution was observed for some alleles between Italy and Spain, the two most represented countries in

TABLE 2 | Incompatibility group phenotypes and corresponding haplotype combinations of Oe-DSI-locus-fragment-A observed in the analyzed olive cultivars.


a Genotype at position 63 bp of Oe-DSI-locus-fragment-A.

b Genotype at position 283 bp of Oe-DSI-locus-fragment-A.

\*The second allele could be considered as null allele.


TABLE 3 | Haplotypes of Oe-DSI-locus-fragment-A and position (base pair distance from the forward primer) of the polymorphisms [single-nucleotide polymorphisms (SNPs) and indels] identified. SNP combinations that identify uniquely each incompatibility group are indicated in bold.

our data set, such the S-B allele, mainly present in Spanish cultivars, and s-c, only present within the Italian ones (Supplementary Figure S1).

#### DISCUSSION

Species with a long juvenile phase require many years to reach the first flowering and fruit setting (Purba et al., 2001; Flachowsky et al., 2011; Yang et al., 2016), delaying the generation of crossbreed, F2, and backcross populations and slowing down the process of genetic improvement. In these cases, the availability of genomic tools to assist the selection of new genotypes becomes mandatory in order to guide the choice of parental combinations, to select genotypes carrying the traits of interest, and to introgress specific alleles into new varieties. Markers linked to the traits under selection, identified either through genetic mapping approaches or by means of genomewide association studies, represent the most powerful tools for breeding in woody perennial crops, offering new opportunities to develop early selection strategies and new ways to integrate variation from different sources (Varshney et al., 2005; Montanari et al., 2013; Bink et al., 2014; Kole et al., 2015; Muranty et al., 2015).

The use of high-throughput sequencing technologies is speeding up the genotyping of mapping populations, providing an unprecedented high number of markers that allows the construction of dense genetic maps (Kujur et al., 2015; Liu et al., 2017). In this work, we have used ddRAD sequencing for genotyping 7,006 and 9,737 segregating sites in a biparental population, which largely exceeded the number of genetic bins that can be resolved in a progeny of 229 individuals. Olive tree is known to be a highly heterozygous species with very high levels of nucleotide diversity observed even in cultivated varieties

'Dolce Agogia' (s-b/s-c) allele combinations and segregation of four haplotypes in the Le×DA progenies.

(Gros-Balthazard et al., 2019). Here we show that the species is characterized also by very high levels of recombination as attested by the lengths of the two parental genetic maps. We observed a significant difference in the length between the male Dolce Agogia (3,538 cM) and the female Leccino (5,680 cM) maps. Heterochiasmy, i.e., the presence of different crossover frequencies in male and female meiosis, has frequently been observed in plants, without a fixed trend of higher frequency in male or female meiosis (Lenormand and Dutheil, 2005). In Arabidopsis, for example, a dramatically higher crossing over rate (575 cM vs. 332 cM) is observed in male than in female meiosis (Giraut et al., 2011). Theory would predict that haploid selection determines heterochiasmy, with the sex experiencing more intense selection during the haploid phase showing lower recombination (Lenormand and Dutheil, 2005). In a highly heterozygous and obligately outcrossing species such as olive, we can expect stronger selection among male gametophytes that could explain the observed difference in genetic map length. An alternative explanation could be provided by a high frequency of chromosomal inversions or other chromosomal rearrangements that suppress recombination. The identification of two large inversions for which the two parental varieties are homozygous for alternative alleles makes us believe that in such a highly heterozygous species, each individual accession may be heterozygous for a much larger set of inversions. As 'Dolce Agogia' appears to be more heterozygous than 'Leccino,' the lower recombination frequency observed could be explained by a higher frequency of heterozygous chromosomal rearrangements that suppress recombination.

In this case, the generation of markers did not represent a limiting factor for trait mapping, whose resolution was rather limited by the size of the progeny subject to phenotyping. Phenotyping of reproductive traits requires adult plants and, in the case of DSI group assignment, is also labor intensive. We mapped the DSI locus to a 5.4 cM genetic interval, which is estimated to correspond to a physical distance of approximately 300 Kb in a chromosome-scale assembly of a wild olive genome (Unver et al., 2017). Gene prediction and gene annotation in the wild olive assembly did not provide any obvious functional candidate, except for transcription factors that might be involved in gynoecium development regulation. Two candidate genes encode proteins putatively related to flower development: the ortholog of Arabidopsis STYLISH 1 encoding a binding protein with nuclear localization that promotes formation of stylar and stigmatic tissues and proliferation of stylar xylem (Gomariz-Fernández et al., 2017; Min et al., 2019), and FAR1 related sequence 5, which is expressed in hypocotyls, inflorescences stems and flowers (Li et al., 2017; Ma and Li, 2018). However, the prediction of proteins with uncharacterized function in the same region, the presence of non-annotated transcribed regions, the expected intraspecific presence/absence variation between genomes, make it necessary to proceed with the complete assembly of the two 'Leccino' haplotypes across the entirety of the DSI locus.

While we confirmed the monogenic nature of the DSI system in olive, as postulated by Saumitou-Laprade et al. (2017a), five haplotypes were identified using a sequence-tagged site (STS) marker in the locus and, via linkage mapping and association mapping, we demonstrated that two of them are in phase with the dominant genetic determinant of the G1 incompatibility group. The STS haplotypes were linked to the DSI genetic determinant, but they do not correspond to the DSI alleles. It was noteworthy that the cumulative frequency of the STS haplotypes in phase with the dominant S allele was roughly half the cumulative frequency of haplotypes in phase with the s allele, although the accessions analyzed in this paper are not a natural population. It is possible to speculate that this condition may reflect a balancing selection for maintaining G1 and G2 genotypes in a cultivated population at frequencies that maximize pollination rate.

In the set of cultivars analyzed in this paper, we found only two dominant S-A and S-B alleles that confer the G1 phenotype and no cultivar was observed carrying both dominant alleles. It was also noteworthy that some haplotypes were more frequent in groups of cultivars typical of specific geographic locations. For instance, the allele s-c showed 18% frequency in the sample and was almost exclusively present in varieties of Central Italy, leading to an excess of homozygous genotypes in that geographic area, whereas the S-B haplotype was mostly present in cultivars from the Iberian Peninsula. Variation of haplotype frequencies at DSI-linked markers in geographically distant populations may simply reflect genetic drift or may indicate adaptive evolution changing the frequency of DSI alleles. Further analyses will help to clarify these issues and to identify additional untapped variation at the DSI locus.

From an evolutionary point of view, the maintenance of a homomorphic DSI in a hermaphrodite species like O. europaea is unexpected and remains to be explained. Indeed, the SI systems are susceptible to rapid invasion by new self-incompatibility alleles, which should experience a strong negative frequencydependent advantage. In Oleaceae species in which DSI was identified, such invasion was not observed (Saumitou-Laprade et al., 2010; Vernet et al., 2016; Saumitou‐Laprade et al., 2017a; Saumitou‐Laprade et al., 2017b; Saumitou-Laprade et al., 2018). The homomorphic DSI system seems to "resist" and this is an intriguing finding (Barrett, 2019). In the absence of obvious evolutionary constraints that could prevent the selection of new SI specificities, the molecular constraint appears to be the best candidate. Therefore, the molecular characterization of the S locus region in Olea and its comparison with other Oleaceae species are of strong interest. The present work is opening the way and should contribute efficiently to solve the evolutionary paradox of the stable homomorphic DSI in Oleaceae.

The STS marker identified in the present work represents a new tool for large-scale screening of thousands of olive accessions from their traditional areas of cultivation across the Mediterranean shores and from new growing areas and will allow to determine their incompatibility group. In addition, it represents a starting point for the identification of the genetic determinant of such a peculiar incompatibility system. A comprehensive understanding of the genetic control of DSI can offer great opportunities to characterize cultivars for their incompatibility group, increase the olive production and guide the orchard plantation design with optimal spatial distribution of inter-compatible varieties.

#### CONCLUSIONS

This work provides markers for a fast and reliable genotyping of olive cultivars for their incompatibility group, offering great opportunities to rapidly screen and identify inter-compatible varieties, planning inter-varietal crosses, and reducing the time for seedling selection. It also represents the initial effort for the identification of the genetic determinants of incompatibility, a starting point for understanding the molecular mechanisms underlying the DSI system in olive.

The genetic maps of 'Leccino' and 'Dolce Agogia' will also serve to identify Mendelian loci or QTLs responsible for other important traits that segregate in the Le×DA progeny. Information yet to be gained on the number and location of the genetic determinants of those traits, along with the DSI-linked ddRAD and STS markers developed in this paper, may pave the way to the application of genomics-assisted breeding in olive.

#### DATA AVAILABILITY STATEMENT

Raw reads of ddRAD-Seq have been deposited in Short Read Archive under the BioProject number PRJNA594490. The matrix of genotypic calls has been deposited in the figshare repository with the DOI 10.6084/m9.figshare.11352068.

#### AUTHOR CONTRIBUTIONS

RM, PS-L, PV, MM and LB conceived the study. SP, RM, NC and LB provided the plant material. RM, SM, MR, PS-L, PV and FA

#### REFERENCES


performed the stigma tests. RM, NC, VP, MR, AF, GM, SS, DS, GDG and MM performed the molecular, sequencing and bioinformatic analyses. FB, LB, RM, SM and VP wrote the first draft of the manuscript. PS-L, PV, DS, GDG and MM contributed to the writing and revision of the manuscript. All the authors agreed on the final version of this work.

#### FUNDING

The research was supported by the European Union's Horizon 2020 Research and Innovation Program Marie Sklodowska-Curie - Before Project (Grant Agreement No 645595), by the EU projects "OLIVE4CLIMATE – LIFE" (LIFE15 CCM/IT/ 000141) and by the Rural Development Program of Umbria Region, 2014-2020 – Measure 16.2.1, INNO.V.O. - Development of alternative varieties to face the new challenges of olive growing, SIAN n. 84250258245.

#### ACKNOWLEDGMENTS

We thank the Istituto Tecnico Agrario "Ciuffelli" (ISIS) Todi (PG) for hosting the Le×DA progeny in the field and for their technical assistance.

#### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2019.01760/ full#supplementary-material

SUPPLEMENTARY FIGURE S1 | Geographical distribution of DSI alleles in the Italian and Spanish analyzed varieties.


(Prunus avium L.) is associated with domesticated and bred germplasm. Sci. Rep. 9 (1), 5008. doi: 10.1038/s41598-019-41484-8


downy mildew resistance in Vitis aestivalis-derived 'Norton'. Theor. Appl. Genet. 132, 137–147. doi: 10.1007/s00122-018-3203-6


shape traits in Pisum sativum using SLAF sequencing. Front. Genet. 9, 615. doi: 10.3389/fgene.2018.00615


Conflict of Interest: The authors declared that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Mariotti, Fornasiero, Mousavi, Cultrera, Brizioli, Pandolfi, Passeri, Rossi, Magris, Scalabrin, Scaglione, Di Gaspero, Saumitou-Laprade, Vernet, Alagna, Morgante and Baldoni. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

# Ploidy-Dependent Effects of Light Stress on the Mode of Reproduction in the Ranunculus auricomus Complex (Ranunculaceae)

#### Fuad Bahrul Ulum1,2, Camila Costa Castro1 and Elvira Hörandl 1\*

<sup>1</sup> Department of Systematics, Biodiversity and Evolution of Plants, Albrecht-von-Haller Institute for Plant Sciences, University of Göttingen, Göttingen, Germany, <sup>2</sup> Biology Department, Faculty of Mathematics and Sciences, Jember University, Jember, Indonesia

#### Edited by:

Emidio Albertini, University of Perugia, Italy

## Reviewed by:

Juan Pablo Amelio Ortiz, National Council for Scientific and Technical Research (CONICET), Argentina Eric Javier Martínez, Instituto de Botánica del Nordeste (IBONE-CONICET), Argentina

\*Correspondence:

Elvira Hörandl elvira.hoerandl@biologie.unigoettingen.de

#### Specialty section:

This article was submitted to Plant Breeding, a section of the journal Frontiers in Plant Science

Received: 29 November 2019 Accepted: 23 January 2020 Published: 20 February 2020

#### Citation:

Ulum FB, Costa Castro C and Hörandl E (2020) Ploidy-Dependent Effects of Light Stress on the Mode of Reproduction in the Ranunculus auricomus Complex (Ranunculaceae). Front. Plant Sci. 11:104. doi: 10.3389/fpls.2020.00104 Polyploidy in angiosperms is an influential factor to trigger apomixis, the reproduction of asexual seeds. Apomixis is usually facultative, which means that both sexual and apomictic seeds can be formed by the same plant. Environmental abiotic stress, e.g. light stress, can change the frequency of apomixis. Previous work suggested effects of stress treatments on meiosis and megasporogenesis. We hypothesized that polyploidy would alter the stress response and hence reproductive phenotypes of different cytotypes. The main aims of this research were to explore with prolonged photoperiods, whether polyploidy alters proportions of sexual ovule and sexual seed formation under light stress conditions. We used three facultative apomictic, pseudogamous cytotypes of the Ranunculus auricomus complex (diploid, tetraploid, and hexaploid). Stress treatments were applied by extended light periods (16.5 h) and control (10 h) in climate growth chambers. Proportions of apomeiotic vs. meiotic development in the ovule were evaluated with clearing methods, and mode of seed formation was examined by single seed flow cytometric seed screening (ssFCSS). We further studied pollen stainability to understand effects of pollen quality on seed formation. Results revealed that under extended photoperiod, all cytotypes produced significantly more sexual ovules than in the control, with strongest effects on diploids. The stress treatment affected neither the frequency of seed set nor the proportion of sexual seeds nor pollen quality. Successful seed formation appears to be dependent on balanced maternal: paternal genome contributions. Diploid cytotypes had mostly sexual seed formation, while polyploid cytotypes formed predominantly apomictic seeds. Pollen quality was in hexaploids better than in diploids and tetraploids. These findings confirm our hypothesis that megasporogenesis is triggered by light stress treatments. Comparisons of cytotypes support the hypothesis that ovule development in polyploid plants is less sensitive to prolonged photoperiods and responds to a lesser extent with sexual ovule formation. Polyploids may better buffer environmental stress, which releases the potential for aposporous ovule development from somatic cells, and may facilitate the establishment of apomictic seed formation.

Keywords: apomixis, single seed flow cytometric seed screening, light stress, meiosis, pollen, polyploidy, Ranunculus, seed formation

## INTRODUCTION

Polyploidy is a heritable trait of obtaining more than two sets of chromosomes in the nuclei (Comai, 2005). A polyploid arises either from intraspecific genome duplication (autopolyploidy) or the merging of the genome of distinct species through hybridization and subsequent genome duplication (allopolyploidy) (Grant, 1981). Polyploidy is quite common in flowering plants, estimated to occur in more than 50% of species (Soltis et al., 2015) and is considered as a major factor in plant evolution (Soltis et al., 2014). Even though polyploidy is potentially obstructed by several disadvantages, e.g., disruption effects of structural enlargement of nuclei, side effects of aneuploidy, and epigenetic mutation, it also provides advantages such as heterosis, gene redundancy, and novel gene combinations. Heterosis favors polyploids that are more vigorous than their diploid progenitors, while gene redundancy protects polyploids from the deleterious effect of mutation (Comai, 2005).

Polyploidisation, with higher DNA content, increases the cell size and promotes diversity of the genome, transcriptome, and metabolome. These improvements imply a greater resistance to environmental change (Schoenfelder and Fox, 2015). Several studies reported a better adaptivity of polyploid plants to abiotic stress conditions, such as salt (Chao et al., 2013), drought (del Pozo and Ramirez-Parra, 2014; Martínez et al., 2018), drought and heat stress (Godfree et al., 2017), cold (Klatt et al., 2018), and light (Coate et al., 2013). The better stress response and adaptation of polyploids to abiotic conditions are probably under epigenetic control (del Pozo and Ramirez-Parra, 2014). Polyploidy changes the methylation profile under stressful environments, as reported, e.g. for Brassica napus after drought (Jiang et al., 2019).

Notably, stress conditions can also influence mode of reproduction, especially apomixis, the asexual reproduction via seed (Nogler, 1984a). Apomixis is widespread in angiosperms (Hojsgaard et al., 2014b), and occurs most frequently in polyploid cytotypes, but occasionally also in diploids (Grant, 1981; Carman, 1997; Hojsgaard and Hörandl, 2019). Gametophytic apomixis, the form of interest here, involves formation of an unreduced embryo sac from an unreduced megaspore via meiotic restitution of the megaspore mother cell (diplospory) or from a somatic cell of the nucellus tissue (apospory) (Asker and Jerling, 1992; Koltunow and Grossniklaus, 2003). Functional seed development through gametophytic apomixis involves three components: (1) apomeiosis (formation of unreduced embryo sac); (2) parthenogenesis (embryo development without fertilization of egg cell); and (3) functional endosperm development with male genome contributions from the pollen (pseudogamously) or independent from pollen (autonomously) (Nogler, 1984a). Male development is usually meiotic, but microsporogenesis is often disturbed, and hence final pollen quality is often strongly reduced (Asker and Jerling, 1992; Izmaiłow, 1996; Hörandl et al., 1997; Mráz et al., 2009). Apomixis is heritable (Ozias-Akins and van Dijk, 2007), and under genetic and epigenetic control (Grimanelli, 2012; Hand and Koltunow, 2014). Natural apomixis is frequently facultative, which means that the plant produces sexual and asexual seeds within one generation, often within the same flower or inflorescence (Bicknell et al., 2003; Aliyu et al., 2010; Cosendai and Hörandl, 2010; Hojsgaard et al., 2013; Schinkel et al., 2016).

Alternation of frequencies of asexual vs. sexual reproduction was observed under abiotic stress conditions, e.g. temperature, drought stress, salt stress, and photoperiod in many different genera (Evans and Knox, 1969; Saran and de Wet, 1976; Quarin, 1986; Gounaris et al., 1991; Klatt et al., 2016; Rodrigo et al., 2017; Klatt et al., 2018). Such a condition-dependent sex is also known from other asexual eukaryotes (Ram and Hadany, 2016). Abiotic stress leads to the accumulation of ROS (Reactive oxygen species) in plant tissues, which triggers oxidative damage, but also can initiate various epigenetic, genetic and hormonal signaling pathways for plant development (Halliwell, 2006; Foyer and Noctor, 2009; Huang et al., 2019). In the germline precursor cells, oxidative stress may increase the level of DNA double-strand breaks (DSBs) as initiator of meiosis. Here meiosis could act as DNA repair system (Hörandl and Hadacek, 2013). The above-mentioned studies on condition-dependent sex in plants support this hypothesis. In polyploids, however, an improved tolerance of stress conditions might decrease the stimulus for meiosis, and consequently trigger the alternative asexual development (Hörandl and Hadacek, 2013). However, a putative differential response of cytotypes to stress conditions with respect to mode of reproduction was so far not investigated.

We use here as a model system three cytotypes of the Ranunculus auricomus complex, a Eurasian polyploid complex with facultative, aposporous and pseudogamous apomixis (Nogler, 1984b; Hojsgaard et al., 2014a). In Central Europe, the R. auricomus complex comprises three closely related and genetically similar sexual progenitor species, and polyploid apomictic hybrids of these taxa (Hörandl et al., 2009; Hodač et al., 2014). One of the hexaploid hybrids (R. carpaticola x cassubicifolius) with facultative apomixis (Hojsgaard et al., 2014a) was used previously for testing the response to light stress. This previous experiment using extended photoperiod enhanced sexual megaspore formation in these hexaploid R. auricomus clones concomitant with oxidative stress (Klatt et al., 2016). In our study, we test the hypothesis that with the light stress treatment, diploids would respond more intensively to stress conditions with higher frequencies of sexual development than higher ploidy levels. Here we extend the treatment of (Klatt et al., 2016) to diploid, lower polyploid (tetraploid), and the same hexaploid plants to observe effects on mode of reproduction in different ploidy levels. To simulate the effect of extended photoperiod on the components of gametophytic apomixis, we study here two developmental steps, namely ovule formation, and seed formation. Since microsporogenesis is meiotic without an alternative asexual developmental pathway, we focus here on pollen quality as a possible factor for successful seed formation. The main aims of this research are to explore with light stress treatments whether ploidy level alters stress response with respect to mode of reproduction, and whether stress response correlates positively to sexual megaspore formation and/or proportions of sexual

### MATERIALS AND METHODS

### Plant Material

seed formation.

We used for the extended photoperiod experiment facultative apomictic plants of the Ranunculus auricomus complex from three different cytotypes. These cytotypes are hybrids that originated from three Central European parental species (R. cassubicifolius, R. carpaticola, and R. notabilis). The diploid plants were synthetic F2 hybrids of R. carpaticola x notabilis and represent sister or sibling individuals from two parental lines; see details of crossing design in Barke et al. (2018). We used these plants because natural diploid apomicts are not known for the R. auricomus complex. The tetraploids were garden offspring of Ranunculus variabilis, which is a putative natural allopolyploid of the R. carpaticola/cassubicifolius lineage and the R. notabilis lineage, and occurs sympatrically with the parental species in Central Europe (Hodač et al., 2014). The hexaploids were garden offspring of Ranunculus carpaticola x cassubicifolius, the same plants as used by Klatt et al. (2016). Hence, all cytotypes are hybrids, and they share the genetic background of closely related parental species (Hörandl et al., 2009). Since the parental taxa and the natural hybrids occur all in the same geographical area and altitudinal zone (Hörandl et al., 2009), we can also assume that they are all pre-adapted to the same natural light conditions. The ploidy level of tetraploids was ascertained using flow cytometry following methods of (Klatt et al., 2016). A list of materials with an identity number and ploidy levels is given in the Appendix (Supplementary Table 1). Plants were cultivated in the old botanical garden of the University of Goettingen from summer to winter for exposure to natural conditions, to stimulate the flower initiation.

### Growth Chamber Setup

The plants were moved into the climate growth chamber when sprouting at the beginning of the spring season. We run experiments for 2 years to get a more complete sampling. The 1st year experiment was started from the first week of March 2017; the 2nd year was started from first February 2018. A total of c. 25 plants from each cytotype were grown with 10-h photoperiod (control) and 16 plus 0.5-h photoperiod (stress treatment) following (Klatt et al., 2016). Temperature setup and relative humidity were kept stable at 18°C and 60%, respectively. The light intensity was measured with a photometer (3415F Quantum Light Meter, Spectrum Technologies, Inc, Plainfield, USA) as photoactive radiation (PAR) c. 250 µmol m-2 s -2 (measured at shoot tips).

#### Plant Genotyping

Genotyping by Simple sequence repeats (SSRs) was applied to verify the plant's clonality and the relationships of cytotypes. We conducted SSRs only to tetraploid plants following methods by (Klatt et al., 2016). The SSR data for the other two cytotypes were derived from (Barke et al., 2018) for diploids and (Klatt et al., 2016) for hexaploids. Genomic DNA was performed by extracting dried leaf samples using Invisorb® Spin Plat Mini Kit (Qiagen, Hilden, Germany) according to the manufacturer's protocol. Multiplex Polymerase Chain reaction (PCR) was conducted at 25 µl volumes, containing 1 µl template DNA, 12.5 Roti®-Pol TaqS Master mix (Carl Roth GmbH + Co. KG, Karlsruhe, Germany), 1 µl Forward Primer, 1 µl Reverse Primer, 0.125 µl MgCl2, 1 µl CAG-Primer (FAM or HEX labeled). PCR reactions were run in a BIORAT™ Thermal Cycler. PRC machine setting was: 94°C for 10 min, then 14 x (denaturation at 94°C for 60 s, annealing at 62°C+ 0.5°C per cycle for 90 s, extension at 72°C for 60 s), followed subsequently by 35 x (denaturation at 94°C for 30 s, annealing at 55°C for 30 s and extension at 72°C for 30 s), last extension step at 72°C for 60 s and final storage conditions at 4°C. PCR samples were adjusted before 85 µl formamide (HiDi) was added. This mixture was run in an automatic capillarity sequencer Genetic Analyzer 3130 (Applied Biosystems, Forster City, CA, USA) using Gene Scan 500 Rox (Applied Biosystems) as size standard after a denaturing pretreatment for 3 min at 92°C. Scoring of the electropherograms was done using Genemarker V2.4.2 (SoftGenetics LLC, State College, PA, USA) and exported as a binary matrix presence/ absence of alleles to characterize multilocus genotypes. We applied Neighbour-joining analysis based on Jaccard similarity index in FAMD to test the SSR profiles (Schlüter and Harris, 2006). Branch support values were derived from the majority consensus tree from 1000 bootstrap replicates. FigTree v1.4.2 (Rambaut, 2007) visualized the result.

#### Female Development

Development of embryo sacs was already previously characterized within the R. auricomus complex on both apomictic and sexual species and is quite uniform (Nogler, 1984a; Hojsgaard et al., 2014a; Klatt et al., 2016; Barke et al., 2018): the megaspore mother cell differentiates near the micropyle and undergoes meiosis, resulting in a megaspore tetrad. In sexual development only the chalazal megaspore develops further, and produces after three mitotic divisions a typical 7-celled, 8-nucleate Polygonum type embryo sac (with three antipodals, a binucleate central cell, two synergids, and one egg cell). Apomictic development is characterized by enlargement of a somatic cell in the nucellus which emerges in parallel and aside the megaspore tetrad, and continues embryo sac development into an unreduced Polygonum type embryo sac, whereas all megaspores abort. Embryological analysis of the female development was made at the end of sporogenesis and the beginning of gametogenesis, following Hojsgaard et al., 2014a and Barke et al. (2018). R. variabilis, the only taxon that was analyzed for the first time here, did not show any deviations in timing or type of development. Flower buds were fixed at Formalin: acetic acid: ethanol: dH2O (2: 1: 10: 3.5) (FAA) for 48 h, and stored in 75% ethanol (Hojsgaard et al., 2014a). The flower bud was treated by dehydrating in four steps of 30 min incubation in 1 ml of 70%, 95%, and 100% (two times). Then the flower buds were treated by clearing method in five steps of 30 min in 300 µl of upgrading series of methyl salicylate diluted in ethanol [25%, 50%, 70%, 85%, and 100%; (Young et al., 1979)]. The perianth of selected flower buds was removed, ovaries were dissected and mounted in methyl salicylate on glass slides. Female sporogenesis and early stages of sexual or aposporous gametophyte development were analysed with differential interface contrast (DIC) in a light transmission microscope (Leica DM5500B with DFC 450 Camera, LAS V41 software, Leica Microsystems, Wetzlar, Germany). The determination of sexual and asexual ovules was made by the absence or presence of aposporous initial cells (AIC), respectively (van Baarlen et al., 2002).We excluded ovules with unclear structure and aborted ones. We only considered the data from a plant that had a minimum of five observable ovules. Additional data from (Klatt et al., 2016) were added to increase the N value for the hexaploid cytotype.

#### Seed Set

After we collected the sample for embryological analysis, the remaining flowers were then manually pollinated to increase fertilization rates. In fruiting stages, we bagged a minimum of five peduncles with collective fruits with porous plastic bags to avoid seed loss. We harvested the mature collective fruits and evaluated the proportion of well-developed seeds (seed-set percentage) among ploidies per flower on individual according to Hörandl (2008). Well-developed seeds were stored at room temperature and were used for reproductive pathway analysis.

### Reproductive Pathway of Seed Formation

The reproductive pathway was evaluated by single seed flow cytometric seed screening (ssFCSS) (Matzk et al., 2000). Two steel balls grounded a single seed (Ø 4 mm) in a 2 ml Eppendorf tube in a TissueLyzer II (Qiagen, Hilden, Germany; 30 Hz s-1 for 7 s). Nuclear isolation and staining were attained in two steps using Otto buffers (Otto, 1990). In the first step, nuclear isolation, 200 µl Otto I buffer (0.1 M citric acid monohydrate, 0.5% v/v Tween 20) was added and hand shacked with the ground material for 30 s. The solution was then filtered (30 µm mesh, Celltrics® Münster, Germany) into plastic tubes (3.5 ml, 55 mm x 12 mm, Sarstedt, Nümbrecht, Germany). In the second step, staining, 800 µl otto II buffer [0.4 M Na2HPO2, ddH2O and charged with 3 ng/ml 4',6-diamidinophenyl-indole (Sigma-Aldrich, Munich, Germany)] was added to the filtrate, and the solution was measured directly in Flow cytometer (CyFlow® Ploidy Analyser (Sysmex Partec GmbH, Görlitz, Germany) in the Blue fluorescence (UV LED, gain 365). Histograms were analyzed using CyView™ V.1.6 software (Partec GmbH). The coefficients of variation were less than 8%. The ploidy levels of embryo and endosperm were determined, and peak indices (PI) (mean peak value of the embryo compared to the mean peak of endosperm) were assessed (Supplementary Figure 5). For a Polygonum type embryo sac with two polar nuclei, the peak index for a sexual seed is c. 1.5, while for asexual seeds it can be 2.0, 2.5, or 3.0, depending on the contribution of pollen nuclei to endosperm formation. We observed the following developmental pathways: Sexual, pseudogamous apomixis, autonomous apomixis, and BIII-hybrids (Hojsgaard and Hörandl, 2019). BIII-hybrids arise from an unreduced embryo sac, whereby egg cell and polar nuclei were fertilized. The BIII-hybrids were excluded for the determination of the proportion of sexual seeds since this mode of reproduction is intermediate between sexual and asexual seed formation.

### Pollen Stainability

Pollen stainability was determined on a minimum of 500 pollen grains per plant from all cytotypes in both chambers by using 10% Lugol's iodine (I2KI) solution, following methods by (Schinkel et al., 2017). The stainability of starch content was used as an indicator of viable pollen under a light microscope (LEICA DM5500B with DFC 450 C camera, LAS V41 software, Leica Microsystems, Wetzlar, Germany) at 400x magnification. The viable pollen grains were indicated by black-stained color, but brownish, reddish, and translucent (empty) pollen was counted as non-viable.

### Statistical Analyses

All data were tested for their normality distribution by Kolmogorov-Smirnov and Shapiro-Wisk test and for their homogeneity of variance with the Levene test. Female development, seed set, reproduction pathway of seed formation, and pollen viability were determined per flower as a percentage and subsequently averaged per plants. The percentage of data were arcsine transformed before statistical analysis. We tested the influence of treatment on mean sexual ovules and seed set among ploidies with General Linear Model (GLM) univariate (Two-way ANOVA) for completely randomized factorial design model, and means were compared according to the least significant difference (LSD) test at 0.05 probability level (pvalue < 0.05). Tukey HSD was performed to the means of sexual ovules to determine the main factors. Nonparametric Kruskal-Wallis and Mann-Whitney U-test were applied to test the influence of treatment on sexual seed formation per ploidy. Boxplots were plotted with untransformed percentage values and show the 25th, and 75th percentile ranges as a box, and the median as a black line: circles are outliers; asterisks are extreme values. All statistical analyses were performed with IBM SPSS Statistic 25 (IBM Deutschland GmbH).

## RESULTS

#### Female Development

The ovule development of all three cytotypes of the R. auricomus complex showed the same pattern of a typical Polygonum type embryo sac (Supplementary Figures 1–4). We had observed 6,505 ovules (c. 18 ovules per flower bud) among cytotypes at megasporogenesis and early megagametogenesis. At this stage, sexual and asexual ovules can be discriminated (Supplementary Figure 4). At the megasporogenesis stage, a meiotic division of a megaspore mother cell produced four cells, i.e. a megaspore tetrad. During the next step, three cells aborted, and only the chalazal cell remained as functional megaspore. At megagametogenesis stage, the functional megaspore enlarged with the presence of vacuoles and continued with three nuclear divisions, resulting in a total of eight nuclei. Development of sexual ovules was indicated by the absence of any aposporous initial cell (AIC) during megasporogenesis and early megagametogenesis. On the other hand, in asexual ovules, one or more AIC was observed directly near the megaspores at the chalazal pole or near to this area, but at a different optical layer (Figure 1).

#### Effects of Ploidy, Treatment, and Combined Effect of Ploidy/Treatment to the Proportion of Female Development

Extended photoperiod enhanced the proportion of sexual ovules in all three cytotypes of the R. auricomus complex. The mean proportion of sexual ovules significantly increased from control treatment to stress treatment (80.37 (mean) ± 19.38 (sd) % to 99.26 ± 1.26%; p-value < 0.001) in diploid, (57.90 ± 8.79% to 80.29 ± 10.67%; p-value < 0.001) in tetraploids, and 52.61 ± 26.11% to 70.36 ± 20.04%; p-value = 0.006) in hexaploids (Figure 2).ANOVA revealed significant alterations by the main effect photoperiod (pvalue < 0.001) and ploidy (p-value < 0.001), but not by the interrelationship between them (p-value = ns) (Table 1). Tukey HSD revealed significant differences in control treatment between diploids and hexaploids (p-value = 0.047) and in stress treatment between diploids and polyploids (p-value < 0.001) but there is neither a significant difference between tetraploids and hexaploids in the both treatments nor among diploids and tetraploid in the control treatment (p-value = ns) (Supplementary Table 3).

### Seed Set

Extended photoperiod did not influence the proportion of welldeveloped seeds among cytotypes of R. auricomus complex. Our investigation of 83 individuals revealed no significant difference in seed set between plants grown in control and stress chamber (p-value = ns) (Figure 3). Diploid plants under stress treatment produced a higher mean but not significant different proportions of well-developed seeds (mean value = 50.22%) compared to control treatments (mean value = 39.84%; p-value = 0.300). Tetraploid plants under stress treatment produced a mean of 28.97% compared to a mean of 31.09% (p-value = 0.459) under control treatment. Hexaploid plants under stress treatment produced a mean of 43.04% compared to a mean of 42.17% (pvalue = 0.880) under control treatment. A two-way ANOVA revealed only significant differences between the ploidies (p-value < 0.001), but neither a significant effect on treatment nor an interaction effect (p-value = ns) (Supplementary Table 4). Multiple comparison tests revealed that significant differences were observed between diploids and tetraploids (p-value < 0.001; Tukey HSD) and between tetraploids and hexaploids (p-value < 0.001; Tukey HSD) (Supplementary Table 5).

#### Reproductive Pathways of Seed Formation

Extended photoperiod did not enhance the proportion of sexual seeds over ploidies. The mean value of the proportion of sexual seeds was not significantly different between treatments among ploidies (p-value = ns, Mann-Whitney U-test) (Figure 4, Supplementary Table 6). Analysis of 1,468 seeds among ploidies indicated several reproductive pathways in the R. auricomus complex (Table 2). In diploid plants, the majority of seeds was formed sexually while in tetraploid and

FIGURE 2 | Proportions of sexual ovules in the R. auricomus complex plants grown in climatic chamber under prolonged photoperiod (stress) and shortened photoperiod (control). Mean values and statistical significance are given in figure. N = number of individuals. For the test statistics, see Supplementary Table 2.

TABLE 1 | P-values for the two way ANOVAs to determine the interaction effect of stress treatment and ploidy level on the proportion of sexual ovules.


R Squared = 0.574 (Adjusted R Squared = 0.551)

hexaploid plants, asexuality was the most frequent reproduction mode (Figure 4). In diploid sexual seeds, we observed the ratio of embryo to endosperm DNA content of 2C:3C, which is the indication of double fertilization between reduced egg cell with one sperm cell [1(m)+1(p)] and two polar nuclei with the other sperm cell [1(m)+1(m)+1(p)], producing a Peak Index (PI) of 1.5. A few apomictic seeds were observed (two with pseudogamous endosperm and one with autonomous endosperm) only in the stress treatment. The pseudogamous endosperm comes from the development of an unreduced embryo [2(m)] and fertilization of two polar nuclei with one or two reduced or unreduced sperm cells [2(m)+2(m)+1(p) or 2 (p)], with ratios of embryo to endosperm of 2C:5C (PI = 2.5) and 2C:6C (PI = 3.0). Autonomous endosperm develops from an unreduced embryo [2(m)] and unfertilized of two polar nuclei (2Cm+2Cm) with the ratio of embryo to endosperm of 2C:4C (PI = 2.0), which is caused by the absence of paternal genome in seed development.

Tetraploid and hexaploid plants displayed more variation on the mode of seed reproduction. Sexual reproduction mode was present in 39 (6.2%) tetraploid seeds and 36 (7.5%) hexaploid seeds. Pseudogamous endosperm was the most frequent mode of seed formation and appeared in 543 (86.3%) tetraploid seeds and 433 (90.7%) hexaploid seeds. Generally, this mode of reproduction produced a PI value of 3.0. The less frequent forms of pseudogamous endosperm with a PI = 2.5 and PI = 4.0 originated from the contribution of one reduced sperm nucleus or two unreduced sperm nuclei. Autonomous endosperms (PI = 2.0) were the most infrequent mode of seed formation, in a total of four seeds (0.55%) from tetraploids and nine seeds (1.93%) from hexaploids. Another type of reproduction mode, i.e. partial apomixis with an unreduced egg cell fertilized by reduced pollen (BIII-hybrid), was more frequent in tetraploid plants (45 seeds or 12.43%) compared with only one case in hexaploid plants (Table 2).

#### Pollen Stainability

Extended photoperiod did not alter the proportion of viable pollen between treatments. The assessment through 34,348 pollen grains from 67 plants revealed no significant differences in pollen viability between plants of the same cytotype grown in both treatments (p-value = ns; see Supplementary Figure 6). Hexaploids produced a higher mean proportion of viable pollen (mean value = 64.6% in control treatment and 60.7% in stress treatment) compared to diploids (49.9% in control treatment and 52.9% in stress treatment) and tetraploids (50.3% in control treatment and 52.4% in stress treatment). Multiple comparison tests among ploidies revealed that the only significant differences were observed between tetraploid and hexaploid plants (p-value < 0.001; Tukey HSD; Supplementary Table 7).

### DISCUSSION

Mode of reproduction in the facultative apomictic plant is influenced by abiotic stress, e.g. by light (Knox, 1967; Saran and de Wet, 1976; Quarin, 1986; Klatt et al., 2016). However, these studies compared stress and control treatments only within the same cytotype. Under the same conditions, the degree of facultative apomixis is usually related to ploidy level (Delgado et al., 2016; Kaushal et al., 2018). In this study, we presented for the first time developmental patterns among three cytotypes of the R. auricomus complex under stress and control conditions. We tested the hypotheses that prolonged photoperiod enhances only the first component of apomixis, i.e., apomeiotic embryo sac development, with the expectation of a buffer effect of stress in polyploids. The other two apomixis components, i.e. parthenogenesis and endosperm development, were not affected by different photoperiods.

#### Effects of Ploidy, Treatment, and Combined Effect of Ploidy/Treatment to the Proportion of Female Development

Prolonged photoperiod enhanced the proportion of sexual ovules, with a greater effect on diploids but lesser effect on tetraploids and hexaploids. Enhancement on the proportion of sexual ovules after the same type of light stress had been reported before only in the hexaploid cytotype (Klatt et al., 2016). The hexaploids also formed a comparable proportion of sexual ovules

FIGURE 3 | Proportions of well-developed seeds in the R. auricomus complex plants grown in climatic chambers under prolonged photoperiod (stress) and shortened photoperiod (control). Mean values and statistical significance are given in figure. N = number of individuals. For the test statistic, see Supplementary Table 2.

under garden conditions (Hojsgaard et al., 2014a). The three cytotypes of the R. auricomus complex exhibited a similar mode of reproduction as the pairwise comparison of data revealed insignificant differences between ploidies in control treatments. The result of controls and also the high genetic similarity of the three cytotypes (Supplementary Figure 7) make it unlikely that slightly different genetic backgrounds of the cytotypes had influenced the results of our experiments. The proportion of sexual ovules of the diploid cytotype grown in the garden, ranging from 45% to 82% (Barke et al., 2018), was still within the range of our data. These plants represent recently formed synthetic F2 hybrids (Barke et al., 2018) with lower proportions of apospory than in the polyploids that already had established apomixis in the natural source populations. However, despite these more lineage-specific features, differential effects of treatments were observed in all three cytotypes in the early stages of development.

The prolonged photoperiod (16 plus 0.5 h) may have expanded the accumulation of ROS (Reactive oxygen species) in the reproductive tissue, as reported for the hexaploids based on analysis of secondary metabolite profiles (Klatt et al., 2016). Results support the hypothesis that the oxidative lesions might mobilize the meiotic DNA repair system in the megaspore mother cell and trigger meiosis and megasporogenesis (Hörandl and Hadacek, 2013). This stimulus might increase the proportion of functional megaspores as a cellular survival strategy for the germline (Rodrigo et al., 2017), as shown remarkably in our diploids. Differential genetic stress regulation of sexual and apomictic plants was also observed in seedlings of Boechera, and may be important for the bypass of the meiotic pathway (Shah et al., 2016).

In tetraploids and hexaploids, the oxidative stress of prolonged photoperiods might be different. This could be due to altered photosynthetic electron transport capacities (Coate et al., 2013), or to altered secondary metabolite profiles in polyploids and hybrids (Orians, 2000). We speculate that lowered oxidative stress in polyploids might not be severe enough to induce sufficient double strand breaks that would be essential for a correct processing of meiosis (Keeney et al., 2014). Consequently, meiosis and megasporogenesis might be disturbed. Failure of megasporogenesis might release aposporous initial cell (AIC) development. Cell-specific transcriptome studies on aposporous Hieracium subg. Pilosella suggested that contact and cross-talk between AICs and functional megaspores could be the trigger for mitotic development of the former and degeneration of the latter (Juranic et al., 2018). We suppose a similar interaction of AICs and megaspores in the Ranunculus auricomus complex as they always occur together in close neighborhood, and we observed the presence of AICs together with young (2-nucleate stage) meiotic embryo sacs but not at later stages (Supplementary Figures 2C, D and Supplementary Figure 4). The emergence of aposporous initials starts in the Ranunculus auricomus complex mostly at the end of megasporogenesis and is correlated to disturbance of megasporogenesis. The surviving aposporous cells grow faster than the meiotic cell and occupy the mode of megagametogenesis and seed development (Hojsgaard et al., 2014a; Barke et al., 2018). The stress only affects the megaspore but leaves apomeiosis as the surrogate for the sexual pathway (Hörandl and Hadacek, 2013). Alternatively, polyploids with more DNA content have more repair templates for the DSBs, and a higher dose of stress would be required to break the DNA (Schoenfelder and Fox, 2015). Here polyploidy might promote DNA damage tolerance under elevated stress as described (Schoenfelder and Fox, 2015) and buffers stress effects (Hörandl and Hadacek, 2013).


TABLE 2 | Observed reproductive pathways of three cytotypes of the R. auricomus complex.

†Autonomous endosperm.

‡Pseudogamous endosperm, polar nuclei were fertilized by one reduced/unreduced or two reduced/unreduced sperm nuclei.

Cx reflects ploidy based on DNA content: m, maternal genome contribution; p, paternal genome contribution; PI, peak index.

Environmental stress plays a role as an inhibition factor under an epigenetic mechanism that disturbs or interrupts the silencing signal of apomictic-conditioning (Rodrigo et al., 2017). At least in diploid Ranunculus, the treatment might strengthen a signal transduction pathway that promotes switching from apomeiosis to meiosis, as demonstrated in facultative Boechera after drought stress (Mateo de Arias, 2015; Gao, 2018; Carman et al., 2019). In polyploids, the whole duplication genome (WDG) provides the co-loss or co-retention condition, which maintains a constant set of miRNA for basic biological functions (Liu and Sun, 2019). Our data suggested that polyploids respond to the stress via homeostatic regulation in the frequency of apospory vs. megasporogenesis. The high variability of the proportions of sexual ovules among our genetically identical polyploids supports the findings of epigenetic and transcriptional control mechanisms as the background for the phenotypic expression of apospory (Grimanelli, 2012; Schmidt et al., 2014). Our result supports the hypothesis that phenotypic features of apomixis in flowering plants are strongly affected by polyploidy (Delgado et al., 2016; Kaushal et al., 2018) and subjected to epigenetic control (Rodrigo et al., 2017).

#### Effects of Ploidy, Treatment, and Combined Effect of Ploidy/Treatment to the Seed Development and Mode of Reproduction

The prolonged photoperiod affected neither the frequency of seed set, the proportion of sexual seeds, nor the pollen viability. Ranunculus auricomus complex plants generally lose a high seed proportion compared to rates of ovule formation due to their high seed abortion rate, exceeding one-half to two-third (Izmaiłow, 1996; Hörandl, 2008; Hörandl and Temsch, 2009; Klatt et al., 2016; Barke et al., 2018). This failure on seed formation arises at early stages and during the development of endosperm tissue (Barke et al., 2018). The diploid cytotype, which generally reproduces sexually, delivers a better seed set than the higher ploidy levels. In contrast, tetraploids and hexaploids, which are predominantly facultative apomictic, showed a reversed pattern, by increasing frequencies of asexual seeds.

Pollen quality is an external factor influencing the seed set of all cytotypes. The great variation in pollen quality, as observed here, is typical for apomictic plants (Asker and Jerling, 1992). The lower quality of tetraploid pollen was concomitant with a lower seed set of the tetraploids, while the better pollen quality in diploids and hexaploids corresponded to a higher seed set in these cytotypes. For seed formation, the contribution of a male gamete to fertilize the central nuclei is the major requirement for proper endosperm development (Vinkenoog et al., 2003). The diploids keep their sexual ovules growing into sexual seeds in both treatments, while the survival of three apomictic seeds in the stress treatment represented rare exceptions from seed abortion. Similar results have been reported from the garden experiment (Barke et al., 2018). Diploid plants are sensitive to genomic imprinting deviation in the endosperm (Hörandl and Temsch, 2009; Barke et al., 2018), i.e. a 2:1 constant ratio for maternal (m) to paternal (p) genome contribution to endosperm (Spielman et al., 2003; Vinkenoog et al., 2003). The occurrences of genome imbalance in pseudogamously (4m:1p and 4m:2p) and autonomously formed seed (4m:0p) suggested that endosperm imbalance inhibited apomictic seed formation in our diploid cytotype.

On the other hand, in polyploids, the development of sexual ovules aborted to a large extent and was replaced by aposporous initials that completed megagametogenesis. Apomictic seed formation in polyploids is mainly influenced by the competitive capacity of the unreduced embryo sac formation rather than by the light regime during megagametogenesis and seed development (Hojsgaard et al., 2013; Klatt et al., 2016; Hodač et al., 2019). The surviving aposporous initials continue to develop into aposporous embryo sacs, and seeds are formed mostly via parthenogenesis and pseudogamous apomixis. This mode of reproduction is indicated by the parthenogenetic embryo (an unreduced egg cell develops without male gamete fusion) and pseudogamous endosperm (two unreduced polar nuclei fuses with one or two male gametes). Parthenogenesis appears mostly in our asexual polyploid seeds as a significant factor promoting unreduced gametophytes against reduced one and seed formation (Hojsgaard and Hörandl, 2019). A significant number of BIII-hybrids in tetraploids were formed through fertilization of unreduced egg cells as partial apomixis, as it was also occasionally observed in other FCSS studies (e.g. Schinkel et al., 2016; Barke et al., 2018; Klatt et al., 2018). This BIII-hybrid had probably an extremely long period of egg cell receptivity in this cytotype as assumed in diploid Ranunculus (Barke et al., 2018). Additionally, pollen-independent seed development via autonomous apomixis was also a rare event in polyploids. Asexual seed formation via pseudogamy is predominant in most apomictic plants (Mogie, 1992) as observed in our polyploids. The most common developmental pathway, however, used both sperm nuclei, or the unreduced sperm nucleus, for fertilization of polar nuclei, and hence restored the optimal 2m:1p ratio in the endosperm; these pathways result in a peak index of 3.0 in flow cytometric seed screening and represent the major proportions of apomictic seeds in both tetraploids (92%) and hexaploids (88%), see data in Table 2. Unbalanced genome contributions were also observed. Even though the diploids are quite sensitive to genomic imprinting, the polyploids in Ranunculus are more relaxed as expected (Grimanelli et al., 1997; Quarin, 1999). The current theory suggests that epigenetic mutation in polyploids creates relaxation on genomic imprinting during endosperm development (Kaushal et al., 2018). This could be the reason of higher seed set in hexaploid than in tetraploid cytotypes, similar to in hexaploid Potentilla puberula that had higher seed set than the tetraploids (Dobeš et al., 2018). These findings suggest the presence of a buffer effect on genomic imprinting in polyploids.

Our results suggest that the light regime only affects the proportion of sexual ovules, but the effect does not continue on the mode of seed formation. This finding supports the oxidative stress initiation hypothesis (Hörandl and Hadacek, 2013) that light stress affects only female meiosis, but has no relevance to further development. Polyploids express predominantly apospory, probably by improved mechanisms to buffer the abiotic stress, and are able to establish apomictic seed formation. These findings are in line with the general observation that apomixis mostly occurs in polyploid plants, despite the fact that the pathway can occur in diploids as well, albeit in much lower frequencies. Hence, stress resistance of polyploids may indirectly facilitate the establishment of apomixis, but is not necessarily essential for its expression, as proposed by Hojsgaard and Hörandl (2019).

#### CONCLUSIONS

Three cytotypes of facultative R. auricomus complex express the alternation of proportions of asexual ovules into more sexual ovules after prolonged photoperiod. We hypothesize that light stress increases ROS formation that triggers oxidative stress. The oxidative stress might stimulate the meiotic DNA repair system in the megaspore mother cell and suppresses mitotic division, resulting in sexual ovules. The effect of prolonged photoperiod on megasporogenesis was most pronounced in diploids; the lower effect of light stress in polyploids is probably as a consequence of higher stress resistance. In polyploids, high rates of seed abortion left a lower proportion of sexual seeds, whereas in diploids the sexual pathway is still predominant. Seed formation is not influenced by environmental stress conditions, but rather depending on proper endosperm formation. Our findings shed light on the predominance of apomixis occurrence in polyploid plants.

## DATA AVAILABILITY STATEMENT

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation, to any qualified researcher.

## AUTHOR CONTRIBUTIONS

FU and EH designed research. FU performed research, analyzed and interpreted data. CC contributed to FCSS and microsatellite analysis. FU wrote the manuscript with contributions of EH.

### FUNDING

This project was funded by The German Research Fund DFG (DFG Hörandl Ho 4395 4-1) to EH and by the Indonesia endowment fund for education, grant no. PRJ-2369/LPDP.3/ 2016 to FU.

### ACKNOWLEDGMENTS

Silvia Friedrichs for nursing the plants; Birthe Barke for help with data interpretation; referees for valuable comments on the manuscript.

### SUPPLEMENTARY MATERIAL

The Supplementary Material for this article can be found online at: https://www.frontiersin.org/articles/10.3389/fpls.2020.00104/ full#supplementary-material

#### REFERENCES

Aliyu, O. M., Schranz, M. E., and Sharbel, T. F. (2010). Quantitative variation for apomictic reproduction in the genus Boechera (Brassicaceae). Am. J. Bot. 97 (10), 1719–1731. doi: 10.3732/ajb.1000188

Asker, S., and Jerling, L. (1992). Apomixis in plants (Boca Raton: CRC Press).


in the endosperm of diplosporous apomictic Tripsacum (Poaceae). Sexual. Plant Reprod. 10 (5), 279–282. doi: 10.1007/s004970050098

Grimanelli, D. (2012). Epigenetic regulation of reproductive development and the emergence of apomixis in angiosperms. Curr. Opin. Plant Biol. 15 (1), 57–62. doi: 10.1016/j.pbi.2011.10.002


triggers metabolic reprogramming in facultative apomictic Ranunculus auricomus. Front. Plant Sci. 7, 278. doi: 10.3389/fpls.2016.00278


lovegrass (Eragrostis curvula). PloS One 12 (4), e0175852. doi: 10.1371/ journal.pone.0175852


Conflict of Interest: The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Copyright © 2020 Ulum, Costa Castro and Hörandl. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.