Testing the Hypothesis of Multiple Origins of Holoparasitism in Orobanchaceae: Phylogenetic Evidence from the Last Two Unplaced Holoparasitic Genera, Gleadovia and Phacellanthus

Orobanchaceae is the largest family among the parasitic angiosperms. It comprises non-parasites, hemi- and holoparasites, making this family an ideal test case for studying the evolution of parasitism. Previous phylogenetic analyses showed that holoparasitism had arisen at least three times from the hemiparasitic taxa in Orobanchaceae. Until now, however, not all known genera of Orobanchaceae were investigated in detail. Among them, the unknown phylogenetic positions of the holoparasites Gleadovia and Phacellanthus are the key to testing how many times holoparasitism evolved. Here, we provide clear evidence for the first time that they are members of the tribe Orobancheae, using sequence data from multiple loci (nuclear genes ITS, PHYA, PHYB, and plastid genes rps2, matK). Gleadovia is an independent lineage whereas Phacellanthus should be merged into genus Orobanche section Orobanche. Our results unambiguously support the hypothesis that there are only three origins of holoparasitism in Orobanchaceae. Divergence dating reveals for the first time that the three origins of holoparasitism were not synchronous. Our findings suggest that holoparasitism can persist in specific clades for a long time and holoparasitism may evolve independently as an adaptation to certain hosts.


INTRODUCTION
The family Orobanchaceae belongs to the eudicot order Lamiales. It comprises more than 2,000 species in 90 genera with lifestyles from fully autotrophic to completely heterotrophic, and thus becomes an ideal system to study the origin of holoparasitism (McNeal et al., 2013). Orobanchaceae was established first by Ventenat (1799). Initially, its members were defined as the plants which cannot photosynthesize. Subsequent researchers merged some hemiparasites in Scrophulariaceae into Orobanchaceae based on evidence from morphology and anatomy (von Wettstein, 1891;Bellini, 1907;Boeshore, 1920;Armstrong and Douglas, 1989). Following a phylogenetic analysis of the plastid gene rps2, the free-living genus Lindenbergia was regarded as a member of Orobanchaceae (Young et al., 1999). This proposal was also supported by sequence data of another plastid gene, matK (Wolfe et al., 2005). In the latest classification by the Angiosperm Phylogeny Group (APG), 2016), the non-parasitic Lindenbergia and Rehmanniaceae were placed in Orobanchaceae. Phylogeny based on single plastid genes such as matK, rbcL, and rps2 partially resolved the phylogenetic relationships within Orobanchaceae (dePamphilis et al., 1997;Wolfe and dePamphilis, 1998;Young et al., 1999;Young and dePamphilis, 2005;Park et al., 2008), and nuclear genes (PHYA and ITS including the ITS1-5.8S-ITS2 region) provided improved resolution among species in Orobanchaceae (Wolfe et al., 2005;Bennett and Mathews, 2006). Compared with combined multiple genes, however, single genes do not always provide strong and clear resolution because of the dearth of informative sites, especially for closely related species (Zou et al., 2008;Niu et al., 2013;Yang et al., 2014). In a recent study, three nuclear genes (ITS, PHYA, and PHYB) and two plastid genes (rps2 and matK) were combined to reconstruct relationships within Orobanchaceae (McNeal et al., 2013). The backbone of the Orobanchaceae tree including six well-supported clades (Clade I-Clade VI) was determined. Previous studies suggested more than three independent origins of holoparasitism in Orobanchaceae (dePamphilis et al., 1997;Young et al., 1999), whereas the five-gene analysis showed that the holoparasitic species are only located in three clades (III, V, and VI) after excluding several species such as Alectra orobanchoides, Striga gesnerioides, Striga hermonthica, and Tozzia alpina based on the observation of life cycles in detail. Therefore, McNeal et al. (2013) concluded that holoparasitism had evolved independently three times in Orobanchaceae. Until now, however, this has not been tested on a sample of all holoparasitic species in Orobanchaceae. Furthermore, the order of origination time of the holoparasite lineages remains unknown.
The genera Gleadovia and Phacellanthus are just two ignored groups due to the lack of sampling for gene sequencing. They were placed into Orobanchaceae according to the number of valves, parietal placentas, and the shape of bracts and corolla (Zhang, 1990;Zhang and Tzvelev, 1998). Gleadovia comprises two species: Gleadovia ruborum is distributed in the northwest of the Himalaya, and Gleadovia mupinensis is endemic to southwest China. Both of them are small holoparasitic herbs. Our field observation showed that G. mupinensis parasitizes on species of Rubus. Phacellanthus is a monotypic genus which only contains a holoparasite, Phacellanthus tubiflorus. It is distributed in most provinces of China, as well as in Korea, Mongolia, and Russia. Although the originally recorded specimens parasitized on the roots of Fraxinus species, Xu et al. (1991) found that its host was Tilia mandschurica in Changbaishan nature reserve, China. In recent years, there were rarely reports about Gleadovia and Phacellanthus found in the wild (Chung et al., 2010;Liu et al., 2012). Ambiguous host information and uncertain time for collecting make it very difficult to resample Gleadovia and Phacellanthus from their type localities. Coupled with environmental changes and the impact of human activities, to collect the two holoparasitic genera successfully becomes a challenging job. Fortunately, after years of investigation, at last we sampled G. mupinensis and P. tubiflorus in the provinces Sichuan and Heilongjiang, China, respectively.
The phylogenetic positions of 12 holoparasitic genera were determined by McNeal et al. (2013), but those of some genera in Orobanchaceae including Gleadovia and Phacellanthus have still not been resolved by means of molecular methods. Moreover, Eremitilla mexicana was a newly reported holoparasitic species in Orobanchaceae (Yatskievych and Jiménez, 2009), which was suggested as a member of the clade including Epifagus, Conopholis, and Boschniakia based on ITS and rps2 data (Mathews et al., 2008). McNeal et al. (2013) mentioned they hadn't sampled Phelypaea. In fact, Nicolson (1975) had proposed Diphelypaea as a replacement name for this genus. Phylogenetic analysis of ITS sequences plus karyotype evidence suggested that Diphelypea is sister to genus Orobanche section Orobanche (Schneeweiss et al., 2004a,b). Recently, ITS phylogeny provided clear evidence to merge the unplaced genus Platypholis into Orobanche (Li et al., 2017). Therefore, Gleadovia and Phacellanthus are the last two holoparasitic genera whose phylogenetic positions remain unclear. These two genera together with Mannagettaea and Christisonia had been placed in Gleadovieae rather than Orobancheae according to morphological characters such as type and position of inflorescence and absence/presence of mechanical tissue (Zhang and Tzvelev, 1998). But the latter two genera were proposed as members of Orobancheae and Buchnereae, respectively (McNeal et al., 2013). McNeal et al. (2013) did not accept Gleadovieae. Instead, they speculated that Gleadovia and Phacellanthus should belong to Orobancheae, although the samples of these two genera had not been obtained yet. Therefore, it is crucial to investigate the phylogenetic positions of Gleadovia and Phacellanthus for correctly understanding the origins of holoparasitism in Orobanchaceae. We reason that if they do not nest in the three holoparasitic clades, more than three origins of holoparasitism will be found; otherwise, three origins will be determined. In this study, we revisited the phylogeny of Orobanchaceae based on the combined sequence data of multiple DNA regions (ITS, PHYA, PHYB, rps2, and matK) to identify the phylogenetic positions of Gleadovia and Phacellanthus. The number of origins of holoparasitism in Orobanchaceae was re-tested and the divergence dates were estimated to uncover the order of these origins.

Plant Materials and DNA Extraction
We collected G. mupinensis and P. tubiflorus which represent two holoparasitic genera from the provinces Sichuan and Heilongjiang, China, respectively. Voucher specimens were deposited in the herbarium of Fudan University (FUS). Total genomic DNA was extracted from fresh tissues dried with silicagel following the CTAB extraction method (Doyle and Doyle, 1987).

PCR Amplification and Sequencing
Specific primers of five genes (Table S1) were used for polymerase chain reaction (PCR). Primers used to amplify the nuclear genes PHYA and PHYB as previously described (Bennett and Mathews, 2006;McNeal et al., 2013) did not work in G. mupinensis and P. tubiflorus; we designed new primers (Table S1) according to the sequences of closely related taxa inferred from phylogeny of plastid (rps2 and matK) data. Each reaction volume of 50 µl contained ∼150 ng total DNA, 5 µl of 10X PCR buffer, 6 µl of MgCl 2 (2.5 mmol·l −1 ), 8 µl of dNTP mixture (2.5 mmol·l −1 ), 3 µl of each primer (10 µmol·l −1 ), and 2.5 units of Red Taq DNA polymerase. The running programs for these genes are presented in Table S2. PCR products were visualized on 1% agarose gels and purified using Gel Extraction System B Kit (BioDev-Tech, Beijing, China) according to the manufacturer's instructions. The purified products were sequenced directly with BigDye Terminator 3.1 Cycle Sequencing Kit (Applied Biosystems) and run in ABI PRISM 377XL DNA Autosequencer. All sequences generated in this study were submitted to GenBank (accession numbers: KY706614-KY706632).

Phylogenetic Analyses
The sequences of five genes from G. mupinensis and P. tubiflorus were assembled by the software Seqman II 5.05 (DNAStar, London, UK). Sequence data of other species in Orobanchaceae were obtained from GenBank ( Table S3). The alignment was undertaken by Clustal X (Larkin et al., 2007) and adjusted manually. All data sets were partitioned by gene, and gaps were treated as missing data. Modeltest 3.5 (Posada and Crandall, 1998) was executed to select the best fitting evolutionary model for each partition under the Akaike Information Criterion (AIC). Maximum likelihood (ML) and Bayesian inference (BI) analyses were performed on the independent and combined data sets for ITS, PHYA, PHYB, rps2, and matK. ML analyses were run in RAxML 7.0.4 (Stamatakis, 2006); the general time reversible (GTR) model and gamma distribution were used for all partitions (McNeal et al., 2013) because it is impossible to specify different models to different partitions in RAxML. Support values for branches of the ML trees were assessed based on 1,000 bootstrap replicates. Before combining all data sets, the partition homogeneity test was used to check the congruence among the five-gene data sets in PAUP* 4.0b10 (Swofford, 2002). Bayesian analyses were performed in MrBayes 3.1.2 (Huelsenbeck and Ronquist, 2001). Each partition was assigned its own nucleotide substitution model. We ran 25 million generations sampling every 1,000 generations. Stationarity of each analysis was assessed using Tracer 1.6.0  by checking the effective sample size (ESS) values. Completion was determined when the average standard deviation of split frequencies fell below 0.01. The 50% majority rule consensus trees were obtained after the first 25% of the samples were removed as burn-in.
In order to compare the resolution of different partitioning schemes, we also divided the data by codon to construct the phylogeny of Orobanchaceae based on single and combined gene data, respectively. Each protein-coding gene (PHYA, PHYB, rps2, matK) was partitioned by codon (1st, 2nd, and 3rd codon position) and the best fitting models were selected by Modeltest 3.5. The concatenated five-gene data set was partitioned into 13 blocks, including one for the ITS region and 12 for the 1st, 2nd, and 3rd codon positions of each protein-coding gene. ML analyses were conducted with RAxML 7.0.4 under the GTR + G model and Bayesian inference was run in MrBayes 3.1.2 with different nucleotide substitution models for each partition.

The Timing of Origin of Holoparasitic Clades in Orobanchaceae
Bayesian Evolutionary Analysis by Sampling Trees (BEAST 1.8.2; Drummond et al., 2012) was used to estimate the divergence time in Orobanchaceae. Bayesian Evolutionary Analysis Utility (BEAUti 1.8.2) was used to output BEAST input files. Owning to the lack of a reliable fossil record within the family (Soltis et al., 2011;Tank et al., 2015;Uribe-Convers and Tank, 2015), two external fossil calibration points (Call and Dilcher, 1992;Collinson et al., 1993) were used in the dating analysis. Both the stem age of Solanaceae and the age of the most recent common ancestor of Pedicularis and Olea were constrained with the same lognormal distribution prior (an offset of 44.3 Mya, a mean of 1.5, and a standard deviation of 0.5; Zanne et al., 2014). These two nodes are relatively close to Orobanchaceae (Zanne et al., 2014). Ten independent Monte chains Carlo Markov (MCMC) were conducted to ensure convergence in divergence time. Each run consisted of 25 million generations (sampling every 1,000 steps) with the GTR + G model, a Yule tree prior and an uncorrelated lognormal clock. Tracer 1.6.0  and an R package, RWTY (Warren et al., 2017), were used to assess convergence and stationarity of each MCMC chain. The RWTY test showed that each MCMC chain reached convergence within 5 million generations. Therefore, our analyses of 25 million generations guarantee convergence and stationarity of each MCMC chain. The samples from each run were combined by LogCombiner 1.8.2, until ESS ≥ 200. The final maximum clade credibility tree was generated using TreeAnnotator 1.8.2 after removing 10% of the samples as burn-in. The topology and node height with 95% highest posterior density (HPD) were visualized in FigTree 1.3.1 (Rambaut, 2006).

RESULTS
As expected, our ML trees do not present any difference with those of McNeal et al. (2013) in topology among the six main clades of Orobanchaceae except some details. The combined fivegene data provides a clearer resolution than any single gene. All phylogenetic trees show that P. tubiflorus is clustered with the members of the genus Orobanche within Clade III, i.e., Orobancheae (Figure 1, Figures S1-S11). Except for the trees based on ITS or rps2 sequences which present poor resolution in Clade III, all trees show G. mupinensis nested in Orobancheae (Figures S1-S11).
ML analyses based on different partitioning schemes (by gene and by codon) obtained similar results with slightly different support values. The results of the BI analyses are consistent with those of the ML analyses concerning the main clades and the phylogenetic positions of P. tubiflorus and G. mupinensis, expect for the PHYA tree which shows poor resolution based on codon partition. The phylogenetic information from ML and BI trees of combined five-gene data based on two kinds of partitioning schemes (Figures S11-S14) is summarized in Table 1. The corresponding ML results from single genes and combined plastid genes are attached in supplementary materials (Tables S4, S5).
Our divergence dating analyses clearly show that the three holoparasitic clades in Orobanchaceae originated nonsynchronously (Figure 3, Figure S15). The first origin of holoparasitism was found in Clade III, followed by the clade (Hyobanche, (Harveya, (Aeginetia, Christisonia))) within Clade VI and the genus Lathraea in Clade V.

Gene Selection for Phylogenetic Reconstruction
Of the genes used in our ML analyses, the nuclear gene ITS has a limited capacity to resolve the phylogeny of Orobanchaceae at the genus level (Schneeweiss et al., 2004a;Wolfe et al., 2005;Gussarova et al., 2008). Our ML tree based on ITS data shows good support in four clades (I, IV, V, and VI), but poor support for Clade II [maximum likelihood bootstrap (MLBS) < 50%]. Although Clade III collapses in the ML tree, section Orobanche forms a clade, in which the relationships among most species obtain strong support ( Figure S1). The plastid gene rps2, which is known to be retained in all hemi-and holoparasites in Orobanchaceae, behaves similarly to ITS. The coding gene matK has been widely used to infer phylogenetic relationships among closely related groups. Compared with rps2, combined plastid data (rps2 + matK) can resolve the basal clades and support the relationships of the species within the main clades well. The nuclear genes PHYA and PHYB are associated with seed germination (Mathews and Donoghue, 2000;Franklin and Whitelam, 2005). Previous studies suggested that PHYA is the most useful single locus to resolve the phylogenetic relationships within Orobanchaceae (Mathews and Donoghue, 2000;Bennett and Mathews, 2006), but the phylogenetic trees of PHYB, by contrast, showed higher resolution of the relationships among and within the main clades (Figures S5, S6, Table S4). In general, phylogenetic resolution of the combined data is superior to that of any single gene. To gain a better understanding of the phylogenetic relationships among the species of Orobanchaceae, it is necessary to combine nuclear genes with plastid genes as in the previous study by McNeal et al. (2013). In this study, our phylogenetic analyses based on the combined data show strong support among and within the major clades in Orobanchaceae (Figure 2).

Phylogenetic Position of Phacellanthus and Gleadovia
All ML trees based on single or combined genes support that P. tubiflorus clusters with the members of the Orobancheae (Figures 1, 2, Figures S1-S11). Traditionally, the genus Orobanche was divided into four sections: Gymnocaulis, Myzorrhiza, Trionychon, and Orobanche. As far as Orobanche in Clade III is concerned, our phylogenetic analyses based on both single and combined genes obtained the same results as the previous study by Schneeweiss et al. (2004a), i.e., Orobanche falls into two lineages: the subgenus Orobanche which contains the sections Orobanche and Diphelypaea, and the subgenus Phelipanche which contains the sections Gymnocaulis, Myzorrhiza, and Trionychon. Almost all ML trees in this study strongly support P. tubiflorus as the closest relative of section Orobanche (99 or 100% MLBS), with the exception of the ITS tree (72% MLBS). Independent analysis of either ITS or rps2 supports that P. tubiflorus is nested in section Orobanche (Figure 1,  Figures S7-S10). Although we looked for morphological differences between Phacellanthus and the section Orobanche, we found only one: the dehiscing capsule has three valves in Phacellanthus but two valves in the section Orobanche. Except for this difference, they present high similarities in the shape of the stem, leaf, calyx, corolla, and capsule, the arrangement of the leaves, the number of stamina, and so on. Considering all the evidence above, it is better to merge Phacellanthus into section Orobanche.
Unlike the situation of Phacellanthus, almost all analyses from single and combined genes support G. mupinensis as an independent lineage. Even if ITS shows poor resolution in Orobancheae, G. mupinensis forms a polytomy with several large clades ( Figure S1, Table S4). ML trees constructed with PHYA, PHYB, two plastid genes or combined data from five genes (Figures S5, S6, Table S4) group G. mupinensis within Clade III, near to Orobanche, but G. mupinensis is not nested in the clade including Boschniakia, Epifagus, and Conopholis. The phylogenetic positions of Gleadovia and Phacellanthus clearly support the tribe Orobancheae revised by McNeal et al. (2013), but do not support the tribe Gleadovieae which groups Gleadovia with the genera Mannagettaea, Phacellanthus, and Christisonia (Zhang and Tzvelev, 1998). Indeed, only one chamber is fertile in Christisonia and Aeginetia, supporting their sister-group relationship in Clade VI. However, the anthers of the species of Gleadovia and the members of tribe Orobancheae are fertile and have two equal chambers. By contrast, Gleadovia shares more morphological similarities with Mannagettaea and Phacellanthus despite the slight differences in the number of placentae among these three genera and the number of carpels between Gleadovia and Phacellanthus.

Origins of Holoparasitism
Several species in Orobanchaceae such as A. orobanchoides, S. gesnerioides, S. hermonthica, and T. alpina were regarded as holoparasites before McNeal et al. (2013) showed that these species retain functional chlorophyll and photosynthesize at least in part of their life cycle, although sometimes the photosynthesis rates are very low. Without considering these species, our results support three origins of holoparasitism in Orobanchaceae. We note that certain hemiparasitic genera cannot be included in this study due to the limitation of sampling. Because Clade III is strictly restricted at the circumscription of Orobanchaceae sensu stricto containing only holoparasites, and the holoparasites in Clade V consist of only the species of Lathraea, the holoparasitic clade (Hyobanche, (Harveya, (Aeginetia, Christisonia))) within Clade VI becomes the most likely lineage in which hemiparasitic species might be nested. However, these four closely related holoparasitic genera are morphologically so similar that their holoparasitism is most likely homologous. Any hemiparasites nested among them would therefore have regained the ability to photosynthesize. We consider this highly unlikely because no such reversal from holo-to hemiparasitism has been reported to date; but even in that case, the number of origins of holoparasitism would remain unchanged. Thus, if we accept that A. orobanchoides, S. gesnerioides, S. hermonthica, and T. alpina are hemiparasites (McNeal et al., 2013) and that Eremitilla and Platypholis belong to Clade III, there were three independent origins of holoparasitism in Orobanchaceae, each time from hemiparasites.
Orobanchaceae was estimated to have a mid-Tertiary Laurasian origin (Wolfe et al., 2005;Soltis et al., 2011;Tank et al., 2015;Uribe-Convers and Tank, 2015). Based on the fossil record of close relatives of Orobanchaceae, we can compare the relative divergence times of the holoparasitic clades (Figure 3, Figure S15). Our data show that the oldest holoparasitic clade is Orobancheae. The second holoparasitic clade is (Hyobanche, (Harveya, (Aeginetia, Christisonia))) in Buchnereae. The crown group of this clade seems to postdate those of Orobanche and the clade (Boschniakia, (Kopsiopsis, (Epifagus, Conopholis))). The youngest holoparasitic clade comprises only the genus Lathraea. The age of this genus is  Figure S11 and Table S3.  Figure S15.
almost equal to the divergence between Cistanche phelypaea and C. tubulosa.      Figure S6 | Phylogenetic tree of Orobanchaceae including Gleadovia mupinensis and Phacellanthus tubiflorus based on PHYB data with ML method. Figure S7 | Expanded phylogenetic trees based on ITS sequences for Orobanche with ML method. Figure S8 | Expanded phylogenetic trees based on ITS sequences for Orobanchaceae with ML method. Figure S9 | Expanded phylogenetic trees based on rps2 sequences for Orobanche with ML method.

CONCLUSION AND OUTLOOK
Figure S10 | Expanded phylogenetic trees based on rps2 sequences for Orobanchaceae with ML method.
Figure S11 | Expanded maximum likelihood phylogenetic tree of Orobanchaceae inferred from the combined five-gene data set (PHYA, PHYB, ITS, matK, and rps2) partitioned by gene. Figure S12 | Maximum likelihood phylogenetic tree of Orobanchaceae inferred from the combined five-gene data set (PHYA, PHYB, ITS, matK, and rps2) partitioned by codon. Figure S13 | Bayesian phylogenetic tree of Orobanchaceae inferred from the combined five-gene data set (PHYA, PHYB, ITS, matK, and rps2) partitioned by gene. Figure S14 | Bayesian phylogenetic tree of Orobanchaceae inferred from the combined five-gene data set (PHYA, PHYB, ITS, matK, and rps2) partitioned by codon. Figure S15 | Divergence time estimation of Orobanchaceae using combined five-gene data obtained from BEAST analysis.
Table S1 | The primer information for PCR amplification and cycle sequencing. Table S2 | PCR amplification programs of each gene used in this study. Table S3 | GenBank accession numbers for species used in this study. Table S4 | Summarized results of maximum likelihood analyses in Orobanchaceae based on single gene and combined matK and rps2 (plastid) data partitioned by gene. Table S5 | Summarized results of maximum likelihood analyses in Orobanchaceae based on single gene and combined matK and rps2 (plastid) data partitioned by codon.