Evolutionary Convergence of C4 Photosynthesis: A Case Study in the Nyctaginaceae

C4 photosynthesis evolved over 65 times, with around 24 origins in the eudicot order Caryophyllales. In the Caryophyllales family Nyctaginaceae, the C4 pathway is known in three genera of the tribe Nyctagineae: Allionia, Okenia and Boerhavia. Phylogenetically, Allionia and Boerhavia/Okenia are separated by three genera whose photosynthetic pathway is uncertain. To clarify the distribution of photosynthetic pathways in the Nyctaginaceae, we surveyed carbon isotope ratios of 159 species of the Nyctaginaceae, along with bundle sheath (BS) cell ultrastructure, leaf gas exchange, and C4 pathway biochemistry in five species from the two C4 clades and closely related C3 genera. All species in Allionia, Okenia and Boerhavia are C4, while no C4 species occur in any other genera of the family, including three that branch between Allionia and Boerhavia. This demonstrates that C4 photosynthesis evolved twice in Nyctaginaceae. Boerhavia species use the NADP-malic enzyme (NADP-ME) subtype of C4 photosynthesis, while Allionia species use the NAD-malic enzyme (NAD-ME) subtype. The BS cells of Allionia have many more mitochondria than the BS of Boerhavia. Bundle sheath mitochondria are closely associated with chloroplasts in Allionia which facilitates CO2 refixation following decarboxylation by mitochondrial NAD-ME. The close relationship between Allionia and Boerhavia could provide insights into why NADP-ME versus NAD-ME subtypes evolve, particularly when coupled to analysis of their respective genomes. As such, the group is an excellent system to dissect the organizational hierarchy of convergent versus divergent traits produced by C4 evolution, enabling us to understand when convergence is favored versus when divergent modifications can result in a common phenotype.


INTRODUCTION
C 4 photosynthesis is a complex trait that arises following modifications to hundreds if not thousands of individual genes within a genome . Despite this, it is one of the most convergent of evolutionary phenomena in the biosphere, with over 65 independent origins (Conway-Morris, 2003;Sage, 2016;Heyduk et al., 2019). Evolutionary convergence, however, does not necessarily reflect convergence throughout the hierarchy of traits that give rise to a complex phenotype, because multiple mechanisms can support a common function (Losos, 2011). This is well illustrated in the case of C 4 photosynthesis and crassulacean acid metabolism (CAM), each of which have been repeatedly assembled using disparate enzymes and structural modifications Christin and Osborne, 2013;Edwards, 2019). While examples of evolutionary convergence are many, the mechanisms of convergence remain a major question in the life sciences, particularly in the cases where complex traits such as C 4 photosynthesis repeatedly evolve (Blount et al., 2018). Because the complexity of the C 4 system is well-understood, as well as the phylogenetic distribution of the many C 4 clades, C 4 photosynthesis represents an excellent system to understand the mechanics of convergent evolution, and its implication for the rise of C 4 -dominated biomes over the past 30 million years (Christin and Osborne, 2013;Heyduk et al., 2019).
C 4 photosynthesis first captures CO 2 at low concentration in an outer mesophyll (M) compartment via the activity of phosphoenolpyruvate (PEP) carboxylase (PEPCase), and then concentrates it into an internal compartment, typically a layer of cells around the leaf vasculature termed the bundle sheath (BS; Edwards and Walker, 1983;Hatch, 1987). 1 The C 4 pathway begins with the conversion of CO 2 to bicarbonate (HCO 3 − ) by carbonic anhydrase (CA) in M cells, followed by PEP carboxylation (Figure 1). These two steps occur in all C 4 plants, and thus are universally convergent traits in C 4 photosynthesis; however, different paralogs have been recruited to carry out the PEPCase and CA functions, demonstrating evolutionary flexibility at a lower level of organization (Christin et al., 2010b(Christin et al., , 2013aLudwig, 2016a;Heyduk et al., 2019). The product of PEP carboxylation, oxaloacetate (OAA), is too labile to safely move between M and BS cells, so it must be converted to a stable metabolite (Edwards and Walker, 1983). Metabolite transport between M and BS cells is by diffusion, which necessitates that metabolites form steep concentration gradients to support rapid flux, and thus must be stable at high concentration . The solution to the challenge presented by OAA instability is to convert it to the stable metabolites malate or aspartate. This highlights another fundamental feature of evolutionary convergence, in that it occurs where there are strict physiochemical constraints such as OAA lability. A common step (OAA conversion) is accomplished via divergent metabolic solutions (formation of malate versus aspartate). The selected transport metabolite, as it turns out, reflects the enzyme that catalyzes the decarboxylation step.
There are three major decarboxylating enzymes co-opted for the decarboxylation step in C 4 photosynthesis: NADPmalic enzyme (NADP-ME), NAD-malic enzyme (NAD-ME) and PEP carboxykinase (PCK; Hatch, 1987). Most C 4 lineages use NADP-ME as the primary decarboxylating enzyme, about 1/3 use NAD-ME and five lineages predominantly use PCK 1 The tissue layer where CO 2 is concentrated is often comprised of parenchymatous bundle sheath cells; however, in an example of the non-convergence, other cell layers have also been co-opted as the site of CO 2 concentration. Despite this variation, bundle sheath (BS) is the term generally used to refer to the high CO 2 compartment. We generally follow this convention here, although we do use the anatomically correct name when discussing a specific Kranz anatomy cell type that is not technically BS tissue. . The consequence of the decarboxylase selection affects many aspects of C 4 photosynthesis, to include the pathway biochemistry, BS and M ultrastructure, M to BS transport processes, leaf energetics, and photosynthetic efficiency (Hatch, 1987;Drincovich et al., 2011;Ghannoum et al., 2011). To recognize the suite of traits associated with each decarboxylation mode, three subtypes of C 4 photosynthesis have been delineated -the NADP-ME subtype, the NAD-ME subtype, and the PCK subtype (Figure 1). Although each subtype conducts C 4 photosynthesis, they represent three evolutionary mechanisms to concentrate CO 2 that are derived from a distinct set of biochemical, structural and transport traits (Rao and Dixon, 2016). While each subtype represents a divergent means of concentrating CO 2 around Rubisco, traits associated within each subtype reflect strong convergence in response to constraints imposed by the decarboxylating enzyme. The NADP-ME enzyme co-opted by the C 4 cycle is located in the BS chloroplasts, where it uses malate and NADP + to produce CO 2 , NADPH and pyruvate, with the NADPH directly supporting reduction of PGA produced by Rubisco. NAD-ME is mitochondrial, and uses malate and NAD + to produce CO 2 , NADH and pyruvate. Because it is active in the BS mitochondria, substantial mitochondrial volume is needed to meet the metabolic requirements of the C 4 pathway, and consistently, species of the NAD-ME subtype often have more and/or larger mitochondria in BS cells than NADP-ME species (Dengler and Nelson, 1999;Edwards and Voznesenskaya, 2011). The PCK enzyme is located in the cytosol, and uses ATP and OAA to produce PEP, CO 2 , and ADP + P i. Large amounts of ATP are needed for the PCK reaction, and hence there is a large investment in BS mitochondrial volume to meet the ATP requirement (Yoshimura et al., 2004;Voznesenskaya et al., 2006). The decarboxylation type also constrains the selection of the transport metabolite. In NADP-ME species, NADPH produced by malate oxidation in the BS chloroplast is rapidly consumed in the metabolism of PGA generated by Rubisco, so there is no feedback onto malate flux. In NAD-ME species, the use of malate as a transport molecule would be problematic. NADH cannot readily exit the mitochondria, and there is an insufficient energy sink in the mitochondria to use the large amount of NADH that would be generated if NAD-ME oxidized malate imported from the M cells. The utilization of aspartate avoids these issues because there is no net import of reducing power into the BS mitochondria (Kanai and Edwards, 1999). Once in the BS mitochondria, aspartate forms OAA, which is reduced to malate via malate dehydrogenase using the NADH generated by NAD-ME (Figure 1).
PEP carboxykinase directly uses OAA, and hence to avoid flooding the BS cytosol with reducing power, aspartate is also imported from M cells. However, the PCK reaction requires large amounts of ATP to support rapid photosynthesis, and this can be generated in the mitochondria by oxidizing NADH created by NAD-ME using malate directly imported into the BS mitochondria from the M cells (Leegood and Walker, 1999). PEP carboxykinase species are thought to use NAD-ME at about a fourth to a third of the rate of the PCK reaction, in order to supply sufficient NADH for ATP production (Leegood and Walker, FIGURE 1 | A schematic outlining the C 4 pathway in NADP-ME type species, NAD-ME species, and PEP carboxykinase species. The diagram represents the classical depiction of the pathway (as per Hatch, 1987 andFurbank, 2011) assuming malate or aspartate predominate in their respective pathways. For clarity, nitrogen exchange reactions between aspartate and pyruvate are not shown in the NAD-ME and PCK subtypes. Thylakoid stacking is indicated by multi-layered rectangles in the chloroplasts. AL, alanine; ASP, aspartate; CA, carbonic anhydrase; MA, malate; OAA, oxaloacetate; PEP, phosphoenolpyruvate; PEP-C, PEP carboxylase; PEP-CK, PEP carboxykinase; PPDK, pyruvate, phosphate dikinase; PVA, pyruvate. 1999). Consistently, PCK species have many mitochondria in the BS (Dengler and Nelson, 1999).
To meet the energy requirements imposed by the decarboxylating enzymes and associated transport systems, a distinct arrangement of chloroplast membranes occurs in each of the three subtypes (Hatch, 1987;Edwards and Voznesenskaya, 2011;Rao and Dixon, 2016). NADP-ME species have low grana stacking in the BS chloroplasts, yet high stacking in the M chloroplasts. More grana stacking in M cells increases the PSII content in the thylakoids, and hence the potential to generate NADPH by linear electron transport. Because NADP-ME species import reducing power into the BS with malate, the demand for NADPH to reduce PGA in the BS chloroplasts of NADP-ME plants is halved.; hence less PSII and granal stacking are required (Kanai and Edwards, 1999). Chloroplasts in the M cells of NADP-ME species are well stacked to support malate reduction following PEP carboxylation. NAD-ME species, by contrast, have low grana stacking in M chloroplasts because their primary energetic function is to produce ATP for PEP regeneration. The thylakoids of the BS exhibit pronounced stacking in NAD-ME species, to supply sufficient NADPH to potentially reduce all the PGA generated by the C 3 cycle. However, there can be variation on these general chloroplast phenotypes, depending upon the degree to which multiple decarboxylases are employed, whether PGA is exported to the M tissue for reduction, or if aspartate is imported into the chloroplast in lieu of malate, as suggested to occur in NADP-ME Flaveria species (Leegood and Walker, 1999;Furbank, 2011).
The theoretical pattern that emerges in C 4 evolution is thus one of universal convergence in terms of overall outcome (CO 2 concentration into a BS-like compartment), with some steps being universal (CO 2 conversion to bicarbonate, PEP carboxylation), and others being divergent (decarboxylation, transport, chloroplast energetics, and possibly how PEP is regenerated). Convergence occurs within the subtype pathways where specific biophysical constraints are present, for example, in transport metabolites associated with the distinct decarboxylating enzymes. An organizational hierarchy between convergence and divergence can thus be envisioned for complex traits that scales from the level of the genes up through to the composite phenotype. The challenge for research on convergent evolution is to describe this hierarchy and to understand why and under what constraints convergent patterns emerge (Losos, 2011). To do this, one needs effective research systems, such as fast cycling microbes, or in terrestrial plants, multiple lineages where convergence has occurred, such as the dozens of clades that have independently evolved C 4 photosynthesis (Blount et al., 2018;Heyduk et al., 2019). However, most closely related C 4 clades are of the same subtype , which restricts the ability to examine sub-type impacts on convergent versus divergent solutions to creating a C 4 pathway. One potentially strong study system occurs in Portulaca, where both NADP-ME and NAD-ME species are present (Voznesenskaya et al., 2010;Ocampo et al., 2013). Another possibility occurs in the Nyctagineae tribe of the Nyctaginaceae. Here, three related genera of the tribe Nyctagineae -Allionia, Boerhavia and Okenia -contain C 4 species while their closest relatives in the genera Anulocaulis, Cyphomeris, Commicarpus, and Nytctaginia are not known to have any C 4 species . In the phylogeny of Douglas and Manos (2007), Okenia and Boerhavia form a common C 4 clade, while Allionia forms a distinct C 4 clade that is separated from the Okenia/Borehavia lineage by the Anulocaulis/Nyctaginia complex. However, there has been no systematic survey of the occurrence of C 3 and C 4 photosynthesis in the Nyctaginaceae, so it is unclear whether C 4 may exist in Anulocaulis, Commicarpus, Cyphomeris, and Nyctaginia, or even whether Allionia, Okenia, and Boerhavia are completely C 4 , as opposed to also containing C 3 and C 3 -C 4 intermediate species.
Boerhavia and Allionia are listed as being NADP-ME, although this is based on enzyme assays for Boerhavia only (Muhaidat et al., 2007). Intriguingly, our preliminary TEM images show many mitochondria in the BS ultrastructure of Allionia, suggesting it is NAD-ME. If so, then the Allionia and Okenia/Boerhavia clade could join Portulaca in forming a robust study system for addressing evolutionary convergence as influenced by C 4 subtype.
To evaluate the potential of the Nyctaginaceae to become a model for studying evolutionary convergence, we present here a detailed study of the distribution of the C 3 and C 4 pathways in the Nyctaginaceae. We present a survey of carbon isotope ratios from 560 herbarium specimens, in addition to an anatomical/ultrastructural study using five representative species of Allionia, Anulocaulis, Boerhavia, Commicarpus, and Nyctaginia. We also examine climate data and geographic distributions of species in Allionia, Anulocaulis, Boerhavia, Commicarpus, Cyphomeris, and Nyctaginia to evaluate ecological factors contributing to C 4 origins in the Nyctaginaceae. The biochemical sub-types of C 4 species in Allionia and Boerhavia were determined, and we present a transcriptomebased phylogeny that updates the phylogenies of Douglas and Manos (2007) and Douglas and Spellenberg (2010). We also use transcriptome data to evaluate whether there has been convergence in the gene sequences of the major C 4 pathway enzymes. For comparative purposes, we also present TEM images of leaf tissues from two Portulaca (Portulaceae) species previously shown to be NADP-ME (Portulaca pilosa) or NAD-ME (Portulaca oleracea). Through these efforts, we present an initial hierarchical assessment of how divergence occurs during the convergent evolution of C 4 photosynthesis.

Plant Material and Growth Conditions
Seeds of Allionia incarnata, Anulocaulis gypsogenus, Boerhavia burbigeana, Boerhavia coccinea, Commicarpus scandens, and Nyctaginia capitata were collected from naturally occurring field populations in Southwestern North America and Australia between 2000 and 2016 (see Supplementary Table 1 for collection and voucher information). Portulaca pilosa seeds were a gift from Gerry Edwards and Elena Vosnesenskaya (Washington State University), while P. oleracea grew as a weed in our greenhouse. Seeds were sown directly into 10 or 20 L pots containing a sandy-loam mixture and grown in the University of Toronto greenhouse complex housed on the roof of the Earth's Sciences Centre. Plants were watered as needed to avoid soil drying (daily in high summer, two to three times weekly in cooler weather), and fertilized bimonthly with a Miracle-Grow commercial fertilizer mix (All-Purpose Brand, 24-8-16) amended with 4 mM calcium nitrate and 1 mM magnesium sulfate. Greenhouse conditions were 27 to 33 • C daytime temperature, 20-24 • C night temperature, and a peak photon flux density (PFD) of 1500 µmol photons m −2 s −1 on clear days. We used 400 W sodium vapor lamps to supplement natural daylight to maintain a minimum PFD on cloudy days of 250 µmol m −2 s −1 over a >13 h photoperiod. Unless otherwise indicated, three to five plants were sampled for gas exchange, biochemical assay, leaf structural properties and transcriptomics between May and October of 2010 to 2017. Care was taken to sample tissues under the same environmental conditions in the greenhouse (full sun exposure, a leaf temperature near 27 • C, at a time between 9 am and 2 pm) to minimize year to year and month to month variation. For the characteristics examined here, subtle variation that may exist between sampling dates is not known to affect results pertinent to our hypotheses. For example, C 3 versus C 4 expression patterns are largely constitutive, and sun to shade variation in phenotype would not be present at the bright light intensities present in our greenhouses during summer.

Carbon Isotope Ratio
To determine the distribution of the C 3 and C 4 pathways in species of the Nyctaginaceae, 560 herbarium specimens representing 23 genera and 159 species were sampled from the herbaria at Kew gardens (Richmond, London, United Kingdom), Missouri Botanical Gardens (St Louis, MO United States), and New York Botanical Gardens (The Bronx, New York, NY, United States; Supplementary Tables 2, 3). Species were sampled from all genera in the Nyctagineae with the exception of the monospecific genus Cuscatlainia. Approximately 50% (Mirabilis) to 100% (Okenia, Allionia) of the known species from the sampled Nyctagineae genera are present in the survey. We also sampled species from 12 genera occurring in each of the other six tribes of the Nyctaginaceae (Table 1 and Supplementary  Table 2). To measure the carbon isotope ratio (δ 13 C) of each herbarium specimen, 2-4 mg of leaf or stem material were sampled from herbarium sheets and assayed for δ 13 C by the Washington State University Stable Isotope Core. 2 δ 13 C was determined for at least two distinct plant specimens if available in the herbaria. Carbon isotope ratios between −10 and −16 correspond to C 4 values, while carbon isotope ratios between −23 and −32 correspond to C 3 values. Species that are evolutionary intermediates between C 3 and C 4 photosynthesis exhibit δ 13 C ratios that are similar to C 3 values except where a strong C 4 metabolic cycle has been engaged; such C 4 -like plants typically exhibit δ 13 C values between −16 and −22 (Monson et al., 1988;Von Caemmerer, 1992).

Leaf Gas Exchange and Biochemical Assay
The response of net carbon assimilation rate (A) to intercellular CO 2 concentration (C i ) was measured for Allionia incarnata, B. coccinea, and N. capitata using a Li-COR 6400 photosynthetic gas analyzer at 33 • C, a vapor pressure differences of 2.0-2.5 kPa and a photosynthetic PFD of 1800 µmol m −2 s −1 . The most recent, fully mature leaves were first equilibrated in the leaf cuvette at an ambient CO 2 concentration of 400 µmol mol −1 and 1800 µmol photons m −2 s −1 . After equilibration and measurement, the ambient CO 2 was increased to 1200 µmol mol −1 and then gas exchange parameters were measured after equilibration. The CO 2 concentration was then reduced to near the CO 2 compensation point in approximately 13 steps, with steady-state gas exchange values determined at each step. The initial slope of the A versus C i response was estimated as the linear slope of the measurements below a C i of 100 µmol mol −1 . Intrinsic water use efficiency (WUE) was determined as the ratio of A to stomatal conductance (g s ) at an ambient CO 2 of 400 ppm, and the CO 2 -saturated rate of A (A 1200 ) was measured at 1200 µmol mol −1 CO 2 .
The activities of PEPC, NADP-malic enzyme, and NADmalic enzyme were assayed at 30 • C using a coupled-enzyme assay that measured oxidation/reduction rate of NADP(H) or NAD(H) at a wavelength of 340 nm using a Hewitt-Packard 8230 spectrophotometer following procedures in Ashton et al. (1990) as modified by Sage T. L. et al. (2011). Two to three cm 2 of recent, fully-mature leaves of A. incarnata, B. coccinea, and N. capitata were sampled under full illumination in the greenhouse and then rapidly ground using a glass tissue homogenizer in an extraction buffer (100 mM HEPES -pH 7.6, 5 mM MgCl 2 , 10 mM KHCO 3 , 2 mM EDTA, 10 mM 6-aminocaproic acid, 2 mM benzamide, 1 mM phenylmethylsulfonyl fluoride, 1% (w/v) PVPP, 2% (w/v) PVP, 0.5% Triton X-100, 2% (w/v) BSA, 5 mM DTT, 1% (w/v) casein). After removing two aliquots for chlorophyll assay (in 80% acetone at 645 and 663 nm; Arnon, 1949), the extract was centrifuged 30 s and divided into three aliquots and put on ice. The aliquot used for NAD-ME assay was immediately 2 www.isotopes.wsu.edu treated with sufficient MnCl 2 to give a 2 mM solution. PEPC was assayed by coupling the production of OAA to NADH oxidation via malate dehydrogenase in an assay buffer containing 50 mM Bicine (pH 8.0), 1 mM EDTA, 5 mM MgCl 2 , 2 mM NaHCO 3 , 2 mM DTT, 1 mM glucose 6-phosphate, 2 units ml −1 malate dehydrogenase, and 0.2 mM NADH. The reaction was initiated by the addition of PEP to give 5mM. NADP-ME was assayed by following NADP + reduction at 340 nm. The assay buffer contained 25 mM Tricine (pH 8.2), 1 mM EDTA, 20 mM MgCl 2 , 2 mM DTT, 0.5 mM NADP. The reaction was initiated by the addition of malic acid to give 5 mM. NAD-ME was assayed by measuring NAD + reduction at 340 nm. The reaction mixture contained 25 mM Hepes (pH 7.2), 0.2 mM EDTA, 8 mM ammonium sulfate, 1 unit ml −1 malate dehydrogenase, 5 mM malic acid, 0.025 mM NADH, and 2 mM NAD + . The reaction was initiated by adding MnCl 2 to give 5 mM.

Imaging and Quantification of Leaf Structure
For all anatomical and ultrastructural imaging, the middle portion of a leaf blade equidistant between the mid-rib and leaf margin were sampled from recent, fully expanded leaves. Leaf pieces approximately 2 mm 2 were collected between 9:00-11:00 am and were prepared for light and transmission electron microscopy as described previously (Khoshravesh et al., 2017). Cell and organelle features of BS cells were quantified using Image J software (Schneider et al., 2012). One leaf per plant was sampled from three to five plants. Each mean value per plant was compiled from measurements of 5-10 imaged cells from the adaxial region of the leaf. All measurements were conducted on planar images of cross (=tranverse) sections. Cells measured were randomly selected from the pool of cells where the plane of section passed through the central region of a BS cell, rather than the periphery. Parameters measured include BS cell area; number of chloroplasts per BS cell; number of mitochondria per BS cell; area of individual chloroplasts and mitochondria; and% BS cell area covered by all chloroplasts and mitochondria in a BS cell.

Phylotranscriptomic Analysis
To prepare a phylotranscriptome of the Nyctaginaceae, we used publicly availably RNA-sequence data from the NCBI Short Read Archive (Supplementary Table 4). 3 FASTQ read data was trimmed using Trimmomatic (Bolger et al., 2014) with minimum leading and trailing quality cutoffs at 30, a 5-base sliding window cutoff of 30, and 70bp minimum trimmed length. Transcriptomes were de novo assembled using Trinity (Grabherr et al., 2011) with default settings including in silico read normalization. Open reading frames were predicted and translated using the getorf program from the EMBOSS package (Rice et al., 2000), and the longest open reading frame for each putative locus identified by Trinity was selected using a Python script. OrthoFinder (Emms and Kelly, 2015) was used to predict groups of orthologous genes using 10 reference proteomes with MapMan annotations (Thimm et al., 2004) and 10 with all other annotation types TABLE 1 | δ 13 C data for sampled species from the Nyctaginaceae showing number of accepted species for each genus and the number of species sampled, the range of δ 13 C for species means and number of C 4 species we identified.

Tribe and genus
Accepted species number/sampled species number δ 13 C range of species means  Tables 2,3 for herbarium specimens, collection information and individual δ 13 C values. Accepted species numbers follows Spellenberg, 2003, Tropicos (2020, and the International Plant Names Index (IPNI, 2020). Sampled species number includes species listed in Supplementary Tables 2,3 whose acceptability is uncertain. Belemia fucsiodes is a type specimen examined with a dissecting scope for vein density only.

Number of C 4 species in Supplementary
available from the Phytozome v12 (Supplementary Table 5). 4 Single-or low-copy orthogroups were selected with a Python script based on the following criteria: a minimum of 40 of 53 species present; no more than 10% of the species having multiple sequences; and a minimum alignment length of 100 amino acids. For species with multiple sequences, a Python script was used to concatenate all orthogroup sequences from a given species if none overlapped by more than 10% of the shortest sequence length, or remove them all if they did. Sequences were assumed to represent fragmented assemblies in the former case and paralogs in the latter. This selection process resulted in 1084 genes for phylogenetic analysis. Initial protein alignments were produced with mafft (Katoh and Standley, 2013) and DNA codon alignments were generated from these using pal2nal (Suyama et al., 2006). The codon alignments were trimmed using trimAl (Capella-Gutiérrez et al., 2009) with a gap threshold of 0.5, then concatenated into a partitioned super-matrix using a Python script written by co-author M. Stata. Phylogenetic inference was conducted using RaxML (Stamatakis, 2014). This was conducted using separate evolutionary models for all partitions, rapid bootstrap analysis with a search for the best tree in one run (command line option -f a), and bootstrap convergence testing (autoMRE). Convergence testing showed that 50 bootstrap replicates were sufficient, but in order to be certain of support values we conducted 200. The species tree was visualized using FigTree. 5 All Python scripts are available at github.com/MattStata/Nyctaginaceae_Scripts.

C 4 Gene Phylogenies and Analysis
All homologs of PEPC, NADPME, NADME, and PPDK were identified in the Nyctaginaceae transcriptomes in the NCBI Short Read Archive (Supplementary Table 4) using BLASP with Arabidopsis sequences as queries. Matches were aligned using mafft (Katoh and Standley, 2013) and preliminary trees were inferred with FastTree (Price et al., 2010) in order to identify the number of copies present in Nyctaginaceae. For each copy, sequences from the Nyctagineae clade were extracted from the alignment and a consensus sequence was generated using Geneious (www.geneious.com) to create full-length consensus references for each copy onto which RNA-seq data for all Nyctagineae species in the NCBI SRA were mapped using HiSat2 (Kim et al., 2019) with the scoring argument -score-min L,0,-1.4. Read mappings were scrutinized manually using the software Geneious to be certain there were no additional paralogs beyond those we had detected, which would be evident as paralogous reads mapped onto the closest reference. Consensus sequences based on mapped reads were generated in Geneious. Gene trees were inferred using MrBayes (Huelsenbeck and Ronquist, 2001) with two runs, 40 chains, 2 million generations, and a heating factor of 0.05. About 10,000 trees from the end of each run where the average SD of split frequencies remained flat at or below 0.01, were used to generate the final tree and infer posterior probabilities using the consense program included with ExaBayes (Aberer et al., 2014). Trees were visualized using FigTree. 5 Gene expression values were calculated as reads per kb of transcript per million reads (RPKM) using the SAM files produced by HiSat and a Python script. Only species from the 1KP project 6 submitted by our lab were used for gene expression as these represent both C 4 lineages and a closelyrelated C 3 outgroup, and were grown and sampled identically. For these, the newest fully expanded leaves were sampled during a sunny day from plants grown in a glass house at the University of Toronto between 9 am and 1 pm. Because complete mappings to all transcripts were not conducted, numbers of total sequenced reads were used in RPKM calculations rather than total mapped reads. Scripts written by M Stata are available at github.com/MattStata/Nyctaginaceae_Scripts.

Species Distribution Data
Biogeographic distributions for species of the tribe Nyctagineae were obtained from the Global Biodiversity Information Facility website (GBIF). 7 Duplicate data points and those lacking herbarium records or corresponding to marine coordinates were removed (maptools, Bivand and Lewin-Koh, 2013). The remaining 15,870 observations represented 75 species within Allionia (two species), Boerhavia (40 species), Anulocaulis (five species), Commicarpus (25 species), Cyphomeris (two species), and Nyctaginia (one species). Bioclimate and monthly minimum and maximum temperature parameters (2.5 min resolution) were downloaded from the WorldClim 2.0 dataset 8 (Fick and Hijmans, 2017). Monthly potential evapotranspiration and Global Aridity Indexes (AI) were downloaded from CGIAR-CSI GeoPortal. 9 Values per observation for 19 bioclimatic variables in the Worldclim dataset, plus minimum and maximum temperatures and AI were then extracted using the extract function in the R raster package (Hijmans and van Etten, 2012). Median values per variable per species were calculated and normalized to Z-scores for use in subsequent analyses. Climate variables that significantly predicted the occurrence of photosynthetic type at p < 0.05 were selected using stepwise regression. Mixed-effect models (R package lme4, Bates et al., 2015) were built using the selected bioclimatic variables and photosynthetic subtypes (NAD-ME or NADP-ME) as the main effect and genus and species as random effects. These models were compared by ANOVA and Akaike's Information Criteria (AIC). The best model was selected based on the lowest AIC and p values <0.05. A principal component analysis was performed by R package FactoMineR (Lê et al., 2008) to evaluate species distribution across the multivariate predictors. A subset of the data for which we had phylogenetic data was also used to run a phylogenetically corrected ANOVA using phytools in R (Revell, 2012).

Carbon Isotope Ratios
For the δ 13 C survey, we sampled herbarium specimens from all genera of the Tribe Nyctagineae, except for one monospecific genus from El Salvador (Cuscatlainia vulcanicola; Table 1). All sampled species from the genera Allionia, Boerhavia, and Okenia exhibited C 4 δ 13 C values (−9 to −15 ) ( Table 1 and Supplementary Tables 2,3). All other species exhibited C 3 δ 13 C values (−22 to −30 ), including all assayed species in the Nyctagineae genera Abronia, Acleisanthes, Anulocaulis, Commicarpus, Cyphomeris, Mirabilis, and Tripterocalyx. The δ 13 C survey provides little evidence for C 3 -C 4 intermediate species in the Nyctagineae. Values from all species were clearly C 4 or within the more negative range of δ 13 C values typical of C 3 plants, with two possible exceptions -the arid zone species Acleisanthes angustifolia (δ 13 C = −23.9) and Mirabilis polyphylla (δ 13 C = −22.6). While high for a typical C 3 δ 13 C value, these exceptions are within the range of values observed in arid-zone C 3 species with high WUE (Farquhar et al., 1989).

Gas Exchange and Biochemistry
Both Allionia incarnata and B. coccinea exhibited typical C 4 photosynthetic parameters. The CO 2 compensation point of photosynthesis ( ) was below 5 µmol mol −1 in both species, the ratio of intercellular to ambient CO 2 concentration (C i /C a ) ratio was 0.31-0.35, and their carboxylation efficiency of photosynthesis, measured as the initial slope of the photosynthetic response to intercellular CO 2 concentration (A/C i response), was 5-7 times greater than the carboxylation efficiency in the C 3 N. capitata (Figure 2 and Table 2). Allionia exhibited a steeper initial slope of the A/C i response than Boerhavia. When relativized by dividing by the maximum net CO 2 assimilation rate at high CO 2 (A 1200 ) to correct for variation in photosynthetic capacity, the normalized carboxylation efficiency in Allionia was 50% greater than in Boerhavia ( Table 2).
The in vitro activity of PEPC was high in the C 4 species relative to the C 3 N. capitata, and similar on a leaf area basis in the two C 4 species; however, on a chlorophyll basis, the PEPC activity is 63% higher in Allionia than Boerhavia ( Table 2). FIGURE 2 | The response of the net CO 2 assimilation rate to intercellular CO 2 concentration in Allionia incarnata (C 4 ), Boerhavia coccinea (C 4 ) and Nyctaginia capitata (C 3 ) at 30 • C and a light intensity of 1800 µmol photons m −2 s −1 . Means ± SE, N = 4 (for the C 4 species) or 1 (for the C 3 species).
NADP-ME activity was high in Boerhavia and not detected in Allionia, while NAD-ME activity was negligible in Boerhavia and high in Allionia ( Table 2). Chlorophyll content was 43% higher in Boerhavia than Allionia, and chlorophyll a/b ratio was 21% higher in Boerhavia. Observed enzyme activities are considered robust because they are equivalent to or greater than the observed A values in each species. From these results, we conclude that Boerhavia belongs to the NADP-ME subtype, while Allionia is of the NAD-ME subtype.
Leaf Structure and Ultrastructure in Allionia, Boerhavia and Two Portulaca Species Allionia incarnata and B. coccinea exhibited typical C 4 -Atriplicoid Kranz anatomy, which is characterized by large BS cells in planar cross sections with centripetal organelle arrangements (Figures 3, 4; Edwards and Voznesenskaya, 2011). Notably, enlarged BS cells in both Allionia and Boerhavia do not wrap around the entire vascular bundle, mimicking patterns observed in Atriplex but not many other C 4 eudicots classified as having Atriplicoid Kranz anatomy (Muhaidat et al., 2007). A. gypsogenus, C. scandens, N. capitata, and Mirabilis jalapa have characteristic C 3 leaf anatomy (Figures 3, 4 and Supplementary Figure 1). Planar areas of the BS cells are comparable between the C 3 and C 4 species, indicating similar dimensions of the BS tissue in a radial direction with respect to the vasculature ( Table 3). Both C 4 species exhibit greater coverage of the BS cell by chloroplasts than the C 3 species, because there are more chloroplasts per BS cell and greater mean area per chloroplast in the C 4 species (Figure 4; Table 3). In A. incarnata, the planar area of BS cells covered by mitochondria is significantly greater than in C 3 species and the C 4 B. coccinea, due to the presence of larger mitochondria in planar section, and more mitochondria per cell area (Table 3; Figures 4A,C). The percent mitochondrial Means ± SE, N = 4 for the C 4 species, 1 for N. capitata gas exchange data, and 3 for N. capitata biochemical data. The p-values of one-tailed Students T-tests between the two C 4 species are indicated, with significant differences highlighted in bold. Gas exchange measurements were conducted at 33 • C and a light intensity of 1800 µmol m −2 s −1 . Biochemical assays were conducted at 30 • C. A, net CO 2 assimilation rate; C i , intercellular CO 2 concentration; C a , ambient CO 2 concentration in the leaf chamber; A 400 , A at a CO 2 concentration of 400 µmol CO 2 mol −1 air; A 1200 , A at a CO 2 concentration of 1200 µmol mol −1 ; CHL, leaf chlorophyll content; g s , leaf conductance to water vapor; ME, malic enzyme. coverage of the BS cells is similar in B. coccinea and the C 3 species (Table 3; Figures 4B,D-F).
The BS cells in species with C 3 δ 13 C values have smaller and fewer chloroplasts than the C 4 species of the Nyctagineae, and these are positioned mostly along the outer BS wall exposed to intercellular airspace (Figures 4E,F and Supplementary  Figure 1). None of the three C 3 species examined exhibit traits associated with C 3 -C 4 intermediacy or even the incipient intermediate state termed "proto-Kranz"; there was not a greater mitochondrial number, nor a repositioning of mitochondria and chloroplasts toward the inner side of the BS cells. Chloroplasts in the BS of Boerhavia and the M tissue of Allionia are largely agranal, whereas distinct levels of grana stacking are apparent in the thylakoids of the BS chloroplasts in Allionia, and the M chloroplasts of Boerhavia ( Figure 5) In P. oleracea (NAD-ME), chloroplasts with pronounced grana stacks cluster in the inner BS ( Figure 6A). Many mitochondria are interspersed between the chloroplasts, and exhibit distinct structural connections with nearby chloroplasts (Figure 6B). In P. pilosa (NADP-ME), chloroplasts also occur in the inner BS where they form clusters with no discernable thylakoid stacks (Figures 6C,D). Mitochondria are largely absent between chloroplasts, although some mitochondria occur along the inner wall of the BS in a pattern that is commonly observed in C 3 -C 4 intermediate species (Figure 6C).

Species Phylogeny
We update the molecular phylogeny of the Nyctagineae using transcriptome sequence data available at the NCBI short read archive (Figure 7). The tree largely replicates the phylogenies of Douglas and Manos (2007) and Douglas and Spellenberg (2010) showing the C 4 and C 3 clades of the Nyctagineae correspond to a clade of xerophytic herbs and shrubs that Douglas and Spellenberg term the North American Xeric (NAX) clade. C 4 Allionia branches with Cyphomeris at the base of a clade that includes the isotopically C 3 Anulocaulis and Nyctaginia species, and the C 4 Okenia and Boerhavia species. Anulocaulis and Nyctaginia form a clade that branches between Allionia/Cyphomeris and Boerhavia/Okenia. C. scandens branches at the base of the clade containing the C 4 species and their immediate non-C 4 sister clades, with Mirabilis species branching just below C. scandens. Unlike Douglas and Spellenberg (2010) who predicted Bougainvilleeae + Pisonieae to be sister to the Nyctagineae, we predict only Bougainvilleeae branches in a sister position, with maximal support. Also, the branching order of Phytolaccaceae, Sarcobataceae, Gisekiaceae, and Nyctaginaceae is unclear in the literature and in the APG Caryophllalyes tree, where they form a polytomy. 10 Our tree resolves them with 100% support, improving the phylogenetic context for the family.

Gene Phylogenies
We examined the trees of four important C 4 cycle genes -PEPCase, NADP-ME, NAD-ME, and PPDK (Figure 8 and Supplementary Figures 2-5). For each gene, we assumed that the functional copy in the C 4 pathway was the one with the highest expression in the transcriptome analysis. In this analysis, we numbered gene copies based on branching order from the base of the gene tree; the numbers assigned are not meant to imply direct orthology with gene copies from any other lineage. For Allionia incarnata and two Boerhavia species, the PEPC1 gene is the copy used in the C 4 pathway because it shows an expression strength that is orders of magnitude greater than PEPC2 and PEPC3 ( Table 4). By similar logic, NAD-ME3 is the main decarboxylase gene in the C 4 pathway of Allionia, and NADP-ME2 is the main decarboxylase gene in Boerhavia. Allionia also exhibited 10 www.mobot.org/MOBOT/research/APweb  Table 4). The expression of the PEP carboxykinase gene was minimally detectable in both C 4 clades (not shown). In the gene trees for PEPC1, PPDK and NADP-ME1, the respective orthologs from numerous C 3 species of the Nyctagineae branch between the two C 4 lineages, with strong support (Figure 8 and Supplementary Figures 2-5). This indicates the C 4 genes arose independently from an ancestral C 3 copy, rather than by lateral transfer between clades. If one or more of the genes had moved laterally between the C 4 clades, the orthologs from both C 4 lineages would form a common clade.

Positively Selected Amino Acid Substitutions
We examined PEPC1 sequences for evidence of convergence in the C 4 isoforms ( Table 5). Christin et al. (2008) showed that 21 amino acid sites were under positive selection in the PEPC1 of various C 4 grasses, with the best-known example of sequence convergence being a serine for alanine substitution near the maize 780 position in the PEPC1 sequence (position 774 in the Flaveria sequence; Gowik and Westhoff, 2011). We did not find evidence for convergence with other C 4 lineages at position 780 in Allionia nor Boerhavia, as they both exhibited an alanine, which is typical of C 3 isoforms ( Table 5 and Supplementary Figure 6). There was consistent convergence at sites 572, 761, and 807, where all C 4 species exhibited the same amino acids as those under positive selection in C 4 grasses. Convergence was observed in both C 4 lineages at site 813, but in only one of the six Boerhavia/Okenia species in the database ( Table 5). Boerhavia also converged on the same amino acids as the C 4 grass species at site 733 and 863, and partially at position 502 (B. purpurescens and B. torreyana only). In total, of the 21 C 3 to C 4 amino acid switches repeatedly observed in the grass PEPCase sequences (Christin et al., 2007), about a third were replicated in the Boerhavia lineage and a fifth in the Allionia lineage (Table 5). At one site, position 577, the C 3 and C 4 Nyctaginaceae species share a serine with the C 4 grass Zea mays. In the grasses, most C 3 species examined by Christin et al. (2007Christin et al. ( , 2012b exhibit an alanine at this site. It is possible that C 4 evolution in Nyctaginaceae did not require this substitution as the ancestral C 3 state was already a serine.

Ecological Distribution
Nyctagineae species from the genera Allionia, Boerhavia, Anulocaulis, Commicarpus, Cyphomeris, and Nyctaginia are distributed in tropical and subtropical regions of similar climate (Supplementary Figures 7-9). Six bioclimatic variables significantly predicted (p < 0.05) the C 3 and C 4 distribution using a stepwise regression model (Supplementary Figure 8 and Supplementary Table 6). To address whether the variation in distribution of C 3 and C 4 species could be species or genera dependent, GLMM models were performed with and without considering taxa as random effects. The model was significantly improved (p < 2e −16 ) when genus was added as the random effect (Supplementary Table 6); however, after this addition, none of the main effects significantly predicted the pattern of photosynthetic pathway distribution.    A phylogenetically-corrected ANOVA also failed to detect a significant difference between the C 3 and C 4 species in response to six variables (Supplementary Table 7). These analyses show that the distribution of Nyctagineae species in climate and geographic space follows taxonomic and phylogenetic affinities rather than photosynthetic pathway.

DISCUSSION
Carbon isotope ratios demonstrate that all examined species in Allionia, Boerhavia and Okenia are C 4 , while all examined species in other Nyctaginaceae genera are not, including species in Anulocaulis, Commicarpus, Cyphomeris, and Nyctaginia that   branch sister to the C 4 clades. The number of C 4 species in the Nyctaginaceae is 43-46 based on recent assessments that conclude there are 40 Boerhavia, one to four Okenia, and two Allionia species (Spellenberg, 2003;Tropicos, 2020). The typically C 3 δ 13 C values in the 31 examined species of Anulocaulis, Commicarpus, Cyphomeris, and Nyctaginia indicate that none of the examined species have a type of C 3 -C 4 intermediacy termed C 4 -like, where a strong C 4 metabolic cycle has been engaged. As a C 4 cycle becomes engaged, the values of δ 13 C increase from C 3 toward C 4 values, becoming distinct from typical C 3 values when δ 13 C rises above −22 (Von Caemmerer, 1992;Alonso-Cantabrana and von Caemmerer, 2016). This occurs because PEPCase does not discriminate against the 13 C isotope to the degree that Rubisco does (Farquhar et al., 1989). Carbon isotope screens cannot differentiate between C 3 plants and a type of C 3 -C 4 intermediacy utilizing C 2 photosynthesis, a metabolic pathway that concentrates CO 2 into the BS using photorespiratory metabolites to shuttle CO 2 from M to BS cells. In C 2 photosynthesis, the photorespiratory enzyme glycine decarboxylase is expressed only in BS mitochondria, which forces photorespiratory glycine to diffuse from the M cells where it is formed to the BS cell for metabolism . The released CO 2 from glycine decarboxylation accumulates in the BS cells, where it can be refixed by Rubisco in adjacent chloroplasts with high efficiency. C 2 photosynthesis improves carbon gain at low CO 2 concentrations, but because all CO 2 is fixed by Rubisco, the δ 13 C values of C 2 plants reflect those of C 3 plants (Von Caemmerer, 1992). Anatomically, C 2 plants have increased numbers of mitochondria and chloroplasts in BS cells, typically in a centripetal position against the BS wall facing the vascular tissue (Khoshravesh et al., 2016). This characteristic identifies candidate C 2 species, with low CO 2 compensation points of photosynthesis ( ) confirming the presence of the C 2 physiology. Our examination of the BS structure in Anulocaulis, Commicarpus and Nyctaginia, and leaf gas exchange in Nyctaginia, showed no evidence of C 2 photosynthesis as the structural characteristics and were typically C 3 . Hence, we conclude that the sister clades to the Allionia and Boerhavia/Okenia clades are most likely comprised of only C 3 plants. Gas exchange data and anatomical studies demonstrate strong C 4 features in selected species of Allionia and Boerhavia, which along with the consistently C 4 values of δ 13 C in these clades, lead us to conclude they are completely C 4 rather than C 4 -like. C 4 -like species have a fully functional C 4 cycle but retain limited C 3 photosynthesis in M tissues, and exhibit features suggesting they are newly evolved C 4 plants that have not yet optimized the C 4 pathway (Moore et al., 1989).
In the phylogeny, Anulocaulis and Nyctaginia species occur in a clade that branches sister to the Boerhavia/Okenia clade, while Cyphomeris branches sister to Allionia, supporting a hypothesis of two independent C 4 origins, one in ancestral Allionia and a second in ancestors to the Okenia/Boerhavia clade. An alternative possibility is a single C 4 origin followed by reversions to C 3 in ancestors of Cyphomeris and the Anulocaulis/Nyctaginia clade. This option is less parsimonious, requiring three changes -one acquisition of C 4 and two reversions -rather than two. It is also not favored because reversions are considered unlikely and have never been confirmed (Christin et al., 2010a;Oakley et al., 2014;Bräutigam and Gowik, 2016). In addition, the two C 4 clades exhibit different biochemical and structural subtypes. The high degree of structural, transport, and biochemical specialization associated with each subtype indicates switching subtypes would be difficult, and consistently, no instances of subtype switching have been documented. Gene sequences of C 4 pathway enzymes can also be used to assess reversal possibilities. If a reversion had occurred, the derived ortholog of a C 4 -pathway enzyme in any C 3 progeny might retain sequences from their ancestral C 4 function (Christin et al., 2010a;Khoshravesh et al., 2020). In Anulocaulis, Cyphomeris, and Nyctaginia, the C 4 -type orthologs of PEPCase, PPDK, and both decarboxylases exhibited no signatures of prior C 4 function. For example, the C 3 species in the Nyctagineae share only one positively selected amino acid substitution with the C 4 Nyctaginaceae and Zea mays in the PEPC1 sequence, a serine at position 577. C 3 species outside the clade where an ancestral C 4 transition could have occurred also share this serine, indicating it is not a relic from a C 4 -state. Based on these points, we conclude that Allionia and Boerhavia represent independent yet closely related C 4 clades that have diverged in terms of biochemical subtype. With other examples of subtype divergence in closely related species of Portulaca and possibly other clades (Salsola and Sesuvioideae; Voznesenskaya et al., 1999;Bohley et al., 2015), comparative methods of evolutionary biology can be used to address questions of convergence and divergence during C 4 evolution.
In contrast to a prior conclusion that both Boerhavia and Allionia are NADP-ME (Muhaidat et al., 2007), the biochemical assays demonstrate species in these two clades are different biochemical subtypes of C 4 photosynthesis. Consistently, Allionia and Boerhavia showed pronounced differences in BS and M ultrastructure that support their designation as NAD-ME and NADP-ME subtypes, respectively. In both B. coccinea (high NADP-ME activity, low NAD-ME activity) and A. incarnata (high NAD-ME activity, nil NADP-ME activity), chloroplasts and mitochondria occupy the inner half of the BS cells, as is typical in eudicot C 4 species (Edwards and Voznesenskaya, 2011). In A. incarnata, there is an abundance of mitochondria scattered among the elongated chloroplasts, while in B. coccinea, BS mitochondria are much less frequent and not commonly interspersed within the chloroplast cluster. In NAD-ME species, the juxtaposition of chloroplasts and mitochondria facilitates rapid movement of CO 2 released from mitochondria into adjacent chloroplasts, while the tight packing of chloroplasts and mitochondria reduces the chance of CO 2 escape. This tight packing is a major means by which CO 2 is trapped in species lacking a suberized layer around the BS wall, as is the case in eudicots (von Caemmerer and Furbank, 2003). In Portulaca species, the same patterns hold but with one important variation that is not reported in the literature, notably, mitochondria in the BS of NAD-ME P. oleracea form distinct attachment structures to adjacent chloroplasts that are not obvious in Allionia and other NAD-ME type C 4 species (for example, Anticharis, Cleome, Salsola, and Suaeda; Voznesenskaya et al., 1999Voznesenskaya et al., , 2018Khoshravesh et al., 2012). Such structures may facilitate rapid CO 2 diffusion between mitochondria and chloroplasts, enhancing refixation efficiency.
The structural characteristics observed here are consistent with patterns in other C 4 clades. NADP-ME species among C 4 grasses, sedges and eudicots (as shown in Flaveria, Euphorbia, Gomphrena, Heliotropium, Salsola, and Tribulus) share with Boerhavia and Portulaca pilosa the pattern of enlarged chloroplasts with weakly developed grana stacks, and few BS mitochondria (Carolin et al., 1978;Kim and Fisher, 1990;Ueno, 1996Ueno, , 2013Voznesenskaya et al., 1999;Yoshimura et al., 2004;Muhaidat et al., 2011;Sage T. L. et al., 2011;Lauterbach et al., 2019). NAD-ME eudicots in Amaranthus, Atriplex, Anticharis, Cleome, Gisekia, Salsola, Suaeda, and Tecticornia share with Allionia and P. oleracea the pattern of enlarged chloroplasts with well-developed grana and large numbers of interspersed mitochondria (Carolin et al., 1978;Voznesenskaya et al., 1999Voznesenskaya et al., , 2007Voznesenskaya et al., , 2008Khoshravesh et al., 2012;Bissinger et al., 2014;Oakley et al., 2014). In grasses, Dengler and Nelson (1999) describe NAD-ME species as having 5-to 20-fold more mitochondria than NADP-ME species, and NAD-ME mitochondria are often larger with greater internal membrane surface area in the BS. The differences between C 4 subtypes hold even when different tissue layers are co-opted as the site of CO 2 concentration, whether it is the mestome sheath as in grasses, an inner chlorenchyma layer as occurs in Salsola and other succulent chenopods, or single-cell type of C 4 photosynthesis as shown in the NAD-ME chenopods Bienertia cycloptera and Suaeda aralocaspica (Ueno, 1996(Ueno, , 2013Voznesenskaya et al., 1999;Pyankov et al., 2000;Edwards and Voznesenskaya, 2011;Khoshravesh et al., 2020). In the middle of M cells of Bienertia cycloptera, for example, there are many mitochondria within a ball of Rubisco-containing chloroplasts where CO 2 is concentrated (Voznesenskaya et al., 2002). The central chloroplasts in B. cycloptera have well-stacked grana, while peripheral chloroplast that are functionally equivalent to M chloroplasts of typical C 4 species do not.
Lateral transfer of genes encoding C 4 pathway elements has been observed in closely related grass clades within Alloteropsis and Neurachne (Christin et al., 2012a,b;Dunning et al., 2017;Khoshravesh et al., 2020). Lateral gene transfer is relevant to discussions of evolutionary convergence because the acquisition of previously evolved C 4 genes by a non-C 4 relative could facilitate C 4 evolution in the receptive clade, possibly creating a false impression of convergence. When we examined the phylogenies of C 4 pathway genes in the Nyctagineae, we found no evidence of lateral transfer. The gene trees consistently showed the two C 4 lineages do not share a common gene copy of a C 4 -adapted isoforms of PEPC, PPDK, or either malic enzyme, because the respective orthologs from related C 3 species branch between the C 4 lineages with high support (Figure 8 and Supplementary Figures 2-5). This evidence is consistent with a hypothesis that the two C 4 clades in the Nyctaginaceae arose de novo from a completely C 3 state, rather than via assisted origins involving gene introgression from a pre-existing C 4 clade (Christin et al., 2012a,b;Dunning et al., 2017).
With sequencing data becoming widely available, it may be possible to identify C 4 -subtypes using transcriptomics (Lauterbach et al., 2019). Such an approach, however, could produce errors if not coupled with enzyme activity assays. Here, the transcriptomes consistently showed high expression of the copies we designate as NADP-ME1 in Boerhavia and NAD-ME3 in Allionia. Allionia also exhibited significant expression of NADP-ME1 transcripts which approach expression of its NAD-ME3 gene. This data by itself would suggest co-function of NADP-ME and NAD-ME in Allionia; however, no NADP-ME activity was detected in Allionia, leading us to conclude its NADP-ME1 transcripts are not translated into functional enzyme. The possibility of error in this conclusion is unlikely because the BS ultrastructure in Allionia is consistent with patterns generally seen in NAD-ME subtypes. The high transcript level of both NAD-ME and NADP-ME may be a relic from ancestral Allionia species that may have utilized both decarboxylases during an early phase of C 4 photosynthesis. A chance selection event may have subsequently enhanced NAD-ME activity, after which leaf structure and energetics were optimized for the NAD-ME subtype.
One of the notable features of C 4 photosynthesis is that it is concentrated in relatively few orders of higher plants. The Poideae (grasses, sedges) and Caryophyllales, for example, account for over 90% of all C 4 species and about 50 of the estimated 65 independent origins of C 4 photosynthesis (Sage, 2016). The ability of these clades to repeatedly evolve the C 4 pathway is hypothesized to reflect the presence of enabling traits that facilitate C 4 evolution (Sage, 2004;Christin et al., 2013b). Wider BS cells in C 3 species, for example, could be a structural enabling trait (Muhaidat et al., 2011;Christin et al., 2013b;Griffiths et al., 2013), while greater numbers of organelles in BS tissues may reflect an activation of photosynthetic physiology which enables establishment of photorespiratory glycine shuttles Schulze et al., 2013). Photosynthetic activation of the BS can facilitate the rise of photorespiratory glycine shuttles because mitochondria and some chloroplasts can reposition to the inner BS region, forcing glycine formed in centrifugal chloroplasts to migrate to the inner BS for processing by glycine decarboxylase . To evaluate whether enabling traits are present in close C 3 relatives of the C 4 Nyctaginaceae clades, we examined their BS structure and organelle characteristics. In the C 3 species of Anulocaulus, Cyphomeris, and Nyctaginia, BS cells had similar cross-sectional areas as the C 4 BS cells, and exhibited numerous chloroplasts along the outer wall, indicating photosynthetic activation. We thus conclude the close C 3 relatives of Allionia and Boerhavia exhibit numerous enabling traits for C 4 evolution, indicating they were present in ancestral C 3 taxa from which the C 4 clades arose.

C 4 Selection Environments
The results indicate C 4 photosynthesis evolved in Nyctaginaceae species from hot and dry climates of the New World, most likely in the arid-to-semi-arid regions of Southwestern North America where the center of diversity occurs for the North American Xerophytic clade of the Nyctaginaceae (Douglas and Manos, 2007). Christin et al. (2011) estimate that the C 4 pathway appeared in the Boerhavia/Okenia clade about 4.7 million years ago, and 6.1 million years ago in Allionia, based on molecular phylogenies. This was during a period of aridification in Southwestern North American and reduced atmospheric CO 2 (Sage et al., 2018). The environmental habitat of Allionia versus Boerhavia did not differ much from each other or their C 3 ancestors, indicating the C 3 ancestors were adapted to the same kind of hot, dry environments where the C 4 species are common. This supports a hypothesis that C 4 arose in these sorts of environments, consistent with a model that high levels of photorespiration brought on by heat, drought and/or salinity promoted C 4 evolution in C 3 plants pre-adapted to such harsh environments (Sage et al., 2018).
Surveys of the floristic distribution of NADP-ME versus NAD-ME grasses have identified a trend where NADP-ME species predominate in grass floras of wetter climates while NAD-ME species predominate in drier climates (Vogel et al., 1978;Hattersley, 1992;Ghannoum et al., 2011). The reasons for this trend have not been clarified, although it has been hypothesized that the pattern reflects phylogenetic ancestry where NAD-ME species are more likely to arise in clades from drier areas while NADP-ME species evolve in wetter climate zones (Taub, 2000).
Explanations for habitat differences between C 4 subtypes can be evaluated using closely related NADP-ME and NAD-ME clades. In A. incarnata and B. coccinea, we observed similar photosynthetic capacities and intrinsic WUE (A/gs), indicating no obvious differences in photosynthetic parameters that might explain subtype segregation along a moisture gradient. Allionia did exhibit a steeper initial slope of the A/Ci response than Boerhavia, indicating a stronger C 4 metabolic cycle in the NAD-ME plant. Consistently, the NAD-ME clade in Allionia had greater PEPCase activity per unit chlorophyll than Boerhavia (NADP-ME clade). If a stronger metabolic cycle is an inherent feature of NAD-ME relative to NADP-ME species, there could be a photosynthetic advantage under low intercellular CO 2 concentrations that commonly occur where drought and low humidity reduce stomatal aperture. A comparative analysis using closely related species of differing subtype in the Nyctagineae, Portulaca and other lineages could evaluate this possibility.
Convergence Versus Non-convergence in the C 4 Functional Type As a highly convergent, complex trait, C 4 photosynthesis represents an excellent system to dissect evolutionary convergence, particularly the degree of convergence versus divergence in the mix of traits that give rise to a composite phenotype (Heyduk et al., 2019). Numerous studies have previously examined convergent properties of C 4 components, such as convergence in genes for C 4 enzymes (Christin et al., 2007(Christin et al., , 2009(Christin et al., , 2010bEmms et al., 2016); regulatory components John et al., 2014;Reyna-Llorens and Hibberd, 2017), structural features (Yoshimura et al., 2004;Kadereit et al., 2014;Stata et al., 2014;Danila et al., 2018); and biochemical sub-types (Gutierrez et al., 1974;Hatch et al., 1975;Ludwig, 2016b). However, the assembly of multiple datasets into a hierarchical framework has not been attempted in a C 4 context. In Table 6, we present a preliminary hierarchy of convergent and divergent traits observed in comparative studies of the many C 4 clades, thereby enabling deeper assessments of when and why convergence is favored, versus when divergent solutions to CO 2 concentration can occur. Convergence within C 4 photosynthesis is apparent in two key steps: the carboxylation of PEP by PEPCase, and the conversion of CO 2 to bicarbonate by CA. The ubiquitous use of PEPCase is probably due to three factors. First, it is readily available for co-option, because it is widely used in C 3 plants for processes such as pH control, metabolite generation for the Krebs cycle and nitrogen assimilation, and shuttling reducing power between cellular compartments (Aubry et al., 2011;Mallmann et al., 2014). Second, there are no obvious alternatives in vascular plants that may be co-opted to provide the carboxylation function of a C 4 cycle. While numerous carboxylases are active in prokaryotes, few occur in higher plants (Erb, 2011). One obvious candidate to recruit into a C 4 cycle is pyruvate carboxylase, which produces OAA from pyruvate and bicarbonate in bacteria, animals and algae, but it is not known to occur in plants (Tsuji et al., 2012). Third, it is worth considering the chain of events giving rise to C 4 photosynthesis. The evolutionary rise of C 2 photosynthesis in C 3 -C 4 intermediate  species establishes a Kranz-like leaf anatomy and M to BS transport systems that facilitate the subsequent upregulation of PEPCase and a C 4 metabolic cycle . A leading hypothesis to explain this upregulation is PEPCase provides carbon skeletons to support re-assimilation of photorespiratory ammonia (Mallmann et al., 2014). If so, then the ubiquitous use of PEPCase may arise out of its pre-exiting role supporting N metabolism in C 3 leaves, which make it the ready candidate for optimizing the performance of C 2 photosynthesis, which in turn positions it to be further upregulated as the benefits of the nascent C 4 cycle improve fitness (Heckmann et al., 2013). The convergence upon enhanced CA activity then becomes necessary to enable PEPCase to have enough bicarbonate to maintain rapid activity.
As the most heavily studied enzyme in C 4 photosynthesis, the sequence of the PEPCase gene serves as a model of how convergence versus divergence can operate at the gene sequence level. Convergence is apparent in that PEPC1 was separately co-opted for C 4 function in the Allionia and Okenia/Boerhavia lineages, while divergence is apparent in the sequence of PEPC1 in these clades, as indicated by the distinct branches on the gene trees and variation at specific sites in the amino acid sequences. Is this sequence divergence random, or could it reflect divergent solutions for optimizing PEPcase function in C 4 leaves? In C 4 plants, the affinity of PEPCase for its substrate bicarbonate should be increased to compensate for subsaturating bicarbonate concentrations in M tissues Di Mario and Cousins, 2019). Also, C 4 orthologs of PEPCase require reduced sensitivity to malate and aspartate, because these allosteric inhibitors of PEPCase must accumulate to high concentration in M cells to drive rapid diffusion into the BS (Jacobs et al., 2008;Gowik and Westhoff, 2011). Numerous studies have documented similar changes in the amino acid sequence of PEPCase in a pattern that is associated with changes in sensitivity to malate, aspartate and PEP, supporting hypotheses of convergent optimization of PEPCase for the C 4 leaf (Engelmann et al., 2003;Gowik et al., 2006;Christin et al., 2007). Convergence is indicated by a widely noted substitution of a serine for alanine at a homologous site in the PEPC sequence, position 780 in maize, which is proposed to alter PEP and possibly bicarbonate sensitivity (Christin et al., 2007;Gowik and Westhoff, 2011;Di Mario and Cousins, 2019). Christin et al. (2007) also observed common sequence substitutions in numerous distinct clades of C 4 grasses, some of which occur in gene regions controlling sensitivity to malate. However, transcriptome surveys of numerous chenopods indicate the sequence convergence observed in grasses is less common in eudicots, suggesting divergent solutions to meeting the kinetic and regulatory requirements of a C 4 PEPCase (Rosnow et al., 2014(Rosnow et al., , 2015. Our results with Boerhavia and Allionia support a hypothesis of flexibility in how PEPCase is modified for the C 4 leaf. Both clades lack the serine at the position near 780, instead sharing an alanine with their C 3 sisters. They also exhibit differences in half of the sequence positions where Christin et al. (2007) noted convergence in grasses, but do exhibit some of the same substitutions as C 4 grasses and certain C 4 eudicots (at positions 572, 577, 761, and 807; Table 5). These patterns raise a number of possibilities. On the one hand, these convergent substitutions may sufficiently alter kinetics to allow the C 4 PEPCase to efficiently function. Alternatively, other sequence shifts may be able to accomplish the necessary change in sensitivities. A third possibility is the C 4 Nyctaginaceae species may simply have a non-optimized PEPCase for the C 4 function, in which case other mechanisms may compensate for inefficiencies. One compensation mechanism may be environmental, for example, high temperature in the natural habitat may override a need for convergence in PEPCase sequences by stimulating catalytic capacity. Where convergence is beneficial, but not accomplished, the consequences and compensation mechanisms become as interesting as the convergence itself.
We next consider the evolutionary transition at the opposite end of the C 4 pathway, where there is clear divergence in decarboxylase function. The divergence arises because there are enzymatic alternatives to meeting the decarboxyation imperative, with NADP-ME, NAD-ME, and PCK having significant roles in numerous metabolic pathways in C 3 plants (Aubry et al., 2011). Elevated activities of NADP-ME, NAD-ME, and PCK have been detected in vascular tissues and BS cells of C 3 species, where they metabolize organic acids used in long-distance transport from the roots, and may have a role in pH homeostasis, coordinating carbon and nitrogen metabolism, and providing metabolites for numerous biosynthetic pathways (Hibberd and Quick, 2002;Aubry et al., 2011). The mechanism by which one decarboxylase is selected over another is not known, but since all three are active in BS tissues, it is possible that each may function in a nascent C 4 pathway, perhaps to support recovery of photorespired nitrogen in the BS (Mallmann et al., 2014). It may be that chance determines which of the three decarboxylases is upregulated as the C 4 pathway strengthens, with convergence on the distinctive subtype characteristics occurring afterward, as the C 4 pathway is optimized for the selected decarboxylase. Alternatively, preexisting traits in the C 3 or C 2 ancestors may determine which decarboxylase is selected and also influence the characteristics of each C 4 subtype.
To close, we note that comparative studies of evolutionary convergence and divergence in C 4 plants show that convergence is greatest where constraints are high and biochemical options are limited, while divergence occurs where the constraints are low and multiple alternative solutions can be selected during the evolutionary process. What remains unclear is the influence that chance, ancestry, and environment have over divergent possibilities. By demonstrating multiple sets of closely related clades differing in C 4 subtype, we have identified replicated examples with which to address these issues using comparative methods of evolutionary biology (Harvey and Pagel, 1991). C 4 photosynthesis can thus become a powerful tool to unravel the intricacies of convergence in complex trait evolution.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
RK led the imaging efforts and their quantification and ecological data analysis. MS conducted the phylotranscriptomic and gene sequence analyses. SA performed gas exchange and enzyme assays. TS and RS directed the labs where the work was conducted and supplied funding. RS wrote the manuscript with input from each co-author. All authors contributed to the article and approved the submitted version.

DEDICATION
This paper is dedicated to the memory of Dr. Udo Gowik (1971Gowik ( -2020, a fun-loving friend, helpful to all, a pioneer in C 4 plant biology and a fearsome political debater.