Insights into chloroplast genome variation across Opuntioideae (Cactaceae)

Chloroplast genomes (plastomes) are frequently treated as highly conserved among land plants. However, many lineages of vascular plants have experienced extensive structural rearrangements, including inversions and modifications to the size and content of genes. Cacti are one of these lineages, containing the smallest plastome known for an obligately photosynthetic angiosperm, including the loss of one copy of the inverted repeat (∼25 kb) and the ndh genes suite, but only a few cacti from the subfamily Cactoideae have been sufficiently characterized. Here, we investigated the variation of plastome sequences across the second-major lineage of the Cactaceae, the subfamily Opuntioideae, to address 1) how variable is the content and arrangement of chloroplast genome sequences across the subfamily, and 2) how phylogenetically informative are the plastome sequences for resolving major relationships among the clades of Opuntioideae. Our de novo assembly of the Opuntia quimilo plastome recovered an organelle of 150,347 bp in length with both copies of the inverted repeats and the presence of all the ndh genes suite. An expansion of the large single copy unit and a reduction of the small single copy was observed, including translocations and inversion of genes as well as the putative pseudogenization of numerous loci. Comparative analyses among all clades within Opuntioideae suggested that plastome structure and content vary across taxa of this subfamily, with putative independent losses of the ndh gene suite and pseudogenization of genes across disparate lineages, further demonstrating the dynamic nature of plastomes in Cactaceae. Our plastome dataset was robust in determining relationships among major clades and subclades within Opuntioideae, resolving three tribes with high support: Cylindropuntieae, Tephrocacteae and Opuntieae. A plastome-wide survey for highly informative phylogenetic markers revealed previously unused regions for future use in Sanger-based studies, presenting a valuable dataset with primers designed for continued evolutionary studies across Cactaceae. These results bring new insights into the evolution of plastomes in cacti, suggesting that further analyses should be carried out to address how ecological drivers, physiological constraints and morphological traits of cacti may be related with the common rearrangements in plastomes that have been reported across the family.


Introduction
Cacti comprise one of the most charismatic plant clades of the world, exhibiting an enormous variety of growth forms, morphology and intriguing niche occupancy across the Americas (Britton & Rose 1919;Anderson 2001;Hunt et al. 2006;Hernández-Hernández et al. 2011). Members of the family are conspicuous elements of the arid and semiarid succulent biome of the New World, also inhabiting subtropical and tropical forests (Taylor and Zappi 2004;Hunt et al. 2006). This distributional pattern is accompanied by high species diversity with heterogeneous diversification rates across evolutionary lineages (Arakaki et al. 2011;Hernández-Hernández et al. 2014).
Some uncommon features in most Angiosperms, such as succulent tissues, Crassulacean acid metabolism (CAM), betalain pigments and the reduction of or absence of leaves are typical characters of cacti that have long captured the attention of plant biologists and have been suggested as adaptations to allow survival in harsh environments (Mooney et al. 1977;Mauseth 1999;Landrum 2002;Nobel 2002). Besides major morphological and physiological adaptations, genetic and genomic-level changes derived from host of selective pressures are also expected to be present. For example, whole genome duplication events have long been suggested to be associated with adaptations to extreme environments (e.g., Stebbins 1971; Soltis & Soltis 2000;Brochmann et al. 2004), and significant gene family expansion in genes related to stress adaptation, as well as more restricted events of gene duplications were reported in lineages of Caryophyllales adapted to severe environments including in cacti .
Although gene content, structural organization and size of the chloroplast genome (plastomes) of land plants is often considered highly conserved (Raubeson & Jansen 2005;Chumley et al. 2006;Wicke et al. 2011), deviations have been increasingly reported in some clades and have challenged the generality of this phenomenon (Daniell et  were lost in some parasites, carnivorous plants and xerophytes (Braukmann et al., 2009;Wicke et al., 2011;Iles et al., 2013;Peredo et al., 2013;Ruhlman et al., 2015;Sanderson et al. 2015).
Members of Cactaceae also have experienced different alterations in their chloroplast genome. A conserved inversion of ~6 kb on the large single copy unit comprising the trnM-rbcL genes have long been suggested (Wallace 1995) and more recently confirmed (Sanderson et al. 2015;Solorzano et al. 2019;Majure et al. 2019). Besides that, the first cactus plastome assembled from the saguaro cactus (Carnegiea gigantea (Engelm.) Britton & Rose) exhibited an exceptional reduction in size (113 kb) and gene content, including the loss of one of the two inverted repeat regions and nine of the 11 ndh genes (Sanderson et al. 2015 (Taylor et al. 2002;Stuppy 2002 Hunt et al. 2006). Nonetheless, molecular phylogenetic studies showed that Opuntia s.l was paraphyletic, which led to the recognition of numerous smaller genera corresponding to well-supported clades (Taylor and Stuppy 2002). Likewise, the tribal classification of Opuntioideae has been controversial based on different approaches, with up to six tribes proposed (Doweld, 1999;Wallace and Dickie, 2002;Hunt 2011). Although previous studies have improved our understanding of the relationships among the major clades in Opuntioideae (Griffith & Porter 2009;Ritz et al. 2012;Majure et al. 2019), support for the relationships among those clades still needs to be strengthened.
Apart from the external and internal transcribed spacer (ETS and ITS) of the nuclear ribosomal repeats (NRR) and ppc marker, most molecular phylogenies of cacti have been historically based on few plastid markers (trnL-trnF, rpl16, trnK and matK) (Nyffeler 2002;Edwards et al. 2005;Korotova et al. 2010;Demaio et al. 2011;Arakaki et al. 2011;Bárcenas et al. 2011;Hernández-Hernández et al. 2011;Hernández-Hernández et al. 2014;Ritz et al. 2012;Bárcenas 2016;Luna-Vargas et al. 2018). While these markers have shown to be potentially able to resolve some clades, some relationships are still lacking support (Nyffeler 2002;Griffiths & Porter 2009;Hernández-Hernández et al. 2011;Bárcenas et al. 2011). In this case, nextgeneration sequencing (NGS) could be a useful tool, since it has transformed the study of nonmodel plant taxa in phylogenetic inferences with high throughput data allowing deep resolutions across major plant clades (Xi et al. 2012;Ma et al. 2014;Gardner et al. 2016;Zong et al. 2019). NGS data are also showing to be extremely useful for discovering informative regions across genomes, for marker development (Wu et al. 2010;Dong et al. 2012;Ripma et al. 2014;Reginato et al. 2016), as well as to investigate chloroplast genome evolution (Dong et al. 2013;Mower & Vickrey 2018;Yao et al. 2019). Nevertheless, this approach is still in its infancy across Cactaceae (Majure et al. 2019) and remains a path to be explored.
Here, we investigate the use of next-generation sequencing across Opuntioideae to address two major questions: (1) how homogenous is the content and arrangement of chloroplast genomes across the subfamily? and (2) how phylogenetically informative are chloroplast genome sequences for resolving major relationships among the clades of Opuntioideae? We used a combination of de novo and reference-guided assemblies to process genome skimming data: (i) assembling and characterizing the first chloroplast genome of an Opuntia species, O. quimilo, (ii) investigating overall patterns of reference-guided assemblies and comparative chloroplast genome sequence analyses across the subfamily, (iii) inferring phylogenetic relationships with assembled sequences and (iv) surveying plastomes for highly informative phylogenetic markers for Sanger-based studies for future use.

Taxon sampling, DNA extraction and sequencing
All currently recognized genera in Opuntioideae (sensu Hunt et al. 2006, plus Majure et al. 2019 for Grusonia s.l.) were sampled with one accession per genus, resulting in a dataset of 17 taxa which were sequenced via genome-skimming (Straub et al., 2012;Majure et al. 2019).
Three additional samples were selected as outgroup taxa (Cactoideae: Parodia magnifica and Coryphantha macromeris; and Pereskia: Pereskia sacharosa). Plant materials were from wild collections or from the Desert Botanical Garden's living collection (see Table S1 for details).
DNA was extracted from silica-dried or fresh epidermal tissues using a standard CTAB incubation (Doyle & Doyle 1987) followed by chloroform/isoamyl alcohol precipitation and silica column-based purification steps, as described in Neubig et al. (2014)

De novo assembly and data processing for chloroplast genome sequences
Raw reads were imported into Geneious 11.  (Laslett and Canback, 2004); tRNA genes were annotated with tRNAscan-SE v2.0 (Lowe and Eddy, 1997), and BLAST searches were used to annotate ribosomal RNA (rRNA), tRNA and DNA genes conserved in embryophyte plastomes (Wommack et al., 2008). All annotations were cross checked with the "Annotate from" feature in Geneious, transferring annotations with a 50% or greater similarity from the P. oleracea plastome, and eventual start/stop codons were manually adjusted with the "Open Read Frame (ORF)" feature from Geneious. The genes that had their structures affected by the insertion of many internal stop codons or ORF, thus did not forming their respective full coding sequence (CDS), were annotated as putative pseudogenes. The graphical representation of O. quimilo circular annotated plastome was created in OGDRAW (Lohse et al., 2013). To visualize changes in gene order and content, we compared the O. quimilo assembly with the canonical gene order of P. oleracea plastome via multiple whole genome alignment using MAUVE (default options, assuming colinearity; Darling et al., 2004). Boundaries between the IRa IRb, LSC, SSC and putative inversions were visually checked in Geneious using an in silico approach adapted from Oliver et al. 2010.

Comparative chloroplast genome sequences analyses
The newly annotated plastome of Opuntia quimilo, with one of the inverted repeats (IRa) manually stripped to avoid data duplication, was then used for a reference guided assembly on the trimmed reads from all other taxa using Geneious mapper with a medium-low sensitivity iterating up to 5 times (adapted from Ripma et al 2014). Each of the assembles mapped had a majority threshold consensus sequence generated and an annotation transferred from the O.
quimilo reference, and manually adjusted. To identify highly variable regions across the subfamily, the 17 assembled Opuntioideae chloroplast genome sequences were compared using mVista (Frazer et al., 2004)

Phylogenetic analyses and informative regions
The assembled chloroplast genome sequences were aligned using MAFFT v. 7 with an automatic strategy search for algorithm selection (Katoh & Standley 2013), using 200PAM scoring matrix and an open gap penalty of 1.53 (offset value 0.123). The alignment was manually examined for misaligned areas following a similarity criterion (Simmons, 2004).
Sequence portions that contained gaps and/or ambiguities across more than 80% of the taxa were stripped using the "Mask Alignments" feature in Geneious. Phylogenetic inference was performed using Maximum Likelihood implemented in RAxML 8.2.4 (Stamatakis, 2014) in the CIPRES Portal (Miller et al. 2010) with GTR+G model employed for the entire sequence. Support values were estimated implementing 1,000 bootstrap replicates.
In order to identify and rank highly phylogenetically informative regions in the Opuntioideae plastomes, we split the full plastome alignment in protein coding sequences (cpCDS -pseudogenes were included here), non-coding sequences (cpNCDS) and intergenic spacers (cpIGS) using the annotated O. quimilo plastome. Each individual marker (cpCDS, cpNCDS, cpIGS) was extracted from the above-mentioned alignment, and a Maximum Likelihood tree was inferred with RAxML (GTR+G model, 100 bootstrap replicates). For each marker, we report the number of variable sites, number of parsimony informative sites (PIS), mean sequence distance (under K80 model), alignment length, mean sequence length, mean bootstrap support and distance to the full chloroplast genome sequence tree (RF distance; Robinson & Foulds, 1981). The metrics were retrieved using functions of the R packages ape and phangorn (Paradis, Claude & Strimmer, 2004;Schliep, 2011). Markers were ranked by phylogenetic information using a weighted mean of relative values of the following metrics: number of variable sites (weight = 1), mean bootstrap (weight = 2) and distance to the full plastid tree (weight = 3). We designed primer pairs for the top 5 markers identified in the previous step with suitable size for PCR amplification (~800-900 bp). Primers flanking the target regions were designed with Primer3, using the default settings (Rozen & Skaletsky, 2000). All metrics reported, as well as primer design, were considered only for the ingroup (the 17 Opuntioideae chloroplast genome sequences).  Table S1.

Opuntia quimilo plastome
The complete chloroplast genome of Opuntia quimilo was sequenced, assembled, annotated and deposited in GenBank with the accession number MN114084.  (Table 1).
The de novo assembly of Geneious assembler produced 1,000 contigs; of these, 988 were higher than 1,000 bp in length from a minimum length of 1,026 bp to a maximum of 283,150 bp.
MegaBLAST search founds one consensus plastid contig of 128,909 bp that includes the full chloroplast sequence with two putative inverted repeats assembled as a single IR unit (~22 kb).
The GetOrganele and NOVOPlasty pipelines both yielded one plastid contig of 150,374 bp with the same gene content, order and structure as the plastid contig of the Geneious assembler, except for the two inverted repeats that were interleaved by the LSC and SSC on the first ones while in the Geneious contig was merged as one IR.  (Fig. 2, orange genes).
When compared to the canonical angiosperm chloroplast genome of Portulaca oleracea, two block translocations in the LSC are present in the O. quimilo plastome: the first (Fig. 2, region II) is a simple colinear translocation of nine genes (Fig. 2, region II); while the second one is a big block inversion and translocation comprising 50 genes within the trnG UCC -psbE region (Fig. 2, region III). Inside that block (region III), the putative synapomorphic inversion of cacti encompassing the trnM-rbcL genes is confirmed for Cactaceae, but in the O. quimilo plastome this inversion also encompassed the trnV UAC gene (Fig. 2, green bars). Further gene order is mainly colinear (Fig 2., regions I, IV, V, VI, VII), except for the rearrangement comprising the SSC genes that were transferred to the IRs regions, including a double inversion on the ycf1-rpl32 region, placing ycf1 gene adjacent to rpl32 (Fig. 2, orange genes).

Reference-guide assemblies and comparative chloroplast sequences analyses
The reference-guided assembles of the remaining Opuntioideae and outgroup taxa to the Opuntia quimilo plastome (one inverted repeat stripped) mapped an average of 616,615 reads with a mean genome depth of 721x (  (Table S3).
Pairwise comparison of divergent regions within the Opuntioideae chloroplast genome sequences using mVISTA with O. quimilo as a reference revealed both striking conserved and divergent regions across the chloroplast genomes sequences (Figure 3). Overall, the alignment uncovered sequence divergence across assemblies, suggesting that chloroplast genome sequences are not conserved. Divergences were observed both in noncoding regions and coding regions.

Phylogenetic analyses and informative regions
The full chloroplast genome sequences resulted in an alignment of 118,930 bp with 86,484 identical sites (72.7%), a pairwise identity of 94.5% and 8,694 distinct alignment patterns. There are 8,922 parsimony informative sites (PIS) and 11,509 sites with gaps. Maximum Likelihood analyses resolved a well-supported Opuntioideae (bs = 100), with three major subclades currently circumscribed as tribes, i.e., Opuntieae, Cylindropuntieae and Tephrocateae (Fig. 5).
All nodes had high bootstrap support values (bs= 100), except by two nodes, which were still higher than 90 (Fig. 5).
The summary statistics for all markers (cpCDS, cpNCDS, cpIGS) are presented in Table S4.
A list of the top 10 markers ranked by phylogenetic information considering topological distance to the plastome tree, mean bootstrap support and number of parsimony informative sites is given in Table 3. All single marker phylogenies presented some disagreement to the plastome tree (RF tree distance ranging from 6 to 28). Bootstrap support ranged from 0 to 89 (mean = 37), and number of PIS from 0 to 619 (mean = 25). Primer pair sequences for PCR amplification are provided for the top 5 markers with suitable Sanger sequencing size (max ~900 bp) in Table 4.

Insights from chloroplast genome assemblies in Opuntioideae and Cactaceae
The first chloroplast genome of a species of Opuntia is here reported. Although the bulk of The major plastid regions marked by pseudogenization in the Opuntia quimilo plastome (ycf1, ycf2 and accD) are visually highlighted as non-conserved regions both in reference-guide maps (Fig. 5, green stars), as in the mVista alignment across Opuntioideae (Fig. 3). These regions are also emphasized as with hyper or moderate variability regarding the nucleotide diversity values (Fig. 4). All genes here reported as pseudogenes in the O. quimilo plastome (accD, rpl16, rps16, ycf1 and ycf2) have also been reported as pseudogenes in the Mammillaria plastomes (Solórzano et al. 2019), while the accD was described as a pseudogene in Carnegia gigantea (Sanderson et al. 2015). Pseudogenization of these genes has been repeatedly reported across different angiosperm lineages, such as Malpighiales, Campanulales, Ericales, Poales, Solanales, Geraniales, Santalales and Myrtales (Harris et al. 2013;Haberle et al. 2008;Fajardo et al. 2013;Weng et al. 2013;Bedoya et al. 2019;Cui et al. 2019;Machado et al. 2017). Even though these genes have been identified with essential functions beyond photosynthesis and retained in the plastome of most embryophytes (Drescher et al., 2000;Kuroda and Maliga, 2003;Kode et al., 2005;Kikuchi et al., 2013;Parker et al., 2014;Dong et al., 2015), there are several other plants where these genes are missing from the chloroplast genome (Kim, 2004;Magee et al., 2010;Lei et al., 2016;Graham et al., 2017). The pseudogenization or loss of the accD, rpl22 and several genes of the ndh suite from the plastids has been reported to be a consequence of them being transferred to the nuclear genome (Jansen

Phylogenetic relationship of Opuntioideae tribes
The plastome phylogeny of Opuntioideae strongly resolves three major clades that are currently circumscribed as tribes Opuntieae, Tephrocacteae and Cylindropuntieae ( Fig. 5 and Schum. and Micropuntia Daston), which formed a well-supported subclade, but they also contain two genera that are found in tropical dry forest of Mexico/Northen Central America

(Pereskiopsis Britton & Rose) and South America (Quiabentia Britton & Rose). Tribe
Pereskiopsideae (Doweld 1999), previous described to only accommodate the leafy Pereskiopsis, is nested within Cylindropuntieae and is redundant, and thus unnecessary. Deeper relationships within Cylindropuntieae were recently untangled using a phylogenomic approach and dense sampling, revealing biogeographic patterns as well as characters evolution (Majure et al. 2019).
Tephrocacteae is a South American clade adapted to diverse climatic conditions over a wide area of the southern Andes and adjacent lowlands (Ritz et al. 2012;Guerrero et al. 2018;Las Peñas et al. 2019). The tribe includes morphologically diverse species from geophytes and cushion-plants to dwarf shrubs, shrubs or small trees (Anderson 2001); and probably geophytes and cushion-forming species evolved several times from shrubby-like precursors (Ritz et al. 2012). Tribes Austrocylindropuntieae and Pterocacteae (Wallace & Dick 2002), described to circumscribe Autrocylindropuntia + Cumulopuntia and Pterocactus, respectively, are both nested in the Tephrocacteae, as amplified by Hunt (2011), and their use is mostly redundant.

Phylogenetically informative regions
Our plastome survey for phylogenetically informative markers revealed a list of potentially highly informative plastid markers for Sanger-based phylogenetic studies in Opuntioideae (Table   S4). The top 10 markers in our cpCDS dataser are : accD, ycf1, ndhD, petD, ccsA, clpP, rpoC1, rpoC2, including just one intron (the trnK intron comprising the matK gene -trnK/matK) and one intergenic spacer (psbE-rpl20) (Table 3). However, two of the better ranked markers (accD and ycf1) are putative pseudogenes and must be treated apart from traditional protein coding genes.
From the top 10 markers ranked in our list, just one (trnK/matK) has been used in more than one phylogenetic study in cacti (Nyffeler 2002;Edwards et al. 2005;Korotova et al. 2010;Demaio et al. 2011;Arakaki et al. 2011;Bárcenas et al. 2011;Hernández-Hernández et al. 2011;Hernández-Hernández et al. 2014;Ritz et al. 2012 However, accD intergenic spacers, such as rbcL-accD and accD-psaI, have been much more widely used across disparate groups (Barfuss et al. 2005;Miikeda et al. 2006;Reginato et al. 2010;Sun et al. 2012;Michelangeli et al. 2012). The ycf1 gene appears to be moderately used (Gernandt et al. 2009;Guo et al. 2012;Majure et al. 2012;Whitten et al. 2013;Shi et al. 2013;Dastpak et al. 2018), and increasingly reported to be a useful marker in phylogenetics inferences (Neubig et al. 2009;Neubig and Abbott 2010;Dong et al. 2012;Thomson et al. 2018), and the most promising plastid DNA barcode of land plants (Dong et al. 2015). The petD intron has been used Borsh et al. 2009;Worberg et al. 2007;Scataglini et al. 2013), but in our analysis the entire gene was used (exon + intron) showing phylogenetic utility. The ccsA gene seems to be underexplored as a phylogenetic marker (Marx et al. 2010;Peterson et al. 2012) but was already suggested as convenient for phylogenetic inferences (Logacheva et al. 2007). The rpoC1 and rpoC2 genes have been occasionally used together as markers (Liston and Wheeler 1994;Kulshreshtha et al. 2004) or combined with other markers (GPWS 2001;Zhang et al. 2011;Downie et al. 2000;Guo et al. 2012) yielding satisfactory results. The rpoC2 gene was recently found as the best performing marker to recover with high levels of concordance the "accepted tree" of the angiosperm phylogeny ). The ndhD gene seems to be scarcely used for phylogenetic inference (Panero and Funk 2002), while the intergenic spacer of psbE-rpl20 genes has never been used individually to our knowledge.
Eight of the top 10 markers are more than 900 bp, indicating that longer genes are superior for phylogeny reconstruction, as previous suggested by Walker et al. (2019), although they may require internal primer designing for complete Sanger's sequencing. A list of the top 10 markers with less than 900 bp is reported (Table S5), and primer pair design for the top five is provided in Table 4. The clpP gene (~ 359 bp) is the best ranked under this criterium (< 900 bp), and the intergenic spacer psbB-clpP (~ 547 bp) is also ranked in the top 10 list of this list, thus, the pair primer design included them as one marker (psbB/clpP). Many of the top 10 markers listed with less than 900 bp have been occasionally used in phylogenetic studies across disparate plant lineages with variable resolution, but never in cacti. The clpP gene is usually employed with its introns (Stefanović et al. 2009;Lam et al. 2016), and the psbB-clpP intergenic spacer has been increasingly reported as useful marker (Loera et al. 2012;Särkinen and George 2013;Prince 2015). The intergenic spacer ycf4-cemA has been used for phylogenetic studies in Asteraceae and Poaceae genera Ekenas et al. 2007) and the rps2 gene in Orobanchaceae and Ephedraceae (McNeal et al. 2013;Manen et al. 2004;Loera et al. 2012). The petA gene appears to be rarely used (Tsumura et al. 1996), while the intergenic spacer petA-psbJ has been employed across various groups ( ndhE-psaC appears to have not yet been used as a marker for phylogenetic studies, but it was also reported as useful for phylogenetic and phylogeographic studies in Liliaceae (Lu et al. 2017). The intergenic spacer ndhC-rbcL is putatively exclusive of Cactaceae, resulting from the trnV-rbcL inversion, and its phylogenetic utility must be further investigated, at least in clades where ndh genes are present.
Chloroplast markers have been used for testing evolutionary relationships among plants for the past 30 years (Gitzendanner et al. 2018). While the assumption that these markers are evolving as a single unit without recombination, routine analyses have concatenated data producing highly supported phylogenies that have been underlying the current classification of angiosperms (APG, 2016). However, as here reported, no marker as a single unit (gene tree) recovered the same topology of the plastome inference (concatenated tree), and even within the top 10 markers listed, some showed high values of discordance (Table 3 and Table S5). Such results discourage and call attention to phylogenetic approaches based on one or few markers.
Recent studies have explored gene tree conflicts in plastome-inferred phylogenies and incongruence between gene trees and species trees in plastid genes (

Conclusions
Chloroplast genomes have long been considered conserved among land plants, but recent generation of thousands of plastomes through NGS has illuminated that this is not always the case. Cactaceae are no exception to variations that have been observed. Previous plastomes of cacti have shown to have lost one copy of the inverted repeat regions and several genes of the ndh gene suite, as well as to possess divergent inverted repeat regions and the smallest chloroplast genome known for an obligately photosynthetic angiosperm. We showed that the Opuntia quimilo plastome also presents deviations of canonical angiosperm plastomes with an expansion of the LSC incorporating genes that are typically in the IRs, a reduction of the SSC translocating some common genes of the SSC into the IR region, and one massive translocation with an inversion of a block of genes in the LSC. Our reference-guided assemblies across Opuntioideae allowed us to infer putative independent losses of some ndh genes across disparate taxa of the subfamily. We did not find synapomorphic plastome features within Opuntioideae clades, thus, we hypothesize that putative rearrangements across the subfamily are from homoplasious events. Further analyses should be carried out to address how ecological drivers and morphological traits of cacti may be related with positive selection of genes and the common rearrangements in chloroplast genomes that have been reported in the family. Phylogenetic analyses of chloroplast genome sequences strongly support Opuntioideae and its three tribes: Opuntieae, Cylindropuntieae and Tephrocacteae. Since computational and budget limitations are still a bottleneck to deal with high throughput data, especially in developing countries, a list of highly informative plastid markers is presented for future use, and several top ranked markers have not been used in phylogenetic studies of cacti. Nonetheless, gene trees discordances between plastome markers must be carefully considering while inferring phylogenies in this remarkable group of plants.   Plastid genome structure and gene order in Opuntia quimilo compared with purslane (Portulaca oleracea). Purslane has the canonical order typical of most angiosperms. For simplicity, the circular map has been linearized. Green line highlights the trnMCAU-rbcL synapomorphic inversion of Cactaceae, which in O. quimilo also includes the trnVUAC gene. Regions I, IV, V, VI and VII are colinear in both plastomes. Region II is colinear but is translocated in the O. quimilo plastome, while region III is inverted and translocated. Region V comprise the genes that are typically in the IR region but are translocated to the large single copy in O. quimilo. Genes highlighted in orange are those typically found in the SSC but transferred to the IR region in O. quimilo. Orange dashed line indicate the double inversion on the ycf1-rpl32 genes, placing ycf1 gene adjacent to rpl32. Black triangles represent duplicated genes present in purslane but absent in O. quimilo; LSC = large single-copy region; SSC = small single-copy region; IR = Inverted repeat.  . The x-axis re alignment the base sequence of the alignment, and the y-axis represents the nucleotide diversity (π value). Each variation hotspot for the chloroplast genome sequences of Opuntioideae alignment is annotated on the graph.

Figure 5.
Maximum likelihood phylogenetic tree from RAxML analysis transformed in cladogram with the phylogram represented in small size with substitution rate scaled. All nodes have total bootstrap values (bs = 100) with exception for those that are shown above the branch. Each tip is represented with the assembly map of raw read coverages from Geneious mapper to the Opuntia quimilo chloroplast genome (one IR stripped, represented on the top with annotated genes). Red stars represent low coverage mapping and putative losses associated with the ndh gene suite; green stars represent partial low coverages associated with putative pseudogenization of ycf1, ycf2 and accD genes. Tribe Opuntieae is highlighted in orange, Tephrocacteae in green and Cylindropuntieae in yellow.